Sample vs. population -Population-includes all the possible people with the characteristic to be studied Ex: All the students attending Guilford College -Sample-a subgroup of the population with the characteristic to be studied Ex: The freshman students attending Guilford College
Different types of sampling -Simple random sample- To take away common bias when choosing people for a study, this method gives each person in the entire population an identical chance of being selected. Ex: Place all freshman student ID numbers in a computer database similar to a lottery and then randomly select 50 ID numbers of students who would then participate in the study. -Stratified Random Sampling-Separately selects participants arbitrarily from each section or level (that is created along lines of impact: race, gender, age, ect.) (higher level of unbias method then simple random sample) Ex: Divide Guilford College freshman by their sex (male, female, ect). Then pull at random a sample from each of those groups. -Random Cluster Sampling-Separate population into already naturally occurring subgroups, then randomly select a few of those subgroups to study. All of the participants belonging to the subgroups chosen will participate in the study. Ex: At Guilford College all entering freshman belong to a FYE homeroom. If a researcher is interested in studying the average alcohol consumption for freshman, she would select a few (specific number) of these FYE homerooms to study. Every student in the selected FYE homerooms would then participate in the study. -Convenience Sample- The most bias method of creating a sample for study. This method relies on who happens to be accessible and willing to participate in the study. The data yielded does not necessarily represent the population of study. Ex: The researcher is sitting outside the Guilford College campus cafeteria requesting students to fill out surveys on alcohol consumption. The students who are not naturally around the cafeteria or refuse to complete the survey will not be considered. -Sample Bias- Occurs whenever parts of the specific population have a greater chance of participating in a study than anyone else. All methods of collecting a sample have a risk of bias. Ex: A researcher selects five students from the Guilford College cafeteria to represent the entire student body. Not all student have a meal plan and attend the cafeteria.
Independent vs. dependent variable -Independent- The variable you have control over to manipulate or change. Affects the constant dependent variable. Commonly known as a control value. Ex: At Guilford College, a researcher wants to test the theory that vegetable and fruit consumption can decrease the number of classes students miss. The independent variable is the amount of fruits and vegetables that the students consume. -Dependent- The variable you measure in a study, the response. Dependent variable responds to the independent variable. Ex: At Guilford College, a researcher wants to test the theory that vegetable and fruit consumption can decrease the number of classes students miss. The dependent variable is the number of classes missed.
Levels of measurement -Nominal-Categorical. No hierarchy or ranking. Ex: The freshman at Guilford College were asked to name their favorite movie genre. -Ordinal-Ordered categories, in a hierarchy. May consist of ranges. Not equally spaced intervals. Ex: The researcher asks freshman at Guilford College to report on the amount of alcohol they consume per week. A) 0-3, B) 4-6, C) 6-10, D) 10- or up -Interval-Same distance between the values. No real 0. Ex: The research asks freshman at Guilford College to report the number of times they use the bathroom in a week. -Ratio- Same distance between the numbers. True 0. Ex: The researcher asks freshman at Guilford College to report on the amount of alcohol they consume per week. Values can range from 0 to infinity.
Statistical distributions (negative skew; positive skew; normal; bimodal) VISUAL Normal-When the curve is in the middle of the graph with equal tales on both sides. The mean, median, and mode are similar.
Ex:
1 2 3 4 5 6 7 8 9 10 11 12 13 Mean, Median, Mode Median Mode Positive skew- When the curve is on the left side, with the tale streaming to the right. The mean is larger than the median.
Ex:
Negative Skew-When the distribution curve is on the right side, with the tale streaming to the left. The mean is smaller than the median.
Ex:
Bimodal-When there are two peak curves within the datas range. Does not have to be equal, contains two modes.
Measures of central tendency- Measurements used to help describe data and its distribution. Mean-Average number of a data set. Use only for Ratio data. Is calculated by adding all the individual values, then divide by the amount of values. Ex: Five students were surveyed at Guilford College, on the weekly consumption of alcoholic servings. Results: 0, 2,4,10, 4. To calculate: 0+2+4+10+4=20/5=4 (Mean) Median-After lining up the individual data from lowest to highest, then the researcher identifies middle number. If there is an even number of values (resulting in two numbers as the middle) then take the mean of the two middle values for the median. Can be used for ordinal, interval, and ratio data. Ex: 0,2,4,4,10 Median=4 Mode-Category or number that appears most often. Used in nominal, ordinal, interval, and ratio. Ex: Mode =4 In Normal distribution: The mean, median, and mode are all the same In Positive distribution: Mean > median In Negative distribution: Median> mean
Percentages vs. proportions (how to calculate each) Percentage: Says how many out of 100. Helpful when comparing more than one large groups. To calculate, divide the smaller number by the total number (or larger number), then X by 100. Ex: If 146 of the Guilford freshman sample group (N=400) reported do not drink alcohol, then the percent (%) would equal 146/400=.365 X 100 =36.5% of the freshman sample group do not drink. Proportion: The number as part of one. To calculate, divide (same directions as obtaining a percentage), but do NOT multiple by 100 in the end. End up with a decimal most likely. Ex: If 146 of the Guilford freshman sample group (N=400) reported do not drink alcohol, then the proportion would be 146/400=.365 (thirty six hundredths)
Variation (range; standard deviation) VISUAL Range: The difference between the highest number and the lowest (largest number- smallest number = Range). Sensitive to outliers.
Ex:
The range is calculated as follows: 9 3 = 6 1 2 3 4 5 6 7 8 9 10 11 12 13 Low=3 Standard Deviation: Statistical number that indicates how much the data set differs from the mean score of the set. Only applies to singular data sets. Also known as: spread or dispersion. Most used measure of variability. 1SD=68%, 2SD=95%, 3SD=99% in a normal distribution curve data set.
Ex: Given that the Guilford College students were asked to report how many time they ate in the cafeteria per week between 0 and 11. The resulting shows the distribution of responses. The mean of this sample was 7. The standard deviation is about 2. It can be concluded that 95% of the data fall within 2 values higher and lower of the mean.
Standard error SE=margin of error for multiple data set means. Determine how well it can be generalizable to a wider group. The larger the population or sample the smaller the SE of the mean. The less variability in a population, the smaller the SE of the mean. 1SE=68%, 2SE=95%, 3SE=99% in more than one normal distribution means. Ex: Given the standard deviation is 4, and the sample size is 100. SE=4 10 (Take the square root of 100 =10) It can be determined that the standard error is 0.4.
Confidence intervals- (SD and SE) A measurement used to decide the level of confidence a researcher has when applying a statistic from a sample to determine a parameter that the population when the data is normally distributed. To calculate a 68% CI, the statistician would subtract 1SD from the data mean (or 1SE from the population means) to get the low number of the confidence interval, and add 1SD (or 1SE from the population means) to the data mean to get the high number of the CI. CI for a sample follows the 1SD=68%, 2SD=95%, and 3SD=99%. As well as the same rule for population means (SE). Ex: Given the mean of the data is 20, and the standard deviation is 2. The 68% confidence interval for 1SD would be 18-22.
(ALL images by Pauravi Shippen-How)
0 2 3 4 5 6 7 8 9 10 11 Mean High=11 Mean 2SD = 7 2(2) = 3, 11 Primary Learning Strategy: My primary learning strategy is visual. I chose statistical distribution including normal curve, positive skew, negative skew, and bimodal. Additionally, I included a visual for range and standard deviation. I chose to represent statistical distribution through a visual because I can better identify not only the curve of each distribution according to the data given, but also the relationship of the mean, median, and mode depending of the shape of the curve. Moreover, the range and standard deviation was initially difficult for me to understand in terms of there differences. However, when physically drawing a number line, and seeing how each deviation falls within the line, helped me to better distinguish between the highest and lowest in the range. Additionally, the difference between the confidence interval and standard deviation or standard error are more apparent on a number line when distinguishing between 68%, 95%, and 99%.