Académique Documents
Professionnel Documents
Culture Documents
Statistics in Education
Course Packet
Dr. Rachel Vannatta
EDFI 641
Table of Contents
Video #1Introduction to Statistics................................................................2 Video #2Frequency Distributions ...................................................................6 Video #3Central Tendencies & Variability .................................................. 10 Video #4Probability & z Score ...................................................................... 18 M-n-M Activity........................................................................................ 19 Video #5Distribution of Sample Means...................................................... 24 Video #6Hypothesis Testing......................................................................... 28 Video #7t Test ................................................................................................. 38 Video #8t Test of Independent Samples................................................... 44 Interpreting Research ......................................................................... 49 Video #9t Test of Related Samples............................................................ 50 Interpreting Research ......................................................................... 54 Coke vs. Pepsi Experiment ................................................................... 55 Video #10AVOVA............................................................................................. 57 Interpreting Research ......................................................................... 63 Video #11Correlation & Regression.............................................................. 64 Interpreting Research ......................................................................... 73 Video #12Chi Square ...................................................................................... 75 Interpreting Research .......................................................................... 81 Statistical Test Grid ............................................................................ 82 Unit Normal (z-score) Table ............................................................................. 84 t Distribution Table ............................................................................................ 88 F Distribution (ANOVA) table .......................................................................... 89 Pearson Correlation Table.................................................................................. 90 Chi Square Distribution Table ........................................................................... 91
Page 2
Two major types of statistical methods descriptive statssummarize, organize and simplify data (e.g., mean, standard deviation, tables, graphs, distributions) data raw score inferential statstechniques that allow us to study samples and make generalizations about the population from which they were selected (e.g., t test, ANOVA, correlation) sampling erroramount of error between the sample statistic and the population parameter (degree to which the sample differs from the population) random samplingused to minimize error between sample and population Inferential statistics also allow us to study relationships between/among variables that the sample holds. variablecharacteristic/condition that differs among individuals (gender, height, test scores, IQ) constructhypothetical concepts/theory to organize observations operational definitiondefines a construct in terms of how it is measured
Types of Variables categorical variable (discrete)consists of separate categories (e.g., gender, religion, classification of personality) quantitative variable (continuous)can be divided into an infinite number of fractional parts (e.g., height, time, age) independent variableusually a treatment that has been manipulated (control group versus experimental group), usually categorical dependent variableusually the effect, usually quantitative confounding variablean uncontrolled variable that creates a difference between the control and experimental groups Variables determine type of relationship being studied mutual causal Groups must be compared to examine cause and effect categorical variable groups are created by a
Page 3
Page 4
Research Designs
Correlationalstudies relationships among 2 or more variables to explain for predict behaviors usually both IV and DV are quantitative example: Teacher studies the relationship between English grades and overall GPA.
Experimentalexamines cause and effect; manipulates a treatment and tests the outcome; compares the experimental and control groups (groups are randomly created)
IV=nominal; DV= interval/ratio example: Researcher compares grades of a group of students that receive computer-assisted instruction to a group that receives none. Groups were created through random assignment.
Quasi-Experimentalexamines cause and effect; indirectly manipulates a treatment and tests the outcome; compares the experimental and control groups (uses existing groups) IV=nominal; DV= interval/ratio example: Researcher compares grades of a group of students that receive computer-assisted instruction to a group that receives none. Existing groups were used.
Causal Comparativeexamines cause and effect (cautiously); compares groups created by some categorical characteristic (gender, religion, ethnicity) IV=nominal; DV= interval/ratio example: Researcher compares final grades of male and female students.
Most research is guided by a hypothesis, a prediction about the effect of the treatment.
Measurement Scales
Nominalnumbers have NO numerical value but represent categories (religion, ethnicity, occupation, gender) Ordinalnumbers represent a rank (1 begin the best); interval can vary (e.g., class rank, Olympic ordinals) Intervalnumbers have typical numerical value; interval are equal; no real zero (e.g., temperature, test score) Ratiosame as interval but has a real zero (e.g., money, time)
Page 5
Identify the measurement scale (nominal, ordinal, interval, ratio) for each. _________________5. _________________6. _________________7. _________________8. _________________9. Size of school district (small, medium, large) Rank of faculty on their teaching ratings Social security number Color of persons eyes IQ scores
_________________10. Degree in Fahrenheit _________________11. Religious affiliation Medalists in an Olympic event Income in actual dollars
________________12. ________________13.
Page 6
X 10 9 8 6 5 4
1 4 5 6 2 2
Proportion and Percents of Frequency Distributions Proportionrelative frequencies; measures the fraction of the total group that is associated with each score; most often appear as decimals proportion = p = N
Percentagepercent of the total group that is associated with each score percentage = p (100) = f (100) N
10 9 8 6 5 4
1 4 5 6 2 2
p=f/N
%=p(100)
5 20 25 30 10 10
cum f
1 5 10 16 18 20
cum%
5 25 50 80 90 100
Page 7
Helpful Hints
use the following equation to determine the number of intervals and the width of intervals that is appropriate for the data number of intervals = highest score - lowest score + 1 * interval width ALWAYS round up the number of intervals! It is impossible to have a fourth of an interval at the end of the distribution. So even if the number of intervals (using the above formula equals 8.25, round up to 9!* try different widths, until an appropriate number of intervals is calculated
Example: N=25
51, 55, 57, 60, 63, 66, 68, 69, 70, 72, 74, 74, 74, 75, 77, 79, 83, 84, 85, 85, 88, 90, 92, 95, 98
number of intervals =
98 51 + 1 5 2 2 3 2 3 5 3 2 2 1
48 = 9.6 5
(round up to 10)
95-99 90-94 85-89 80-84 75-79 70-74 65-69 60-64 55-59 50-54
keep in mind that since a continuous variable contains an infinite number of points, a score is not assigned a single point but rather an interval with boundaries, also called real limits, that separate a score from the adjacent scores. Example: X=88 upper real limit = 88.4 lower real limit= 87.5 therefore, a score of 87.75 would fall in the interval of X=88
Frequency Distribution Graphs Uses an x-axis to represent scores or and a y-axis to represent frequencies List scores increasing in value from left to right List frequencies in increasing value from bottom to top The height of the y-axis should be approximately 2/3 to 3/4 of the length of the x-axis
Page 8
Creating a Grouped Frequency Histogramfollow rules for Grouped Frequency Table Histogramused for interval/ratio data; a bar represents an interval (real limits of the score or class interval); bars touch each other to represent the continuous nature of the data; height corresponds to frequency Example: Using data from the Grouped Frequency Table on previous page
5 4
3 2 1
50-54 55-59 60-64 65-69 70-74 75-79 80-84 85-89 90-94 95-99
Bar Graphused for nominal/ordinal data; a bar represents a category, bars do not touch Frequency Distribution Polygons used for interval/ratio data; a single dot represents an individual score or a class interval; dots are connected Distribution Curveshows relative frequencies for the population; smooth Normalsymmetrical; greatest frequency in the middle, smallest frequency in the extremes (tails)
Positively Skewed smallest frequency in the positive (right) end of the distribution
Negatively Skewed smallest frequency in the negative (left) end of the distribution
Page 9
10, 15, 18, 22, 25, 26, 29, 31, 33, 33, 34, 37, 38, 39, 39, 40, 40, 40, 41, 42, 42, 43, 44, 45, 46, 46, 47, 48, 49, 50 1. Using the data above, do the following: a. construct a histogram based upon the grouped frequency distribution b. determine the distribution type (normal, positive, negative) from the histogram
Page 10
describes a group of individuals with a single measurement that is most representative of all individuals Types: mean, median, and mode
Meanarithmetic average used for interval/ratio (quantitative) data computed by adding all the scores and dividing by the number of scores Population mean = = X N Sample mean = X = X n
Medianthe midpoint; the score that divides the distribution exactly in half; 50% are above and below the median used for ordinal data or when: there is a skewed distribution, some scores are undetermined, or there is an open-ended distribution Calculating the median when N is an odd number make sure scores are in order; find the middle score Calculating the median when N is an even number make sure scores are in order; find the two middle scores; add the two scores & divide by 2
Modethe most frequent score used especially for nominal data represented by the highest point in the frequency distribution
Page 11
Mode
Median
Mean
Variability
Variabilitya measure that describes how spread out or close together the scores are within the distribution Rangedistance between the highest score and the lowest score in the distribution; easiest measure of variability
7
6
6
5 5 4 3
5
4
4
3
3
2 2 1
2
1
1 0
1 2 3 4 5 6 7 8 9 10 11
Page 12
Standard Deviation from the Mean most common measure of variability; average distance of scores from the mean
7
6
6
5 5 4 3 2 1
5
4
4
3
3
2
2
1
1 0
1 2 3 4 5 6 7 8 9 10 11
9 8 7 6 5 4 3 2 1 0
4 3 2 1 2 3 2 2 1
Distribution #2
5 6 7
10
11
Page 13
Page 14
Steps
1. For distribution A create a normal distribution like Dr. Vannattas with your candy. Trace outline of distribution. Now on your own, complete the following: 2. For distribution B, move candy around to create a distribution that has greater variability than A. Trace outline of distribution. 3. For distribution C, move candy around to create a distribution that has less variability than A. Trace outline of distribution. 4. For distribution D, move candy around to create a distribution that has the least possible amount of variability.
(X - )2
N
(X - X)2
n-1
degrees of freedom (df = n - 1) an adjustment of sample bias; to calculate the standard deviation, we must know the sample meanthis places a restriction on sample variability since only (n - 1) scores are free to vary once we know the sample mean.
Page 15
n1
n1
15
Sum of squaressum of squared deviation scores or sum of squared differences SS = (X - X)2 also SS = s2(n-1) Variancemean of squared deviation scores; sum of squares divided by the number of scores minus 1 variance = s2 = (X - X)2 n-1
5. Divide SS by degrees of freedom (df = n-1) This is Variance = (X X)2 n-1 6. Take the square root of variance This is the Standard Deviation (SD) =
(X X)2 n-1
Page 16
X 2 2 8 8 10
X X
(X X)2
X 2 6 6 6 10
X - X
(X - X)2
SS=
SS=
Characteristics of standard deviation a small standard deviation indicates that scores are close together a large standard deviation indicate that scores are spread out adding a constant to each score will not chance the standard deviation multiplying each score by a constant cause the standard deviation to multiply by that same constant research articles usually use (SD) to refer to the standard deviation Standard deviation and the normal distribution three standard deviations on each side of the mean
-3
mean
+1
+2
+3
Page 17
This data is slightly different from what is presented in the video so that a cleaner mean would be calculated.
b. Calculate the following: mean = ____________________ median = ____________________ mode = ____________________ range = ____________________ degrees of freedom = ____________________ standard deviation = ____________________
c.
Page 18
Probabilityfraction, proportion or percent of selecting a specific outcome out of the total number of possible selections probability of A = number of As total number of possible outcomes
.13%
13.59% 2.14%
34.13%
34.13%
13.59%
2.14%
.13%
-3
0.13%
2
2.28%
1
15.87%
mean
50%
+1
84.13%
+2
97.72%
+3
99.87%
Page 19
z z
z Scores
Page 20
z scoremeasure of relative position; identifies position of a raw score in terms of the number of standard deviations it falls above or below the mean Use z scores to convert raw score into percentile rank z=X-
Example: Jill gets a raw score of 55 on a standardized math test (=50, =10). What is Jills z score?
z = X - = 55 - 50 10 = 5 = .5 10 So Jill is .5 standard deviation above the mean.
99.7% 95% 68% 13.59% .13% 2.14% 34.13% 34.13% 13.59% 2.14% .13%
-3z
0.13%
-2z
2.28%
-3z
15.87%
mean
50%
1z
84.13%
2z
97.72%
3z
99.87%
View area under the normal curve in terms of probability and percent:
What What What What What
is the probability of selecting a score that fall beyond 1z? p=.1587 is the probability of selecting a score that fall below -2z? p=.0228 is the percentile rank of someone who has a z score of 2? 98th %tile is the percentile rank of someone who has a z score of 1? 84th %tile if we have a z-score of 1.2, how can we find the probability or percentile rank?
we use the table of z scores provided in your course packet (see statistical tables on
page 84)
Page 21
z = 540 - 500 = .4
100
200 -3z
300 -2z
400 -1z
500 0z
600 +1z
700 +2z
800 +3z
Suppose an individual scored at the 70th % on a standardized test (,=100, =10), but for some reason we dont know his raw score and need to calculate it.
1. Use the equation: raw score = + z 2. Use the percentile rank and convert it to a probability (example: 70%
.7000).
3. Use the z-table to identify the z-score associated with the probability .7000 corresponds to a z-score of z=.52 (Notice that we could not find a probability of
exactly .7000 but had to find a probability that was closest to .7000, which was .6985).
Page 22
2. Bebe scored 48. Place Bebes score on the distribution. What is her z-score and percentile rank?
3. Kenny scored 63. Place Kennys score on the distribution. What is his z-score and percentile rank?
4. Sally is at the 71st percentile. Place Sally on the distribution. What is her z-score and raw score?
Page 23
For the problems 5-7, use the following parameters from the GRE ( = 500, = 100). 5. Mary scored 570. What is her z-score and percentile rank?
For the problems 8-10, use the parameters from an IQ test ( = 100, = 15). 8. Wendy scored at the 90th percentile. What is her raw score?
10. Jack scored 80. What is his z score and percentile rank?
Page 24
With statistics, we are usually trying to make conclusions/inferences about the population from the studied sample. Consequently, we want to compare the sample to the population of similar samples. But in doing so, two issues arise: How do we know is a sample is representative of the population when every sample is different? How can we transform a population distribution of individuals to a population distribution of sample means? Every sample is different from the population, this is known as sampling error, or the discrepancy/error between the sample and the population. Random sampling is used to minimize sampling error, which can occur randomly
If we were to take a population distribution of individuals. . . randomly group individuals into similar sized samples then calculated the means of these samples and placed them into a frequency distribution a normal curve would formthis distribution is known as the distribution of sample means. any distribution that is of sample statistics and NOT individual scores is referred to as a sampling distribution.
as sample size increases, the standard error will decrease-----> which means that the samples are more representative of the population
Page 25
x =
100 400
100 20
pop of individuals
pop of samples (n=400)
0z 500 500
A sample mean of 515 corresponds to +3z Using the z table, +3z corresponds to a probability of .0013 (.13%)
Page 26
Example: What is the probability of getting a sample mean of 104 or higher on an IQ test (=100, =15) with a random sample of n = 36? Calculate the standard error for samples of n=36. x = n = 15 36 = 15 6 = 2.5
pop of individuals
pop of samples (n=36)
-3z 55 92.5
-2z 70 95
-1z 85 97.5
0z 100 100
A sample mean of 104 corresponds to +1.6z z = X - = 104 100 = 4 = 1.6 X 2.5 2.5
Page 27
1. A normal population has = 70 and = 12. a. Sketch the population distribution. What proportion of the scores have values greater than a score of X = 73? b. Sketch the distribution of sample means for samples of size n = 16. What proportion of the means have values greater than a mean of X = 73?
-3z
pop of individuals pop of samples (n=36)
-2z
-1z
0z
+1z
+2z
+3z
2.
For a normal population with = 70 and = 20, what is the probability of obtaining a sample mean greater than X = 75 a. For a random sample of n =4? b. For a random sample of n =16? c. For a random sample of n = 100?
-3z
-2z
-1z
0z
+1z
+2z
+3z
Page 28
Hypothesis Testingusing sample data to evaluate a hypothesis (prediction) about the population so conclusions/inferences can be made about the population from the sample We are testing a hypothesis to determine if the treatment has caused a significant change in the population the majority of sample means are in the middle of the distribution; so for a sample to be significantly different, it should be with the extreme means in the tails of the distribution, where the probability is very low
hypotheses can also be directional or non-directional non-directionaljust a prediction of a change/effect Key words: effect, impact, difference, cause directionala prediction of increase or decrease Key words: increase, decrease, higher, lower, positive, negative
(applying example values of =60)
Alternative One-tailed
(Directional)
Two-tailed
(Non-directional)
Page 29
Example: Suppose that local school district implemented an experimental program for science education. After one year, 100 children in the special program obtained a mean score of X=63 on a national science achievement test (=60, =12). Did the program have an impact on the participants science achievement? alternativeThe science program will significantly effect science achievement among program participants. This is an example of a non-directional hypothesis;
H1: sprog 60
null The science program will NOT significantly effect science achievement among program participants.
H0: sprog = 60
By setting a benchmark or criteria that requires the change in the population mean to be quite large and the probability of this change due to be very low, we decrease our chance of a Type I error this criteria is known as the level of significance or alpha level () most commonly used alpha levels are .05 (5%) and .01 (1%) these levels of significance correspond with specific z scores, but depends upon whether the hypothesis is directional or non-directional
non-directional hypothesis--->2-tailed test .05 level -------> zcritical = 1.96 .01 level -------> zcritical = 2.58
99% 95%
-3z
-2z
-1z
0z
+1z
+2z
+3z
-2.58z -1.96z
+1.96z +2.58z
directional hypothesis---> 1-tailed test .05 level -------> zcritical = + or - 1.65 .01 level -------> zcritical = + or - 2.33
95% 99%
-3z -2z -1z 0z when the sample mean exceeds the limit, then it differs significantly so we would reject the null
+1z+
2z
+3z
+1.65z +2.33z
Page 30
Step 3Collect & analyze sample data--random selection highly recommended so that
sample is representative of population
Recall that when a test statistic is calculated by hand, you need to identify the critical value (zcritical), which is then compared to the test statistic (zcalculated) to determine significance. Computer automatically determines the probability of obtaining a test statistic due to chance. Consequently, when determining significance you do NOT compare zcalculated to zcritical, rather you examine the p-value or level of significance.
If p (or sig) is less than alpha level (.05 or .01) the null.
If p (or sig) is greater than alpha level (.05 or .01) significant fail to reject the null.
Decision-making Table
Comparison
Hand Calculations Computer
Significance?
Significance! Not! Significance! Not!
Decision?
Reject Null Fail to Reject Null Reject Null Fail to Reject Null
Conclusion
Restate Alternative Restate Null Restate Alternative Restate Null
Errors in Hypothesis Testing--Two types of errors are possible when testing a hypothesis:
Type I Errorwe could make the mistake of rejecting the null when it really the H0 is true, when there really isnt a significant change due to the treatment this kind of error may be due to sampling error (the sample was above the population mean even before the treatment) minimize a Type I error by setting low alpha () level (low probability for making an error) Type I error is more serious!
Type II Error we could make the mistake of not rejecting the null when we should have, when there really is a significant change due to the treatment the treatment effect was not big enough most likely due to sampling error (the sample was below the population mean even before the treatment)
Page 31
Step 2: Establish significance criteria Computer = .05 Hand calculations identify z scores used for the alpha level and the appropriate test.
two-tailed test at .05 corresponds to zcritical = 1.96
Step 3: Collect and analyze sample data Computer enter and analyze data Hand calculations
Calculate standard error
x = = 12 = 12 = 1.2 n 100 10
Draw distribution of sample means and shade in critical region
95%
pop of individuals
pop of sample means (n=100)
56.4
-3z 24
57.6
-2z 36
58.8
-1z 48
0z 60
60
61.2
1z 72
62.4
2z 84
63.6
3z 96
-1.96z
+1.96z
Page 32
Computer
Identify test statistic and level of significance (p-value) in output z = 2.49, p=.0064 Compare level of significance with alpha level
p-value of .0064 is less than .05 it is significant reject null Hand calculations
Calculate test statistic Convert sample mean into z score to determine if it falls in critical region. z = X = 63 - 60 = 3 = 2.5 X 1.2 1.2 it exceeds +1.96z, so it is significant, reject the null
Step 5: Draw conclusion Null is rejected so alternative hypothesis is restated as conclusion Participation in the science program did significantly effect science achievement scores among program participants.
Example of a one-tailed test: Suppose we took the same example, but hypothesized that the
program would cause a significant increase in achievement scores--this would be a directional hypothesis. In addition, lets change the level of significance to .01 Recall: n = 100, X = 63, = 60, = 12
Hand calculations Identify z scores used for the alpha level and the appropriate test.
one-tailed test at .01 corresponds to z = + 2.33, since we are looking for an increase, we are focusing on the positive end of the distribution
Step 3: Collect and analyze sample data Computer enter and analyze data Hand calculations Calculate standard error
x = n = 12 100 = 12 10 = 1.2
Page 33
99%
56.4
57.6
-2z 36
58.8
-1z 48
0z 60
60
61.2
+1z 72
62.4
+2z 84
63.6
+3z 96
+2.33z
Hand calculations
Identify test statistic and level of significance (p-value) in output z= 2.49, p=.0032 Compare level of significance with alpha level p-value of .0032 is less than .01 it is significant reject null
Calculate test statistic Convert sample mean to z score to determine if it falls into the critical region.
Null is rejected so alternative hypothesis is restated as conclusion Participation in the science program did significantly increase achievement scores among program participants.
Page 34
Complete the process of hypothesis testing for each of the scenarios. 1. A high school counselor created preparation course for the SAT-verbal (=500, =100). A random sample of n = 16 students complete the course and then take the SAT. The sample had a mean score of X = 554. Does the course have a significant affect on SAT scores? Test at the .01 level.
Page 35
Z-test results: - mean of Variable (Std. Dev. = 100) H0 : =500 HA : not equal 500 Variable var1 n 16 Sample Mean 554 Std. Err. 25 Z-Stat 2.16 P-value 0.0308
b. Circle:
one-tailed
or
two-tailed
c. Write the alternative and null hypotheses using correct notation. H 1: H0:
d. zcalculated =
f. Circle:
reject null
or
Page 36
2. A researcher believes that children who grow up as an only child develop vocabulary skills at a faster rate than children in large families. To test this, a sample of n = 25 four-year-old only children are tested on a standardized vocabulary test (=60, =10). The sample obtains a mean of X = 63.8. Test at the .05 level.
Z-test results: - mean of Variable (Std. Dev. = 10) H0 : =10 HA : > 10 Variable var1 n 25 Sample Mean 63.8 Std. Err. 2 Z-Stat 26.9 P-value <0.0001
There was an error when conducting this test. The population mean is NOT 10 but rather 60. The result is still significant, but the z-statistics would have been 1.93 with p=.03.
b. Circle:
one-tailed
or
two-tailed
c. Write the alternative and null hypotheses using correct notation. H 1: H0:
d. zcalculated =
f. Circle:
reject null
or
Page 37
3. A psychologist investigates IQ among autistic children to determine if their IQ is significantly different from the norm. Using a standardized IQ test (=100, =10), he tests 10 autistic children, all age 12. The following output was generated using StatCrunch. Test at = .05. Sample data are: 105, 110, 130, 150, 185, 100, 125, 95, 85, 120 Z-test results: - mean of Variable (Std. Dev. = 10) H0 : =100 HA : not equal 100 Variable var1 n 10 Sample Mean Std. Err. Z-Stat P-value
b. Circle:
one-tailed
or
two-tailed
c. Write the alternative and null hypotheses using correct notation. H 1: H0:
d. zcalculated =
f. Circle:
reject null
or
Page 38
To use the z score as a test statistic, we must know the population standard deviation in order to calculate the standard error of sample means. Unfortunately, most of the time we do not know
, so what do we do?
The t statistic, commonly known as a t test, allows us to compare the sample to the null by using the sample standard deviation to estimate the standard error of sample means. estimated standard error (sX) = s n
The t statistic uses a formula very similar to z but instead utilizes the estimated standard error. t=X- z= X-
sX
Tip on when to use which: if you know , then use z if you dont know , use t Since we are comparing a single sample mean to a population mean, this t test is called Single Sample t Test or One Sample t Test.
The t Distribution
Since the t statistic utilizes the estimated standard error (sX), the t distribution only approximates the normal distribution and is based on degrees of freedom (df = n - 1) not the total sample size. as df and sample size increase, the closer the s represents , and the better the t distribution approximates the normal (z) distribution since the t distribution has more variability, it is more spread out and flatter we use the t statistic in a very similar way as we used z, in that we use a t distribution table to find the probability of a t statistic note: since the t statistic is dependent on degrees of freedom, the critical t statistics corresponding to levels of significance () vary with the degrees of freedom, unlike the critical z scores (where a two-tailed test at .05 will always corresponds to zcritical = 1.96)
Page 39
Reporting of Results of the t Test t Test results statement include the following parts: results with sample mean and standard deviation; (M = 24.58 , SD = 3.48 ) t calculated with the degrees of freedom in parentheses; (t(11) = -2.40) alpha level or p-value; (p< .05) two-tailed or one-tailed
Example:
Subjects (M = 24.58 , SD = 3.48) spent significantly less time talking to parents than the therapists claim; t(11) = -2.40, p< .05, two-tailed. Assumptions of the t test: independent observations, normal population Putting it all together Example of a two-tailed t test A family therapist states that parent talk to their teens an average of 27 minutes per week. Surprised by this claim, a counselor collects data on 12 teens and finds the following (X = 24.58, s = 3.48) Does the amount of parent talk for the sample significantly differ from the therapists claim? Test at the .05 level. Step 1: Develop hypotheses State Alternative: Amount of parent talk for sample will significantly differ from the norm. Determine if it is a one-tailed or two-tailed test. It is non-directional hypothesis ------>two-tailed H1: 27 (samples will be different) H0: = 27 (samples will NOT be different) Step 2: Establish significance criteria Computer =.05 Hand calculations Identify tcritical used for the alpha level, the appropriate test, & df two-tailed test at .05 (df =11) corresponds to tcritical = 2.201 Step 3: Collect and analyze sample data Computer enter and analyze data Hand calculations Calculate estimated standard errorsx =
s n
Page 40
Step 4: Compare sample data to null------>calculate test statistic Computer Identify test statistic and p-value in output o t(11)=-2.396, p=.019 o p-value (.019) is less than alpha (.05) so it is significant reject null Hand Calculations Convert the sample mean into a t statistic to determine if it falls into the critical region. tcalculated = X - = 24.58 - 27 = -2.42 sX 1.01 1.01 = -2.396 it exceeds -2.201, so it is sig., reject null
Step 5: Draw conclusion Amount of parent talk for sample (M = 24.58, SD = 3.48) significantly differs
Page 41
1. On a standardized spatial skills task, normative data reveals that people typically get = 15 correct solutions. A psychologist tests n = 7 individuals who have brain injuries in the right cerebral hemisphere. For the following data, determine whether or not right-hemisphere damage results in reduced performance on the spatial skills task. Test at the .05 level. Data: 12, 16, 9, 8, 10, 17, 10
Variable var1
DF 6
T-Stat -2.4849744
P-value 0.0237
Quantitative Quantitative
e. Write the alternative and null hypotheses using correct notation. H1: f. tcalculated = H0: g. Level of significance (p) =
h. Circle: i.
reject null
or
Page 42
2. A researcher would like to examine the effects of humidity on eating behavior. It is know that laboratory rats normally eat an average of = 21 grams of food each day. The researcher selects a random sample of n = 25 rats and places them in a controlledatmosphere room where the relative humidity is maintained at 90%. On the basis of this sample, can the researcher conclude that humidity affects eating behavior. Test at the .05 level.
Std. Err.
DF
T-Stat
P-value
16.12 0.79229623
24 -6.1593122 <0.0001
Quantitative Quantitative
e. Write the alternative and null hypotheses using correct notation. H1: f. tcalculated = h. Circle: i. reject null or H0: g. Level of significance (p) = fail to reject null
Page 43
3. Does the average age of students enrolled in EDFI 641 differ significantly from the average age of BGSU grad students (24 years)? Test at the .01 level. T-test results: - mean of Variable H0 : = 24 HA : not equal 24 Variable Sample Mean var1 Std. Err. DF T-Stat P-value 0.0453
27.125 1.4314183
15 2.1831493
Quantitative Quantitative
e. Write the alternative and null hypotheses using correct notation. H1: f. tcalculated = H0: g. Level of significance (p) =
h. Circle: i.
reject null
or
Page 44
So far, we have only used one sample to draw inferences about one population. What if we want to compare two different groups, such as male vs female or Treatment A students vs Treatment B students? t Test of Independent Samples draws conclusions about two populations by comparing two samples; since we are looking at differences between the two samples and the two populations, the t statistic reflects these multiple comparisons tsingle sample = X - sX tind samples = (X1 - X2) - (1 2) sX1 - X2 where sX1 - X2 = sp2 n1 + sp 2 n2
Recall, that for the single sample t test, we calculated the estimated standard error. Since we are now comparing two samples to two populations, we calculate the standard error of sample mean differences. Standard error of sample mean differences total amount of error involved in using two sample means to approximate two population means (averages the error of the two sources). However, the preceding formula for sX1 - X2 is only appropriate when the two samples are the same size. To correct for the bias in sample variances, we need to combine the two sample variances into a single value called pooled variance.
Pooled Varianceaverages the two sample variances, which allows the bigger sample to carry more weight. pooled variance = sp2 = SS1 + SS2 df1 + df2 Using the pooled variance, we can now calculate an unbiased measure of the standard error of sample mean differences: sX1 - X2 = sp2 n1 + sp 2 n2
Hypothesis Testing with t Test of Independent Samples t Test of Independent Samples used to test a hypothesis about the mean difference between two populations null hypothesis reflects no difference alternative hypothesis reflects a difference
One-tailed Two-tailed
H0: 1 2 H0: 1 = 2
rejection of null------>data indicate a significant difference between the two populations failure to reject null------>data indicate NO significant difference between the two populations
Assumptions about t test of independent samples: independent observations, each population must be normal and have equal variances (homogeneity of variance).
Page 45
Putting it all together Example of a one-tailed t test A psychologist would like to examine the effects of fatigue on mental alertness. An attention test is prepared that requires subjects to sit in front of a blank TV screen and press a response button each time a dot appears on the screen. A total of 110 dots are presented during a 90 minute period, and the psychologist records the number of errors for each subject. Two groups of subjects are selected. The first group (n =5) is test after they have been awake for 24 hours (X = 34, SS = 63). The second group (n=10) is tested in the morning after a full nights sleep (X = 24, SS = 100). Can the psychologist conclude that fatigue significantly increases errors on an attention task? Test at .05 level.
Step 1: Develop hypotheses State alternative: Fatigue will significantly increase the number of errors on an attention task. It is directional hypothesis ------>one-tailed H1: fatigue > rested H0: fatigue rested Step 2: Establish significance criteria Computer =.05 Hand calculations Identify tcritical used for the alpha level, the appropriate test, and df one-tailed test at .05 (df =13) corresponds to tcritical = +1.771
Step 3: Collect and analyze sample data; Computer Hand calculations Calculate pooled variance pooled variance = sp2 = SS1 + SS2 = 63 + 100 df1 + df2 4+9 Calculate standard error of sample mean differences sX1 - X2 = sp2 n1 + sp2 n2
= 163 = 12.54 13
Two Sample T-test results (with pooled variances): 1 - mean of var2 where var1=1 2 - mean of var2 where var1=2 H0 : 1 - 2 = 0 HA : 1 - 2 > 0 Difference 1 - 2 Sample Mean 10 Std. Err. 1.9360149 DF 13 T-Stat 5.1652493 P-value <0.0001
Identify test statistic and p-value in output t(13)=5.17, p<.0001 Compare p-value to alpha level p is less than .05 reject null
Hand calculations Calculate t tind samples = (X1 - X2) - (1 2) sX1 - X2 tcalculated > t critical, reject null Step 5: Draw conclusion
Page 46
= 5.15
Null is rejects so alternative hypothesis is restated as conclusion Fatigue significantly increased the number of errors in attention task; t(13)=5.17, p<.0001, one-tailed.
A t test will not calculate effect size. You must calculate it by hand.
o A common index of effect size (r2) Percentage of Variance accounted for effect size (r2) = t2 t2 + df
Typically an effect size of 0.50 (50%)or larger signifies an important difference Use inferential statistics very cautiously especially when dealing with non-random samples-be very careful in generalizing your results to the population
Page 47
Summary statistics for var2 grouped by var1 var1 1 2 n 10 20 Mean 43.1 36.8 Variance 17.211111 25.010527 Std. Dev. 4.1486278 5.0010524 Std. Err. 1.3119112 1.1182693 Median 43.5 36.5 Range 14 18 Min 36 30 Max 50 48 Q1 40 33 Q3 46 40
Two Sample T-test results (with pooled variances): 1 - mean of var2 where var1=1 2 - mean of var2 where var1=2 H0 : 1 - 2 = 0 HA : 1 - 2 not equal 0
Difference 1 2
DF 28
T-Stat 3.4290135
P-value 0.0019
Quantitative Quantitative
e. Write the alternative and null hypotheses using correct notation. H1: H0: f. tcalculated = h. Circle: i. reject null or g. Level of significance (p) = fail to reject null j. effect size r2=
Page 48
2.
Does level of anxiety (measured on a scale from 1 to 10) when enrolling in a statistics class differ by gender? Test at the .05 level.
Two Sample T-test results (with pooled variances): 1 - mean of var2 where var1=1 Difference Sample Mean 2 - mean of var2 where var1=2 1.5 1 - 2 H0 : 1 - 2 = 0 HA : 1 - 2 not equal 0
DF 18
T-Stat 1.2325299
P-value 0.2336
Quantitative Quantitative
e. Write the alternative and null hypotheses using correct notation. H1: H0: f. tcalculated = h. Circle: reject null or g. Level of significance (p) = fail to reject null
Page 49
t
3.76* 4.02* 5.89* 4.56* 3.80* 5.41* 5.63*
* indicates p<.001
Source: Newell, C.E., Rosenfeld, P., & Culbertson, A. L. (1995). Sexual harassment experiences and equal opportunity perceptions of Navy women. Sex Roles, 32, 159-168.
1. Which group of Navy women is more likely to recommend the Navy to others? In other words, which group has the higher mean for item one?
5. In general, what can we conclude about sexual harassment and navy satisfaction?
Answers: 1) Those who have NOT been sexually harassed have the higher mean and are more likely to recommend the Navy to others; 2) Yes, it is significant at the p<.001 level. 3) Yes, the t result is significant at p<.001.; 4) all items were significant; 5) Navy women who have NOT been sexually harassed are more satisfied with the Navy than those who have been sexually harassed.
Page 50
Many times research evaluates the effect of a treatment by uses a pretreatment and post treatment design with a single sample, this is called a repeated measures study. since the test uses the same sample, there is no risk that one group is different from another even before the treatment begins. researchers try to build upon this concept when studying two samples by matching subjects from the two groups--this helps to eliminate pretreatment differences t test of related samples compares the differences between the pre and post treatment scores of the sample to pre-post differences in the population. difference score = D = X2 - X1 Mean of differences (D) = D n
Computing the t of related samples Recall tsingle sample = X - sX For t of related samples, the sample data are the difference scores (D) and the population data we are interested in is NOT the population mean but the population mean difference (D), therefore, t related samples = D - D sD where sD = s n
We are not comparing means of the pre and post, rather the pre and post scores for each individual are compared!
Developing the hypotheses:
Null H 0 : D 0 H 0 : D = 0
Assumptions of the related samples t test independent observations, normal distribution of pop of differences
Page 51
Putting it all together Example of a one-tailed t test A researcher is interested in studying the effects of endorphins (the feeling-good chemical that is released in the brain at the end of aerobic exercise) on pain tolerance. A sample of 16 subjects is obtained; each persons tolerance for pain is tested before and after a 50 minute session of aerobic exercise. On the average, the pain tolerance for the sample was D =10.5 higher after exercise than it was before. The SS for the sample difference scores was SS = 960. Do these data indicate a significant increase in pain tolerance following exercise. Test at the .01 level. Step 1: Develop hypotheses State alternativeExercise will significantly increase pain tolerance It is directional hypothesis ------>one-tailed H1: D > 0 H0: D 0 Step 2: Establish significance criteria Computer =.01 Hand calculations Identify tcritical used for the alpha level, the appropriate test, and df one-tailed test at .01 (df =15) corresponds to tcritical = +2.602 Step 3: Collect and analyze sample data Computer Hand calculations Calculate sample mean of D (D): D = 10.5 Calculate standard deviation of D scores s= SS = n-1 960 = 15 64 = 8 sD = s = 8 n 16 = 2
Step 4: Compare sample data to null------>calculate test statistic Computer Identify test statistic and p-value; t(15)=5.25, p<.001 Compare p-value with alpha level .001 is less than .01 reject null Hand calculations Calculate t trelated samples = D - D = 10.5 = 5.25 it exceeds tcritical reject null 2 sD Step 5: Draw conclusion Aerobic exercise significantly increased pain tolerance; t(15)=5.25, p<.001, one-tailed.
1. An investigator for NASA examines the effect of cabin temperature on reaction time. A random sample of 10 astronauts and pilots is selected. Each persons reaction time to an emergency light is measured in a simulator where the cabin temperature is maintained at 70 degrees F and again the next day at 95 degrees. Using the results of this experiment, can the psychologist conclude that temperature has a significant effect on reaction time. Test at the .01 level. Summary statistics
Column var1 var2 n 10 10 Mean 203 223 Variance 381.55554 417.1111 Std. Dev. 19.533447 20.423298 Std. Err. 6.177018 6.458414 Median Range Min Max 205.5 224 55 176 65 190 231 Q1 183 Q3 216
Page 52
Paired T-test results: D - mean of the differences between var1 and var2 H0:D = 0 HA:D not equal 0
Difference var1 - var2 Sample Diff. -20 Std. Err. 1.67332 DF 9 T-Stat -11.952286 P-value <0.0001
Quantitative Quantitative
One-tailed
e. Write the alternative and null hypotheses using correct notation. H 1: H0: f. tcalculated = h. Circle: reject null or g. Level of significance (p) = fail to reject null
Page 53
2. Does eating oatmeal decrease cholesterol levels? A researcher implements a 30-day treatment that consists of eating a bowl of oatmeal everyday for breakfast. Cholesterol is measured before (var1) and after (var2) the treatment for the 10 participants. An = .05 was utilized.
Summary statistics
Column var1 var2 n 10 10 Mean 258.2 222 Variance 192.4 269.33334 Std. Dev. 13.870832 16.411379 Std. Err. 4.3863425 5.1897335 Median 257.5 221 Range 40 56 Min 240 190 Max 280 246 Q1 245 210 Q3 270 230
var1 - var2
Quantitative Quantitative
One-tailed
e. Write the alternative and null hypotheses using correct notation. H 1: H0: f. tcalculated = h. Circle: reject null or g. Level of significance (p) = fail to reject null
Page 54
of supportive versus behavioral therapy for illicit drug use. Behavior Research and Therapy, 34, 41-46.
1. As is customary in journal article, the research did not state the null hypothesis. Write the appropriate null hypothesis for the first t-test result reported in the excerpt.
3. Should the null hypothesis be rejected for the second t test reported in the excerpt. Explain.
4. The last difference in the excerpt was statistically significant at the .001 level. Was it also significant at the .05 level?
Answers: 1)The treatment of Behavioral Counseling Program will NOT significantly reduce drug use among participants. 2) Yes, since the p-value is less than .05. 3) No, the p-value is greater than .05. 4)Yes, If it is significant at p<.001 then it is also significant at p<.05.
Page 55
1) Are diet drinkers (when compared to regular drinkers) more accurate in tasting the
This question will utilize a t-test of independent samples, which you can complete for 5 points of extra credit (Extra Credit #1).
2) When tasting the difference between Coke and Pepsi, is ones prediction of accuracy significantly different from ones actual ability/accuracy? This question will utilize a t-test of related samples, which you will complete for 5 points of extra credit (Extra Credit #2). In order to complete this experiment, you need at least one other person (who has the same pop preference as you) to participate. It would be great if you can find 2-4 more individuals.
Directions:
1. Identify your pop preference (Diet or Regular). If you prefer diet pop, purchase one can/bottle of Diet Coke and one of Diet Pepsi. If you prefer regular, purchase can/bottle of Coke and one of Pepsi. 2. In addition to the pop, you will need the following supplies to complete this experiment. 5 small paper cups for each participant Pen or pencil Napkins in case you spill Pretzels or chips for cleansing ones palate 3. Once you have your supplies and participants together, record each participants name in the first column of the data grid below and ones preference (diet=1, regular=2) in the second column.
Data Grid
Name Preference Prediction % Actual %
4. Have each participant predict how accurate they will be in identifying the pop as Coke or Pepsi. Since each person will be given 5 cups of pop, predict how many times out of 5 chances you will be correct in the identification process (e.g., 3/5). Then, convert that fraction into a percent (e.g., 3/5=60%). Record this percent in the third column of the grid. 5. Determine who will complete the taste test first. Have that person turn away while another participant fills 5 cups with pop (make sure that some cups have Pepsi and other cups have Coke
Page 56
and that you know which cups have which pop). Hint: Dont write the name of the pop on the bottom of the cup; it will show through as the person drinks the pop. 6. Have the taste tester proceed in identifying the pop in each cup, while another participant records the accuracy. Dont tell the results to the taster until all 5 cups have been tasted. Calculate the number of correct tastes out of five. Convert that fraction into a percent and record the percent in column 4 of the grid. 7. Once you and your fellow participants have finished the taste test, add your results to the spreadsheet below. 8. Go to StatCrunch and enter ALL the data from the spreadsheet (including the data provided for 15 individuals). You should have a minimum of n=17 for your sample. Proceed with the t-test directions.
Page 57
Analysis of Variance (ANOVA) is a hypothesis testing procedure that evaluates mean differences between two or more treatments or groups; t test can only compare two groups. Single Factor Designstudies the effect that one factor (independent variable) has on the dependent variable. Note that although there is only one factor, this factor has more than two categories so that we are comparing two or more groups/treatments. Hypothesis Testing for ANOVA Null hypothesis states that there is no difference among the groups or treatments H0: 1 = 2 = 3 Alternative hypothesis states that at least one mean is different from the others H1: At least one mean will differ
ANOVA Test Statistic ANOVA creates a test statistic called an F-ratio that is similar to t statistic
Recall that
t=
obtained difference between sample means = tsingle = X - difference expected by chance (error) sX
F is similar to t, but since there are more than two means to compare, variance will be used to represent the differences between all the means being compared.
F=
variance (differences ) between sample means variance (differences ) expected by chance (error)
Like t, a large F value indicates the treatment effect (mean differences) that is unlikely due to chance. when the treatment had no effect so that the means are the same (H0 is true), the F-ratio will be close to 1.00
Distribution of F-ratios Like t, F is also distributed But the F distribution is not normal; it is positively skewed, the degree of which depends upon the degrees of freedom from the two variances. large df -------> nearly all F-ratios are clustered around 1.00 small df -------> the F-ratios are more spread out Since the F distribution is positively skewed, we are only looking in one tail for the difference. As a result we dont need to indicate if the test is one or two tailed. Recall: we expect F near 1.00 if the null is true and expect a large F if the null is rejected therefore, significant F-ratios will be in the tail of the F distribution
Page 58
variance (differences) between group means variance (differences) expected by chance/error (within groups)
Variance (differences) between groups can be due to: treatment effect individual differences (subjects within the various groups are different even before the treatment begins experimental error (caused by poor equipment, lack of attention/knowledge on the researchers part, unpredictable change of events) Variance within groups can be due to: individual differences (subjects within the various groups are different even before the treatment begins experimental error (caused by poor equipment, lack of attention/knowledge on the researchers part, unpredictable change of events) Consequently, if we divide the variance between treatments by the variance within treatments, (individual differences and error cancel out) so we can determine the treatment effect.
F =
The last few steps of ANOVA require the following calculations: df between groups = k 1 df within-groups = N k where k is number of groups where N is total number of individuals in groups
MS between = variance between treatments = SSbetween MS within = variance within treatments = SSwithin F-ratio = MS between MS within
df between
df within
Page 59
Putting it all together Example: A number of studies on jetlag have found that jetlag seems to be worse when people are traveling east. A researcher examines how many days it takes a person to adjust after taking a long flight. One groups flies west across time zones (NY to CA); a second group flies east (CA to NY); and a third group takes a long flight within one time zone (San Francisco to Seattle). Perform an analysis of variance to determine if jetlag varies for the direction of travel. Use the .05 level of significance.
Computer Results
Analysis of Variance results for var2 grouped by var1
Sample means:
Group 1 2 3
ANOVA table:
n 6 6 6
df 2 15 17
SS 93 17 110
MS 46.5 1.1333333
F-Stat 41.02941
P-value <0.0001
Step 1: Develop hypotheses State alternativeDirection of travel will significantly effect jetlag. H0: 1 = 2 = 3 H1: At least one mean will differ Step 2: Establish significance criteria Computer =.05 Step 3: Collect and analyze sample data Computer enter data Step 4: Compare sample data to null------>calculate test statistic Computer Identify test statistic and p-value; F(2, 15)=41.03, p<.0001 Compare p-value with alpha level .0001 is less than .05 reject null Step 5: Draw conclusion Direction of travel significantly effected jetlag.
Page 60
M SE
MS 46.5 1.13
F = 41.02
When space is an issue, the results should include the F-ration with both degrees of freedom in parentheses and the p-value. Do NOT indicate one-tailed or two-tailed! Travel direction does effect jetlag; F(2, 15) = 41.02, p < .05.
Assumptions of ANOVA: independent observations, samples are selected from normal populations that also have equal variances.
ANOVA
Page 61
1. The extent to which a persons attitude can be changed depends on how big a change you are trying
to produce. In a classic study on persuasion, Aronson, et al. (1985) obtained three groups of subjects. One group listened to a persuasive message that differed only slightly from the subjects original attitudes. For the second group, there was a moderate discrepancy between the message and the original attitudes. For the third group, there was a large discrepancy between the message and the original attitudes. For each subject, the amount of attitude change was measured. Data were entered for the three groups (small, moderate, large discrepancy) and an ANOVA was utilized to determine if the amount of discrepancy between the original attitude and the persuasive argument has a significant effect on the amount of attitude change. Test at the .05 level.
d. Write the alternative and null hypotheses using correct notation. H1: e. Fcalculated = f. Level of significance (p) = g. Circle: reject null or fail to reject null H0:
Page 62
2. A psychologist would like to examine the relative effectiveness of three therapy techniques for treating mild phobias. A sample of N=15 individuals who display a moderate fear of spiders is obtained. These individuals are randomly assigned to the three therapies. After a certain amount of therapy, the psychologist measures the degree of fear reported by each individual. ANOVA was conducted to determine if there are any significant differences among the three therapies. Test at the .05 level.
Quantitative Quantitative
d. Write the alternative and null hypotheses using correct notation. H0: H 1: e. Fcalculated = f. Level of significance (p) = g. Circle: reject null or fail to reject null
Page 63
1. Which type of technology use is the highest among all levels of self-efficacy?
2. Which group of teachers (low, moderate, or high self-efficacy) report the highest technology use among their students?
3. Write the null hypothesis for self-efficacy and overall technology use, where the ANOVA results indicate: F(2,98)=4.71, p<.05.
4. Considering the null hypothesis that you wrote for item 3, should the null hypothesis be rejected? Explain.
Answers: 1) teacher technology use; 2) teachers with high self-efficacy (M=1.81); 3) Self-efficacy will NOT significantly impact overall technology use among teachers; 4) Reject the null, F(2,98)=4.71, p<.05.
Page 64
Correlationstatistical technique used to measure and describe a relationship between two quantitative variables; correlation measures 3 characteristics: direction of relationship positiveas one variable increases so does the other (food intake & weight) negative (inverse)as one variable increases the other decreases (exercise & weight)
x Positive (r = +.90)
x Negative (r = -.90)
form of relationship linearthe relationship between x and y falls in a straight line curvilinear the relationship between x and y curves (age across the lifespan is a variable that often creates a curvilinear relationship) degree (strength) of relationship degree of relationship is reflected in a correlation coefficient (usually r) r ranges between -1 to +1, 0 indicating no relationship, while +1 indicates a perfect positive relationship, and -1 indicates a perfect negative relationship
since we will be computing variability for each variable as well as their variability together, we will be using SS and a new concept, SP, sum of products. Sum of products is used to compute the amount of covariability of two variables SP = (X - X)(Y - Y)
Page 65
Correlation
does NOT measure cause and effect when data have a limited range of scores, the value of the correlation can be exaggerated interpreting strength of coefficient (practical significance): r > .8 is very strong r = .6 - .79 is strong r = .4 - .59 is fair r < .39 is weak to describe how accurately one variable predicts the other, square r. For example, if r=.60, then r2 = .36, which can be interpreted as 36% of the variability in Y scores can be predicted from the relationship with X. r2 is called the coefficient of determination because is measures the proportion of variability in one variable that can be determined from the relationship with the other variable.
Hypothesis Testing (hypotheses use the Greek letter rho, , to signify r) Alternative H 1: > 0 H 1: 0 Null H0: 0 H0: = 0
One-tailed Two-tailed
Putting it all together Example: To measure the relationship between anxiety level and test performance, a psychologist obtains a sample of n=6 college students from an intro stats course. Students arrive fifteen minutes prior to the exam and complete physiological measures of anxiety (heart rate, skin resistance, blood pressure, etc.). Anxiety ratings and exam scores are listed below. Compute the Pearson correlation to determine if a negative relationship exists between anxiety and test performance. Test at the .05 level. Step 1: Develop hypotheses. State Alternative: Anxiety and test performance will negatively relate. It is a directional hypothesis ---- one-tailed H1: < 0 (population shows negative correlation) H0: > 0 (population does not show negative correlation) Step 2: Establish significance criteria Computer StatCrunch does not calculate the p-value for the correlation coefficient. As a result, we must identify rcritical used for , tails, and df df = n 2 = 6 2 = 4, r critical = -.729 Notice that df is n-2 for correlation, since we need two points to create a line. Hand calculations Identify rcritical used for , tails, and df
Page 66
Step 3: Utilize sample data to calculate r Computer Hand calculations Calculate SP, SSX, SSY, r
Exam Score(Y) 80 88 80 79 86 85 Y = 83 (X - X) 0 -3 2 2 -1 0 (Y - Y) -3 5 -3 -4 3 2 (X - X) (Y - Y) 0 -15 -6 -8 -3 0 SP = -32 (X - X)2 0 9 4 4 1 0 SSX=18 (Y - Y)2 9 25 9 16 9 4 SSY= 72
Step 4: Compare sample data to null------>calculate test statistic Computer Identify test statistic and compare rcalculated to rcritical Correlation between var2 and var1 is: -0.8888889 r falls into critical region, it is significant reject null Hand Calculations Calculate r = SP = - 32 = -32 = -.888 SSX SSY 18(72) 36 Compare rcalculated to rcritical r falls into critical region reject null
Step 5: Draw conclusion A negative relationship exists between anxiety and test performance, r(4)=-.889, p<.05, one-tailed.
Computer Output
Page 67
Regression
Regressionstatistical technique for finding the best-fitting straight line for a set of data; used when wanting to determine the ability of one variable to predict another variable (e.g., using SAT score to predict freshman college GPA) Regression lineline that represents the linear relationship; represented by a linear equation Y = a + bX, where a = Y-intercept and b=slope Least-squares method helps determine the best-fitting line by minimizing the error between the predicted & actual values of Y. Y = a + bX , where b = SP SSX and a = Y bX
Example: Using the correlation problem we just solved, lets calculate the regression line. Step 1: Use X, Y, SSX, SP to calculate b and a (previously calculated: X= 5, Y = 83, SP = -32, SSX=18, SSY= 72) b = SP = -32 = -1.777 18 SSX a = Y bX a = 83 (-1.777)(5) a = 83 + 8.888 a = 91.888 Step 2: Calculate regression equation Y = a + bX Y = 91.89 -1.78X now use regression equation to predict Y for a given value of X. If X=7, what is the predicted value of Y? Y = 91.89 -1.78X Y = 91.89 -1.78(7) = 79.43
We can
Page 68
Computer Output
Computer Output: The output in the video will appear different, since a different version of StatCrunch was used.
Simple linear regression results: Dependent Variable: var2 Independent Variable: var1 var2 = 91.888885 - 1.7777778 var1 Sample size: 6 R (correlation coefficient) = -0.8889 R-sq = 0.79012346 Estimate of error standard deviation: 1.9436506 Parameter estimates: Parameter Intercept Slope Estimate 91.888885 -1.7777778 Std. Err. 2.4241583 0.45812285 DF 4 4 T-Stat 37.905483 -3.88057 P-Value <0.0001 0.0178
Ignore these pvalues since they are NOT for the correlation coefficient (r).
Regression equation
Correlation coefficient
Analysis of variance table for regression model: Source DF Model Error Total 1 SS 56.88889 MS F-stat P-value 0.0178 56.88889 15.058824
4 15.111111 3.7777777 5 72
Predicted value for Y when X=7 95% C.I. (76.07917, 82.809715) 95% P.I. (73.08468, 85.80421)
Page 69
1. You probably have read about he relationship between years of education and salary potential. The following hypothetical data represent a sample of n = 10 men who have been employed for five years. Does this data indicate a significant relationship between years of higher education and salary. Test at the .05 level. Also find the regression equation for predicting salary from education. (X) Years of Higher Education: 4, 4, 2, 8, 0, 5, 10, 4, 12, 0 (Y)Salary (in $1000s): 31, 29, 28, 42, 23, 35, 45, 27, 44, 24
Simple linear regression results: Dependent Variable: salary Independent Variable: education salary = 23.135265 + 1.9723947 education Sample size: 10 R (correlation coefficient) = 0.9601 R-sq = 0.92169785 Estimate of error standard deviation: 2.4466708 Parameter estimates: Parameter Intercept Slope Estimate 23.135265 1.9723947 Std. Err. 1.2611643 0.20325504 DF 8 8 T-Stat 18.34437 9.704039 P-Value <0.0001 <0.0001
Analysis of variance table for regression model: Source Model Error Total DF 1 8 9 SS 563.71045 47.88958 611.6 MS 563.71045 5.9861975 F-stat 94.168365 P-value <0.0001
Predicted values: X value 5 Pred. Y 32.99724 s.e.(Pred. y) 0.77397215 95% C.I. (31.212456, 34.78202) 95% P.I. (27.07964, 38.91484)
Page 70
Quantitative Quantitative
e. Write the alternative and null hypotheses using correct notation. H 1: f. rcritical = h. Circle: reject null or H0: g. rcalculated = fail to reject null
j. Regression equation: k. If one has 5 years of education, what is the predicted salary?
Page 71
2. Research has shown that similarity in attitudes, beliefs, and interests plays an important role in interpersonal attraction. A therapist examines the correlation in attitudes between husbands (X) and wives (Y). She administers a questionnaire that measures how liberal or conservative ones attitudes are. Low scores indicate that the person has liberal attitudes while high scores indicate conservatism (scale 1-10). Ten couples participate. Test at the .01 level.
Simple linear regression results: Dependent Variable: wife att Independent Variable: hus att wife att = 0.7785714 + 0.8035714 hus att Sample size: 10 R (correlation coefficient) = 0.7869 R-sq = 0.61919034 Estimate of error standard deviation: 1.6673064 Parameter estimates: Parameter Intercept Slope Estimate 0.7785714 0.8035714 Std. Err. 1.4370375 0.22280319 DF 8 8 T-Stat 0.54178923 3.6066422 P-Value 0.6027 0.0069
Analysis of variance table for regression model: Source Model Error Total DF 1 8 9 SS 36.160713 22.239286 58.4 MS 36.160713 2.7799108 F-stat 13.007869 P-value 0.0069
Predicted values: X value 5 Pred. Y 4.7964287 s.e.(Pred. y) 0.57239175 95% C.I. (3.4764907, 6.1163664) 95% P.I. (0.73135275, 8.861505)
Page 72
Quantitative Quantitative
e. Write the alternative and null hypotheses using correct notation. H 1: f. rcritical = h. Circle: reject null or H0: g. rcalculated = fail to reject null
j. Regression equation: k. If the husband has moderate attitude of 5, what is the value of the wifes attitude?
Page 73
--.69 -.35
-.39
--
Source: Boivine, M. & Hymel, S. (1997). Peer experiences and social self-perceptions: A sequential model. Developmental Psychology, 33, 135-143.
Notice that the correlation coefficients are presented in a matrix. The column header represent the same variables presented in the row headers, however the column header only uses the number to indicate a certain variable. For example, the circle coefficient of .39, represents the correlation between Perceived Social Acceptance and Perceived Behavior Conduct.
Page 74
-861. What is the value of the Pearson r for the relationship between withdrawal and loneliness? Describe this value in terms of strength and direction.
2. What is the value of the Pearson r for the relationship between social preference and victimization by peers? Describe this value in terms of strength and direction.
5. The Pearson r for the relationship between withdrawal and loneliness indicates that those who tend to be more lonely tend to be: A. more withdrawn B. less withdrawn
6. Which of the following pairs has the strongest relationship between them? A. Perceived social acceptance and loneliness B. Withdrawal and victimization by peers C. Number of affiliate links and aggression
7. Which of the following pairs has the weakest relationship between them? A. Withdrawal and social preference B. Withdrawal and perceived social acceptance C. Withdrawal and perceived behavior-conduct
Answers: 1) .29, weak and positive; 2) -.68, strong and negative; 3) Victimization by peers, r=.42; 4) Perceived behavior-conduct, r=.06; 5) A, more withdrawn; 6) A; 7) C.
Page 75
So far we have used parametric tests to evaluate a hypothesis about the population. Parametric tests require certain assumptions about the population parameters, such as a normal distribution, homogeneity of variance, and a quantitative (interval/ratio) dependent variable. When these assumptions for parametric tests cannot be fulfilled, nonparametric tests can be used.
Nonparametric tests
usually do not state a hypothesis in terms of the population distribution, so they are often called distribution-free tests are suited for data that utilize a nominal or ordinal scale are not as sensitive as parametric testsare more likely to fail in detecting a real difference between two treatments one commonly used nonparametric tests is the Chi Square Test for Independence.
2 =
Building on our example of females and males with respect to learning styles, the table below presents the data observed for a sample of 125 males and 75 females. Audio 30 30 60 Visual 30 25 55 Kinesthetic 65 20 85
(fo-fe)2 fe
Males Females
125 75
Page 76
If the distribution for gender is predicted to be the same for the each learning style category, then the same proportion/percent of males and females in each category would be expected. to calculate the expected frequency for each category this formula is used fe = fcfr where fc = column total, fr = row total, n n = sample size the table of expected frequencies would look something like this Audio Males Females
60(125)/200=38 60(75)/200=22
Visual
55(125)/200=34 55(75)/200=21
Kinesthetic
85(125)/200=53 85(75)/200=32
125 75
60
55
85
Degrees of freedom are calculated a bit differently df = (R - 1)(C - 1), where R= number of rows, C=number of columns in our example, df = (2-1)(3-1) = 1(2) = 2 using this and =.05, our 2critical = 5.99
Page 77
Putting it all together Example: Based upon the observed frequencies presented in the table below, can a researcher conclude that learning styles differ by gender? Test at the .05 level. Audio Visual Kinesthetic Males 30 30 65 125 Females 30 25 20 75 60 55 85 Step 1: Develop hypotheses. State Alternative: Learning style will significantly differ by gender. Step 2: Establish significance criteria Computer = .05 Hand calculations Identify 2critical used for and df df = (2-1)(3-1) = 2 2critical = 5.99 Step 3: Utilize sample data to calculate 2 Computer enter data Hand calculations Calculate expected frequencies (fe), fo-fe, (fo-fe)2
fo
male-audio female-audio male-visual female-visual male-kinesthetic female-kinesthetic 30 30 30 25 65 20
fe
38 22 34 21 53 32
fo-fe
-8 8 -4 4 12 -12
(fo-fe)2
64 64 16 16 144 144
(fo-fe)2 fe
Step 4: Compare sample data to null------>calculate test statistic Computer Identify test statistic and compare p-value to a level Statistic Chi-square Hand DF 2 Value 13.042 P-value 0.0019
o p-value is less than .05 reject null Calculations Calculate 2 = 13.04 Compare 2calculated to 2critical Since 2 = 13.04 and exceeds the 2critical= 5.99, the null is rejected
Step 5: Draw conclusion Males and females differ in learning styles; 2(2, n=200)=13.04, p<.05.
Page 78
Computer Output
Contingency table results: Rows: var1 (1=male, 2=female) Columns: var2 (1=audio, 2=visual, 3=kinesthetic)
Cell format: Count Row percent Column percent Total percent
Total 125 100.00% 62.5% 62.5% 75 100.00% 37.5% 37.5% 200 100.00% 100.00% 100.00%
Total
Statistic Chi-square
DF 2
Value 13.042
P-value 0.0019
Assumptions of Chi Square Tests Random sampling Independence of observations Expected frequency for any cell MUST be greater than 5 Reporting Chi Square Results Statement should include chi-square value with df and n in parenthesis, and p-value:
Page 79
1. The US Senate recently considered a controversial amendment for school prayer. The amendment did not get the required two-thirds majority, but the results of the vote are interesting when viewed in terms of the party affiliation of the senators. Does the vote on the prayer amendment (var2: 1=yes, 2=no) differ by political party (var1: 1=demo, 2=rep). Test at the .05 level. Contingency table results: Rows: var1 Columns: var2
Total 45 100.00% 45% 45% 55 100.00% 55% 55% 100 100.00% 100.00% 100.00%
1
Statistic Chi-square DF 1 Value 6.3032928 P-value 0.0121
Total
Quantitative Quantitative
d.
calculated
Page 80
2. A stats instructor would like to know whether it is worthwhile to require students to do weekly homework assignments. For one section of the course, homework is assigned, collected and graded each week. For the second section, the same problems are recommended but not required. At the end of the semester, all students complete the same final exam. Letter grades (A, B, C, D, F) are tabulated for each student by section. Do these data indicate significant grade differences for students with homework versus no homework? Test at the .05 level.
Contingency table results: Rows: var1 Columns: var2
Total
2 20 10% 100.00% 40% 47.62% 4.762% 47.62% 3 22 13.64% 100.00% 60% 52.38% 7.143% 52.38%
Total
9 10 11 7 5 42 21.43% 23.81% 26.19% 16.67% 11.9% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 21.43% 23.81% 26.19% 16.67% 11.9% 100.00%
Statistic Chi-square
DF
Value
P-value 0.647
Scale (circle): Categorical Scale (circle): Categorical Quantitative Quantitative
4 2.4870248
Page 81
Additional Practice: Interpreting Research Articles Read the following excerpt to complete the questions on the next page: Researchers surveyed 120 college sophomores and juniors enrolled in general education psychology courses. Participants were between the ages of 18 and 23 and completed a survey that measured class absenteeism (cutting class) in the past month (for no valid reason) and seven negative behaviors and two positive behaviors--all measured using yes/no response. Negative behaviors included: speeding, slapped/hit someone, getting drunk, breaking the law, telling a significant lie, thinking about dropping out of school, feeling depressed, getting a tattoo, piercing body. Positive behaviors were reading a book that wasnt required for class and visiting family.
Table 1. Number and percentage of students answering yes to behaviors by groups of students who have cut class (n=68) and not cut class (n=52)
Cutting
Not Cutting % 87 93 51 21 12 10 12 19 26 37 91 N 24 39 10 8 3 5 11 4 7 15 40 % 46 75 19 15 6 10 21 8 13 29 77
Behavior Getting drunk Speeding Breaking law Telling significant lie Thoughts of dropping out Feeling depressed Hitting/ slapping Getting tattoo Piercing body Reading a non-required book Visiting family Note: * p<.05, ** p<.002
N 59 63 35 14 8 7 8 12 18 25 62
2 22.79** 7.19* 13.07** 0.53 0.79 0.02 1.95 3.16 3.17 0.83 4.61*
Source: Trice, A.D. , Holland, S. A., & Gagne, P.E. (2000). Voluntary class absences and other behaviors in college students: An exploratory analysis. Psychological Reports, 87, 179-182.
1. What percentage of students who did not cut class report reading a non-required book? 2. Is the difference in frequencies for speeding significant for the two groups? Explain.
4. Should the null hypothesis you wrote for item 3 be rejected? Explain.
5. What can you conclude about students who cut class and get drunk?
Answers: 1) 29%; 2) yes, 2 =7.10, p<.05; 3) Students who cut class will NOT significantly differ in the behavior of getting drunk from students who do not cut class; 4) The null should be rejected since 2 =22.79, p<.002; 5) Students who cut class are more likely to get drunk and vice versa.
Page 82
Dependent Variable
Categorical
1
t test (2) Single Sample Independent Samples Related Samples ANOVA (3+) Pearson Correlation (relate) Regression (predict)
Quantitative
Overview Items
1. Does disability category (LD, EBD, none, etc.) differ by gender?
Page 83
4. Does SES (low, middle, high) effect reading preparedness (as measured by a test) among preschoolers?
5. Does a seminar on self-esteem increase self-esteem scores? (Self-esteem was measure before and after the seminar)
8. Do BGSUs GRE scores for entering graduate students significantly differ from the population norm?
9. Does a reading intervention significantly increase 4th grade reading proficiency scores? Note: one group receives intervention, while another group receives traditional instruction.
Page 84
Page 85
Page 86
Page 87
Page 88
Page 89
Page 90
Page 91
Page 92