Académique Documents
Professionnel Documents
Culture Documents
linear pattern of relationship between one variable (x) and another variable (y) an association between two variables relative position of one variable correlates with relative distribution of another variable graphical representation of the relationship between two variables
Warning:
No proof of causality Cannot assume x causes y
22 21 20
Daughter size
15 16 17 18 19 Parent size 20 21 22
19 18 17 16 15
19 18 17 16 15 15 16 17 18 19 Parent size 20 21 22
Koefisien Korelasir
A measure of the strength and direction of a linear relationship between two variables
0
If r is close to 0 there is no linear correlation.
Examples of positive and negative relationships. (a) Beer sales are positively related to temperature. (b) Coffee sales are negatively related to temperature.
Examples of relationships that are not linear: (a) relationship between reaction time and age; (b) relationship between mood and drug dose.
Correlations:
Positive
C1 vs C2
20.0
Negative
C1 vs C2
120.0
13.3
80.0
C2
C2
6.7
40.0
0.0 0.0
4.0
8.0
12.0
C1
0.0 0.0
83.3
166.7
250.0
C1
Large values of X associated with large values of Y, small values of X associated with small values of Y. e.g. IQ and SAT
Large values of X associated with small values of Y & vice versa e.g. SPEED and ACCURACY
Application
Absences Final Grade
95 90 85 80 75 70 65 60 55 50 45 40
8 10 Absences X
12
14
16
x 8 2 5 12 15 9 6
y 78 92 90 58 43 74 81
Final Grade
Computation of r
x y
1 2 3 4 5 6 7
8 2 5 12 15 9 6
57
78 92 90 58 43 74 81
516
x2 64 4 25 144 225 81 36
579
Coefficient of Determination, 2 r
To understand the strength of the relationship between two variables
The correlation coefficient, r, is squared r 2 shows how much of the variation in one measure (say, fecal energy) is accounted for by knowing the value of the other measure (fecal lipid loss)
13
Linear Regression
Used when the goal is to predict the value of one characteristic from knowledge of another Assumes a straight-line, or linear, relationship between two variables when term simple is used with regression, it refers to situation where one explanatory (independent) variable is used to predict another
16
Regression
Regression: Correlation + Prediction
predicting y based on x
Regression equation
formula that specifies a line y = bx + a plug in a x value (distance from target) and predict y (points) note
y= actual value of a score y= predict value
The regression line is the best fit through the data. Best fit is defined as the average line the one that minimises the distance between the data points and the line. It is the best description of the data
The equation of a line may be written as y = mx + b where m is the slope of the line and b is the y-intercept.
1 2 3 4 5 6 7
x 8 2 5 12 15 9 6
y 78 92 90 58 43 74 81
x2 64 4 25 144 225 81 36
Write the equation of the line of regression with x = number of absences and y = final grade.
57 516
579 39898
Calculate m and b.
= 3.924x + 105.667
95 90 85 80 75 70 65 60 55 50 45 40
Final
Grade
10
12
14
16
Absences