Vous êtes sur la page 1sur 6

1

Overview

 In this lecture we will learn about

 Correlation

 Correlation coefficient

 Properties of correlation coefficient

 Independent variable
 Dependent variable

 Regression

 Regression coefficient

 Properties of regression coefficient

Correlation

 Example: What happens to Sweater sales with increase in temperature? • What is the strength
of association between them?• Ice-cream sales v.s temperature ?

 Is there any relationship between Ice-cream and temperature?

This relationship is correlation.

 All sciences, natural, social or biological are largely concerned with the study of interrelationship
among variables.

 If the change in one variable affects the change of other variable, then the variables are said to
be correlated.

 For example, the relationship between result and study, the relationship between sales and
advertisements etc.

 Thus the relationship between two variables is known as correlation.

Types of correlation

 Positive correlation
 Negative correlation
 Zero correlation
 Simple correlation
 Multiple correlation
 Partial correlation
2

 If the increase or decrease in one variable results in the corresponding increase or decrease in
the others i.e. if the changes are in the same direction, the variables are called positively
correlated (positive correlation).

 For example, the heights and weights of a group of persons is positively correlated.

 If the increase or decrease in one variable results in the corresponding decrease or increase in
the other i.e. if the changes are in the opposite direction, the variables are said to be negatively
correlated (Negative correlation).

 For example, the volume and pressure of a perfect gas is negatively correlated. Price and
Demand of commodity is negatively correlated.

 Zero correlation: If there is no relationship between two variable, it is called zero correlation.

 Simple correlation: There may be relationship a number of variables in the study. When the
relation between only two variables is considered, it is called simple correlation.

Methods of studying correlation

 The following are the important methods of ascertaining/diagnosis whether two variables are
correlated or not:

 i. Scatter Diagram Method

 ii. Karls Pearson’s correlation coefficient

Scatter Diagram Method

 In this method the given data are plotted on a graph paper in the form of dots

 i.e. for each pair of X and Y values we put dots and thus obtain many points as the number of
observations.

 By looking to the scatter of various points we can form an idea as to whether the variables are
related or not.

Karl Pearson’s) Correlation coefficient: Karl Pearson’s (1896) developed an index or coefficient to
measure the relationship between two variables. The correlation coefficient measures the intensity
(degree) of correlation (relationships) between the variables. Karl Pearson’s correlation Coefficient is
denoted by ‘r’ and is defined as
3

Properties of correlation coefficient:

 Correlation coefficient of correlation is independent of change of origin and scale.

 Correlation coefficient lies between -1 to +1, Symbolically

 Coefficient of correlation is the geometric mean of regression coefficients.

 If X and Y are independent variables then coefficient of correlation is zero. However, the
converse is not true

Interpretation of correlation coefficient:

 The correlation of coefficient measures the degree of relationship between two sets of
variables. The following guidelines are given which would help in interpreting the value of r.

 when r = +1, it means there is perfect positive correlation between the variables

 when r= - 1, it means there is perfect negative correlation between the variables

Interpretation of correlation coefficient:

 when r = 0, it means there is no correlation between the variables i.e. the variables are
uncorrelated.

 the closer the value of r to +1 or -1 , the closer the relationship between the variables and the
closer the value of r is to 0, the lesser the relationship.

Example-1:The following data represent the capital employed and profit 7 farms in an
industrial area of a country:
Capital employed (in Tk. crores) 10 20 30 40 50 60 70
Profit obtained (in Tk. Crores) 2 4 8 9 10 15 12
i. Make a scatter diagram
ii. Do you think there is any correlation between profits and capital employed?
iii. Find the Karl Pearson’s coefficient of correlation between capital employed and profit
iv. Also interpret the result

Solution:
We know that the coefficient of regression of y on x is given by
4

Calculation table:

x y X2 Y2 xy
10 2 100 4 20
20 4 400 16 80
30 8 900 64 240
40 9 1600 91 360
50 10 2500 100 500
60 15 3600 225 900
70 12 4900 144 840
 x =280  y =60 x 2
=14000 y 2
644  xy =2940
Thus, the correlation coefficient (r) is given by

Home work (HW)

The following table gives the age of cars of certain make and the annual maintenance costs.
Find (i) the coefficient of correlation between variables and ii) regression coefficient of
maintenance cost on age of cars. iii)Also estimate the maintenance cost if the age of a car is
15 years.
Age of cars (in years) 2 4 6 8
Maintenance costs (in hundred of Tk.) 0 20 25 30

Regression

 Regression analysis: is a mathematical measure of average relationship between two or more


variables in terms of original units of the data.

 In regression analysis there are two types of variables.

 The variable whose value is influenced or is to be predicted or to be explained is called


dependent variable and

 the variable which influences the values or is used for prediction, is called independent variable.
5

 In regression analysis independent variable is also known as regressor or predictor or


explanatory variable while the dependent variable is also known as regressed or explained
variable.

Objective of regression analysis:

 The usual purpose regression analysis is to explain and predict the changes in the magnitude of
a given variable in terms of one or more variables.

 In regression analysis, we express the relationship between dependent variable and


independent variables by means of an equation.

 For example, if Y denote the yield of a croup and x1 and x2, the associated rainfall and
temperature respectively.

 We might like to explain the variation in Y in terms of the variations in x1 and x2 and perhaps
want to predict the value of Y when x1 and x2 have specified values.

Regression coefficients:

 The quantity which is used to measure the average change of dependent variable in terms of
independent variables is called regression coefficient.

 The quantity ‘b’ in the regression equations (y=a +by or x = a +by) is called the regression
coefficient or slope coefficient.

 Since there are two regression equations, therefore there are two regression coefficients-
regression coefficient of Y on x and regression coefficient of X on Y.

The regression coefficient of Y on X is denoted by b yx and is defined as

b yx 
 ( x  x )( y  y )
 (x  x) 2

Simple linear regression model:

 The model in which we consider only one dependent variable and one independent variable and
the relationship between these variables is straight line is called Simple linear regression model.

 The simple linear regression can be written as follows:

 Y = a + bX +e , where e follows N(0,6)

Assumptions of simple linear regression model:

 The following are some assumptions involved in simple linear regression model.

 The X’s are fixed


6

 The regression of Y on X is linear and E(Y) = a +bX

 The conditional distribution of Y for each value of x is normal.

 The distributions have the same variance.

the regression coefficient of X on Y is denoted by bxy and is defined as

bxy 
 ( x  x )( y  y )
 ( y  y) 2

Difference between correlation and regression:

 There are two important points of difference between correlation and regression analysis:

 1. Correlation coefficient measures the degree of relationship between X and Y.


 on the other hand, regression coefficient measures the average movement of dependent
variable in terms of change of independent variables.
 2. The cause and effect relation is clearly indicated through regression analysis than by
correlation.
 But Correlation is only a tool of ascertaining the degree of relationship between variables and
therefore, we cannot say that one variable is the cause and the other the effect.
 3. Correlation coefficient is independent of change of origin and scale.

 On the other hand regression coefficient is independent of change of origin but not on scale.

Example-2: The following table represents the advertising expenditure and their
corresponding sales of 5 companies:
Advertising expenditure (in lakhs of Tk.) 5 7 10 12 15
Sales (in crores of Tk.) 2 5 4 8 10

Estimate the sales (y) corresponding to advertising expenditure (x) 30 lakhs of Tk.

Home Work

The following table gives the age of cars of certain make and the annual maintenance costs.
Find (i) the coefficient of correlation between variables and ii) regression coefficient of
maintenance cost on age of cars. iii)Also estimate the maintenance cost if the age of a car is
15 years.
Age of cars (in years) 2 4 6 8
Maintenance costs (in hundred of Tk.) 0 20 25 30

Vous aimerez peut-être aussi