Vous êtes sur la page 1sur 1

( Summation Properties:

[| cov(X,Y)

W=aX+bY means W~N[W,

W]

coefficient of variation

Skewness=zero, kurtosis=3

A,B,C are statistically independent if P(ABC)= P(A)P(B)P(C) If A,B,C are mutually exclusive then P(A+B+C)=P(A)+P(B)+P(C) If A,B,C are not mutually exclusive then P(A+B) = P(A) + P(B) P(AB)
Bayes

cov(a + bX , c + dY) = bd cov(X,Y) where abcd are constants correlation coefficient properties: same sign as cov measure of linear relation between 2 vars 1 1 no unit =0 when cov(X,Y) = 0 correlation causality. cov(X,Y)= XY

Z~N(0,1) X1,X2,Xn constitute a random sample of size n if all Xs are drawn independently from same probability distribution aka iid sample If X1,X2,Xn is a random sample
(

influence of non-present variables. Randomness in behavior is bound to occur. Errors of measurement. Assume u to be sum of all errors for ease of calculations. Sample Regression Function to represent SRL:


substitute

|
| |

; P (B)>0
|

(
)

standard error standard deviation Use se not sd for ~N(0,1) sample

is estimator of E(Y|Xi) estimator of population conditional mean; b1=est B1 ; b2=est B2 ; ei aka residual=estimator of ui
Econometrics goal is to estimate stochastic PRF on the basis of stochastic SRF. We do not observe B1, B2 , u. What we observe are their proxies b1,b2 ,e

) Prove

(
substitute

d.r.v. PMF:

c.r.v. PDF = 1

CDF:F(X)= P(X x) P(x1<X<x2)=F(x2)-F(x1) CDF is merely an accumulation of all PDFs

var(X+Y)= var(X)+ var(Y)+2 XY var(X-Y)= var(X)+ var(Y)-2XY


E X|Y y Xf X|Y y where x is d.r.v f X|Y y is conditional PDF of X. Skewness is a measure of asymmetry and is 3rd moment, and Kurtosis is a measure of tallness or flatness and is 4th moment.
[ ]

Central Limit Theorem size t distribution:

Joint PMF: Marginal PMF/PDF Sum the joint probabilities corresponding to given value of X regardless of values of Y. Conditional PMF f(Y|X) = P(Y=y | X=x) f(Y|X)= Statistically Independent f(X,Y) = f(X)*f(Y) X,Y E(X) = 21/6=3.5 for Expected Value d.r.v. E(X) =
aka average or population mean value
X f(X) Xf(X) 1 1/6 1/6 2 1/6 2/6 3 1/6 3/6 4 1/6 4/6 5 1/6 5/6 6 1/6 6/6

S>0 means PDF is right skewed and S<0 means PDF is left skewed. PDF with K<3 are platykurtic and K>3 are leptokurtic. K=3 is normal distribution and PDF=mesokurtic. Sample Mean = and Sample Variance =
n-1 is degrees of freedom ( )( )

variance for t defined k/(k-2) and k is dof>2 Chi-square Z2= 2 with 1 degree of freedom Properties: >0 skewed distribution EV of 2 rv is k and its var is 2k k is dof If Z1, Z2 are 2 independent 2 vars with k1, k2 dof then (Z1+Z2) is also 2 var with dof=(k1+k2) ( ) F distribution: X1,X2,..Xm and Y1, Y2Yn sample size m,n both from normal pop F=S2X/S2Y Fk1,k2 where k1 numerator,k2 denominator Properties: skewed to right 0<F< F ~N

Linear regression refers to linearity in perimeters not the variables Multiple Linear Regression Estimation: A) Pick a line whose average residual is minimized. Drawback: could result in large +ve or ve residuals. Bad B) Ordinary Lease Squares method is to reduce Residual Sum of Squares therefore too sensitive to outliers C) Weighted Lease Squares [ ] so wi=1 i except i=n-1; w=0 for i=n

Prove

=0

Prove

you know

Prove xi=yi=0 ( for y

; repeat we


( ) ( )

normal eqns.

Sample cov (X,Y)= correlation r =

know xi=0

E(b)=b E(X+Y)=E(X)+E(Y) E(X/Y) E(X)/E(Y) E(XY) E(X)E(Y) unless X,Y 2 2 are independent E(X )[E(X)] E(aX)=aE(X) E(aX+b) = aE(X) + E(b) Variance(X) = = and for CRV std dev = | | var(k)=0 var(X+Y)=var(X)+var(Y) var(X+k) = var(X) If X,Y
are independent r.v. & a,b are constants then

Sample

as k1,k2

2 m

Note that
( ) small letters

. .
rd

. . (

Sample
)

or m*Fm,n=

skewness uses 3 moment =


2

and

sample k is 4th moment/sq of 2nd moment X~N (X , x ) normally distributed with (EV, Var) Properties: Symmetrical around x PDF is highest at mean value 68% area under curve lies (xX); 95% area under curve lies (x2X); 99.7% area under curve lies (x3X) Fully described by X and

Regression does not imply causation. Objectives: estimate mean test hypotheses about nature of dependence predict the mean beyond sample range. Population Regression Line passes through the conditional means of Y. PRL gives the average or mean value of the dependent variable (Y) corresponding to each value of the independent variable (X). Population Regression Function: regression coefficients. ; where ui is the stochastic or random, error term. Nature of error: May represent

denote deviations from sample mean values Features of OLS estimators: SRF obtained by OLS passes through sample mean values of X and Y so

Example:
summation ; | if given ; for all ; ~ distributed ; new line ; new chapter -------------------------------y x 0 1 2 3 4 f(Y) If we want to find out 0 .03 .03 .02 .02 .01 .11 f(Y=4|X=4) = f(Y=4 and 1 .02 .05 .06 .02 .01 .16 X=4)/f(X=4) = 0.15/ 2 .01 .02 .10 .05 .05 .23 0.32 0,47. We know 3 .01 .01 .05 .10 .10 .27 from table that P(Y=4) 4 .01 .01 .01 .05 .15 .23 = 0.23 but given X=4 f(X).08 .12 .24 .24 .32 1 increases it to 0.47

var(aX+bY)=a var(X)+b var(Y) [ ] => for CRV


Chebychevs Inequality where c>0 :

X Linear combination of 2 or more


normally distributed random vars is itself normally distributed. If X~N(X,2X) and Y~N(X,2X) and X,Y are independent. Then

B1 and B2 are the parameters aka

Y-hat is an estimator of the true pop mean to given X. Interpretation is that the slope coefficient of 0.13 means that other things remaining same, if X goes up by 1, the mean Y goes up by 0.13. And 432 means if X=0, the mean Y will be Goal is to find stuff to rewrite Prove

Vous aimerez peut-être aussi