Vous êtes sur la page 1sur 14

SIMULTANEOUS EQUATIONS REGRESSIONS MODEL INTRODUCTION The classical linear regression model, general linear regression model, and

seemingly unrelated regressions model make the following assumption: The error term is uncorrelated with each explanatory variable. If an explanatory variable is uncorrelated with the error term it is called an exogenous variable. If an explanatory variable is correlated with the error term it is called an endogenous variable. The 3 most important sources that produce a correlation between the error term and an explanatory variable are the following. 1 !mission of an important explanatory variable. " #easurement error in an explanatory variable. 3 $everse causation. %e will focus on reverse causation. !ne or more of the explanatory variables will be endogenous if the data are generated by a process that is described by a simultaneous e&uations system. This occurs when a change in a right'hand side variable causes a change in the left'hand side variable, and a change in the left'hand side variable causes a change in the right'hand side variable. (xample )onsider the following simple *eynesian model of income determination comprised of two e&uations: a consumption function, and e&uilibrium condition )+a,b-, -+),I %here ) is aggregate consumption. - is aggregate income. I is exogenous investment. a and b are parameters. and is an error term that summari/es all factors other than that influence ) 0e.g., wealth, interest rate . 1ow, suppose that increases. This will directly increase ) in the consumption function. 2owever, the e&uilibrium condition tells us that the increase in ) will increase -. Therefore, and - are positively correlated. INTRODUCTION TO THE SIMULTANEOUS EQUATONS REGRESSION MODEL %hen a single e&uation is embedded in a system of simultaneous e&uations, at least one of the right'hand side variables will be endogenous, and therefore the error term will be correlated with at least one of the right'hand side variables. In this case, the true data generation process is not described by the classical linear regression model, general linear regression model, or seemingly unrelated regression model. rather, it is described by a simultaneous e&uations regression model. If you use the !34 estimator, 5634 estimator, 47$ estimator, or I47$ estimator, you will get biased and inconsistent estimates of the population parameters.

8efinitions and 9asic )oncepts (ndogenous variable : a variable whose value is determined within an e&uation system. The values of the endogenous variables are the solution of the e&uation system. (xogenous variable : a variable whose value is determined outside an e&uation system. 4tructural e&uation : an e&uation that has one or more endogenous right'hand side variables. $educed form e&uation : an e&uation for which all right'hand side variables are exogenous. 4tructural parameters : the parameters of a structural e&uation. $educed form parameters : the parameters of a reduced form e&uation. Specifying a Simultaneou E!uation Sy tem ; simultaneous e&uation system is one of < important types of e&uation systems that are used to specify statistical models in economics. The others are the seemingly unrelated e&uations system, recursive e&uations system, and block recursive e&uation system. It is important to know the difference between these < types of e&uation systems when specifying statistical models of data generation processes. The Identification =roblem 9efore you estimate a structural e&uation that is part of a simultaneous e&uation system, you must first determine whether the e&uation is identified. If the e&uation is not identified, then estimating its parameters is meaningless. This is because the estimates you obtain will have no interpretation, and therefore will not provide any useful information. )lassifying 4tructural (&uations (very structural e&uation can be placed in one of the following three categories. 1. Unidentified equation : The parameters of an unidentified e&uation have no interpretation, because you do not have enough information to obtain meaningful estimates. ". Exactly identified equation : The parameters of an exactly identified e&uation have an interpretation, because you have >ust enough information to obtain meaningful estimates. 3. Overidentified equation : The parameters of an overidentified e&uation have an interpretation, because you have more than enough information to obtain meaningful estimates. (xclusion $estrictions

The most often used way to identify a structural e&uation is to use prior information provided by economic theory to exclude certain variables from an e&uation that appear in a model. This is called obtaining identification through exclusion restrictions. To exclude a variable from a structural e&uation, you restrict the value of its coefficient to /ero. This type of /ero fixed value restriction is called an exclusion restriction because it has the effect of omitting a variable from the e&uation to obtain identification. $ank and !rder )ondition for Identification (xclusion restrictions are most often used to identify a structural e&uation in a simultaneous e&uations model. %hen using exclusion restrictions, you can use two general rules to check if identification is achieved. These are the rank condition and the order condition. The order condition is a necessary but not sufficient condition for identification. The rank condition is both a necessary and sufficient condition for identification. 9ecause the rank condition is more difficult to apply, many economists only check the order condition and gamble that the rank condition is satisfied. This is usually, but not always the case. !rder )ondition The order condition is a simple counting rule that you can use to determine if one structural e&uation in a system of linear simultaneous e&uations is identified. 8efine the following: 6 + total number of endogenous variables in the model 0i.e., in all e&uations that comprise the model . * + total number of variables 0endogenous and exogenous excluded in the e&uation being checked for identification. The order condition is as follows. If If If *+6:1 *?6:1 *@6:1 the e&uation is exactly identified the e&uation is overidentified the e&uation is unidentified

$ank )ondition The rank condition tells you whether the structural e&uation you are checking for identification can be distinguished from a linear combination of all structural e&uations in the simultaneous e&uation system. The procedure is as follows. 1. )onstruct a matrix for which each row represents one e&uation and each column represents one variable in the simultaneous e&uations model.

". If a variable occurs in an e&uation, mark it with an A. If a variable does not occur in an e&uation, market it with a B. 3. 8elete the row for the e&uation you are checking for identification. <. 5orm a new matrix from the columns that correspond to the elements that have /eros in the row that you deleted. C. 5or this new matrix, if you can find at least 06 : 1 rows and columns that are not all /eros, then the e&uation is identified. If you cannot, the e&uation is unidentified. S"ECI#ICATION ; simultaneous e&uation regression model has two alternative specifications: 1. $educed form ". 4tructural form The reduced'form specification is comprised of # reduced'form e&uations and a set of assumptions about the error terms in the reduced form e&uations. The reduced'form specification of the model is usually not estimated, because it provides limited information about the economic process in which you are interested. The structural' form specification is comprised of # structural e&uations and a set of assumptions about the error terms in the structural e&uations. The structural'form specification of the model is the specification most often estimated. This is because it provides more information about the economic process in which you are interested. 4pecification of the 4tructural 5orm of the #odel ; set of assumptions defines the specification of the structural form of a simultaneous e&uations regression model. The key assumption is that the error term is correlated with one or more explanatory variables. There are several alternative specifications of the structural form of the model depending on the remaining assumptions we make about the error term. 5or example, if we assume that the error term has non'constant variance, then we have a simultaneous e&uation regression model with heteroscedasticity. If we assume the errors in one or more e&uations are correlated, then we have a simultaneous e&uation regression model with autocorrelation. %e will assume that the error term has constant variance, and the errors not correlated within e&uations. 2owever, we will allow errors to be contemporaneously correlated across e&uations. ESTIMATION 4ingle (&uation Ds 4ystem (stimation Two alternative approaches can be used to estimate a simultaneous e&uation regression model. 1. 4ingle e&uation estimation

". 4ystem estimation Single Equation Estimation 4ingle e&uation estimation involves estimating either one e&uation in the model, or two or more e&uations in the model separately. 5or example, suppose you have a simultaneous e&uation regression model that consists of two e&uations: a demand e&uation and a supply e&uation. 4uppose your ob>ective is to obtain an estimate of the price elasticity of demand. In this case, you might estimate the demand e&uation only. 4uppose your ob>ective is to obtain estimates of price elasticity of demand and price elasticity of supply. In this case, you might estimate the demand e&uation by itself and the supply e&uation by itself. System Estimation 4ystem estimation involves estimating two or more e&uations in the model >ointly. 5or instance, in the above example you might estimate the demand and supply e&uations together. -ou might do this even if your ob>ective is to obtain an estimate of the price elasticity of demand only. Advantages and Disadvantages of the Two Approaches The ma>or advantage of system estimation is that it uses more information, and therefore results in more precise parameter estimates. The ma>or disadvantages are that it re&uires more data and is sensitive to model specification errors. The opposite is true for single e&uation estimation. 4ingle (&uation (stimators %e will consider 3 single e&uation estimators. 1. !rdinary least s&uares 0!34 estimator ". Instrumental variables 0ID estimator 3. Two'stage least s&uares 0"434 estimator (ach of these estimators is biased in small samples. Therefore, if the sample data are generated by a simultaneous e&uation regression model you cannot find an estimator that has desirable small sample properties. This means that you must look for an estimator that has desirable asymptotic properties. !rdinary 3east 4&uares 0!34 estimator The !34 estimator is given by the rule: $ % 0&T& '1&Ty Properties of the O S Estimator

If the sample data are generated by a simultaneous e&uation regression model, then the !34 estimator is biased in small samples, and inconsistent in large samples. It does not produced maximum likelihood estimates. Thus, it has undesirable small and large sample properties. !ole of O S Estimator The !34 estimator should be used as a preliminary estimator. -ou should initially estimate the e&uation using the !34 estimator. -ou should then estimate the e&uation using a consistent estimator. -ou can then compare the !34 estimate and consistent estimate of a parameter to determine the direction of the bias. Instrumental Dariables 0ID (stimator The ID estimator is given by the following two step procedure. 1. 5ind one instrumental variable for each right'hand side variable in the e&uation to be estimated. ; valid instrumental variable has two properties: 1. Instrument relevance. It is correlated with the variable for which it is to serve as an instrument. ". Instrument exogeneity. It is not correlated with the error term in the e&uation to be estimated. ". ;pply the following formula to the sample data: ID$ % 0'T& '1'Ty . %here & is the Tx" data matrix for the original right'hand side variables. ' is the Tx" data matrix for the instrumental variables. y is the Tx# column vector of observations on the dependent variable in the e&uation to be estimated. $omments (ach exogenous right'hand side variable in the e&uation to be estimated can serve as its own instrumental variable. This is because an exogenous variable is perfectly correlated with itself and is not correlated with the error term in any e&uation by assumption of exogeneity. The best candidates to be an instrumental variable for an endogenous right'hand side variable in the e&uation to be estimated are exogenous variables that appear in other e&uations in the model. This is because they are correlated with the endogenous variables in the model via the reduced'form e&uations, but they are not correlated with the error term in any e&uation. !ften times there will exist more than one exogenous variable that can serve as an instrumental variable for an endogenous variable. In this case, you can do one of two things. 1 7se as your instrumental variable the exogenous variable that is most highly correlated with the endogenous variable. " 7se as your instrumental variable the linear combination of candidate exogenous variables most highly correlated with the endogenous variable. !elationship %etween the &' Estimator and &dentification

The following relationship exists between the ID estimator and identification. 1. If the e&uation to be estimated is exactly identified, then there are exactly enough exogenous variables excluded from the e&uation to serve as instrumental variables for the endogenous right'hand side variable0s . ". If the e&uation to be estimated is overidentified, then there are more than enough exogenous variables excluded from the e&uation to serve as instrumental variables for the endogenous right'hand side variable0s . 3. If the e&uation to be estimated is unidentified, then there are not enough exogenous variables excluded from the e&uation to serve as instrumental variables for the endogenous right'hand side variable0s . In this case, the ID estimator cannot be used. Properties of the &' Estimator If the sample data are generated by a simultaneous e&uation regression model, then the ID estimator is biased in small samples, but consistent in large samples. Thus, it has desirable large sample properties. 2owever, in the class of single e&uation estimators, the ID estimator is not necessarily asymptotically efficient. This is because a given endogenous right'hand side variable can have a number of possible instrumental variables, and each instrumental variable results in a different ID estimator. %hile all such ID estimators are consistent, not all are asymptotically efficient. The greater the correlation between the endogenous right'hand side variable and its instrumental variable, the more efficient the ID estimator. 1ote that the ID estimator does not produce maximum likelihood estimates. Two'4tage 3east 4&uares 0"434 (stimator The "434 estimator is a special type of ID estimator. It involves two successive applications of the !34 estimator, and is given by the following two stage procedure. 1. $egress each right'hand side endogenous variable in the e&uation to be estimated on all exogenous variables in the simultaneous e&uation model using the !34 estimator. )alculate the fitted values for each of these endogenous variables. ". In the e&uation to be estimated, replace each endogenous right'hand side variable by its fitted value variable. (stimate the e&uation using the !34 estimator. $omments 1. 4tage 1 is identical to estimating the reduced'form e&uation for each endogenous right'hand side variable in the e&uation to be estimated. ". The estimated standard errors obtained from the stage " regression are incorrect and must be corrected. 4tatistical programs that have a "434 procedure make this correction automatically and report the correct standard errors. 3. The "434 estimator is the most popular single e&uation estimator, and one of the most often used estimators in economics.

Properties of the (S S Estimator If the error term is correlated with one or more explanatory variables, then the "434 estimator is biased in small samples. 2owever it is consistent and in the class of single e&uation estimators asymptotically efficient. Thus, it has desirable large sample properties. 1ote that the "434 estimator does not produce maximum likelihood estimates. #onte )arlo studies ;lso suggest that under most conditions the "434 estimator has better small sample properties than alternative single e&uation estimators. $hec)ing the 'alidity of the &nstruments If the instrumental variable0s are uncorrelated with the endogenous explanatory variable, then they are not relevant. If the instrumental variable0s have a relatively low correlation with the endogenous variable, then they are said to be weak instruments. If the instruments are irrelevant or weak, then "434 will be inconsistent in large samples. ;lso, it will not have an asymptotic normal distribution so that hypothesis tests will not be valid. If any instrumental variable is correlated with the error term, then it is not exogenous. If any instrument variable is not exogenous, then "434 will be inconsistent in large samples. If "434 is inconsistent, then it will not produce an estimate that is close to the true value of the population parameter, even if the sample si/e is large. Therefore, it is important to check the validity of your instrumental variable0s . $hec)ing &nstrument !elevance To check for instrument relevance 0strength of instruments , you calculate the 5'statistic for the null hypothesis that the coefficients of the variables used as instruments are all /ero in the first'stage regression. The bigger 0smaller the 5'statistic, the stronger 0weaker the instrument0s . ;n often used rule'of'thumb is that an 5'statistic of less than 1B indicates possible weak instruments. %hyE It can be shown that the mean of the sampling distribution of the "434 estimator in large samples is approximately: 0F!34 : F G"434 (0F + F , HHHH I(05 : 1J where F!34 is the !34 estimator, 0F!34 : F is the bias in the !34 estimator, and (05 is the expected value of the 5'statistic. 1ote that the expression 1 K I(05 : 1J is the asymptotic bias in FG"434 relative to F!34. The larger 0smaller the 5'statistic, the smaller 0larger the bias in FG"434 relative to the F!34. 5or example, if 5 +" then, 1 K I(05 : 1J + 1 K 0" : 1 + 1. In this case, the bias in FG"434 is the same as the bias in F!34. If 5 + 3 then, 1 K I(05 : 1J + 1 K 03 : 1 + L. In this case, the bias in F G"434 is one'half 0CBM of the bias in F!34. If 5 + 1B, then 1 K I(05 : 1J + 1 K 01B : 1 + 1KN. In this case, the bias in FG"434 is one'ninth 0>ust over 1BM of the bias in F!34. #any econometricians believe that a bias of about 1BM or less is small enough to be acceptable in most applications.

$hec)ing &nstrument Exogeneity To test whether one or more instruments is correlated with the error term, you can test the overidentifying restrictions 0if the e&uation is overidentified . This test is discussed below. 4ystem (stimators ; system estimator can be use to estimate two or more identified e&uations in a simultaneous e&uation model together. Thus, a system estimator uses more information than a single e&uation estimator 0e.g., contemporaneous correlation among the error terms across e&uations, cross'e&uation restrictions, etc. , and therefore will produce more precise estimates. %e will consider " system estimators. 1. Three'stage least s&uares 03434 estimator ". Iterated three'stage least s&uares 0I3434 estimator Three'4tage 3east 4&uares 03434 (stimator The I3434 estimator involves the following 3 stage procedure. 1. 4ame as "434. ". 4ame as "434. 3. ;pply the 47$ estimator. Properties of the *S S Estimator If the error term is correlated with one or more explanatory variables, then the 3434 estimator is biased in small samples. 2owever it is consistent and asymptotically more efficient than single e&uation estimators. Thus, it has desirable large sample properties. Iterated Three'4tage 3east 4&uares 0I3434 (stimator The I3434 estimator involves the following 3 stage procedure. 1. 4ame as "434. ". 4ame as "434. 3. ;pply the 47$ estimator. Properties of the &*S S Estimator The I3434 estimator has the same asymptotic properties as the 3434 estimator. 2owever, there is an ongoing debate about whether I3434 or 3434 produces better estimates when using small samples. ; ma>or advantage of the I3434 estimator is that

for a singular e&uation system it produces parameter estimates that are invariant to the e&uation dropped. 04ee material on the I47$ estimator . (conometricians seem to prefer the I3434 estimator. S"ECI#ICATON TESTING Two important specification tests for simultaneous e&uations regression models are: 1. Test of (xogeneity ". Test of overidentifying restrictions %e will implement these tests using a single e&uation estimation procedure. Test of (xogeneity If you believe one or more right'hand side variables appearing in an e&uation may or may not be exogenous, then you can perform a formal test of exogeneity. +otation
8esignate the e&uation to be estimated and the identifying instruments as

- + a , ()* , c& , ' + identifying instruments %here - is the dependent 0left'hand side variable. )* is a vector of one or more right' hand side variables that you believe may or may not be endogenous. & is a vector of right'hand side variables you believe are exogenous. a is the intercept. ( and c are vectors of slope coefficients attached to the variables in )* and &, respectively. and ' is a vector of exogenous variables that are excluded from this e&uation, and therefore used as identifying instruments for the endogenous variable0s in )*. and is the error term. ,ausman Test The most often used test of exogeneity is the 2ausman test. The 2ausman test is based on the following methodology. 3et )* be interpreted more generally as a vector that contains one or more variables that you believe may be correlated with the error term . The null and alternative hypotheses are as follows: 2B: )* and are not correlated 21: )* and are correlated To test the null hypothesis that )* and are not correlated, you proceed as follows.

1. )ompare two estimators. !ne estimator should be a consistent estimator if the null hypothesis is true but an inconsistent estimator if the null hypothesis is false 0e.g., !34 estimator . The second estimator should be a consistent estimator regardless of whether the null hypothesis is true or false 0e.g., "434 estimator . ". If the null hypothesis is true, then both estimators should produce similar estimates of the parameters of the e&uation. If the null hypothesis is false, then the two estimators should produce significantly different estimates of the parameters of the e&uation. Thus, to test the null hypothesis you test the e&uality of the estimates produced by the two estimators. 3. If the parameter estimates produced by the two estimators are significantly different, then you re>ect the null hypothesis and conclude that the sample provides evidence that )* is correlated with in the population. If the parameter estimates produced by the two estimators are not significantly different, then you accept the null hypothesis and conclude that )* is not correlated with in the population. If you are testing whether the variable0s in )* are endogenous, then you interpret the null and alternative hypotheses as follows. 2B: )* is exogenous 0)* and are not correlated 21: )* is endogenous 0)* and are correlated If the vector )* contains one variable, then you are testing whether a single right'hand side variable is exogenous. If the vector )* contains two or more variables, then you are testing whether two or more right'hand side variables are >ointly exogenous. &nterpretation of ,ausman Test If you re>ect the null hypothesis, then you have found evidence that )* is correlated with . -ou interpret this as evidence that )* is endogenous. 2owever, there are other reasons why )* might be correlated with 0e.g., )* is measured with error . Thus, you cannot conclude with certainty what causes the correlation between & and . 2owever, your interpretation is that you have found evidence that )* is endogenous. &mplementation of the ,ausman Test Implementing the 2ausman test involves the following steps. 1. $egress each variable in )* on all variables in & and ' 0all exogenous variables in the model using the !34 estimator. ". 4ave the residuals from each of these regressions. 8enote this vector of residuals G. The residuals from each regression in step O1 is a Presidual variableQ. 3. (stimate the following regression e&uation using the !34 estimator: - + a , ()* , c& , + G , v %here + denotes the vector of coefficients attached to the residual variables.

<. Test the following null and alternative hypotheses: 2B: + + B 21: + B 0)* is exogenous 0)* is endogenous

C. If there is one variable in )*, and therefore one residual variable in G and one coefficient in +, then this hypothesis can be tested using a t'test. If there is more than one variable in )*, and therefore more than one residual variable in G and more than one coefficient in +, then this hypothesis can be tested using a 5'test. Test of !veridentifying $estrictions It is possible to test the overidentifying restrictions for a single e&uation in a system of e&uations. 4trictly speaking it is not possible to test the identifying restrictions for an e&uation. This is because the e&uation must be identified for estimation of its parameters to be meaningful. %hen you test the overidentifying restrictions for an e&uation you are testing whether the variables that you have excluded from the e&uation to overidentify it can be validly excluded, or whether at least one of these variables should be included in the e&uation. Therefore, you are testing the following null and alternative hypotheses, 2B: The variables excluded to overidentify the e&uation do not belong in the e&uation 21: ;t least one variable excluded to overidentify the e&uation does belong in the e&uation. +otation
8esignate the e&uation to be estimated before the identifying instruments are excluded as

- + a , ()* , c& , +' , %here all variables and parameters have been defined previously. 1ote that this is the e&uation before it is identified, and therefore the variables in the vector ' have not been excluded. The null and alternative hypotheses can be expressed as follows. 2B: + + B 0The variables excluded to overidentify the e&uation do not belong in the e&uation 21: + B 0;t least one variable excluded to overidentify the e&uation belongs in the e&uation agrange -ultiplier Test

The easiest way to test the null of hypothesis that the overidentifying restrictions are correct is to use a 3agrange multiplier test. The test statistic and sampling distribution for this test are 3# + T$" R "0' : )* %here T is the sample si/e. $" is the unad>usted $'s&uared statistic from the second of two auxiliary regressions. " is the chi s&uare distribution with ' : )* degrees of freedom, where ' is the number of variables excluded from the e&uation and )* is the number of endogenous right'hand side variables in the e&uation 0this difference is e&ual to the number of overidentifying restrictions . $alculating the - Test Statistic To calculate the 3# test statistic, you need to obtain the $ " statistic from the second of two auxiliary regressions. This involves the following steps. 1. (stimate the following model using the "434 estimator - + a , ()* , c& , 7se as instruments for )* all variables in the vectors & and '. ". 4ave the residuals from this regression. 8enote the residual variable as G. 3. $egress the residual variable, G, on all the variables in & and ' using the !34 estimator. that is, estimate the following e&uation using the !34 estimator G + , & , ' , v <. 7se the unad>usted $" statistic from this regression to calculate the 3# test statistic. +otes a.out the Test of Overidentifying !estrictions 1. If you re>ect the null hypothesis, then you are re>ecting the overidentifying restrictions. This casts doubt on the identifying restrictions. This is because the overidentifying restrictions cannot be separated from the identifying restrictions. ". If you re>ect the overidentifying restrictions, the test gives you no guidance about what to do next. ; test does not exist that allows you do determine which variable or variables in ' should not be excluded from the e&uation being estimated. H)"OTHESIS TESTING The small sample t'test and 5'test cannot be used in the context of a simultaneous e&uation regression model. This is because if the error term is correlated with one or more explanatory variables, we donSt know the sampling distributions of the t'statistic and 5'statistic. The following large sample tests can be used: 1. ;symptotic t'test

". ;pproximate 5'test 3. %ald test <. 3agrange multiplier test 1ote that because the ID, "434, 3434, and I3434 estimators do not produce maximum likelihood estimates, the likelihood ratio test cannot be used to test hypotheses.

Vous aimerez peut-être aussi