Vous êtes sur la page 1sur 4

0.5.

ASSIGNMENT 5: REGRESSION 23

0.5 Assignment 5: Regression


Complete this assignment using R. Due Thursday 30 January
The .csv file for this assignment,barro.csv, is available at http://vincentarelbundock.
github.io/Rdatasets/datasets.html. There is also a documentation file.
This is the version of the Barro Growth Data used in Koenker and Machado(1999).
This is a regression data set consisting of 161 observations on determinants of cross coun-
try GDP growth rates. There are 13 covariates with dimnames corresponding to the orig-
inal Barro and Lee source. See http://www.nber.org/pub/barro.lee/. The first 71 observa-
tions are on the period 1965-75, remainder on 1987-85.
(The file is also available in D2L but I would prefer that you go to the online source for practice. )

A useful command once you have downloaded a files is

> f <- file.choose()

You can use this to call the file manager on your computer to let you search for
the file you want and assign it to f. The command > f will let you see the path.

You can then open the file using

> data <- as.data.frame(read.csv(f))

It is a good idea to use summary(data) and head(data) to check to see what you
have.

1. Setting up: First we want to give you a personalized dataset.

(a) Get a seed, s, for for the sampling process


> s<- (your student number)%%42

(b) Using the seed, s, generate a list of observations to omit:


> set.seed(s); list<- sample(1:71, 55, replace=FALSE)
24 CONTENTS

(c) If you want to select the first panel with observations on 1975, select the first
71 observations:
> data <- data[1:71,]
then check to see ho many observations you have:
> length(data[,1])
to select the second panel (1985) select observations 72 to 161:
> data <- data[72:161,]
(d) Now select the observations from your list:
> data<- data[llist,]
Check the length. There should be 55 observations. No one else will have the
same set of observations.
2. Regress Annual Change Per Capita GDP on variables of you own choice. You can
use the first or the second panel of data, or both, you can cut the sample to low or
high income, or any other partition that you like. The key is that you are going to
come up with a theory and test it.
3. Write up your results. As usual, this is the main part of the exercise. The instruc-
tions are the same as for previous exercises. In this case you are choosing variables
to test the idea you have about what will affect growth. You have to specify one
or more hypotheses to test. (The Barro paper this is based on is available on D2L.)
Whenever I do a project I start by setting up a script file. I always start the file with a
set of comments like this:

###############################
#Regression exercise #3 Economcs 2466
#David Robinson
#
# Variables dataset baro
# [1] "X" "y.net" "lgdp2" "mse2"
# [5] "fse2" "fhe2" "mhe2" "lexp2"
# [9] "lintr2" "gedy2" "Iy2" "gcony2"
# [13] "lblakp2" "pol2" "ttrad2"
#
#################################
0.5. ASSIGNMENT 5: REGRESSION 25

To get the list of variables use the command colnamesdata. I might add the full variable
names from the documentation:

A data frame containing 161 observations on 14 variables:


[,1] "Annual Change Per Capita GDP"
[,2] "Initial Per Capita GDP"
[,3] "Male Secondary Education"
[,4] "Female Secondary Education"
[,5] "Female Higher Education"
[,6] "Male Higher Education"
[,7] "Life Expectancy"
[,8] "Human Capital"
[,9] "Education/GDP"
[,10] "Investment/GDP"
[,11] "Public Consumption/GDP"
[,12] "Black Market Premium"
[,13] "Political Instability"
[,14] "Growth Rate Terms Trade"
26 CONTENTS

Assignment 5: Marking
General structure of a good answer
INTRODUCTION SECTION
5% Introducing topic and approach of paper. Growth is a big issue....
(There could be a literature review section, but it is not asked for. Bonus 5% )
THEORY SECTION
15% Stating a theory in words. Growth is caused by....
DATA SECTION
20% Describing data, correlations, means, bivariate plots
MODEL SECTION
10% Writing the model in equation form. What variables are controls?
10% Stating the hypothesis (and null hypothesis ) in terms of the formal model
RESULTS SECTION
10% Presenting the regression results in a clear table.
20% Discussing the results. Includes confirm, disconfirm of your hypothesis.
CONCLUSIONS SECTION
10% Drawing conclusions about your hypotheses and suggesting further work
REFERENCES If any. Bonus 5%

Vous aimerez peut-être aussi