Vous êtes sur la page 1sur 3

Regression Project

Pick something you would like to "predict" (Y - the dependent variable)


It must not be time series (over time). Why? Too many external factors affect data (Katrina and 911 e.g.)
You will need exactly 30 entries (teams, people, companies, cities, countries, states, etc)
Pick two or more "predictors" (independent variables (x's))
You will walk through your analysis with ONLY ONE independent variable. If you gather information on several x's
you will not have to start over if one or two x variables are not significant.

Examples:
Is there a relationship between:
Crime per 10,000 and Sales (dollars are a measurement - OK) and
Education level (percentage of pop.) Advertising
Housing cost (average) Price
Single parents (percent of pop.)

Baseball winning percent per season and Poverty per 10000 and
RBI's (avg. per season) Single parent status
Team batting average Family size
On base percentage Education level

Football winning percent per season and Unemployment and


Rushing yards Job satisfaction
Passing yards GNP
Turnovers

Cities Y X1 X2
1 People or
2 States or Begin your spreadsheet somewhat like this example
… Teams or
30 Countries… etc
Never use a subset of Y for X. For example, Y is fatal car accidents, X is alcohol related fatal car accidents
Provide a copy of the slides prior to presenting (4 slides per page is fine). ALL DATA MUST BE LEGIBLE.
Please staple your printed slides together - do not hand them in loose or in a folder
The project is to educate the class on regression analysis as if they had no prior understanding of it
What does each term (each Excel cell in the output) tell us?
What does your specific data tell us

Your final regression x variable must be significant with a p value < alpha
Provide an introductory slide with your group number and "who did what" to prepare the presentation and who
is presenting what portion of the presentation. Everyone must present.
Next, what is your "story". Why did you pick the data you picked?
Regress this data in Excel.
Show the y hat equation and explain its meaning (interpret it)
Make sure to include your data source(s) as references
Only use the number of decimal points that are necessary
The web sites for the Census Bureau, Bureau of Labor Statistics and The Statistical Abstract of the United States
all contain a wealth of information
Please turn in your score sheet for any group members you score less than 20 with your presentation slides. If you score
every group member a 20, you do not need to turn in group score sheets.
Ten points are assigned to originality and creativity
Credit will be given for covering things over and above what I specifically ask for
There is no dress code for the presentations
Please put all information in the slides that you wish to receive credit for, not merely verbalize it.
Presentations done thoroughly are normally in the 12 - 15 minute range.
Do not use big numbers. For example, make 3,456,725 into 3.46 mil…..same with 5 and 6 digit numbers
You must show HOW! HOW you went from 3 x's to 1 x to receive credit…. HOW you used rand() to get a sample of 30, etc.
The details of how to explain each term, what to compare the numbers to etc. will given in class so please attend.

Vous aimerez peut-être aussi