Vous êtes sur la page 1sur 41

Abstract I would like to start by saying this report isn't meant for scientific purposes, it is purely just a tool

to showcase the things I learned in my Stat 1510 class. As a class project our instructor asked us to take a sample from a class survey and upon getting a good representation of the population of 2627, he asked us to analyze, graphically showcase, and answer various questions pertaining to the data. By looking at the data within my report we can see various things. First off gender doesn't appear to have that much of an affect of income but it does seem to have an impact on sizes and weights. It appears handedness does not seem to have anything to do with how much water a person consumes in a 48 hour period. When it comes to the affects of political parties it seems as though all parties pretty much agree on President Obama being re-elected and the death penalty, but when it comes to the change in affiliation and the health care bill there is quite a big difference between the parties. Looking at the data concerning hair and eye color it seems as though brown eyes with black or brown hair dominate and blue eyes have the the widest range of color.

Introduction As a Stat 1510 project, we were asked to take a sample of a class survey, that has population of 2627. This report consist of a simple random sample of 25 different sets of the class survey data. After obtaining a good representation of the population we were then asked to analyze different portions of the data to show our comprehension of the materials we covered in class. In doing this we were asked to create appropriate graphical displays to showcase our data. We were then asked to take the data and use it to answer various questions asked by the instructor. This report is not meant for any scientific reasons, it is purely just a tool to showcase the things I learned this semester. Materials and Methods The graphs represented were created by using the class survey data and a TI 84 Plus calculator. The first set of data being displayed is the data that comes from the question on the survey that inquires about gender and amount of annual income. Just by looking at the graphs it appears that males make more money per year than females. The female data isn't completely representative of the females because there were two instances where the females neglected to report anything for the amount of income. In order to make a graph to best represent the data that I have, I chose to report the missing incomes values as zero. Histogram of Male Income Data

This histogram of the male income data shows that the data is skewed right.

Histogram of the Female Income Data

This histogram shows that the data for the annual female income is skewed right.

5 Number Summary for Male and Female Income Data Min. Q1 M Q3 Males 0 8446 57500 75500

Max. 11500

Females

20800

62500

110000

Box and Whisker Plot of Male and Female Income Data

The the box and whisker plot on top is the one for the male income data and the bottom one is the one for female data. We can see here that both sets of data appear to be skewed right. This next set of data is a correlation of data from the question asking about how many ounces of water were drank the two days prior to taking the survey and the question asking which was the dominant hand of the individual. By taking a look at this data we should be able to get an idea of which handedness typed individual drinks more water in a 24 hour period. Histogram of Right Handed and Water Consumed Data

This histogram shows that the data for the amount of water a right handed person consumes is skewed right. Histogram of Left Handed and Water Consumed

This histogram shows that the data for the amount of water consumed by a left handed person is uniform.

Histogram of Ambidextrous and Amount of Water Consumed

This histogram shows that the data for ambidextrous and the amount of water consumed is uniform. 5 Number Summary for Handedness and Amount of Water Consumed Min. Q1 M Q3 Right Left Ambidextrous 8 52 26 29 52 26 48 64 64 72 98 100

Max. 210 98 100

Box and Whisker Plot of Handedness and Amount of Water Consumed

The top box and whisker plot shows the data for the right hand is skewed right. The middle box and whisker plot is for the left hand data and the bottom box and whisker plot is for the ambidextrous data. Both the bottom two show their data as being relatively uniform. This next bit of data takes a look at the number of non registered voters and their view point on whether or not they think Obama will not be re-elected. By looking at this data we should get an idea on whether or not the fact that these people don't vote has any impact on whether or not Obama gets reelected. Histogram of Non Registered Voters and Obama Not being Re-elected

This histogram shows that the data for those that believe Obama will not be re-elected is

relatively uniform. Non Registered Voters and Obama Not Being Re-elected Categories Frequency Yes No 8 (53.33%) 7 (46.67%)

This data represents the political parties who are in favor of re-electing Obama. This data can give us an idea of which party is in favor of Obama and his term in office. Bar Graph of Political Party and Favor of Obama Re-election

This bar graph shows that the democrats are more in favor of Obama being re-elected. Political Parties and Favor of Obama Being Re-elected Categories Frequency Democrats Republicans Independents Others 4 (66.67%) 1 (16.57%) 1 (16.67%) 0 (0%)

This next set of data represents the political parties that are in favor of the health care bill. By looking at this data we can get a generally idea of which political parties favor the health care bill.

Bar Graph of Political Party and In Favor of Health Care Bill

This bar graph shows that the Democratic favor the health care bill. Political Parties and Favor of The Health Care Bill Categories Frequency Democrats Republicans Independents Other 7 (77.78%) 1 (11.11%) 1 (11.11%) 0 (0%)

This next data represents political parties and how they compare regarding the favor of the death penalty. By looking at this data we should be able to decide which political party favors the death penalty more. Bar Graph of Political Party and Favor of Death Penalty

The data in this bar graph appears to be skewed left and somewhat uniform. Political Party and Favor of The Death Penalty Categories Frequency Democrats Republicans Independents Others 7 (38.89%) 7 (38.89%) 2 (11.11%) 2 (11.11%)

This next data represents the political parties and the answer to the survey of not applicable for change in affiliation. This data is to show that those who belong to the various political parties don't feel the need to change party affiliation. Bar Graph of Political Party and Not Applicable for Change in Affiliation

This bar graph shows us that the data for political party and not applicable for change in affiliation is skewed right. Political Party and Not Applicable for Change in Party Affiliation Categories Frequency Democrats Republicans Independents Others 9 (40.91%) 8 (36.36%) 3 (13.64%) 2 (9.09%)

The next few sets of data have to do with hair and eye color. By looking at and comparing these sets of data we should be able to figure out which hair color predominately goes with which eye color. Bar Graph of Brown Eyes and Hair Color

This bar graph shows that for brown eyes the data is uniform. Brown Eyes and Hair Color Categories Brown Hair Black Hair Red Hair Grey Hair Blond Hair Frequency 6 (50%) 6 (50%) 0 (0%) 0 (0%) 0 (0%)

Bar Graph of Black Eyes and Hair Color

The bar graph for black eyes and hair color shows the data is skewed right with black being the only hair color. Black Eyes and Hair Color Categories Brown Hair Black Hair Red Hair Grey Hair Blond Hair Frequency 0 (0%) 2 (100%) 0 (0%) 0 (0%) 0 (0%)

Bar Graph of Blue Eyes and Hair Color

The bar graph for the blue eyes and hair color show the data is approximately uniform.

Blue Eyes and Hair Color Categories Brown Hair Black Hair Red Hair Grey Hair Blond Hair Frequency 2 (28.57%) 0 (0%) 1 (14.29%) 1 (14.29%) 3 (42.86%)

Bar Graph of Hazel Eyes and Hair Color

The bar graph for hazel eyes and hair color shows that the data is skewed right. Hazel Eyes and Hair Color Categories Brown Hair Black Hair Red Hair Grey Hair Blond Hair Frequency 2 (60.67%) 1 (33.33%) 0 (0%) 0 (0%) 0 (0%)

Bar Graph of Other Eyes and Hair Color

This bar graph shows that the data for others eyes and hair color is skewed right with brown being the only hair color. Other Eyes and Hair Color Categories Brown Hair Black Hair Red Hair Grey Hair Blond Hair Frequency 2 (100%) 0 (0%) 0 (0%) 0 (0%) 0 (0%)

This next data compares male weights and female weights. By looking at the appropriate graphs we should get an idea of who weighs more on average. Histogram of Male Weight

This histograms shows that the male weight data is approximately bell shaped uniform. Histogram of Female Weight

This histogram shows the female data is bimodal symmetric. 5 Number Summary for Male and Female Weight Min. Q1 M Q3 Male Female 138 115 162.5 130 187.5 140 207.5 160

Max. 270 190

Box and Whisker Plot of Male and Female Weight

The box and whisker plot on top is that of the male weight and the one on the bottom is that of the female weights. By the looks of these box and whisker plots both sets of data appears to be skewed right. This next data is so we can compare the male height with the female height and get an idea of which on average is taller. Histogram of Male Height

This histogram shows the data for male height to be bell shaped symmetric. 5 Number Summary for Male and Female Height

Min. Male Female 65 60

Q1 67.5 62

M 69 64

Q3 70.25 66

Max. 74 70

Box and Whisker Plot of Male and Female Height

The top box and whisker plot shows the male data and the bottom shows the female data. Both sets of data appear to be skewed right. This next set of data is for comparison of male and female ring sizes. This data may not accurately represent both the male and the females because in the male data four respondents answered unknown for their ring size and three female respondents answered unknown for their ring size. Since unknown isn't a numerical value I used a zero in their place. Histogram of Male Ring Size

This histogram shows the data for male ring size is fairly uniform. Histogram for Female Ring Size

This histogram shows the data for female ring size is skewed left. 5 Number Summary for Male and Female Ring Sizes

Min. Male Female 0 0

Q1 0 1.5

M 7.5 6

Q3 8 6.5

Max. 9 7

Box and Whisker Plot for Male and Female Ring Size

The top box and whisker plot for male data shows the male rings size is fairly uniform. The bottom box and whisker plot for the female data appears to be skewed left. This last bit of data represents the comparison of male and female shoe sizes. Histogram of Male Shoe Size

This histogram shows that the data for male shoe size is approximately uniform Histogram of Female Shoe Size

This histogram shows that the female shoe size data is bell shaped symmetric.

5 Number Summary for Male and Female Shoe Size Min. Q1 M Q3 Male Female 9 6.5 9.25 7.75 10 8 11.5 9

Max. 12 13

Box and Whisker Plot for Male and Female Shoe Size

The top box and whisker shows the data for the male shoe size is approximately uniform. The bottom box and whisker plot shows the female shoe size data is skewed right. Results By looking at the data for the income based on gender it appears that males make a little bit more money each year compared to females. Both sets of data are skewed right but the 5 number summary showing the median for the males being 57500 compared to the median for the females 20800 suggest males make more money on average. However, we did not find statistical evidence (pvalue = 0.44589) that one genders make more money than the other. In this instance the number of right handed respondents out number the left handed and ambidextrous respondents. Based on the median data for handedness in comparison to amount of water drank two days prior to the survey it appears that the ambidextrous and left handed respondents drank more with a median of 64, where right handed had a median of 48. We did not find enough evidence (p-value 0.6103) to suggest a particular handed individual consumes more water than the others. Since the data for the non registered voters and President Obama not being re-elected data is nominal there is no need to do any other graphs other than the histogram. In this instance it appears that President Obama not being re-elected is not being hindered by the lack of voters. The data for political party and favor President Obama re-election appears to be clearly dominated by the Democratic party. Since it is nominal data we don't need to do anything other than create the bar graph to see that the data is skewed in the Democrats be in favor of President Obama. We did not find any statistical evidence (p-value 0.5315) however, that suggest one party favors President Obama's re-election any more than the others. We can tell just by looking at the bar graph for the political party and in favor of the health care

bill that the Democrats favor the bill more than the other political parties. Just by looking at the data is seems as if the Democrats are pretty much the only party in favor of the health care bill. We did find enough statistical evidence (p-value 0.0326) to suggest one political party may favor the health care bill over the others. Based on the bar graph depicting the political party and favor of the death penalty the data appears to be both skewed right and relatively uniform. Both the Democrats and Republicans seem to be highly in favor of the death penalty where very few of the Independents and Others seem to be in favor of the death penalty. We did not find enough statistical evidence (p-value 0.7630) to suggest one party favors the death penalty any more than the others. By looking at the bar graph for the political party and not applicable for change in affiliation the we can see that the data is skewed right. The appearance of this data suggest that the mainly the independents and others bother with changing party affiliation. The Democrats and Republicans tend to just be affiliated with the Democratic and Republican parties. We found enough statistical evidence (p-value 0.001) to suggest that certain parties change their affiliation while others do not. By looking at all the data for the hair colors we can see a couple of things. First we can see that brown eyes with brown or black hair dominates the other eye color hair color combinations. It appears that blue eyes has a wider range of hair colors associated with it. By taking a look at all the data comparing the males and females we can get quite a few different conclusions. We see that on average males are larger than female in pretty much everything. Discussion/Conclusion

Appendices To obtain my sample of 25 I did a simple random sample using my calculator. The commands I used were math>prb>randint (2, 2628, 35). I started at the number 2 and stopped with the number 2628 because on the spreadsheet the 1 position was the categories and not the actual data. Here are the samples I used: 257, 382, 403, 715, 916, 985, 1021, 1117, 1212, 1215, 1370, 1688, 1792, 1826, 2083, 2084, 2185, 2210, 2234, 2239, 2366, 2385, 2402, 2489. To obtain the 5 number summaries I used the calculator commands stat>calc>1-var stats>input list. Here are the questions we were to address and the methods I took to address them: 1. Is there a relationship between an appropriate combination of a person's height, weight, ring size, and shoe size? Is there enough evidence to suggest men are typically taller than women? By looking at the normal plots we see that both plots are normal so a 2-sample-f-test needs to be done to determine whether or not to pool. The p-value for the 2-sample-f-test is 0.6624, so we would fail to reject which means we would pool the variances. Normal Plot for Male Height

This is a normal plot showing the data is relatively normal for male height data. Normal Plot for Female Height

This normal plot that shows female height data is relatively normal. To decide whether or not to pool the 2-sample-f-test was used. Since the p-value of 0.6624 is larger than the alpha value of 0.05 we will pool the variances. Hypothesis Statement for Male and Female Height

This is the hypothesis statement and calculator commands I used for the male and female height data. We will reject the null hypothesis. The p-value of 0.0000 gives us enough evidence to suggest males are typically taller than females.

2. Is there a difference in gross annual income based on gender? By looking at the normal plots for the male and female gross annual income we can clearly see that this data is not normal, because it is not normal the Wilcoxon Rank-Sum was used. Normal Plot for the Male Gross Annual Income

This is the normal plot for the male gross annual income. It is clearly not normal. Normal Plot for the Female Gross Annual Income

This is the normal plot for the female gross annual income. We can see clearly the data is not normal. Hypothesis Statement for Gross Annual Income Based on Gender

Here is the hypothesis statement and calculator commands I used to get the p-value. We will fail to reject the null hypothesis (p-value 0.44589). There is not enough evidence to suggest a higher annual gross income based on gender. 3. Is there a relationship between political party and : a.) If the respondent feels President Obama will be re-elected? To determine the dependency of political party and President Obama being re-elected, the test for independence was used. This test was done by putting the observed data values for the political parties and how their votes corresponds to whether or not they think President Obama will be re-elected in matrix A. The calculator automatically calculated the expected values upon completion of the chisquared test and placed them in matrix B. Matrix A (Observed) and Hypothesis Statement for Political Party and President Obama Re-elected

Here we see on top the values I put into matrix A in my calculator and on bottom the hypothesis statement along with the calculator command I used to get the p-value. Matrix B (Expected) for Political Party and President Obama Re-elected

These values we would have expected to see from the different political parties corresponding to the question asked. We will fail to reject the null hypothesis (p-value 0.5315). There is not enough evidence to suggest political party and President Obama being re-elected are independent. b.) If the respondent is in favor of the health care bill passed? To determine dependency for political party and the health care bill being passed, the test for independence was used. This test was done by putting the data values for the political parties and how they voted for the health care bill into matrix A. The expected values were automatically calculated and placed into matrix B upon completion of the Chi-squared test. Matrix A (Observed) and Hypothesis Statement for Political Party and Health Care Bill

This shows the observed data values for political party and their responses to the question regarding the health care bill being passed. Also picture here is the hypothesis statement along with the calculator command I used to calculate the p-value. Matrix B (Expected) for Political Party and Health Care Bill

These are the expected values that my calculator gave me in regards to the question about political party and the health care bill. We will reject the null hypothesis (p-value 0.0326). There is enough evidence to suggest political party and those in favor of the health care bill are independent. c.) If the respondent is in favor of the death penalty? To determine dependency for political party and the death penalty, the test for independence

was used. This test was done by putting the data values for the political parties and how they voted concerning the death penalty in matrix A. The expected values were automatically calculated upon completion of the Chi-squared test and placed in matrix B. Matrix A (Observed) and Hypothesis Statement for Political Party and The Death Penalty

This shows the data values for how the political parties voted regarding the issue of the death penalty. Also shown here is the hypothesis statement and the calculator commands I used to calculate the pvalue. Matrix B (Expected) for Political Party and The Death Penalty

Here we see the expected values the calculator gave me concerning political parties and the death penalty.

We will fail to reject the null hypothesis (p-value 0.7630). There is not enough evidence to suggest political party and the death penalty are independent. 4.) Is there a relationship between handedness and: a.) In favor of the death penalty? To determine the dependency between handedness and being in favor of the death penalty, the test of independence was used. The test was done by putting the data values for handedness and how each voted regarding the death penalty into matrix A. The calculator automatically calculated the expected values and put them into matrix B upon completion of the Chi-squared test. Matrix A (Observed) and Hypothesis Statement for Handedness and The Death Penalty

The top portion shows the observed data values for handedness and the death penalty. The bottom shows the hypothesis statement and the calculator commands I used to obtain the p-value. Matrix B (Expected) for Handedness and The Death Penalty

Here we see the expected values the calculator gave up for handedness and the death penalty. We will fail to reject the null hypothesis (p-value 0.9218). There is not enough evidence to suggest handedness and votes for the death penalty are independent. b.) Amount of water consumed? To determine a relationship between handedness and the amount of water consumed I used the Kruskal-Wallis test. I used the Kruskal-Wallis test instead of the ANOVA test for this data because the normal plot for the right handed showed data points that were not normal. The left and ambidextrous handed appeared to be at least approximately normal but they all three need to be normal for ANOVA to be appropriate. Normal Plot for Right Handedness and Amount of Water Consumed

By looking at this normal plot for right handedness and amount of water consumed it is plain to see the data is not normal. Normal Plot of Left Handedness and Amount of Water Consumed

This normal plot shows the data for left handedness and amount of water consumed is approximately normal. The point in the middle may be of some concern if there would have been more data points to configure. Normal Plot for Ambidextrous Handedness and Amount of Water Consumed

This normal plot showing data for ambidextrous handedness and amount of water consumed is clearly normal. Hypothesis Statement for Handedness and Amount of Water Consumed

Here is hypothesis statement for handedness and amount of water consumed along with the calculator commands I used to obtain the p-value. We will fail to reject the null hypothesis (p-value 0.6103). There is not enough evidence to suggest a relationship between handedness and the amount of water consumed. 5.) Form your own question, which encompasses the question regarding change in affiliation with political parties. Is there a relationship between political party and not applicable for change in affiliation? To determine if there is a relationship between political party and change in affiliation I used the goodness of fit test. Grid and Hypothesis Statement for Political Party and Change in affiliation

This grid shows the data pertaining to the political party and change in affiliation that was used to perform the goodness of fit test. Here you can also see the hypothesis statement and calculator commands I used to get the p-value. We will reject the null hypothesis (p-value 0.001). There is enough evidence to suggest a relationship between political party and change in affiliation. 6.) Form a minimum of at least one other question based on the data that was collected. Is there enough evidence to suggest a relationship between brown eyes and blue eyes? To determine the relationship between brown eyes and blue eyes I used Wilcoxon Rank-Sum test because the normal plots were both not normal. Normal Plot for Brown Eyes

This normal plot of brown eyes clearly shows a violation in the normality of the data. Normal Plot of Blue Eyes

This normal plot of blue eyes shows only slight abnormality. Hypothesis Statement for Brown Eyes and Blue Eyes

This is the hypothesis statement for the brown eye and blue eye data. Also you will see the calculator commands used to get the p-value. We will fail to reject the null hypothesis. There is not enough evidence to suggest a relationship between brown eyes and blue eyes.

Vous aimerez peut-être aussi