Vous êtes sur la page 1sur 13

Jacob Campbell

Math 1040 Math 1040: Final Project

Term Project

The following is a compilation of graphs and equations which I have learned to develop through Math 1040 (Statistics). There were 5 different categories to choose between and the data I will use is body measurements. There are many steps within this project. The first is gathering all my categorical information and charts. Then I will compile charts and information for my quantitative data, followed by choosing a confidence level and determining the margin of error for the data, so I can create a population proportion interval from my two categorical samples. The quantitative data will then be used to create a population mean interval and standard deviation. The last step includes: selecting a level of significance, completing the hypothesis testing for the population proportion of both data sets. These are the totals of the population data set of male and female.

What I have to do now is choose two different samples from the stat group I have selected (Male and female) and it has to be random. The first sampling method I am going to use is stratified

Jacob Campbell

Math 1040

Term Project

sampling. First I had to decide my sampling size, the limit was 35 and I chose 50. When doing a stratified sampling you much choose some value from every strata (the stratas being male and female). So, with my sample size being 50, I needed to take 25 samples from each strata. To choose the 25 males or females I had to revert my sampling to simple random sampling, which is where I used randomizer.org and it gave me 25 male and 25 females.

The next sampling method I chose to use within my categorical data was a systematic sampling method. Once again I had to choose a sample size, so this time I chose 38. I then went to randomizer.org and had it pick a random number in the whole of the data. The number it came up with was 288 so, I chose the 288th data (which was male) and from there I chose every 5th data going from 288 to 283 to 278 etc., until I had a size of 38. I then graphed my results in a pie graph and pareto chart.

Jacob Campbell

Math 1040

Term Project

I was then asked to compare the two categorical samples to each other and then compare those results to the entire population. As seen in the graphs there is an obvious difference between the data in the two sampling methods. In the systematic sampling there are 9 females and 29 males whereas in the stratified it is a 50/50 split with there being 25 of both male and female. The reason for this is because, in the stratified method both the stratas must be used whereas in the systematic the whole population is being determined. Neither of the methods did an effective job of showing an accurate sampling of the entire population because in the original data there are 13 more women than men. The next step in this project moves on to quantitative data. I must choose one set of quantitative data from the body measurements data. The data I chose was weight.
Summary statistics: For Population Data Std. dev. Median Min Max Q1 Q3 68.2 42 116.4 58.4 78.9

Column

Mean

Weight 69.147535 13.345762

Jacob Campbell

Math 1040

Term Project

Then I was asked to take two samples from the population, make a five-number summary, a frequency histogram and a box plot out of the samples. To do this I now had to choose two more sampling methods. The first I chose was the simple random method. Before choosing the data I had to determine a sample size, 40, is what I chose. I then went to randomizer.org and punched in the info it asked me (how many sets of numbers, how many numbers in each set and the range). After telling the website to compute this data it threw out 40 random numbers between 1 and 508, which gave me my first sample.
Summary statistics: Column Mean Std. dev. Weight Sample #1 68.88 13.665194

Median Min Max

Q1

Q3

67.3 45.9 104.1 59.05 76.4

Jacob Campbell

Math 1040

Term Project

The shape of the histogram is slightly skewed right but for the most part is normally distributed as represented by its bell shape. This means the data starts low and then rises until it reaches its peak in the middle of the data and then returns to its lower state. The box plot shows the maximum and minimum values, represented by the lines at the ends. The box represents quadrant 1 and quadrant 3 with the mean being represented by the line in the middle of the box. The second sampling method I am going to use for the weight data is going to be systematic. How I did this was, first I decided my sample size was going to be 36 I then called my mother and told her to pick a number off her head between 1 and 508. She chose 2. So to use the systematic style of sampling, I then chose 2 and every 5th number after 2 until I had a sample size of 36. This revealed my second set of sample data for weight.
Summary statistics: Column Mean Weight Sample #2

Std. dev. Median Min Max Q1

Q3

78.022222 8.0823657

78.2 61.4 93.8 72 84.75

Jacob Campbell

Math 1040

Term Project

The shape of the histogram is now skewed left. Which means there is more data to the right of the x-axis causing the bars on the right side to be smaller. The same applies for the box plot as the last one. The max and min of the data are represented by the lines on the outside. The box represents quadrant 1 and 3, with the mean being represented by the line in the middle. All three of the histograms are completely different. The population histogram is skewed right, the systematic histogram is skewed left and the simple random is normally distributed. This proves that a good sample size and an effective sampling method is necessary in being accurate with data. All the box plots are relatively the same in style. Obviously, they all have different numbers but the structure is the same. Quadrants 1 and 3 are in the lower have of the min and max and the mean is about right in the center of the box. The next objective consisted of three parts. The first is to choose one of my categorical variables (to which I chose males) and create a confidence interval for the population proportion for each of my categorical samples. To do this I had to choose a confidence level, to which I chose 99%. The confidence level is an interval estimate of a parameter (in this case a population parameter). It is used to indicate the reliability of an estimate. I will later show hand written work to better explain. The confidence

Jacob Campbell

Math 1040

Term Project

interval for the population proportion for my stratified categorical data was .3179<P<.6821 and the margin of error was .1821. The confidence interval for the population proportion relating to my systematic categorical data was .5816<P<.9384 and the margin of error was .1784. The second part was that I had to create confidence intervals for the population mean of each of my quantitative samples and show the margins of error. These intervals determine the mean parameter. The confidence interval for the population mean in relation to my simple random quantitative data was 63.027<<74.733 and the margin of error was 5.853. The confidence interval for the population mean, in relation to my systematic quantitative data was 74.352<<81.688 and the margin of error was 3.608. The third part to this section was to create confidence intervals for the population standard deviation of each of my quantitative samples and show the computation for the left and right endpoints. These intervals determine the standard deviation parameter. The confidence interval for the population standard deviation in relation to my simple random quantitative data was 10.448<<18.760. The confidence interval for the population standard deviation in relation to my systematic quantitative data was 6.525<<12.874. Here is the hand written work:

Jacob Campbell

Math 1040

Term Project

Jacob Campbell

Math 1040

Term Project

Jacob Campbell

Math 1040

Term Project

My last step to finish the project was to select a level of significance (to which I chose .05). Using this level of significance to complete a hypothesis test for the population proportion for one value of my categorical variable in both samples. For my stratified categorical data, the original claim was that there are 25 males (50%) in the sample. My null hypothesis then is H:P=.50 and my alternative hypothesis is H:P.50. To test these hypothesis I now have to find the P-Value which is done by first finding the z-score which turned out to be 0. The test is a two-tailed test so I must sum the areas on both sides of the z-score and they are 1. So, my P-Value=1, which is greater than my level of significance, thus I fail to reject the claim that 50% of the stratified data are male. For my systematic categorical data, the original claim was that there were 60% males in the sample. My null hypothesis then is H:P=.60 and my alternative hypothesis is H:P.60. The z-score in this sample was 2.60 which made the P-Value= .0094, which was less than my level of significance, thus I reject the claim that 60% of the systematic data are males. To better understand I have provided hand written notes on the hypothesis tests:

Jacob Campbell

Math 1040

Term Project

I then had to use the level of significance to, complete a hypothesis test for the population mean, using both quantitative samples. For my simple random quantitative data, the original claim was that the mean weight is > .05. To determine this I needed to get the test value, which was a t-score and it was -.13. The p-value was 1.865 which was > the level of significance, so I must reject the H (H:=.05) and I fail to reject the original claim. The computation of the test statistics is shown in the handwritten notes. For my systematic quantitative data, the original claim was that the mean weight was > .05. The test value was t=6.59. The p-value was 1.690, which was > the level of significance, so I must reject the H (H:=.05) and I fail to reject the original claim. The computation of the test statistic is shown in the handwritten notes. Handwritten notes for the hypothesis test for the population mean:

Jacob Campbell

Math 1040

Term Project

Jacob Campbell

Math 1040

Term Project

The whole paper has been summarizing what I was doing throughout the project so I am not going to bother with a summary. Now, a reflection I can do. This assignment has affected me in many ways (besides staying up nights at a time straining about it). For one, I have come to appreciate the subject of Statistics more than I previously had. Another is that I have now developed skills I never knew I had. If I were to look at the instructions for this project in January 2014, it would have looked like pure jibberish. Whereas, I can now understand and interact with the information. The math skills I have acquired through this project are definitely going to help me in other aspects of my life. I say this because, my plan (who really has a plan?) is to major in Psychology with an emphasis on statistics. So, in that sense this project is like the first stepping stone across a very large lake, that is my life. It also helped with just life in general. For example, in the future, if I want to perform a survey. I now know the proper ways to compile a sample that will be both, big enough and effective. Everyone thinks, unless Im going in that subject I dont really need it. This is especially said about math. I was one of those people. I still studied other kinds of math because I like it but, I never thought I would be using what I learned outside the classroom. Then this class came along and showed me that there are some subjects that I am going to use the rest of my life, even if I wasnt going into statistics.

Vous aimerez peut-être aussi