Vous êtes sur la page 1sur 7

Caelei Rosenvall

Fall 2017 Semester


Math 1040 Skittles Term Project
Introduction:
In this project I will be demonstrating the statistical strategies and tools I have learned this semester. I will be
demonstrating this through different interpretations of colors and quantities of Skittles candy. This
information comes from data that has been collected from each statistics class taught by the professor of my
class and will be used to construct the information found below.
Organizing and Displaying Categorial Data: Colors

The observations I have made from these charts is that the highest amount of skittle color found throughout
all the bags was red, and the lowest color found was yellow. I would expect this because red is one of the most
favored colors/flavors and tends to show up more than the other colors in the bags, and yellow is one of the
most disliked colors and doesnt show up in the bags as much. Personally, red is my favorite and yellow is my
least favorite. These graphs do a good job of showing what I did expect to see. As seen below in my own
personal data gathered from my individual skittles bag, the highest amount of candies was purple, and the
lowest amount of candies were green. This doesnt exactly agree with the data collected, but the idea that I
still had more red than others and less yellow than others still helps support it.

My Skittles Data:

Number of Red Number of Number of Number of Number of


Candies Orange Candies Yellow Candies Green Candies Purple Candies
Caelei Rosenvall
Fall 2017 Semester
14 11 11 9 17

All Classes Skittles Data:

Number of Red Number of Number of Number of Number of


Candies Orange Candies Yellow Candies Green Candies Purple Candies
642 623 581 622 608

Organizing and Displaying Quantitative Data: The Number of Candies per Bag
Summary statistics: 5-Number Summary
Column Mean Std. Dev. Median Min Max Q1 Q3

Proportions of 60.3 3.44 61 50 65 59 62


Skittle Colors
Caelei Rosenvall
Fall 2017 Semester
The observations I have made from these charts is that the average amount of skittles in each individual bag is
62, and the least common amount of skittles in an individual bag ranges from 50 to 53. I would have expected
a total in the range of 60-65 based off the data collected and the amount of skittles I had in my personal bag.
Therefore, the graphs made sense and reflected what I expected to see. The frequency histogram is skewed to
the left because the tail is on the left side. The overall data collected does agree with my own data because
the amount of skittles in my bag matched the overall average.
Reflection:
Quantitative data is information gathered based on quantities and is measurable. It is shown through dot
plots, stem plots, and histograms. These charts and graphs help show the numerical value, quantity, or
frequency of the information. In the graphs given in part two, the quantitative data is used to see the number
amounts/ quantities of skittles in individual bags. Categorial data is information gathered about categories or
qualities of data and is not measurable. It is shown with pie charts and bar charts. These charts to a good job
of showing information about the color, size, or other physical features that describe the data. In the graphs
given in part one, the categorial data is used to see the amounts of specific colors found in the skittles bags. It
makes sense that these charts would go with each described above in the matter that they can explain
numbers or show categorial information. The types of calculations that make sense for quantitative data are
totals, averages, maxes, minimums, etc. The types of calculations that make sense for categorial data are
numbers of certain sizes, colors, shapes, etc. Each calculation belongs to its type of data just in the way that it
supports a number or category.

Confidence Interval Estimates:


Caelei Rosenvall
Fall 2017 Semester
A confidence interval Is a range of values containing and upper and lower bound. This range is used to
determine if the found probability for the given information falls in this range. A confidence interval is used to
test if the true population mean contains the found value, and what determination can be made involving a
result and its interpretation to the problem.

99% Confidence Interval Estimate for True Proportion of Yellow Candies:

We are 99% confident that the interval between 0.171 and 0.207 does not contain the value of true
proportion, .018, for the true proportion of yellow candies.
Since the margin of error does not fall within the confidence interval range, the interval doesnt contain the
value, therefore the interval is not satisfied.

95% Confidence Interval Estimate for True Mean Number of Candies Per Bag:

We are 95% confident that the interval between 59.331 and 61.269 does contain the value of true proportion,
60.3, for the true mean number of candies per bag.
Since the sample mean does fall within the confidence interval range, the interval does contain the value,
therefore the interval is satisfied.

Hypothesis Tests:
Caelei Rosenvall
Fall 2017 Semester
Hypothesis testing is used to test a claim or statement regarding a characteristic of one or more populations.
Based on a samples evidence and the information found we can determine whether to reject or fail to reject a
claim. We can determine this by comparing the level of significance to a found p-value (probability), or by
comparing the critical value of z alpha/2 with the level of significance which determines the area of rejection
our Z note falls into.

0.05 Significance Level- Claim that 20% of all Skittles Candies are Red:

Fail to reject, there is not sufficient evidence to support the claim that 20% of the skittles candies are red. If
the p-value had been less than the significance level, we would have had enough information to reject the
claim. So, since the p-value is larger, we fail to reject because more information is needed.

0.01 Significance Level- Claim that the Mean Number of Candies in a Bag of Skittles is 55:

Reject, there is sufficient evidence to support the claim that the mean number of candies in a bag of skittles
are red as incorrect.
The t-value is larger than the critical value for the sample, therefore there is enough information to reject the
claim. Based off the data for this project we had already found the mean, therefore it was not surprising when
this was tested as untrue.

Reflection:
Caelei Rosenvall
Fall 2017 Semester
The conditions that must be met to correctly perform interval estimates and hypothesis tests involve random
sampling, normality, and independency.
The first thing that is determined when performing either test is that of random sampling. For the data to
represent a sample or population well without any bias, it must be random. This is determined by reading the
given information and seeing if things arent too closely correlated together. It must also show a normal
distribution. If there is a normal distribution the population is normal. For the test to be accurate there must
be a normal distribution and the sample but be simple/ random.
The next thing that is determined when preforming either test is that of normality. This is determined through
the sample size or population size. For hypothesis testing the sample size, n, must be larger than 10. To find
this the formula np(1-p) is used. If the result is larger, normality is obtained. For confidence interval testing the
sample size, n, must be larger than 30. If a large population is being dealt with, the population size must be
larger than n as well. It would be difficult to obtain enough information that is accurate with a sample or
population size that is too small.
The last thing that is determined when preforming either test is independency. To determine this n must be
less than or equal to 0.05. Populations are very large, and it would be difficult to deal with a too large
population, especially at the level of statistics us students are in. So, to keep the sample normal it must be
independent.
Our skittles test passes all three conditions. The data shows a normal and random distribution, np(1-p) is
greater than 10, for the other test n is greater than 30, and n is less than or equal to 0.05. This information can
be collected from above information. Because of this, the tests were all performable and accurate.
The only errors I can think to happen would involve math miscalculations or typos, and Type 1 or Type 2 errors
on the hypothesis testing. Math errors and typos would result in an incorrect final calculation which would
result in a wrong statistic and interpretation. Type 1 and Type 2 Errors could occur which would show that we
chose to reject or not reject a hypothesis when it was wrong or right, therefore stating that we made the
wrong conclusion and were wrong. In my opinion, the sampling method works well as it is. For people who
arent amazing at math, like me, it is found understandable and learnable. Nothing is perfect, all that matter is
that it works, and this does.
The conclusions that I have made involving my statistical research involve a commonality of colors and
numbers of candies in each bag. With the overall data and tests done, I observed that there are more colors
than others that are made and put in the bags more often. These colors showed up more and had a higher
amount in the data collected. Red is a more common color that is seen. After preforming the test involving
whether 20% of the candies are red, there was evidence that showed it was an accurate hypothesis. Since it is
a more common color based on the data, 20% is highly possible and the data proved that. There was also a
common range of number of candies in each bag. The range was close together and involved the mean. This
showed that the amount in ounces is similar and the amount of candies is around the same for each bag. This
evidence was cool to see because that means the company is controlling how much candy is packaged and
what colors are getting placed in the packages more frequently.

Reflective Writing and ePortfolio:


Caelei Rosenvall
Fall 2017 Semester
Statistics 1040 was an interesting class and fun environment to learn in this semester. I feel that I was

able to understand the information more than in the other math classes I have taken. Although the math itself

was a challenge, our instructor made it understandable and learnable. I found it interesting how well

everything tied together. We learned how to do different types of tests for different types of data, but they

still all went together.

The Skittles project we conducted and completed has taught me how to use statistics to analyze data.

We analyze this data by comparing and contrasting different information. It was interesting to see how

perfectly the different formulas and tests tied into skittles. With a large sample for each statistic test taught by

our instructor, we were still able to find an average and different confidence intervals. It was cool and made

the math more fun.

This project has also taught me how the information and problem solving Ive learned through

statistics is applicable to so many things, including candy. I never thought that math could be used for

something like skittles. It leads a person to imagine all the other things that math could be used for in the real

world. Math has always been a class that I have taken because it was required, but now its better than that

because it is applicable to my life.

The information I have learned will impact other math classes I plan to take because now I will

approach them with a different mindset. I will understand more that they will also be applicable to my life,

therefore I will care more about what I am learning. I think it would be interesting to see how statistics and its

information works alongside other types of math. We saw some algebra tie into statistics, so I am sure it could

work the other way around. Overall, statistics is very relevant, and this project has helped to show that. This

project did a good job with showing that math is used so frequently.

Vous aimerez peut-être aussi