Vous êtes sur la page 1sur 7

Kristen Booth

Professor Alia Maw


Math 1040
April 27, 2016
For this semester in my online statistics class, we completed a group project based on the
statistics of colors and numbers of candies in a single 2.17 ounce bag of Skittles. The following
charts and ideas are based on the results from individual results as well as total class results in
the number and colors in our own Skittles bag. There are group portions as well as individual
portions included in the project.

Term Project Part 2: Group 2 Assignment


1. We expected the proportions for each color to be about the same because the way that
skittles are made in a factory we assumed that each color would be made in the same
quantity. Also the data we are looking at is a bigger sample than just our own individual
bags.

2. Orange: 20.9%
Purple: 20.7%
Red: 20.6%
Yellow: 18.9%
Green: 18.9%





3. Our group decided that the class data represented a Convenience sample. Everyone just
went out to the nearest store and bought whatever bag of skittles that they wanted. We
didn't use systematic sampling to obtain our candies. (Every "kth" individual.) The
students were not chosen at random and the bag of skittles were chosen by convenience.
Each student had a choice of where to buy the bag of candy and which bag to choose.
The population for this project could be all 2.17 ounce bags of skittles. The sample is the
49 bags that were counted by our classmates.

Term Project Part 2: Individual Assignment


Count Red Count Orange Count Yellow Count Green Count Purple Total
15
11
12
6
62
My Bag
18
627
566
567
620
2997
Class Counts 617
Over the course of the last week, we have been really focused on frequencies and relative frequencies, so I
saw that appropriate to apply that principle to this project.
Relative frequencies for my bag of Skittles:
Red: 0.290
Orange: 0.242
Yellow: 0.168
Green: 0.194
Purple: 0.097
Relative frequencies for class totals of Skittles:

Red: 0.203
Orange: 0.2010
Yellow: 0.189
Green: 0.189
Purple: 0.207

Based on this information, I was expecting similar frequencies of colors, however, the purple skittles in my
bag seemed pretty low as compared to class totals, so that was kind of a surprise to me. It does also seem a little
funny to me that there are more of the red and orange Skittles in everyones bag! I would have though thtat there
was a pretty even amount of colors across the board. According to this chart, I think my amount of purple Skittles
would be an outlier, but I dont really think it has a huge impact on this summary. If this information were
organized into a bar graph or a pareto chart, then it would be a lot more obvious. I do believe that my count of
Skittles compares to the class, with the exception of the purple Skittles, however, there were a couple of bags in the
class count that I thought had a different amount as compared to the rest of the class. Maybe this was a human error,
or they just got lucky with their Skittles!

Term Project Part 3 summary stats - Group Portion


Summary statistics:
Column Mean Std. dev. Min Q1 Median Q3 Max
var1
61.2
8.9 52 58
60 62 110

Term Project Part 3 Individual


In order to prepared for a graph and boxplot on the total number of candies in each bag of Skittles
for our class, I calculated a five number summary (also including the mean and standard deviation) as
follows:
Mean: 61.6
Standard Deviation: 8.87
Min: 52
Q1: 58
Med: 60
Q3: 52
Max: 110
IQR: 4
Lower Fence: 52
Upper Fence: 68
Therefore, the max being 110, and the upper fence only 68, leaves 110 to be an outlier. Without
even graphing, you can assume that the shape the distribution would be skewed right. After seeing a
histogram and a boxplot of the information, the graphs did reflect my assumption, and I was a little
surprised to see more than one outliner. (This would have been more noticeable had I gone back through
the information a couple of times.) The total number of bags for the class was 49, and after calculating a
five number summary for my own bag, I found the standard deviation to be 8.5, pretty close to the class.
I will be honest, Im not sure yet how to compare my bag to the whole entire class because the numbers
are on such different scales. I think I need to calculate fie number summaries with mean, standard

deviations, and upper and lower fences of each color to really get down and compare my bag to the rest of
the class.
I literally just had an aha! moment while typing this!! I think this is where categorical data and
quantitative data come into play. For example, my small bag of candies cannot be compared to the whole
entire class without having some sort of categorical data to break down the information. From what I
have learned so far, I would think categorical data could be displayed using a bar graph or histogram, and
quantitative data could be displayed using bar graphs, histograms, or bod plots. Using a boxplot to
display several categories of data just does not make sense. I dont see the information being clearly
displayed that way.

Term Project Part 4 confidence intervals - Group Portion


1. 99% confidence interval estimate for the population proportion of yellow candies
p
x
s
n
Lower Limit
Upper Limit
.1889

599.4

30.2539

566

.17044

.20727

We use StatCruch to find the lower upper bounds.


The margin of error is .20727-.17044=.03683; .03683/2=.01842
The critical value is 1-0.99=0.01; 0.01/2= 0.005; t0.005=2.5758 (Used invNorm)
2. 95% confidence interval estimate for the population mean number of candies per bag
x

Lower Limit

Upper Limit

61.16

8.8679

49

58.616

63.71

We used the calculator to find the mean and standard deviation then plugged those
numbers into the TInterval setting on the calculator to get the confidence interval.
The critical value is 1-0.95=.05; 0.05/2=0.025; t0.025=1.960 (Used invNorm)
3. 98% confidence interval estimate for the population standard deviation of the number of
candies per bag

Lower Limit

Upper limit

61.16

8.8679

49

7.0404

11.272

We used the calculator to find the mean and standard deviation and find the interval
using the Chi Squared distribution to get the confidence interval.

The critical value is 1-0.98 = 0.02. 0.02/2 = 0.01. Using the Chi Square table, we
find the critical values to be between 29.707 and 76.154. We plug those values into
the standard deviation formula for population standard deviation.

Each of the three interval estimates show a prediction of what the results might be. The 3rd
estimate means that we are 98% confident that the standard deviation of candies per bag will fall
between 7.0404 and 11.272. The 2nd shows that we are 95% confident that the numbers of
candies per bag will fall between 58.616 and 63.71. The first problem the proportion of yellow
candies has a 99% confidence that it will fall between .17044 and .20727. For this project there
were a few outliers which has caused the confidence interval to widen as well as make the level
of confidence decrease. The outlier could have been removed from the data set because it was
obviously the wrong size bag but we left it in the data that we used because in the real world
there will always be mistakes that need to be accounted for.

Term Project Part 4 confidence intervals - Individual Portion


I am actually really grateful for this assignment, because I have been really struggling
with chapter 9. Confidence intervals and their formulas really confuse me for some reason!
From what I have gathered, a confidence interval will basically give you smaller statistic
estimates that can be used to define a larger parameter. Also, how frequently the observed
interval contains the parameter is determined by the confidence level. So you can test a smaller
group, and based on the level of confidence, assume results for a larger population for whatever
you are looking for a statistic of. (I hope I am right! Im not sure why this confuses me so
much.) In other words, the higher the level of confidence, the more accurate the test on the
population as a whole will conclude. This can also be used as a basis for reliability.
Term Project Part 5 reflection.

I have learned that there are statistical principles all around me! When we
first started this project, I honestly thought to myself What do Skittles have to do
with statistics? As we got down deeper into the project, Skittles have a lot to do

with statistics. I started to think about ideas such as maybe there is a person who
works at the Skittles factory and it is their job to make sure there are certain
proportions of colors in each bag of candy. Im sure they dont just through all the
colors in a bag! There has to be some mathematics and statistical thinking behind
even the smallest things, such as a small bag of candy, and if there are reasons
behind it, like Why are there more red Skittles than purple Skittles? Do they do
simple random surveys to see what flavors/colors are more popular? There is so
much in every day life having to do with statistics that go unnoticed on a day-today basis.
This course and applying the principles of statistics has already
impacted my current college experiences with other courses. In my Physiology lab
we were asked to graph and extrapolate certain things such as respiration and
cardiac rate, as well as find the mean, median, mode, range, and standard deviates
on simulated groups of patients. It was really beneficial to have a background in
statistics to help me collect and analyze data given to me in lab.