Vous êtes sur la page 1sur 7

INTRODUCTION:

Throughout the course of Math 1040, taken Summer 2016, a bag of skittles was evaluated using the statistical analysis techniques
learned throughout the course. We began with a simple analysis of the mean, median, mode of both individual and class
proportions. We then developed histograms and box plots and various displays to graphically show our data. As we progressed
further into learning about stats we began to perform analyses that included margin of error, lower and upper bounds as well as
standard deviation and in depth analysis. This assignment was a real world application of statistics with something as simple as a bag
of candy.

Here you will see an analysis of a single bad of skittle candy as well as the class totals, followed by graphs depicting the data. These
graphs include pareto, pie charts, histograms and box plots. In order to perform these analyses and create the graph we used
statcrunch and excel.

PART 1: Individual
Count Red Count Orange Count Yellow Count Green Count Purple Total

My Bag
14
11
15
9
8
57
Class Counts 1164
1117
1189
1087
1093
5650
The graph represents what I thought would be seen. I suspected that more yellow and reds would be more per package than the
other colors. I suspected this based on consumerism and the scientific backing that red and yellow make you hungrier, therefore
making you buy more skittles. Although the distribution of yellows and red is not significant it aligned with my guesses. I was very
surprised to see a few bags had over 80 skittles and a few bags barely had 40. These could be possible outliers, but based on the size
of the data they couldnt impact it significantly or skew the data. I was also very surprised by the consistency in the bags that
contained 54 skittles; several bags contained the same exact numbers of each color. If these numbers are truly random then our
data shouldnt be affected by it. The means of each color would not be significantly affected by the outliers because of the size of
the sample and the distribution of colors. The median values may be affected by extreme values, but I did not see nay extremes
throughout my data and the classes. The distribution of color totals in the class matched my distribution closely. The variance was
not highly significant. There seems to be average consistency throughout all skittle bags. The color distribution
PART 2 Team/Individual
1. Guess! What do you expect the proportions to be? Why? We have agreed as a group, that we guess the proportions of yellow and
red candies to be slightly higher in the class sample, as it is believed that as a marketing tactic consumers are more attracted to
those colors. Therefore, we believe there are slightly higher counts for those candies, not a substantial amount just a little bit higher
than all of the other colors.
2. Create a Pie Graph and Pareto Chart to display data.

Valya P, Melissa T, Samantha B, Jamie G, Jordan G


After creating these charts you will see some statistical question being answered regarding our graphs and numerical summary.
3. Does the class represent a random sample? What would the population be? Collaborate to discuss sampling and our data in a
paragraph or two.

The class data does not represent a random sample, but rather a convenience sample. We believe this is because random in
statistics specifically means that all members of a population have an equal and independent chance of being selected. The sample is
the 94 2.17 oz bags of Original Skittles bags selected by the students in our Statistics class. The population are all 2.17 oz Original
Skittles bags. So this population is not specific to the state of Utah or the US alone as we are sure they are sold in other parts of the
world. We believe that we each bought a bag of Skittles from the most convenient place we can buy one, or the closest supermarket
or gas station. Therefore, a bag of Skittles sold in another state or country is part of the population, but does not have an equal
chance of being selected as one down the street from us, so this is not a random sample.

After learning about the shapes of distributions and types of variable you will see the following explanations discussing the shapes
and types f variable seen.
PART 3 Individual:
The shape of the distribution is bell-shaped. The graph does not surprise me because most of the bags from looking at the numbers
seemed to have around 55-65. After calculating the mean and the 5-number summary it confirmed my predictions. There are a few
surprises again with the outliers and the effect they had on the data, but all data will have outliers. As seen in the box-plot there are
a few outliers, which I discussed in part 2. In my bag I had 67 skittles, which was just above the mean (60.1). I believe the overall
data collected by the entire class matches my own single bag even though I was above the class average.

2. Categorical variables are variables that can be put into categories. They are also known as qualitative variables. These values can
be put into countable number of categories or different groups. This type of data may not have logical order. Some examples of

categorical variables may be, types of cars, gender, hair color or zip code, which categorizes location. Quantitative variables are
those that can be orders and counted or measures. Quantitative are numerical measure of individuals. Some examples of
quantitative are temperature, number of days a student studies per week or the number of compliments per week.

Qualitative data can be put into a frequency distribution, which can then be put into a bar graph. Bar graphs are great because they
make it clear the category and the frequency of X in that category. Pareto charts and pie charts can also be used. Pie charts are
great when comparing one category to the whole. For quantitative data one could use a histogram, stem and leaf plot, dot plots and
use frequency distribution to analyze the data. There types of graphs have distributions, which add another level of analysis. One
could also use frequency polygons and ogives.

Calculating the Mean, median and mode are great for quantitative data and are less effective for qualitative. Mode would be
effective for qualitative because it can tell you which category or sub-category occurs most in a sample or population. For qualitative
data you can figure out of the data is nominal, ordinal or ratio. This type of calculation would not be effective or useful for
quantitative because they do not give a good meaning to the data.
Part 3 Group:
Term Group Project Part 3: Organizing and Displaying Quantitative Data: The Number of Candies per Bag

1. What is the mean number of candies per bag: 60.1 2. What is the standard deviation of candies per bag: 5.6 3. What is the five
number summary of candies per bag: Min: 37, Q1;58, Med: 60, Q3: 62, Max: 82

4. Histogram: Class Skittles Data

5.Boxplot:
Class Skittles Data



After learning much about statistics we are now able to calculate more specific statistical summaries. You will see the groups
calculations step by step below. (Pardon the formatting, as my computer does not handle transferring from a PDF to a word
document well.)

Part 4 individual:
Confidence intervals give meaning to the statistics calculated for population parameter. For example, a 95% confidence interval
means that X of the population occurs in approximately 95% of the cases. A confidence interval explains the degree of uncertainty or
certainty given with a calculated statics. Confidence intervals can also give the precision of the estimate as stated by stattrek.
Confidence intervals give meaning to the numbers and make them even more applicable throughout life.


Term Project 4 - Group Portion

(total candies in the whole sample) (total number of bags in the sample)
(proportion of yellow candies) s = 5.6

X = 6 0 . 1 Conditions:
np(1p)10 5650 (1189/5650) (1- 1189/5650) = 938.78
n = 5650
Valya P, Melissa T, Jamie G, Samantha B, Jordan G

n = 94
p = 1189

5650

1 .
n .005N 5650 100
5 = 113,000 < N of all candies in P opulation T herefore, 5650 .005N
n = 94 is a large sample that is also from a normally distributed population

9 9 % c o n f i d e n c e i n t e r v a l e s t i m a t e f o r t h e p o p u l a t i o n p r o p o r t i o n o f y e l l o w c a n d i e s .

Lower and Upper bounds = Zl2 = (1 )100%

( 1 - ) = .99 = .01

p ( 1 p ) p Zl2
n

p( 1p) n
=

=
1189/5650 ( 1 1189/5650 ) 5650

2.940821972e 5

l2 = .005 Z . 0 0 5 = 2 . 5 7 5
E ( margin of error ) = Za 2

p E

1189

p( 1p)

2.575 2.940821972e 5

( .196 , .224 )

We are 99% confident, that the proportion of yellow candies is between .196 and .224

2. 95% confidence interval estimate for the population mean number of candies per bag.

5650

Lower and Upper bounds = t =(1)100%


x t s 5.6
=


/2
n 94
/2

s
E (marginoferror) = t

( 1 - ) = .95
= .05

= .025 = 1 .985801768
/2 n x E

60.1 1.985801768 5.6 94

( 58.95 ,61.25 )

We are 95% confident, that the population mean number of candies per bag, will fall between 58.95 and 61.25.

3. 98% confidence interval estimate for the population standard deviation of the number of candies per bag.

Lower and Upper bounds =

n1(s2)

= (1 )100% ( 1 - ) = .98 = .02

2 R

/2

= .01

< <

n1(s2) 2 L

2 =1-
R
L /2 = 1 - .01
= 1 24.116 93(31.36)

124.116

2916.48 124.116

=.99 = 6 1.754

= =
< <

< <

93(31.36) 61.754

2916.48 61.754

( 4.847, 6.872 )

We are 98% confident, that the p opulation standard deviation of the number of candies per bag is between 4 .847 and 6.872.

After the conclusion of the analysis of a skittles bad both individually and working as group we were able to effectively perform and
utilize that statistical analysis learned throughout the summer course. Although a skittles bag may seem like a very simple thing to
analyze, all companies use statistics to provide the best products and get the best profit. Through this assignment I have learned
how to use statistics in a real world situation. I can confortable calculate 5 number summaries, create meaningful graph and work on
detailed statistics. I feel that this class has benefited me more than I would have expected. Thank you.

Vous aimerez peut-être aussi