Vous êtes sur la page 1sur 15

Ibrahim Mohamed

Professor Ping Yu
Math 1040 Statistics
Skittles term Project
Introduction:
For this project our teacher asked each student to buy a bag of skittles and email him how
much of each color skittle we got in our bag of skittles. After everyone submitted their results we
took the data and made different charts to explain the data. The first chart will be a pie chart
containing the data for how much of each color there were in the class data. The second chart
will represent my data compared to the classes in a pie chart. The third chart will be a pareto
chart to compare the amount and cumulative amount of skittles per color. The fourth chart will
be a bar graph of candies per bag and how the distribution looks. The fifth chart will be a box
and whisker depicting the distribution of skittles per bag and a 5 number summary. And the last
two charts will be a frequency graph of the ranges of skittles per bag and how many bags had
skittles within that range of the class data compared to mine.
Chart 1: Class Data

Colors Amount
Yellow
213
Orange
195
Green
211
Purple
210
Red
191
Total
1020

Ibrahim Mohamed
Professor Ping Yu
Math 1040 Statistics
Skittles term Project
Here on the pie chart above we can see that the distribution of each color for all the bags
combined is nearly the same size. I found this quite interesting and didn't expect a result like this.
I think the results have a correlation with the law of large numbers theory which says as that if an
experiment is done a large number of times the results will start to become more balanced at a
50/50 state and I can see in this pie chart that this theory is true. Although we only did this with
17 bags of skittles and the results are quite similar for each color Id predict that with even more
bags the colors will be more evenly distributed.
Chart 2: My Data

Colors Amount
Yellow
13
Orange
14
Green
13
Purple
13
Red
5
Total
58

Comparing the
class data to my single bag data it can clearly be seen that mine is not as evenly distributed as the
class data. This is a very good example of the Law of Large Numbers because with just a single

Ibrahim Mohamed
Professor Ping Yu
Math 1040 Statistics
Skittles term Project
bag the data is widely diverse, but with many more bags the data starts to assimilate together to
become more evenly distributed.
Chart 3: Class Data

Colors
Yellow
Orange
Green
Purple
Red

Amoun
t
213
195
211
210
191
1020

Here is the
pareto chart for the
class data, we can
see the percentages
of skittles that
were each color in
the grid above. In
the graph we see
that as cumulative
amount increases
amount slightly
decreases.

Cumul.
amount
213
408
619
829
1020
1020

Cumul.
Percent
21%
19%
21%
20%
19%
100%

Ibrahim Mohamed
Professor Ping Yu
Math 1040 Statistics
Skittles term Project
Chart 4: Class Data
Bag

10 11 12 13 14 15 16 17

Candies Per
Bag

56 56 57 58 59 59 60 60 60 60 60 61 61 62 63 64 64

Here we have the bar graph of each bag and the amount of candies per bag. The bag with
the most candies has 64 while the bag with the least had 56. The middle bag which is bag 9
contains 60 skittles which is the median. In the middle of bag 1 and 9 is bag 5 which has 58.5
skittle making it the 2nd interquartile. In the middle of bag 9 and 17 is bag 13 which is 61.5
making it the 3rd interquartile.

Ibrahim Mohamed
Professor Ping Yu
Math 1040 Statistics
Skittles term Project
Chart 5: Class Data
Mean
Median
Mode
Minimum
Maximum
Standard
Deviation
IR

60
60
60
56
64
2.42
3

In this box and whisker graph we can see that


the largest number is 64 which is the maximum
value, 56 is the minimum value, 60 is the median
since it is in the middle once the number of skittles in
each bag are put in order from least to greatest.
Between the median and minimum is 58.5 which is
the second interquartile, between the median and
maximum is 61.5 which is the 3rd interquartile.

Chart 6: Class Data


Frequency
56 to 59
60 to 63
64 to 67

Amount
6
9
2

Ibrahim Mohamed
Professor Ping Yu
Math 1040 Statistics
Skittles term Project

The class data in its frequency looks slightly like a normal distribution but slightly
skewed to the right. The graphs do show the results I was expecting before graphing it. The data
from the class doesn't reflect my own single bag data. I didn't expect it to since the frequencies
are larger and a larger sample usually meaning that the data is more closely resembled between
amounts per bag.
Chart 7: My Data
Amount

Frequency 5 to 8

9 to 11 12 to 14 15 to 17

Ibrahim Mohamed
Professor Ping Yu
Math 1040 Statistics
Skittles term Project

This is the data from my single skittle bag, it doesn't match or resemble the data from the
class what so ever. Mine is skewed to left while the class data is skewed to the right and also has
a great sample size. My data cannot be used as a statistic because there is not sufficient data, a
singular bag compared to 17 bags is miniscule and can't produce the results to be a stat.

Ibrahim Mohamed
Professor Ping Yu
Math 1040 Statistics
Skittles term Project

Reflection:
Categorical data is data that can be grouped into categories; Colors, shapes, etc.
Categorical data cannot be counted but can be put into a number of amounts. Quantitative data is
data that can be counted and has a numerical value attached to it; number of skittles in a bag,
number of students in a class, etc. In this project, our categorical data are the colors of Skittles in
each bag, and our quantitative data are the numbers of skittles per color of Skittle. The Pie Chart
and Pareto Chart are the graphs that make sense to use for categorical data because we group the
skittles into groups of similar colors. The reason we use the Pie and Pareto Charts for categorical
data is because we cannot assign a numerical value to categories unless we are talking about the
number of groups or the amount of a particular variable; color, shape. Percentages or counts of
each category can be interpreted with categorical data. Therefore, it would not make sense to use
the Histogram or Box Plot for categorical data, as these graphs deal with numerical data. That
being so, the Boxplot and Histogram make sense to use for numerical data, but not categorical
data because they evaluate the distribution from the data.

Ibrahim Mohamed
Professor Ping Yu
Math 1040 Statistics
Skittles term Project
Confidence Interval
A Confidence Interval (CI) is a range (or an interval) of values used to estimate the true
value of a population parameter. It measures the probability that a population parameter will fall
between two set values. The confidence interval can take any number of probabilities, with the
most common being 95% or 99%. A confidence interval gives us a much better sense of how
good an estimate is.

Hypothesis Test
Hypothesis testing refers to the formal procedures used in statistical analysis to accept or
reject statistical hypotheses. A statistical hypothesis is an assumption about a population
parameter. This assumption may or may not be true. A hypothesis test is a statistical test that is
used to determine whether there is enough evidence in a sample of data to infer that a certain
condition is true for the entire population. A hypothesis test examines two opposing hypotheses
about a population: the null hypothesis and the alternative hypothesis.

Ibrahim Mohamed
Professor Ping Yu
Math 1040 Statistics
Skittles term Project

Ibrahim Mohamed
Professor Ping Yu
Math 1040 Statistics
Skittles term Project

Ibrahim Mohamed
Professor Ping Yu
Math 1040 Statistics
Skittles term Project

Ibrahim Mohamed
Professor Ping Yu
Math 1040 Statistics
Skittles term Project
Reflection

There are three conditions for confidence Interval for estimating a population proportion
The sample is a simple random sample.
Either or both of these conditions are satisfied: the population is normally distributed or
n>30.
There are at least 5 successes and at least 5 failures.
Conditions for Confidence Interval for Estimating a Population Mean with not known
The sample is a simple random sample.
Either or both of these conditions is satisfied: The population is normally distributed or n
> 30.
Conditions for Confidence Interval for estimating a population Standard Deviation or
Variance
The sample is a simple random sample.
The population must have normally distributed values (even if the sample is large). The
requirement of a normal distribution is much stricter here than in earlier sections, so
departures from normal distributions can result in large errors.

Ibrahim Mohamed
Professor Ping Yu
Math 1040 Statistics
Skittles term Project
What possible errors could have been made by using this data? How could the sampling
method be improved?
Mistakes could also be made gathering this data. One type of error could be recording
incorrect data. This could happen if the person counted incorrectly or wrote the wrong quantity
down for that color. Non-response error is also something that could affect the data. Each person
in the class was assigned to buy a Skittles bag but if someone never bought a bag to record the
data then we are missing part of our intended sample.
Reflective writing and E-Portfolio
This class has taught me so many things from how charts can be deceiving if they are not
represented properly to how to calculate complicated statistic problems. I found this class very
interesting. Everyday we learned new concepts and I got more and more excited after every
class. The workload wasn't too bad, it was just the right amount to make you not bored. The
professor was very good at teaching us how things worked and made sure we understood the
concepts and how to solve for problems. If someone did not understand how he got an answer he
would do example after example and slowly go through step by step until that person understood
how to do the work.
This project showed me how statistics could apply to unusual things such as skittles. I
have never thought that something so simple as skittles could turn into our term project. This
project has showed me how we could apply statistics to everyday things in life. I always used to

Ibrahim Mohamed
Professor Ping Yu
Math 1040 Statistics
Skittles term Project
say when are we going to use this in real life? Do I really need to learn this? And all of my
questioning about using this in real life has been answered by this project. Overall this project
had me thinking and wondering what else statistics could be applied to in real life. This was truly
an eye opening experience with this project.

Vous aimerez peut-être aussi