Vous êtes sur la page 1sur 10

My Bag

Class
Count

Number
of Red
17
234

Number
of Orange
15
243

Number
of Yellow
12
235

Number
of Green
9
258

Number
of Purple
16
228

Total
69
1198

The graphs mostly reflect what I expected to see I expected one of two things would
happen. I thought that there would be roughly 20% of each color since there are 5
colors they would be distributed evenly. Or I thought that all but one color would be a
lot higher just because every time I get a bag of skittles I feel like I always get all
purple. Even though in theory I thought they might be evenly distributed it was a
pleasant surprise to find that they are. I did not really notice any outliers. The
distribution was a little off. No the distribution from the class does not quite match up
with my individual bag of candies. In my bag I had about 24.64% of reds whereas the
class had about 19.53%, for orange I had 21.74% the class had 20.28% which was
pretty close, then for yellow I had about 17.4% and the class had 19.62%, for green I
had about 13.04% and the class had a much higher percent at 21.54%, and for purple
I was all the way to about 23.19% while the class was at a lower percent of about
19.03%. I had more reds, oranges, and purples, and less yellows and greens then the
rest of the class.

1.

Determine the proportion of each color within the overall sample gathered by the class.

FIRST: Guess! What do you expect the proportions to be? Why?


Now open the data set and compute the proportions of Red, Orange, Yellow, Green, and Purple
candies in the class data set. Note that the sample size is the total number of candies collected
by the class.
We all agreed that each color will be evenly distributed, because there are five total colors we are think
that there is going to be about 20% of each color of the sample size. We calculated it and the actual
proportions are: Purple 19.13%; Red 19.53%; Yellow 19.62%, Orange 19.62%; and Green with 21.54%.

2.

In StatCrunch, create a pie chart and a Pareto chart for the total number of candies of each

color in our class data set. Submit copies of your graphs in this report.

Pie Chart:

Pareto Chart:

3.

Does the class data represent a random sample? What would the population be? Collaborate

to discuss sampling and our data in a paragraph or two.


Yes the class data represents a random sample because each bag of skittles was chosen randomly, all
of the classes chosen skittle bags makes up the sample. With that the population would be all of the
2.17 ounce bags of skittles. Our data was as we expected it to be which was uniformly distributed. All
five colors were roughly around 20%. However most bags of skittles had proportions of one or two
colors that were either very low or very high or occasionally both, but the total proportions of the sample
evened out.

1. The shape of the total candies in each bag was skewed left. I was expecting it
to be symmetrical before seeing the numbers but after seeing the numbers I

was guessing it would be skewed left because of low number of 38 where the
rest of them was closer to the 60s. Yes the data from the class is very similar
to my own bag of candies, although I was on the higher end of the data. I had
about 69 in my bag. The class ranged from 38-69, there were 20 bags collected
from the class. If this were to be a box plot my bag would have definitely been
one of the outliers. The median of the class was 61, so my data was on the
high end for the class.
2. The difference between categorical and quantitative data is that quantitative
data are things that can be measured such as length in inches, number of
questions on an exam, weight in lbs, time in minutes, these are all things that
can be counted. Whereas categorical data cannot be counted it is things such
as gender, model of car, pass or fail these are in groups. Since categorical data
is grouped into categories bar graphs, line graphs, and pie charts are great at
displaying the data. This makes it easy to understand because of the clearly
indicated groups and the titles on the x, and y axis. These also make it easy for
the reader to understand the proportions of the different groups that are being
compared.
Graphs best used for quantitative data are stem and leaf plots, histograms, and
box plots. Stem and leaf plots are useful because they help show the shape of
distribution and organize numbers. It is a good technique that gives a really
good overall impression of the data. Histograms are a great way to show
quantitative data because they also show the distribution of the observations,
based on frequencies and intervals. Boxplots are good for quantitative data
because they show how the data is spread out by using the 5 number
summary, the min, Q1, the median, Q3, and the max. In a box plot there is a
rectangular box which represents the middle half of the data between q1 and
q3. With this box you can also see whiskers which will show you where the
outliers are.
Data you would calculate for categorical data would be much different you
could group it into color of car, make of car, year of car, or type of car i.e. SUV,
truck, etc. Whereas the data you would count for quantitative data would be
numbers such as number of people in a movie theater on a Friday night,
number of cars in a parking lot. These are meaningful measures that can be
counted.

Term Project Part 3


Group 3: Michelle Alvey, Ashley Donnelly, Andrew Gibbons
1. Computations for Total Candies in Each Bag:
(a) mean number of candies per bag: 59.9
(b)standard deviation of the number of candies per bag: 6
(c) 5-number summary for the number of candies per bag: 38 59 61 62 69
2. Frequency histogram:

3. Boxplot:

Confidenc
e Interval
A confidence interval measures that probability that a population parameter
will fall between two sets of values, it is the amount of error that is allowed in the
data and the analysis. It is an observed interval and changes sample to sample.
Confidence intervals consist if a range that is an estimate of the unknown
population parameter. It is a guess in the form of a percentage of the confidence
that the true value of the parameter is in the interval.

1. P = (.167, .226)
2. = (58, 63)
3. = (4.35 [5 candies], 9.48 [10 candies])

4.
For #1, the 99% confidence population proportion, we took the information of the
proportion of total
Yellow Candies in the population of all candies our class sampled, and plugged those
numbers (235/1198) into my calculator in the 1-proportion Z-Interval function. The
interval that came out, to me, means that I have 99% confidence that the
proportion of yellow candies in the parent population will be found between the
proportions of 16.7% and 22.6% of that group.
For #2, the 95% confidence interval for the population mean, we took the totals
from each bag and plugged that proportion into the T-Interval function in the
calculator. The results illustrate that I have 95% confidence that the mean number
of candies in each 12 ounce bag sold (of the parent population) will be between 58
and 63 candies.
For #3, the 98% confidence interval for the population standard deviation of candies
per bag, We took the candies-per-bag totals and created a standard deviation (right
skewed) distribution graph plugged in the sample population, the sample standard
deviation, and the confidence percentage into the equation. From the results, I
can say I am 98% confident that the parent population standard deviation of
candies from each 12 ounce bag will be between 4.35 (5), and 9.48 (10) candies.
We started this exercise by checking if our experiment is normally distributed. On a
graph on my calculator, it looked normal, and np(1-p) 10; also, our sample is
certainly less than 5% of the parent population

Math 1040 Reflection


Michelle Alvey
In this course we were required to do a statistical team project based on skittles obtained
by the class. In this project I learned about how to apply statistics to everyday problems. In part
one we mostly gathered the data and learned the different ways to collect data. This was a hard
group project for me because I have never done a group project online so it took some time to
adjust on how to communicate to make sure that all parts got done on time without sitting next to

the person every day in class. I feel like this helped my communication and time management
skills in order to work with a group online it was very important to not procrastinate so everyone
had a fair chance of participation.
In part two we made a pie and pareto chart of the data obtained. These graphs showed us
potential surprises and outliers of the classes data. These also showed if it was normally
distributed or not. This will help me with future classes to be able to have a better understanding
of how outliers affect the data, and just how to make and understand graphs in general. This also
showed how the classes data as a whole varied from our individual data which taught me about
sample sizes and how they can vary.
In part 3 we learned about shapes of the distribution and how they can change. And how
to make graphs for the different types of data and how to calculate for categorical and
quantitative data.
After doing this project I feel more confident in my problem solving skills, and my ability
to use math in real life applications. I now know how to compute my GPA instead of looking it
up all the time, I now know how to more effectively group data that has been gathered, and my
graphing skills have improved. I have also been able to make more sense of medical journals I
have read.

Vous aimerez peut-être aussi