Vous êtes sur la page 1sur 6

 

PART 2 
 
1. What proportion (or percentage) of the Skittles do you expect to see of each 
color? Why? 
 
I had expected for the proportions to be equal (20% for each color) because I 
think that would make sense in this case. Every bag should have the equal amounts 
of each color. When I observed the class proportions I saw that most colors were 
almost equal, but the count of green was almost 2% less than the others. 
 
  Count Red  Count Orange  Count Yellow  Count Green  Count Purple 

Expected  20%  20%  20%  20%  20% 


Proportion 

Observed  20.05  20.30%  21.11%  18.64%  19.90% 


Proportion 
 
 
2. In StatCrunch, create a pie chart and a Pareto chart for the total 
number of candies of each color in our class data set. Submit copies of 
your graphs in this report. 

 
 
 
 
   
3. Does the class data represent a random sample? What would the 
population be? Collaborate to discuss sampling and our data in a 
paragraph or two. Think carefully about the definition of random sample 
when you work on your response. 
I think that the class data does represent a random sample. The 
population can be starting from the math 1040 students at SLCC or students 
at SLCC or even the world. No matter what the population is, the class data 
is a random sample because we are using chance to pick our bags of skittles 
from the entire Salt Lake. 
 
4. Create a table that displays the proportions by color and the total count 
from your own bag of candies together with the proportions by color and 
total count for the entire class sample 
 
  Proportion  Proportion  Proportion  Proportion  Proportion  Total 
Red  Orange  Yellow  Green  Purple  Count 

My Bag  12,   13, 21.67%  12,   13, 21.67%  10,   60 


20%  20%  16.67% 

Class  1340,  1356,  1410,  1245,  1329,  6,680 


Counts  20.05%  20.30%  21.11%  18.64%  19.90% 
 
 
5. Write a ​well thought out paragraph​​ discussing your observations of this 
data. Respond to the following prompts: 
● Do the graphs reflect what you expected to see? Are there any 
surprises? 
● Are there any observations that appear to be outliers? If so, what 
impact might they have on graphics and summary statistics? 
● Does the distribution of colors in the total class data match with 
your own data from your single bag of candies or are they different? 
There are definite differences between my bag of skittles and the 
class’. The surprises, I would say, were the differences between the greens 
and the purples. My bag had a lot less purple than the class’ combined and it 
was also relatively the least amount in my bag. Most colors varied between 
12-13 counts and the purple was only 10 which makes 5-6% difference.  
There were a few outliers in our combined list, but I don’t think they 
would have made a too big of a difference in such a big data.  
My data and the class data did not really match but there is not a 
very big difference, only of 1-5%. 
 
PART 3 
 
1. Using the total number of candies in each bag in our class sample, 
compute the following measures for the variable “Total candies in each 
bag”: 
A. mean number of candies per bag 

Mean = ​60.2 

B. standard deviation of the number of candies per bag 

Standard Deviation = 7 

C. 5-number summary for the number of candies per bag 

Min = 35 Q1 = 58 Q2 (Median) = 59 Q3 = 61 Max = 97 

2. Create a frequency histogram for the variable “Total candies in each 


bag”. 

 
 

3. Create a boxplot for the variable “Total candies in each bag”. 

 
 

4. Write a well written and thoughtful paragraph discussing your findings 


about the variable “Total candies in each bag”. Address the following in 
your writing: What is the shape of the distribution? Do the graphs 
reflect what you expected to see? Does the overall data collected by 
the whole class agree with your own data from a single bag of candies?  

The shape of the distribution form the first glance looks like it’s bell 
shaped but when you look at the box plot, it looks more of skewed right. The 
graphs do reflect what I had expected to see in a way that most of the data 
of the class is similar, so that data will be inside the scope. But then we did 
have some outliers as we can see from the Box Plot and those are outside of 
the boundaries.  
I think that the data of the whole class agrees with my single bag of 
candies because the total number of the candies in my bag were a count of 
60 which is very close to the Median which is 59 for the whole class. 

5. In a half page, explain the difference between categorical and 


quantitative data. Address the following in your writing: What types of 
graphs make sense and what types of graphs do not make sense for 
categorical data? For quantitative data? Explain why. What types of 
calculations make sense and what types of calculations do not make 
sense for categorical data? For quantitative data? Explain why. 

The difference between Quantitative and Categorical (also known as 


Qualitative), is that quantitative data are numerical values, such as counting 
or percentages. Categorical data are descriptions, things that can be put in 
categories, placed in an order, or given an attribute. Zip-codes are 
categorical variables, but if you are counting how many zip-codes there are 
you are using quantitative data. 

The graphs that go well with categorical data are bar graphs and pie 
charts. These type of graphs give an understandable representation of the 
data. All the rest of the types of graphs in statistics work well with 
quantitative data, such as stem-leaf plots, histograms, and box-plots. These 
graphs are perfect for showing the numerical data. 

The calculations that work well with quantitative data are those that 
are used to produce the mean, median, mode, standard deviation, range, IQR, 
etc. The calculations that work well with categorical data are those that will 
help us see how large a category is in comparison to the population or other 
categories being studied at the time. 

PART 4 

 
1. Construct a 99% confidence interval estimate for the population 
proportion of yellow candies. 

n = ​6680 x = ​1410 p̂ = 0.211 

1-PropZInt => ( 0.2013, 0.2209 )  


Based on calculations from this data, we can be 99% confident that 
the interval between ​0.2013​ and ​0.2209​ actually contains the true value of 
the population proportion p of yellow candies. 

2. Construct a 95% confidence interval estimate for the population mean 


number of candies per bag. 

n = ​111x̄ = 60.2 σ = ​7 

TInterval => ( 58.883, 61.517 )  

Based on calculations from this data, we can be 95% confident that 


the interval from 58.883 to 61.517 actually contains the true value of μ (the 
mean number of candies per bag). 

3. In a well written and thoughtful paragraph, explain in general the 


purpose and meaning of a confidence interval. 

A confidence interval is a range of values, derived from sample 


statistics, that is likely to contain the value of an unknown population 
parameter. Because of their random nature, it is unlikely that two samples 
from a given population will have identical confidence intervals. But if you 
repeated your sample many times, a certain percentage of the resulting 
confidence intervals would contain the unknown population parameter. The 
percentage of these confidence intervals that contain the parameter is the 
confidence level of the interval. 

Vous aimerez peut-être aussi