Vous êtes sur la page 1sur 13

Angelica Langford

Skittles Term Project

Question: How many skittles are in each bag of every bag of Skittles in the world?

In this project we are trying to show the benefit of using excel to create various charts and tables relative to statistics. We have a class of 25 people and each of us bought

a bag of Skittles. We counted each of the colors, and have made a pie chart, a Pareto chart,

a five number summary, a boxplot, and a histogram out of the data of the classes colors and

numbers of skittles. We will end with multiple confidence intervals, and hypothesis testing.

Data for the class

   

Number of

Number of

Number of

Number of

Number of

 

Red

orange

yellow

green

purple

Total

#

Name

Candies

candies

candies

candies

candies

Candies

1

Karen

10

14

8

10

18

60

2

Shalon

19

11

9

9

12

60

3

Samuel

11

14

10

15

10

60

4

Leslie

15

17

7

15

8

62

5

Dialma

14

5

9

18

15

61

6

Maria

13

11

8

5

21

58

7

Margrethe

10

9

19

11

12

61

8

Haley

14

19

12

8

9

62

9

Rupa

12

11

15

9

8

55

10

Heather

10

16

12

11

10

59

11

Allie

13

8

12

12

13

58

12

Brad

19

14

5

13

9

60

13

Bridgette

13

12

18

11

7

61

14

Milene

10

9

12

13

18

62

15

Jameson

6

14

11

17

14

62

16

Jessica

9

19

11

9

15

63

17

Marie

9

11

18

16

5

59

18

Emmoly

10

13

16

11

10

60

19

Cole

12

12

8

14

15

61

20

Angelica

8

13

13

15

13

62

21

Jessica

10

12

9

11

20

62

22

Eli

14

10

18

5

15

62

23

Dallin

5

9

20

11

15

60

24

Adam

12

15

7

14

14

62

25

Nate

14

9

12

14

11

60

 

TOTAL

292

307

299

297

317

1512

Data for me alone

20 Angelica

8

13

13

15

13

62

Pie Charts

Class Skittles Data

Class Skittles Data Number of Red Candies Number of green candies Number of orange candies Number

Number of Red CandiesClass Skittles Data Number of green candies Number of orange candies Number of purple candies Number

Number of green candiesClass Skittles Data Number of Red Candies Number of orange candies Number of purple candies Number

Number of orange candiesClass Skittles Data Number of Red Candies Number of green candies Number of purple candies Number

Number of purple candiesClass Skittles Data Number of Red Candies Number of green candies Number of orange candies Number

Number of Red Candies Number of green candies Number of orange candies Number of purple candies

Number of yellow candies

Red Candies Number of green candies Number of orange candies Number of purple candies Number of

Pareto Charts

Pareto Charts Class Skittles Data 350 300 250 Class Skittles Data 200 150 100 50 0

Class Skittles Data

350 300 250 Class Skittles Data 200 150 100 50 0
350
300
250
Class Skittles Data
200
150
100
50
0

Number of Red Candies350 300 250 Class Skittles Data 200 150 100 50 0 Number of yellow candies Number

Number of yellow candiesSkittles Data 200 150 100 50 0 Number of Red Candies Number of purple candies Number

Number of purple candiesData 200 150 100 50 0 Number of Red Candies Number of yellow candies Number of

Number of orange candiesData 200 150 100 50 0 Number of Red Candies Number of yellow candies Number of

Number of green candies200 150 100 50 0 Number of Red Candies Number of yellow candies Number of purple

Class Skittles Data for Every Student

25 20 15 10 5 0 1 2 3 4 5 6 7 8 9
25
20
15
10
5
0
1 2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

Number of Red Candies11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Number

Number of green candies16 17 18 19 20 21 22 23 24 25 Number of Red Candies Number of

Number of orange candies19 20 21 22 23 24 25 Number of Red Candies Number of green candies Number

Number of purple candies19 20 21 22 23 24 25 Number of Red Candies Number of green candies Number

Number of Red Candies Number of green candies Number of orange candies Number of purple candies

Number of yellow candies

Organizing and Displaying Quantitative Data: the Number of Candies per Bag Mean: 60.6 Std. Dev: 1.4434 5 NUMBER SUMMARY Min: 55 LQ1: 60.5 Median: 61 LQ3: 62.5 Maximum: 63 Box and Whisker Plot of 5 Number Summary

62.5 Maximum: 63 Box and Whisker Plot of 5 Number Summary 55 60.5 61 62.5 63
62.5 Maximum: 63 Box and Whisker Plot of 5 Number Summary 55 60.5 61 62.5 63
62.5 Maximum: 63 Box and Whisker Plot of 5 Number Summary 55 60.5 61 62.5 63
62.5 Maximum: 63 Box and Whisker Plot of 5 Number Summary 55 60.5 61 62.5 63
62.5 Maximum: 63 Box and Whisker Plot of 5 Number Summary 55 60.5 61 62.5 63
62.5 Maximum: 63 Box and Whisker Plot of 5 Number Summary 55 60.5 61 62.5 63

55

60.5

61

62.5

63

3D Histogram of 5 Number Summary 64 62 60 58 56 54 52 50 1
3D Histogram of 5 Number Summary
64
62
60
58
56
54
52
50
1
Minimum
L1Q
Median
L3Q
Maximum

Graph Questions:

What is the shape of the distribution? The shape is approximately normal.

Do the graphs reflect what you expected to see? Yes, I expected to see a normal distribution, and everyone having about the same amount of candies in their bags.

Does the overall data collected by the whole class agree with your own data from a single bag of candies? Yes. Their data was quite similar.

In addition to the summary statistics and boxplot, include the number of candies from your own bag and the total number of bags in the sample. I had 62 candies in my bag, and there were a total of 25 bags in the sample of the class.

Explain the difference between categorical and quantitative data. Categorical data is not based on measurements or numbers but on categories. Quantitative data is based on numbers and things that can be calculated with numbers.

What types of graphs make sense and what types of graphs do not make sense for categorical data? It would make sense to make a graph of the hair colors in your school. It would not make sense to make a graph of the amount of lockers in your school.

For quantitative data? It would make sense to make a graph of the amount of cars on your street. It would not make sense to make a graph of the colors of all of the cars on your street.

Explain why. Categorical data is more about categories and colors, etc. whereas quantitative data is more about calculations and numbers.

What types of calculations make sense and what types of calculations do not make sense for categorical data? It would make sense to calculate how many races there are in America, to put them into columns. It would not make sense to calculate how many times a day people die.

For quantitative data? It would make sense to calculate the number of weeks in a seventy year period. It would not make sense to calculate the holidays and what days they land on.

Explain why. Because categorical data relies more on categories, not calculations of numbers, and quantitative data is the opposite.

Confidence Interval Estimates

Statisticians use a confidence interval to describe the amount of uncertainty associated with a sample estimate of a population parameter. Basically, how confident you are that a sample is close to the population. We are trying to figure out if the sample of Skittles we got looks like the population.

Construct a 95% confidence interval estimate for the true proportion of purple candies.

n=1512=sample size

p^= total # purple candies/ total # candies= (317/1512)=.21

Confidence Interval: 95%

=.05

z=+1.96 (z score associated with .025)

Estimating population parameter: (p^ - E< p < p^ + E)

E=Margin of Error= /2 sqrt(p^(1-p^)/n) =.021

Population parameter: (.189 < p < .23)

This means that we are 95% confident that the population proportion of purple skittles falls between these two numbers.

Construct a 99% confidence interval estimate for the true mean number of candies per bag.

n=25

Mean=60.48

=.01%

Confidence interval: 99%

Estimating population parameter: (mean- E < μ < mean+ E)

E=margin of error= t_ /2 * s/sqrt(n)=.985 (s= std. dev)

Population parameter: (59.5 < μ < 61.5)

This means that we are 99% confident that the population mean for the number of Skittles in a bag falls between these two numbers.

Construct a 98% confidence interval estimate for the standard deviation of the number of

candies per bag.

DF= 24

n=25

s=1.76

Confidence interval=98%

= .02

/2=.01

Estimating population parameter: (1) 2

2

<

< √(−1) 2

2

2 = 42.980

2 = 10.856

Population parameter: Using the equation from above I plugged in the numbers to come up with this confidence interval:

. 201 < < .794

This means that we are 98% confident that the population standard deviation of candies per bag falls between these two numbers.

Hypothesis Tests

Explain in general the purpose and meaning of a hypothesis test. A hypothesis is a claim or statement about a property of a population. A hypothesis test is a procedure for testing a claim about a property of a population.

Use a 0.01 significance level to test the claim that 20% of all Skittles candies are green.

=.01

2

=.005

n=1512

p^=# of green candies/total # of candies=.1964

Null Hypothesis: 0 = = .20

Alternative Hypothesis: 1 = ≠ .20

This will be a two-tailed graph

z=p^-p/sqrt(p(1-p)/n = 1.964-.20/sqrt(.20(1-.20)/1512) = -.350

z score: 2.575

The graph above shows that our score isn’t even close to being outside our critical values, showing that this hypothesis is very likely. Fail to reject 0 .

Conclusion: There is not sufficient evidence to reject the claim that 20% of all Skittles candies are green.

.005 .005 -2.575 +2.575
.005
.005
-2.575
+2.575

Z=-.350

Use a 0.05 significance level to test the claim that the mean number of candies in a bag of

Skittles is 56.

DF=24

n=25

mean=60.5

s=1.76

=56

=.05

t=(mean- )/s/sqrt(n) = 60.5-56/1.76.sqrt(25) =12.78 (already very far from the mean)

Critical t value=2.064

-2.064 +2.064 12.78
-2.064
+2.064
12.78

According to our graph, our test statistic is outside of our critical range, therefore there is sufficient evidence to reject the hypothesis that the mean number of candies in a Skittles bag is 56. Reject 0 .

Reflection

State the conditions for doing interval estimates and hypothesis tests for population proportions and discuss whether or not your samples met these conditions.

- Interval estimates: To design one, you must describe the amount of uncertainty associated with a sample estimate of a population proportion parameter. You have to choose the formula (p^ - E< p < p^ + E). My sample met these conditions.

- Hypothesis tests: You have to want to prove a test for population right or wrong to use a hypothesis test. You have three formulas to choose from. You may choose one. My sample met these conditions.

State the conditions for doing interval estimates and hypothesis tests for population means and discuss whether or not your samples met these conditions.

- Interval estimates: To design one, you must describe the amount of uncertainty associated with a sample estimate of a population mean parameter. You have to choose the formula (mean-E <<mean+E). My sample met these conditions.

- Hypothesis tests: You have to want to prove a test for population mean right or wrong to use a hypothesis test. You have three formulas to choose from. You may choose one. My sample met these conditions

State the conditions for doing interval estimates for population standard deviations and discuss whether or not your samples met these conditions.

- Interval estimates: To design one, you must describe the amount of uncertainty associated with a sample estimate of a population standard deviation parameter.

You have to choose the formula ( (1) 2

these conditions.

2

<

< √(−1) 2

2

). My sample met

- Hypothesis tests: You have to want to prove a test for population standard deviation right or wrong to use a hypothesis test. You have three formulas to choose from. You may choose one. My sample met these conditions

What possible errors could have been made by using this data?

Our class could have had bags of Skittles that were above or below average. Our sample of 25 students is pretty small to observe population. Calculation errors also could have been made in the data, and not counting the broken pieces of Skittles at the bottoms of the bags could make a slight calculation difference.

How could the sampling method be improved? By obtaining a larger sample, we could do the same statistical experiment, and it would have more accurate results for assumption of the population.

State what conclusions you have drawn from your statistical research. I have drawn conclusions that our data wasn’t flawed, but the sample size is too small. The data we did collect was of normal distribution and had pretty average numbers of Skittles. The Pie Chart and Pareto chart show the numbers and colors of candies well. The confidence interval assumes that my conclusions are basically the exact same as the population, but with such a small sample, who’s to know. The hypothesis testing worked great to prove and disprove the hypotheses that were given. Overall this statistical research method is very important and applicable to everyday life.

PROJECT REFLECTION:

What have you learned as a result of this project? This project has helped me while

in my statistics course to learn how to apply the concepts we have been learning to the real

world.

Discuss how the math skills that you applied in this project will impact other classes you will

take in your school career. The math skills I have applied in this class will help me in other

classes that I take. Entropy, for example. Calculating entropy without the use of excel and

knowing how to make charts is miserable. But with the concepts and skills I have learned in

this class, calculating entropy for my physics class was a breeze.

Identify specific parts of the project and your own process in completing the project that may

have applications for other classes. This project can have impact on a lot of other classes.

Especially sharpening your thinking skills and in other math classes. Distributions appear in

physics a ton.

Discuss how the project helped to develop your problem solving skills. This project helped

me develop problem solving skills through learning to follow a pattern of thinking that leads

to an answer, not just plugging numbers into a formula. It helps when there are real world

concepts to apply them to, like how many people drink alcohol in the state of Utah.

Discuss how this project changed the way you think about real-world math applications.If

your thinking was not changed, then discuss how the project supported your views about

real-world math applications. I have actually enjoyed this math class, and I can’t say that

about a lot of other math classes. Statistical skills are very useful, especially when you’re

not sure if you can believe what someone says. Politics plays a huge role in this. You

shouldn’t believe everything a politician says, you should use your statistics skills first.