23 vues

Transféré par api-242194552

- ch08_7e
- Statistics and Probability.pdf
- Vaccination Coverage Cluster Survey Annex
- SISI Testing
- Algebra AF Ch01
- ClickChainModel-www2009
- STAT 200 Final Exam Fall 2016
- Lecture 7 - Hypothesis Testing, Z-test, One-Sample T-test
- Descriptive Statistics and Statistical Graphics
- Confidence Intervals for the Difference Between Two Proportions
- Sample Exam
- spring math 1040
- Lecture 9
- Basics of Statistical Methods
- ftests
- Statistik Inferensial
- Stats Notes
- Tvt JAcuna 2015329 Proof
- lec14
- ETC1010 S12015 Solution Part 1

Vous êtes sur la page 1sur 12

Red Skittles

Orange

Skittles

Yellow

Skittles

Green

Skittles

Purple

Skittles

Total

Number of

Skittles

Bag 1 Row

10

13

12

16

13

60

Bag 2 Row

20

18

13

17

62

Bag 3 Row

17

10

14

14

16

61

Sample

Totals

41

39

37

36

30

183

The StatCrunch rows our group used are 10, 20, and 17. The count for each color of candy associated

with the three bags of Skittles has been evaluated and displayed in the table above.

To better assist the students in organizing our collected data, our instructor created a spreadsheet that

includes the number of each students bag of skittles and their respective numbers of each color. There

are 23 (n) sets of numbers, in correspondent with 23 number of students. To randomly pick three bags,

my partner, Jose printed the data collection the instructor provided on Canvas. Next, I went online to

http://www.mathgoodies.com/calculators/random_no_custom.html in order to access a custom

random number generator. I input the numbers for the lower limit (1) and upper limit (23) and clicked

ENTER. The numbers that were randomly selected were in the order of: 10, 20, and 17. As each

number was picked, Jose would read across the row on the data collection while I record them on my

paper. We had to verify the number he read was in the same order in terms of colors on my table. After I

finish recording Row 17, I took out a calculator and add up all the numbers per row to get the total

number of each bag and then add each column to get the total number of samples for each color. I then

ask him to go home and recalculate using the numbers I wrote down to make sure it adds up with my

table. We used a clustered sampling method because the class data (population) was gathered by each

student. With each student representing a cluster. And by random selecting three clusters and using all

data in each cluster, it can be concluded a clustered sampling method was use for our project. The part

where we randomly selected the bags using a random number generator was just simple random

sampling.

The correct sample totals for each color is red-41, orange-39, yellow-37, green-36, purple-30, and n (size

of sample) is 183.

There are a few possible errors that could have been made using these data. The first problem is the

source of each bags of Skittles. Since all the students live in Utah and attend class in Utah, I can only

conclude that these bags of Skittles may be consistent across the state of Utah. But since students

attend class at Salt Lake Community College Taylorsville Campus location, the bags of Skittles used in the

study may be regionally consistent, say within 20 miles of Taylorsville because all the retailers may be

receiving the same batch of Skittles. A method to improve this would force students to all pick one

retailer and buy their pack of Skittles from specific retailer, for example the Walmart on 54th. Another

method that may be better would be dividing the student into several groups and depending on which

group the students are in, they are forced to purchase their bag of Skittles from that specific retailer and

include that information in our data collection. A second possible error could have occurred in our group

during the transcribing process, specifically when Jose was reading numbers to me while I record them.

A third error is using the online random number generator. These random number generator are

carefully programmed, but the machine is always at the mercy of its programming. In addition to all

those errors, there are also human errors where they accidently drop skittles while counting.The results

may be sufficiently complex to make the pattern difficult to identify, but because it is ruled by a carefully

defined and consistently repeated algorithm, the number it produces are not truly random.

I do believe my sample is representative of the class data set. The total numbers of skittles in each bag is

relatively close to other sets of data. The numbers of skittles for each color is relatively consistent when

comparing the three bags to each other. There isnt a number that Ive yet find surprising. From first

glance, I feel our sample can be a representative of the class data set. Further evaluation regarding

outliers can conclude whether or not these numbers are considered normal.

Part 3

Candy colors in the Skittles Project are considered categorical data. I know this because categorical data

deals with descriptions, they can be observed but not measured. In this case, the difference in candy

colors are just their colors, there isnt a measurable difference between two skittles of different color.

Its either red or not, which means they can be measure using a nominal scale. Nominal scales are used

for labeling variables, without any quantitative value.

It is not appropriate to discuss the shape of the distribution for candy color because it is a categorical

data. When looking at pareto charts or pie chart, the difference between each subjects are not relevant

therefore looking at an overall shape of the distribution is not necessary.

The number of candies per bag is a quantitative data because its data that can be collected consisting of

numbers representing finite counts of measurements. They are information about quantities: that is,

information that can be measured and written down with numbers. The difference between those

numbers have meanings.

Mean = Green Line

5 Number Summary: 53, 58, 61, 61, 65

It is absolutely appropriate to discuss the shape of the distribution for number of candies per bag. Since

the data are quantitative, we can interpret the meanings between differences in numbers. Looking at

the box plot, we see that the distribution is not normal (symmetrical), it is right skewed. This means the

mass of the distribution is concentrated on the left of the figure. The outliers (numbers that lies an

abnormal distance from other values in a random sample from a population) 53, is not a usual number

of skittles in a 2.17 ounce skittles bag.

Part 4

Proportion of Each Colors

Red Skittles: 291/1380

Orange Skittles: 301/1380

Yellow Skittle: 295/1380

Green Skittles: 243/1380

Summary statistics:

Column

Frequency

n

23

Mean

60

Std. dev.

2.5584086

53

65 58 61

Mode

61

Median

61

Outlier Range

58-(1.5*3)=53.5 = lower fence

61+(1.5*3)= 65.5 = upper fence

Outliers: 53

The total numbers of candies in my bags are not outliers since it was 61.

Part 5

The purpose of taking a random sample from a lot or population and computing a statistic, such as the

mean from the data, is to approximate the mean of the population. How well the sample statistic

estimates the underlying population value is always an issue. The general purpose of a confidence

interval is it addresses this issue by providing a range of values which is likely to contain the population

parameter of interest.

Confidence intervals are constructed at a confidence level, such as ninety-percent, selected by the user.

This means that if the same population is sampled on numerous occasions and intervals estimates are

made on each occasion, the resulting intervals would bracket the true population parameter in

approximately ninety-percent of the cases.

Interpretation:

Confident interval assessments were performed to determine the true proportion of Skittles that are

yellow. Based on these calculations, we are 99% confident that the interval from 0.1856 to 0.2424

actually contains the true value of the population proportion (p). This means that if we were to

randomly select different samples of the same size (1380 candies) and construct corresponding

confidence intervals, 99% of them would actually contain the true value of the population proportion p.

The proportion of yellow candies in the single bag of candy that I purchased is a likely value for the true

population proportion because the number is between the two intervals (13 yellow candies out of 61

total number of candies equal a proportion of .2131).

Confident interval assessments were performed to determine the true mean number of Skittles per bag.

Based on these calculations, I am 95% confident that the interval from 58.89 to 61.11 actually does

contain the true value of the mean number of candies per bag in the population (). This means that if

we were to randomly select different samples of the same size (23 bags of Skittles) and construct

confidence intervals, 95% of them would actually contain the true value of the population mean .

The bag of candy I purchase contain 61 skittles and that number lies within the two calculated interval,

therefore it is a likely value for the population mean.

Confident interval assessments were performed to determine the true standard deviation for the

number of Skittles per bag. Based on the results of our confident interval assessments, I have 98%

confidence that the limit from 1.8905 to 3.8847 actually contains the true value for the standard

deviation of the number of candies per bag in the population (). This means that if we were to

randomly select different samples of the same size (23 bags of Skittles) and construct confidence

intervals, 98% of them would actually contain the true value of the population standard deviation .

Based on my interval for the true standard deviation of number of cadies per bag (1.8905 < < 3.8847),

it does appear that the manufacturing process does a consistent job of putting candies into 2.17 ounce

bags. The standard deviation I drew from our class samples is 2.5584 and it is a possible value for the

true standard deviation of the skittles population.

Part 6

Hypothesis testing refers to the formal procedures used in statistical analysis to accept or reject

statistical hypotheses. A statistical hypothesis is an assumption about a population parameter. This

assumption may or may not be true. The usual process of hypothesis testing consists of several steps. A

basic outline is as follows:

Formulate the null hypothesis (HO) and the alternate hypothesis (H1).

Identify a test statistic that can be used to assess the truth of the null hypothesis.

Draw a graph to include the test statistic, critical values, and critical region (if using the critical

value method).

Reject the null hypothesis (HO) if the test statistic is in the critical region. Fail to reject the null

hypothesis if the test statistic is not in the critical region.

Restate this previous decision in simple, non-technical terms, and address the original claim.

1. The hypothesis test to claim that 20% of all skittles candies are red: (see attached work)

Conclusion: Since the test statistic 1.0123, is not in the rejection region, there isnt sufficient to

reject the claim that 20% of all skittles candies are red. We conclude that there is not sufficient

evident to warrant rejection of the claim that 20% of skittles candies are red.

2. The hypothesis test to claim that the mean number of candies in a bag of Skittles is more than

55.

Conclusion: The critical value 2.508 and the test statistic was 9.3727. And since our test statistic

does not fall in the critical region bounded by the critical value of 2.508, we fail to reject the null

hypothesis. Because we fail to reject the null hypothesis, then there is not sufficient evident to

support the claim that the mean number of candies in a skittle bag is greater than 55.

The requirements for doing each hypothesis tests:

(1)Hypothesis tests requirements for a claim

about a population proportion p.

about a population mean (with sigma not

known)

The sample observations are a simple random The sample is a simple random sample.

sample.

The conditions for a binomial distribution are

satisfied. (There is a fixed number of

independent trials having constant

probabilities, and each trial has two outcome

categories of success and failure.

The population is normally distributed or n is

greater than 30.

5, nq is greater than or equal to 5, are both

satisfied.

For proportion, our sample did not meet the requirements for a simple random sample. Since each

student in the class was asked to go to the store and purchase a bag of skittles. The bag of skittles

chosen were not randomly chosen but rather wherever they can find (out of convenience rather than

using a random generator, etc.) then using all the candies in the bag (cluster) to supplement our class

data. There was a fixed number of independent trials since each color candy didnt affect the color of

the next candy. In this hypothesis test, they were

either red or not red making each trial having only two outcomes. The condition of np

(1380*.20) and nq (1380*.80) are both greater than 5 which satisfies the last condition.

For mu, we used a clustered sampling method because the class data (population) was gathered by each

student. With each student representing a cluster. And by random selecting three clusters and using all

data in each cluster, it can be concluded a clustered sampling method was use for our project. The part

where we randomly selected the bags using a random number generator was just simple random

sampling. Judging by the graph from Part 3, the population approximately appears to be a normal

distribution.

Part 7

I do not believe height of the person who purchase bag of candies have any relationship to the bag of

candies other than being the person who bought it. Height cannot be used to predict the number of

candies that will be in a bag of Skittles because they are both independent variables.

The explanatory variable is the height of each person who purchase the bag of candies. The response

variable is the number of candies that will be in a bag of Skittles.

r=-.192

There is not a significant relationship between the two variables since the Critical Value is .396,

and|.192|=.192 and that is not greater than the critical value hence no significant relationship.

Regression Equation is: y= -.14x + 69.5

Y = -.14(63.5) + 69.5= 60.61

It is not appropriate to use height to predict the number of candies per bag because the two shows no

significant relationship.

Assuming there is a significant relationship between height and number of candies per bag, it would not

be appropriate to use Yao Mings height to predict number of candies per bag because he is consider an

outlier

Part 8

- ch08_7eTransféré parAlifikri Abufaiz
- Statistics and Probability.pdfTransféré parJohn Kevin Bihasa
- Vaccination Coverage Cluster Survey AnnexTransféré parJohn Alexander Gallin
- SISI TestingTransféré parzainab
- Algebra AF Ch01Transféré parDarmawan Muhaimin
- ClickChainModel-www2009Transféré parRa Abhishek
- STAT 200 Final Exam Fall 2016Transféré parLiam
- Lecture 7 - Hypothesis Testing, Z-test, One-Sample T-testTransféré parJi Roo
- Descriptive Statistics and Statistical GraphicsTransféré parcrazybobblaskey
- Confidence Intervals for the Difference Between Two ProportionsTransféré parscjofyWFawlroa2r06YFVabfbaj
- Sample ExamTransféré parYchan24
- spring math 1040Transféré parapi-241466802
- Lecture 9Transféré parDango Qtr
- Basics of Statistical MethodsTransféré parpragati goel
- ftestsTransféré parWaqas Khan Niazi
- Statistik InferensialTransféré parErlin Sriwanti
- Stats NotesTransféré parKevin McNeill
- Tvt JAcuna 2015329 ProofTransféré parJose Acuña
- lec14Transféré parUday Prabhu
- ETC1010 S12015 Solution Part 1Transféré parMohammad Rashman
- Chapter 10 (1)Transféré parmaustro
- Inferential Stats SamplesTransféré parJeh Amarante
- STA 9700 Homework 1 8-23-16 AnswersTransféré parmaria
- Linear Correlation.docxTransféré parFlavian Tutuianu
- Useful Stat a CommandsTransféré parNoman Ansari
- ESCI-JPPTransféré parejaz2
- Analysis of Experiments With Random EffectsTransféré parPedro Olivo
- Runfola Robert Computerlab Section#6 (Autosaved)Transféré parbrrunfola
- Confidence Interval for Measure of Algorithm Performance Based on Blocked 3 bm {times} 2 Cross-Validation.pdfTransféré parIEEEPROJECTSCHENNAI
- portfolio statsTransféré parapi-284934306

- How to Write an OutlineTransféré parJamesJohnson
- 344480-june-2015-question-paper-21Transféré parAdnan Mehmood
- Chaos Lecture NotesTransféré parsl1ucky
- Impact of Motivation on Organizational PerformanceTransféré partewodros gashaw
- Seismic MethodTransféré parShah Jee
- KPZ the Fixed PointTransféré parAldo RM
- Simplified Calculation of Component Number in the Curvature TensorTransféré parweylguy
- Job Characteristics as Predictors of Work Motivation and JobTransféré parAlexandruUrsu
- 10.1109@SIIE.2017.8259670.pdfTransféré parHappy Ticha Suganonk
- Self-rated health and its determinants among older people living in the rural community in Sri LankaTransféré parIOSRjournal
- Estimating a Latent Trait Model by Factor Analysis of Tetrachoric CorrelationsTransféré parHongqiang Liu
- Am J Clin Nutr 2010 de Onis 1257 64Transféré parSatria Rao Reza
- Dbscan Fast Density-based Clustering With RTransféré parShengyun Zhan
- Qualitative ResearchTransféré parsameekhanniazi
- Mc DonaldsTransféré parhafsha
- Fiberoptic Endoscopic Evaluation of Swallowing as Possible Index of ALSTransféré parAgung Hidayani
- Three kinds of psychological determinants for hand-washing behaviour in KenyaTransféré parRusida Liyani
- ProjectTransféré paryesh cyber
- Probability AssignmentTransféré parDharmendra Kumar
- Dissertation GuidelinesTransféré parPatients Know Best
- chapter-1-4-tapos-na-ito.docxTransféré parRovelyn Alejandro Tubal
- 2.13_Modeling+Cycles%3A+MA%2C+AR%2C+and+ARMA+Models+建模周期：MA%2C+AR%2C+和ARMA+模型Transféré parJames Jiang
- 02. roth vs estandar.pdfTransféré parLuis Herrera
- Moisture Analysis D3173-03Transféré parmerifie renegado
- Discrete MathTransféré parImey Yemi
- SPSSTransféré parajaykaroor
- Anemia and Associated Factors Among PregnantTransféré par4negero
- Technique for Human Error-rate PredictionTransféré parBramhendra Mude
- Validation of HPLC Method for Determination of E- And Z-Ajoene InTransféré parnini_popa
- Net Promoter ScoreTransféré pardayalume