108 vues

Transféré par garrettherr

A description of the ANOVA process

- mathhhhhhhhhhhhhhhhhhh.docx
- Session 4
- case study
- ejefas_16_09
- Hypothesis Testing for Analysis
- What Z
- Chapter 9 Assignement
- Hypothesis Testing
- Video Questions
- 1 Anticamara JA_Bio 180 Hypothesis Testing
- Analysis of Variance
- Statistics Primer
- Hypothesis
- Biostatistics and Orthodontics
- T2.Statistics Review (Stock & Watson)
- SampleFinal Islam
- Czech Study
- Word Results
- Profile Analysis How To
- Review Test Submission: Midterm 2 Part 1

Vous êtes sur la page 1sur 7

The audience for this document is students majoring in Statistics at Penn State University who will be taking STAT 461, the ANOVA class that is offered at Penn State. This is a class that is fundamental for any student who is in the Applied Statistics option of the statistics major at Penn State. The purpose of this document is to help students understand the background statistics behind what the ANOVA process actually does and why it does this, so that these students are not just learning how to do the right steps to get an answer but why they are doing these steps. Since the purpose of the paper is not to tell the students how to do the testing, I wont explain how to do each step because that would be more of an instruction set. I will explain what each step does in helping to come to a statistical conclusion.

Scope

This document will describe how the ANOVA process works, why each step is taken, and the mechanics behind how each step helps us come to a sound, statistical conclusion. Students need to understand what each step is doing and why it is being taken so that they are better suited to identify which situations require an ANOVA test.

Introduction

The ANOVA process is used by statisticians to analyze data when specific aspects of that data are being compared. This process has its roots in agricultural research, mainly in testing the effects of different variables on crops. An ANOVA is a statistical process used to compare the means of multiple different treatment levels. The assumptions that have to be made are that the variation within the different treatments is the same, the variables that are being tested are normally distributed and the data points are gathered randomly. Basically what is being tested is whether all of the means of different treatment levels are the same. The conclusions that are made from this are whether are not a certain treatment is different from other treatments.

Process

The one main idea behind the ANOVA test is the difference amongst within group variation and between group variation. Within group variation describes the variation that occurs in the data points that are within each treatment level. The between group variation describes the variation that occurs in the data points collectively between the different treatment levels. Knowing this is very important when an ANOVA test is being conducted because knowing the ratio of between group variation compared to within group variation will help make the final conclusion.

Step 1

The first step of an ANOVA test is to state the null and alternative hypothesis for the test. These hypotheses are important because they show what is being tested for. The null hypothesis will always be the same for all ANOVA testing. It will be that all means are the same. This is because the main reason for doing an ANOVA test is to see if treatment levels are different. The null hypothesis is always what is considered true before the test, so it is normal to first suppose that all the means are equal. The alternative hypothesis will then be that just one of the means is different from another mean. The alternative hypothesis is what is trying to be shown to be true. This is the opposite of the null hypothesis because the test is trying to show that the null hypothesis is false, and therefore accept the alternative hypothesis. The test is also testing to see if just one mean is different from the rest because the reason for ANOVA tests is to find what treatment is different, so if just one mean is different, there are significant results. This step is normally written as:

Step 2

The second step of an ANOVA test is to calculate the F statistic for the data, which is used to make the decision on the null hypothesis. There are a couple of variables that will need to be calculated before the F statistic can be calculated. The first calculation that needs to be done is to find the Sum of Squares of the Treatment (SST). The reasoning behind the SST is to find the between group variation. This is the number that is really focused on because it shows how much variability there is between the means of the groups, which will help us to decide whether or not the means are different. The reason that the SST measures between group variation is because it compares all the group means to the mean of the entire data set. This shows how much variation there is between each treatment level in the data set. The next calculation will be to find the Sum of Squares of the Error (SSE). The reasoning behind the SSE is to find the within group variation. The reason that SSE measures within group variation is because it compares all the data points within one treatment to the mean of that treatment. This shows how much variation there is within each treatment level because it shows how much each data point varies from the mean of that treatment. After the SSE and SST are calculated, the Sum of Squares Total (SSTotal) is calculated next. This is an easy calculation after the SSE and SST are calculated because it is just the sum of the SSE and SST. This makes sense because the total amount of variation in a data set is only the within group variation and the between group variation, since there is no other type of variation present. So adding these two together will give the SSTotal. After calculating the SSE, SST, and SSTotal, the degrees of freedom for the SSE and SST need to be calculated. The degrees of freedom, which should be a familiar concept for you, for the SST will be the number of treatments (n) minus 1. This is because there are n treatment levels involved with one data value for each sample (the sample mean). The degrees of freedom for the SSE will be the total number of data points in the data set minus the number of treatment levels. This is because each treatment has degree of freedom of one less than their sample size, so adding them together gives the above formula.

The next calculations will be to find the Mean Squared Error (MSE) and the Mean Squared Treatment (MST). This is done because the variables that will be used for comparison in the F statistic need to be standardized so that they are similar. The standardization is done by dividing the SST and SSE by their respective degrees of freedom. This is important because the SSE may be inflated since it has more data points, and therefore the variation could be larger. Dividing by the degrees of freedom will get rid of this inflation because it takes into account the number of data points in the data set and the number of treatments. Once the MSE and MST are calculated, it is finally time to calculate the F statistic. The F statistic is what will be used to determine whether or not the differences in means are significant. The F statistic is calculated by dividing the MST by the MSE. This gives a ratio of how much variability in the data set is from between group variation compared to within group variation. This will then be compared to an F critical value, which will be explained in the next section. All of these values are then put into a table that looks like the table below.

Step 3:

The next step is to calculate an F critical value. This is done by using an F table. The F table takes into account the degrees of freedom of the SSE and SST as well as the significance level of the test. The significance level of a test should be familiar to you. This is chosen by the statistician and is based on how precise the researcher wants the data. A normal significance level for this test is .01. Once the degrees of freedom for both SSE and SST are known along with the significance level, an F table such as the one below is used to calculate the F critical

value. This F critical value is the value of F at which the ratio between the SSE and SST becomes significant that specific degrees of freedom and significance level. The degrees of freedom and significance level are important in determining this number because the significance level, which measures confidence, will change the F critical since the more confident the test is, the higher the f critical will be because it takes more to make the null inconsistent with the data. The degrees of freedom are important because they take into account the size of the data set and number of treatments, which also affect at what point the f calculated becomes significant because the larger a sample is, the more likely it is that there are smaller variances. So the larger the degrees of freedom are, the smaller the f critical values become.

Step 4:

The last step is to determine whether or not the test has given significant results. This is the easy part of the analysis. All that has to be done is to compare the calculated F value to the critical F value. If the calculated F is greater than the critical F, then there is significant data and the null hypothesis can be rejected. This means that there is one mean that is different from the rest of the means. If the calculated F is less than the critical F, then there is not significant data and the null hypothesis is not rejected. This is because the F critical value is the extreme at where the results that are calculated are not consistent with the null hypothesis anymore. So any F calculated that is greater than this F critical is not consistent with the null hypothesis and therefore is significantly different.

Conclusion

The main idea for a one-way ANOVA test is to test whether or not there is a difference in different treatments on a set of data. It is called an Analysis of Variance test because it is testing the variance in the within the treatments and between the treatment levels to see if there is significant data. Finding whether or not there is a difference in means of treatments is very significant for many industries such as agriculture, production, and medicine. Hopefully with this new understanding of how the ANOVA process works, you will be able to better use this testing process and understand the results that the test gives.

Works Cited

http://www.scribd.com/doc/6006499/ANOVA-Introduction http://www.gs.washington.edu/academics/courses/akey/56008/lecture/lecture7.pdf http://people.richland.edu/james/lecture/m170/ch13-1wy.html

- mathhhhhhhhhhhhhhhhhhh.docxTransféré parpixie dust
- Session 4Transféré parchanlal
- case studyTransféré parVarun Soni
- ejefas_16_09Transféré parNauman Zaheer
- Hypothesis Testing for AnalysisTransféré parforecell
- What ZTransféré parHàMềm
- Chapter 9 AssignementTransféré parcleofecalo
- Hypothesis TestingTransféré parJaniceGalusCordova
- Video QuestionsTransféré parsebaszj
- 1 Anticamara JA_Bio 180 Hypothesis TestingTransféré parJo Fernandez
- Analysis of VarianceTransféré parChiranjaya Hulangamuwa
- Statistics PrimerTransféré parpearlparfait
- HypothesisTransféré parAmit Saha
- Biostatistics and OrthodonticsTransféré parvelangni
- T2.Statistics Review (Stock & Watson)Transféré parAbhishek Gupta
- SampleFinal IslamTransféré parTriMapTrimapcase
- Czech StudyTransféré parAndreea Andrusca
- Word ResultsTransféré parSusanth Alapati
- Profile Analysis How ToTransféré parGifuGifu
- Review Test Submission: Midterm 2 Part 1Transféré parMahmoudT.Mirza
- Central Statistical Monitoring- Detecting Fraud in Clinical TrialsTransféré parwarkop aja
- Hypothesis Test Report_Ztest for Mean (1)Transféré parsammyfaye
- LCD OkdoccTransféré parjun del rosario
- Daniel+Chiu+Article+Transféré parShweta Mishra
- Ijret - Factors Influencing the Rise of House Price in Klang ValleyTransféré parInternational Journal of Research in Engineering and Technology
- mmmmmmTransféré parapi-268380358
- Bucchi et al 2014 editado.docxTransféré parGiancarlo Rienzi
- Us Environmental Protection Agency-Acute Toxicity Lc50Transféré parApoteker Dina Yuspita Sari
- 52061086Transféré parsudiptaeg6645
- B. Question Bank 2Transféré parhimanshubahmani

- Lecture 2 (Handout)Transféré parMichael Lim
- Data Analysis MethodsTransféré parSania Mirza
- Decision Making 150119Transféré parLavie Insani Fadilla
- Aspects of Multivariate AnalysisTransféré parMarcos Silva
- Roles for Theory in Evaluation PracticeTransféré parGerda Warnholtz
- Home Assignemt #4 (Exploratory & Qualitative Research)Transféré parMazhar Ali
- tajuk 4 KONSEP KAJIAN TINDAKAN(printed).Transféré parSaw Yong Chia
- Principles of Sociological Inquiry_ Qualitative and Quantitative Methods 1Transféré parAnshul Sood
- 2. Research Design-LDR 280Transféré parShoaib Zaheer
- Business Research MethodsTransféré parP Sunder Raj
- P-Curve: A Key to the File DrawerTransféré parNOCLUEBOY
- A Compendium of the Evidence for Psi1Transféré pargoni56509
- A Guide to Gear Up Program Evaluation- Optimal Research Design, Methodology, And Data ElementsTransféré parMichaelister Ordoñez Monteron
- Definitions of TermsTransféré parAntonio Bejasa Jr
- Assessment and EvaluationTransféré parhuwaina
- Data SourcesTransféré partafakharhasnain
- Tajuk 2 - Jenis Penyelidikan PendidikanTransféré parNico Vella
- ch09 SOLUTransféré parJuber Farediwala
- ch01Transféré parAndre Powell
- Research Project on brand preferenceTransféré parStewart Serrao
- Mayring(2014) Qualitative Content AnalysisTransféré parBeatriz Vallina
- research Methodology thesis.pdfTransféré parAsad Ali
- 0136141390_Ch11_01.pdfTransféré parsonski
- Action ResearchTransféré parMusbri Mohamed
- Ken Black QA 5th chapter 10 SolutionTransféré parRushabh Vora
- Survey Mode as Source of Instability in Responses Across SurveysTransféré parpmkobar4612
- Behavioral Theory of the FirmTransféré parDragana Radicic
- 228955114-Research-Methods-for-Management (1).pdfTransféré parVARBAL
- Q methodologyTransféré parAdam Leon Cooper
- Jurnal Siti Rahayu Pdf_2Transféré parYani