Académique Documents
Professionnel Documents
Culture Documents
George A. Milliken, PhD Department of Statistics Kansas State University Manhattan, Kansas
September 2000 Department of Statistics Kansas State University 1
Statistics: A collection of procedures and processes to enable researchers in the unbiased pursuit of Knowledge
September 2000
A HYPOTHESIS OR SET OF HYPOTHESES should state exactly what you want to DO or LEARN or STUDY
SHOULD ANSWER What are the factors to be studied and what relationships are to be investigated? What is the experimental material? Etc.?
September 2000 Department of Statistics Kansas State University 3
The area of STATISTICS would not be needed if each time you measured an experimental unit you would obtain the same response or value BUT, THE RESPONSES ARE NOT THE SAME SINCE THERE IS VARIABILITY or NOISE IN THE SYSTEM STATISTICAL METHODS EXTRACT THE SIGNAL FROM THE NOISE TO PROVIDE INFORMATION One of the Statisticians JOBS is to make sense from DATA in the presence of VARIABILITY or noise by using DATA ANALYSIS TOOLS
September 2000
DESIGN VS. ANALYSIS The PURPOSE OF DATA COLLECTION is to GAIN INFORMATION OR KNOWLEDGE!! Collecting Data does not guarantee that information is obtained. INFORMATION At best: INFORMATION=DATA+ANALYSIS
September 2000 Department of Statistics Kansas State University 5
DATA
If data are collected such that they contain NO information in the first place, then the analysis phase cannot find it!!!
The best way to insure that appropriate information is contained in the collected data is to DESIGN (plan) and Carefully Control the DATA COLLECTION PROCESS The measured variables must relate to the stated OBJECTIVES of the study
September 2000
If you have a good design and process for data collection, it is quite often straight forward to construct an analysis that extracts all of the available information from the data
The ROLE of a STATISTICIAN is to work with the REAEARCH TEAM (or researcher) from the START of the study
September 2000
The MOST IMPORTANT TIME for the statistician to become involved with a research study is in the very BEGINNING
A STATISTICIAN CAN HELP OBTAIN THE MAXIMUM AMOUNT INFORMATON FROM AVAILABLE RESOURCES
September 2000
HOW??? HELP WITH THE DESIGN OF THE EXPERIMENT DETERMINE SAMPLE SIZE NEEDED DEVELOP PROCESS OF COLLECTING DATA DISCUSS VARIABLES TO BE MEASURED AND HOW THEY RELATE TO THE OBJECTIVES OF THE STUDY PROVIDE METHODS OF ANALYZING THE DATA HELP TRANSLATE STATISTICAL CONCLUSIONS INTO SUBJECT MATTER CONCLUSIONS
September 2000
THE CORE HELP FROM THE STATISTICIAN IS IN THE DESIGN OF THE EXPERIMENT Help with selecting conditions that relate to the objectives of the study Selecting the Experimental Units Deciding when REPLICATIONS exist Determining the ORDER in which the experiment is to be carried out THE DESIGN OF THE EXPERIMENT IS CRITICAL
September 2000 Department of Statistics Kansas State University 10
COMPONENTS OF DESIGNED EXPERIMENTS TREATMENT STRUCTURE: Factors or Populations or Treatments related to the objectives of the experiment: Brands of Product, Types of Uses of Product DESIGN STRUCTURE OR EXPERIMENTAL UNITS: Factors used in blocking the experimental units as well as characteristics of exp. Units Washing Machine, Person Using Machine, Products evaluated in Session by Taste Panelist
September 2000
11
Treatment Structure
Design Structure
RANDOMIZATION IS THE INSURANCE POLICY AGAINST INTRODUCING BIAS INTO THE STUDY
Selecting an appropriate Treatment Structure, necessary Design Structure, and required Randomization Process provides the Statistician the information needed to construct an appropriate model APPROPRIATE MODEL = BEST ANALYSIS
September 2000
13
Key to the Design of the Experiment is the Concept of REPLICATION REPLICATON: The independent observation of a treatment An Experimental Unit Provides a Replication of the level of a Factor if the level is randomly assigned the the Experimental Unit and observed independently of the other Experimental Units Must make sure that Sub-samples are not considered to be Replications
September 2000
14
The Variability among Experimental Units treated independently alike provides the estimate of the variance (or Standard Error) to be used as the measuring stick for comparing the levels of treatments randomly assigned to those Experimental Units Between Sub-sample variance is generally much less than between Replication variance It is critical that the Replications are appropriately Identified Treatment Structure, Design Structure (with experimental units and replication) and Randomization describe the total Design
September 2000 Department of Statistics Kansas State University 15
ANALYZE THE DATA: Use the COMPLETED DESIGNED EXPERIMENT and the data type to construct an appropriate analysis Use Statistical Software SAS, RS/1, JMP A software package you know will provide valid results
September 2000
16
The Statistician will provide the STATISTICAL interpretation of the results from the analyses STATISTICAL ANALYSES CONCLUSIONS The Statistician will help the Researcher TRANSLATE the statistical analyses conclusions into subject matter conclusions Discuss how the statistical analyses provide results that relate to the STATED OBJECTIVES of the study. The expected results should be written along with the objectives. Results that are not expected should be looked at carefully
September 2000
17
Washing Machine Example: 4 brands or models -- one machine each 3 types of laundry Whites, Wash/wear, Denim 3 persons to operate the Machines
For each person: Randomly assign the order of Brands For each Brand, randomly assign the order of Types
September 2000
18
Random Order of Brands for Person 1 Brand D Denim White W/W Brand B W/W Denim White Brand A White W/W Denim Brand C White Denim W/W
Machine
Random Order of Types within each Machine Re-Randomize for each Person
September 2000
19
Persons are Blocks of Machines Machines are Experimental Unit for Brands and Variance is computed by Person*Brand
September 2000
20
The Machines within a Person are Blocks for Types Three Loads per Machine The Loads within a machine are the Experimental Units for Type and Brand*Type Variability among Loads treated alike provides the measuring stick for comparing the levels of Type and Brand*Type This Design Involves Persons as Blocks and Two Sizes of Experimental Unit Machine and Load
September 2000 Department of Statistics Kansas State University 21
If you ignore that this design involves TWO sizes of Experimental Units and there are Two Error Terms, the resulting error term is a combination of these two error terms The combined error term is Too Large for making comparisons involving Type and Brand*Type wont find things that are there The Combined error term is Too Small for Making comparisons involving Brand will declare things to be different when they are not Statistical Conclusions can be very misleading
September 2000 Department of Statistics Kansas State University 22
STATISTICIANS JOB to figure out how the study is being ran and help identify the type of design that is being used which includes determining if more than one size of experimental unit is involved
This is accomplished BEST when the Statistician is involved at the Beginning of the Study
September 2000 Department of Statistics Kansas State University 23
NINE TYPES OR BRANDS OF SALSA A PERSON CAN TASTE ONLY THREE SALSAS DURING THE SESSION TWELVE PERSONS WILL BE USED IN THE STUDY
September 2000
24
Person 2 A I B B E G 3 B G H F A E Order 1 7 8 9 10 11 12
Department of Statistics Kansas State University
2 D G I H A D
3 E D F C H I
25
F A C D F B
Each Product is Tasted 4 times there are Four Replications of each product
Since each person tastes only Three of the products, how do we compare the products? Want to compare the Products as if each Person had tasted all of the Products The Analysis obtains predicted values for each Product for each Person
September 2000
26
The Product Means of these Predicted Values are the ADJUSTED MEANS for each Product Called LEAST SQUARES MEANS by SAS The LSMEANS are the Predicted Means as if Each of the Persons has Tasted and evaluated all of the products
September 2000
27
Some times characteristics of experimental units are measured to be used as possible covariates Study the effect of three types of Drugs on a persons heart rate Randomly Assign 12 persons to each of the Drugs -- person is experimental unit Dose the person with the assigned drug and measure the heart rate after 15 minutes
September 2000
28
Persons do not have identical heart rates before being given the respective drug Measure the initial heart rate heart rate before giving the drug We want to compare the Drugs as if all experimental units (persons) had the same initial heart rate
September 2000
29
Analysis of Covariance uses a regression model to obtain predicted after drug heart rate values as if all persons had initial heart rates of, say, 74 beats per minute
The Drug Means of these predicted heart rates are used to compare the Drugs These means of Predicted Values are called LSMEANS
September 2000
30
1. Obtain treatments means when not all treatments are observed the same number of times by each person 2. Obtain treatments means when the experimental units do not have identical values of the covariates
September 2000
31
Another Role of the Statistician is to provide appropriate models for the analysis of the data from a given study in order to take into account the Design Structure and covariates to provide estimates of the treatment effects as if all experimental units had observed all treatments or all experimental units had the same value of the covariate -- provide appropriate LSMEANS
September 2000
32
Involving the Statistician in the Beginning of the Study will 1. improve the chance of conducting a successful experiment 2. Speed up the turn around of the analyses since was involved with the design 3. Reduce the costs associated with the experiment -- making sure the sample size is adequate to provide the needed detectable differences
September 2000
33
THE END
THANK YOU FOR LISTENING
September 2000
34