508 vues

Transféré par jomwankunda

Attribution Non-Commercial (BY-NC)

- Hypothesis testing, Test Statistic (z,p,t,F)
- Hypothesis Testing
- Hypothesis testing
- Hypothesis Testing
- Hypothesis Testing
- Statistics Questions Practice W2 Final 1
- Hypothesis
- Hypothesis Testing
- Inclusive Practices in Large Urban Inner-City Schools: School Principal Involvement in Positive Behavior Intervention Programs
- Dr. William Allan Kritsonis - Statistics
- Chapter 2 Hypothesis Testing
- Statistics
- Z Score
- Gretl Tutorial
- Research Methodology
- 10.5923.j.ajcam.20120204.02
- CONJECTURING HIGHER COMPETENCIES IN STATISTICS USING SCIENTIFIC CALCULATOR.
- SSGPOI
- Chapter11 Stats
- annova

Vous êtes sur la page 1sur 21

I. Introduction

II. Brief Review & Discussion of Logic

III. Independent Groups

1. Formula

2. Formal Example - [Minitab] [Spreadsheet]

1. Research Question

2. Hypotheses

3. Assumptions

4. Decision Rules

5. Computation

6. Decision

IV. Dependent Groups

1. Discussion

2. Formula

3. Formal Example - [Minitab]

1. Research Question

2. Hypotheses

3. Assumptions

4. Decision Rules

5. Computation

6. Decision

Homework

I. Introduction

The use of designs that involve two samples far exceeds that of those previously discussed for

two reasons:

1. It is rare that µ or σ are known. When using two samples, neither of these parameters are

required.

2. Since two groups (or measurements) are included, one will serve as a concurrent control.

In other words, the two groups (or measurements) occur closely together in time and

space. Thus, the treatment and testing circumstances (which introduce lots of potential

extraneous variables) can be better controlled. For example, in terms of our IQ/"Bad

Kids" example, perhaps the IQ of the population was taken 2 years previously and the IQ

in that area was increasing at the rate of 2-4 points per year (for whatever reason; you can

be creative here).

Recall our first example of the experimental method at the very beginning of the semester

involving the effects of marijuana on memory. The ability to analyze such an experiment has

been one of the major goals of this course. In this experiment there were two groups and we need

to be able to compare the means and see if the difference is worth paying attention to (i.e., did

marijuana have an effect on the memory performance)?

Let's take a step back and review what we have covered thus far about inferential statistics.

Actually it goes back a little further than that to where we learned about standard scores and the

normal distribution. The key point was that area under the curve implies probability. To

determine these probabilities, we computed the standard or z scores. That is:

for samples

We also saw that the sampling distribution of the mean was a normal distribution with:

and respectively.

So we were able to use Z scores to determine the probability that a particular sample mean was

drawn from a given population. That is:

Case I

(1 sample,

µ&σ

known)

Then we went on to a more realistic situation in which the population standard deviation was not

known. We estimated it from the sample standard deviation. This complicated things a bit in that

the shape of the sampling distribution, while still normal, differed in its kurtosis as a function of

the sample size (or more accurately, the df). This family of distributions was called Student's t

and the formula became:

Case II

(1 sample,

σ

unknown)

df=N-1

However, as was noted earlier, rarely do we know any of the population parameters, and it is

desirable to have a concurrent control. So we need another sampling distribution to help us

compute the relevant probabilities.

The sampling distribution involves two means, so it is called the sampling distribution of the

difference between means. Note that if the two means are the same (when there is no effect of the

IV), the difference between them will be zero. So the value that we are interested here (in terms

of the general formula for Z given above) is the difference between the means, that is:

The mean and standard deviation of the sampling distribution of the difference between means

are given by:

and respectively.

The latter is called the standard error of the difference between means. Since the sample standard

deviation is again used to estimate the population value, the sampling distribution of the

difference between means will also be distributed as t (the family of normal distributions that

differ in kurtosis as a function of the df). So the formula becomes:

HO: µ1=µ2

That is, the two means come from the same population; there is no difference between them (i.e.,

µ1-µ2=0). Thus, the formula reduces to:

All we need to do now is determine the formula for the standard error. However, this formula

differs depending whether we are dealing with independent or dependent groups. Once we

understand this distinction, we can move on to Case III (i.e., independent groups) and Case IV

(i.e., dependent groups) of hypothesis testing with continuous variables.

With the independent groups design, the subjects in each of the two groups are different and

unrelated in any way. The most common type of dependent groups design is also called a

within subjects or repeated measures design, because the same subjects (thus actually only one

group) are tested twice.

1. Formula

The defining formula (when the sample sizes are equal) for the standard error of the difference

between means is:

And thus the formula for the t value is:

The computational formulas (which will also handle unequal sample sizes) are given by

And:

Since two variances are used in estimating the standard error of the difference between means,

the degrees of freedom will equal the sum of the degrees of freedom for each of the variance

estimates, that is:

2. Formal Example - [Minitab] [Spreadsheet]

Suppose you are a researcher interested in the factors influencing paper grading by professors.

You have a hunch (and/or previous research) might lead you to predict that papers that are typed

are rated higher than papers that are handwritten. Research to date though, has only been

correlational and thus little can be said in terms of a cause and effect relationship.

So you have 10 freshman students currently taking English as well as an introductory psychology

course each write one paper. They should each provide two copies of their paper (one typed and

one handwritten). Next, we enlist the aid of 20 English instructors. We randomly assign 10

instructors to each of two groups. Each instructor in one group (the control group) will grade

each of the 10 papers that are hand written, while the second group (the experimental group) will

grade the same papers that are typed.

1. Research Question

Does typing a paper influence the grade it receives?

2. Hypotheses

In Symbols In Words

Typing has no effect on the grade a paper receives

HO µ 1=µ 2

(as compared to a handwritten paper).

HA µ 1≠µ 2 Typing influences the grade for better or worse.

3. Assumptions

1. Our subjects were chosen randomly from the population.

2. The groups are independent.

3. There is homogeneity of variance That is, the amount of variability in the

DV is about equal in each of the groups. When the samples sizes are

reasonably large and the number of subjects in each group is about equal,

we do not have to worry about this too much because the t test is robust.

This means that it is strong and can tolerate some violations of its

assumptions.

4. Sampling distribution of the difference between means is normal in shape.

In other words, the DV should be normally distributed in the population.

5. The null hypothesis.

4. Decision Rules

Using alpha of .05 with a two-tailed test and df=N1+N2-2=10+9-2=17, we

determine from the t table that the critical value is 2.110. Thus:

If tobs > -2.110 and tobs < 2.110, then do not reject HO.

5. Computation

Since we are not interested in the differences between the scores of the 10 papers

graded by an instructor, we simply calculate the mean grade given by each

instructor. Note that one of the instructors in the Written Group had to be

excluded because their dog ate the papers they were supposed to grade. Thus, we

then have 19 means . To describe the data, we present the means and

standard deviations for each of the two groups, that is:

81 84

81 89

79 89

80 81

84 87

Data 87 82

75 87

83 85

88 89

83

82.0 85.6

s 4.03 3.03

N 9 10

Now the inferential question is whether this difference between means is worth

paying attention to. Thus, we will use a between groups t test to answer this

question.

Substituting the appropriate values gives:

6. Decision

Since -2.222 (tobs) < -2.110 (tcrit) we reject HO and assert the alternative. In other

words, we conclude that typing a paper improves the grade it receives. Notice that

we have actually gone beyond the alternative hypothesis by specifying that the

effect has a direction (typing is good).

1. Discussion

As noted earlier, the most common type of Dependent Groups Design is also called a

Within Subjects or Repeated Measures Design, because the same subjects (thus, actually

only one group) are tested twice. There is another situation, though, in which this analysis

is sometimes used. It is called the Matched Groups Design. In this case, there are two

groups, but they are matched on some variable that is highly and positively correlated

with the DV. The procedures involved in matching will be presented more clearly below

in the formal example.

2. Formula

In this case, the standard error of the difference between means is given by:

Notice that the formula requires the computation of the correlation between the two sets of

scores. It is here that we see the potential advantage to this design. That is, the error term (the

standard error of the difference between means) is decreased in direct proportion to the

magnitude of this correlation, which results in a potentially more powerful or sensitive test. The

disadvantage though is the loss of degrees of freedom. The N here refers to the number of pairs

of scores (for an individual or matched pair of individuals). Thus, the degrees of freedom is half

what we would have if we had used a between groups approach (i.e., N-1 is 1/2 of N1+N2-2). The

trick is to make sure the correlation is large enough to offset the loss of df.

The formula above would be very cumbersome to use. Fortunately, there is another technique

available for obtaining the t value called the Direct Difference Method. If the difference between

the X and Y scores is designated as D (i.e., D=X-Y), then we may then we may restate the null

and alternative hypotheses as:

In Symbols

HO µ D=0

HA µ D≠0

And with:

becomes:

where the df=N-1 and N refers to the number of pairs of scores.

Suppose you are interested in reactions times to different colored lights (especially green and

red). We could use either:

• Repeated measures design - test each subject for a number of trials, such as

GGRRGRRG, etc. Then compute the average speed to each color light for each

subject.

• Matched groups design - test all subjects' reaction times to white light for a

given number of trials. Using this data, create two matched groups, that is, take

the two quickest subjects and randomly assign one to each of the groups. Then

take the next two quickest subjects and randomly assign one of them to each of

the groups, etc. Ex:

Ranked

Red Green

Data

2 1

1, 2, 3, 4,

3 4

5, 6, 7, 8, 6 5

8 7

...

...

•

Note that the number of subjects must be devisable by the number of groups.

1. Research Question

Does reaction to red and green lights differ?

2. Hypotheses

In Symbols In Words

There is no difference in reaction times between

HO µ 1=µ 2

red and green lights.

There is a difference in reaction times between

HA µ 1≠µ 2

red and green lights.

3. Assumptions

1. Our subjects were chosen randomly from the population.

2. The scores of the two conditions are correlated (i.e., the groups are

dependent).

3. The sampling distribution of the difference between means is normal in

shape. In other words, the DV should be normally distributed in the

population.

4. The null hypothesis.

4. Decision Rules

We will test 10 (or 20 if matched) subjects. Using alpha of .05 with a two-tailed

test and df=N-1=9, we determine from the t table that the critical value is 2.262.

Thus:

If tobs > -2.262 and tobs < 2.262, then do not reject HO.

5. Computation

First we describe the data by computing the means for each condition/group.

While we are at it, we might as well compute the difference scores and their

squares (since we will need them for the analysis).

Subject

X (red) Y (green) D D2

(or pair)

1 18 22 -4 16

2 16 20 -4 16

3 23 29 -6 36

4 30 35 -5 25

5 32 27 5 25

6 30 29 1 1

7 31 33 -2 4

8 25 29 -4 16

9 27 31 -4 16

10 21 24 -3 9

Then, for the inferential test, we will use a within groups t test (the direct

difference method) and thus we have the formula:

6. Decision

Since -2.512 (tobs) < -2.262 (tcrit) we reject HO and assert the alternative. In other

words, we conclude that reaction time is quicker to red as compared to green light

2.TWO-SAMPLE TEST OF A HYPOTHESIS

3. A. Overview of Two-Sample Hypothesis Testing

B. Step-By-Step Instructions for Performing a Two-Sample Hypothesis Test in

Excel

C. Interpreting the Results of the Test

4. A. Overview of Two-Sample Hypothesis Testing

5. Two-sample hypothesis testing is statistical analysis designed to test if

there is a difference between two means from two different

populations. For example, a two-sample hypothesis could be used to

test if there is a difference in the mean salary between male and

female doctors in the New York City area. A two-sample hypothesis

test could also be used to test if the mean number of defective parts

produced using assembly line A is greater than the mean number of

defective parts produced using assembly line B. Similar to one-sample

hypothesis tests, a one-tailed or two-tailed test of the null hypothesis

can be performed in two-sample hypothesis testing as well. The two-

sample hypothesis test of no difference between the mean salaries of

male and female doctors in the New York City area is an example of a

two-tailed test. The test of whether or not the mean number of

defective parts produced on assembly line A is greater than the mean

number of defective parts produced on assembly line B is an example

of a one-tailed test. The following section provides step-by-step

instructions for performing a two-sample test of a hypothesis in Excel.

6. B. Step-By-Step Instructions for Performing a Two-

Sample Hypothesis Test in Excel

7. Big Foods Grocery has two grocery stores located in Johnston City. One

store is located on First Street and the other on Main Street and each is

run by a different manager. Each manager claims that her store's

layout maximizes the amounts customers will purchase on impulse.

Both managers surveyed a sample of their customers and asked them

how much more they spent than they had planned to, in other words,

how much did they spend on impulse? The following table shows the

sample data collected from the two stores.

First Street Main Street

15.78 15.19

17.73 18.22

10.61 15.38

15.79 15.96

14.22 21.92

13.82 12.87

13.45 12.47

12.86 13.96

10.82 13.79

12.85 13.74

18.4

18.57

17.79

10.83

is a difference in the mean amounts purchased on impulse at the two

stores and has hired you to perform the statistical analysis. This

question can be addressed by performing a two-sample test of a

hypothesis. The following describes the steps to perform the test in

Excel.

9. Step 1. The first step is to state the hypothesis to be tested, called

the null hypothesis, and the alternative hypothesis. In this example,

upper-level management wants to know if there is a difference in the

mean amounts purchased on impulse at the two stores. An alternative

way to state this question is "Is the mean amount purchased on

impulse at the First Street store equal to the mean amount purchased

at the Main street store?" Recall that the "equality" part of the

hypothesis is always stated in the null hypothesis. Therefore, the null

and alternative hypotheses for this example are:

10.

11. where μf is the mean amount spent on impulse in the First Street store

and μm is the mean amount spent on impulse in the Main Street store.

Note, this is a two-tailed test of a hypothesis.

12. Step 2. Select the level of significance to be used in the test. The

level of significance is the probability of rejecting the null hypothesis

when it is true. Common significance levels are .10, .05, and .01.

Suppose you chose a .05 level of significance, meaning there is a 5%

chance that you will reject the null hypothesis when it is true.

13. Step 3. Select the test statistic that is appropriate for this test. In

general, you will need to decide between using a z test statistic or a t

test statistic. If one or more of the sample sizes is less than 30 (as in

this problem), a t statistic is appropriate. The test statistic for this

example is:

14.

15.

16.

17.

18.

19.

20.

21. Determine the rejection region. The rejection region defines the

conditions under which the null hypothesis is rejected. (See the

section One-Sample Test of a Hypothesis for more details on the

rejection region.) The critical values for this test are based on degrees

of freedom, and in this problem the degrees of freedom are equal to 22

(10 + 14 - 2). The critical t values are -2.074 and 2.074. Therefore, if

the test statistic is less than -2.074 or greater than 2.074, we will reject

the null hypothesis in favor of the alternative. Perform the hypothesis test.

The above calculations are easily computed in Excel. First, input the data into an Excel

spreadsheet:

22.

23. From the Tools pull-down menu, select Data Analysis, and then

select t-Test: Two-Sample Assuming Equal Variances.

24.

25. Click OK in the Data Analysis window and the t-Test: Two-Sample

Assuming Equal Variances window opens.

26.

27. In the Variable 1 Range field, type A2:A11, or click the worksheet

icon to the right of the Variable 1 Range field and click and drag the

cursor over the data in column A. In the Variable 2 Range field, type

B2:B15, or click the worksheet icon to the right of the Variable 2

Range field and click and drag the cursor over the data in column B. In

the Hypothesized Mean Difference field type 0 and in the Output

Options box, type D1 in the Output Range field. The t-Test: Two-

Sample Assuming Equal Variances window should appear as

follows:

28.

29. Click OK in the t-Test: Two-Sample Assuming Equal Variances

window and the results of the hypothesis test appear:

30.

31.

32.C. Interpreting the Results of the Test

33.The results of the two-sample test are shown above. Excel calculates

the test statistic and critical values for the test. Recall that if the test

statistic is less than -2.074 or greater than 2.074, we reject the null

hypothesis in favor of the alternative. The test statistic is -1.649,

which does not fall into the rejection region, so we fail to reject the null

hypothesis of no difference between the means from the two samples.

In other words, we fail to reject that the mean amount spent on

impulse at the First Street grocery store is equal to the mean amount

spent on impulse at the Main Street grocery store with 95%

confidence.

Elon University

Campus Box 2700 Last Modified:

Elon, NC 27244 (800) 334- Copyright © Elon

8448 University

E-mail: web@elon.edu

34.

35.

36.

37.

- Hypothesis testing, Test Statistic (z,p,t,F)Transféré parmarketingmixfourp
- Hypothesis TestingTransféré parshrutilather
- Hypothesis testingTransféré parMohammad Shaniaz Islam
- Hypothesis TestingTransféré parFahad Javaid
- Hypothesis TestingTransféré parDivyamathi Thayumanavar
- Statistics Questions Practice W2 Final 1Transféré parAmeen Hussain Fahmy
- HypothesisTransféré parDeep Eyes
- Hypothesis TestingTransféré parRahul Kumar Jain
- Inclusive Practices in Large Urban Inner-City Schools: School Principal Involvement in Positive Behavior Intervention ProgramsTransféré parRoberto Johnson
- Dr. William Allan Kritsonis - StatisticsTransféré parAnonymous sewU7e6
- Chapter 2 Hypothesis TestingTransféré parMario M. Ramon
- StatisticsTransféré parDelphina Gomes
- Z ScoreTransféré parMarlon Roundtree
- Gretl TutorialTransféré parBirat Sharma
- Research MethodologyTransféré parSreenath Reddy
- 10.5923.j.ajcam.20120204.02Transféré parjsp10
- CONJECTURING HIGHER COMPETENCIES IN STATISTICS USING SCIENTIFIC CALCULATOR.Transféré parIJAR Journal
- SSGPOITransféré paratiq124
- Chapter11 StatsTransféré parPoonam Naidu
- annovaTransféré parprincearora
- theoryMATH0487chap6_7Transféré parDominiqueDo
- 06 Test HypothesisTransféré parSyaifulAdri
- Two Sample TestTransféré parNaman Chaudhary
- ,,,,,,,,,,,Transféré parAra Taningco
- BSDM Hypothesis Testing PresentationTransféré parPreethi
- Exam 04 (20151127)Transféré parRovin James Canja
- comprehensive skittles project math 1040Transféré parapi-316732019
- The Effect of Explicit Grammar Instruction.2595056Transféré parGrace Karim
- (13)_Test_of_Significant_Difference.pptxTransféré parrommel legaspi
- Statistika (Uts)Transféré parjaelani

- STA101 Formula SheetTransféré parConner Bieker
- math project part b with conclusionTransféré parapi-273321131
- R_mhwTransféré parajitjain6761
- ALu seminar RX ACCHTransféré parShyupomen Uz
- P4564Transféré parShilpa Somila
- ix_statiTransféré parhoney1002
- Automatic Syllabus ClassiﬁcationTransféré parManas Tungare
- MATH F113 HandoutTransféré parJatin
- Chapter1-Regression-Introduction.pdfTransféré parHabib Mrad
- Frequency Distribution _ Math@TutorVistaTransféré parbino
- IBM SPSS Statistics Algorithms.pdfTransféré parKuldeepBukarwal
- Binary Logistic Regression and its applicationTransféré parFaisal Ishtiaq
- The Role of Commercial Banks in Agricultural Development in Nigeria (1986-2010)Transféré parIbe Collins
- Dataset in Logistic Regression AnalysisTransféré parAlex De la Cruz
- 1-CourseIntroTransféré paroxfordcat78
- AnointTransféré parloveandexile
- Solomon a QP - S1 EdexcelTransféré parabhay
- The Good Research Guide_ for Small-Scale - Martyn DenscombeTransféré parJenni Guevara
- GD&TTransféré parSanjeeb Sinha
- Problem 5-10 FM GitmanTransféré parPervaiz Shahid
- Moderation_Meditation.pdfTransféré parMostafa Salah Elmokadem
- London 2011 BoaTransféré parakita_1610
- Assignment 1Transféré parAshish
- lec1.pdfTransféré parAvinash Siwach
- STA 2023 SyllabusTransféré parEdison Fye
- Skin Permeation Enhancement Potential of Aloe VeraTransféré parCyrus Jia
- Time Sereis Analysis Using StataTransféré parsaeed meo
- ANSYS-Mechanical-APDL-Advanced-Tutorials.pdfTransféré parNam Vo
- Boyer, R. (1979). Wage Formation in Historical Perspective the French Experience. Cambridge Journal of Economics, 3(2), 99-118Transféré parlcr89
- FHMM 1134 Tutorial 5 Correlation and RegressionTransféré parkhohzian