15 vues

Transféré par Usama Ajaz

a student project

- ,,,,,,,,,,,
- Distribution in Statistice
- MB0034 Research Me the Do Logy Final
- Bio3TC17 Chi Square Test Notes
- Investment Pattern Over the Life Cycle of Family(2)
- Estimation of Corrosion Growth Rates in Pipelines Rev 4
- IJERE--2019--April--Volume 4 Issue 2 (Iftikatul-Waspodo)
- Indian Hockey Team News
- ch05
- The Effect of Video-based Perceptual Training on the Observation Conditions and the Number of Football Coach Feedbacks
- synopsis.docx
- Seminar's Slide for listening
- AR12-Slater 2011 corporate differentiated strategies.pdf
- Statistics Two Workbook
- PROBABILITY-PLOT.ppt
- Collocations
- Quantitave Methods
- Elements of Statistics - STAT 111 Z1 - Course Syllabus or Other Course-Related Document
- Unit 5_Testing of Hypothesis_SLM
- Health Care Expenditure

Vous êtes sur la page 1sur 17

RELATED

DISTRIBUTIONS

SUBMITTED BY

OSAMA BIN AJAZ

(std_18154@iobm.edu.pk)

CONTENTS

Abstract

03

Bernoulli distribution

04

Binomial distribution

05

Multinomial Distribution

07

Beta binomial distribution

08

Correlated binomial distribution

08

09

Neyman C () test

09

Testing goodness of fit of binomial distribution

09

The C () test for correlated binomial alternatives

10

C () test for beta binomial alternatives

10

The C () test for Althams multiplicative alternatives

11

12

13

References

14

ABSTRACT

R. E. Tarone from National Cancer Institute, Bethesda, Maryland;

derive the tests for the goodness of fit of the binomial distribution

using C() procedure of Neyman (1959), which are

asymptotically optimal against generalized binomial alternatives

proposed by Altham (1978) and Kupper & Haseman (1978).

Before coming to the article I have explain about binomial and

related distributions. I have reproduced key parts of the article, if

somebody interested in detail of the article then he is advice to

see references at the end page of the report.

Bernoulli trial

A Bernoulli trial (named after James Bernoulli, one of the founding fathers of

probability theory) is an experiment with two, and only two possible

outcomes [2]. For example: female or male, life or death, Head or Tail and

success or failure etc. A sequence of Bernoulli trials occur when a Bernoulli

experiment is performed several independent times so that the probability of

success, say p, remains the same from trial to trial.

Bernoulli distribution

A random variable X is defined to have a Bernoulli distribution if the discrete

density function of X is given by

1 x

f ( x )= p (1p) forx=01

0 otherwise

Where the parameter p satisfies 0p1,

If X has a Bernoulli distribution, then

E[x] = p,

var [x] = pq,

Mx (t) = pet + q.

Proof

1

E[x] =

x p x (1 p)1x=0. q+1. p= p

x=0

Mx (t) = E[etx] =

etx p x ( 1 p)1x

x=0

= q+pet

Example 1: out of millions of instant lottery tickets, suppose that 20% are

winners. If five such tickets are purchased, (0, 0, 0, 1, 0) is a possible

observed sequence in which the fourth ticket is a winner and the other four

are losers. Assuming independence among winning and losing tickets, the

probability of this outcome is (0.8) (0.8) (0.8) (0.2) (0.8) = (0.2) (0.8) 4 [5]

In a sequence of Bernoulli trials, we are often interested in the total number

of successes and not in the order of their occurrence. If we let the random

variable X equal the number of observed successes in n Bernoulli trials, the

possible values of X are 0, 1, 2, . . ., n. if x successes occur, where x=0,1,2,

, n, then n-x failures occur. The number of ways selection x positions for

the x successes in the n trials is

n!

(nx)= x !(nx)!

independent and since the probabilities of success and failure on each trial

are, respectively, p and q=1-p, the probability of each of these ways is px (1p) n-x. Thus f(x), the p.m.f of X, is the sum of the probabilities of these

mutually exclusively events, that is

f ( x )= n p x (1 p)n x for x=0,1,2, n

x

()

(nx )

variable X is said to have a binomial distribution.

A binomial experiment satisfies the following properties:

1. A Bernoulli experiment is performed n times.

2. The trials are independent.

3. The probability of success on each trial is a constant p; the probability

of failure is q=1-p.

4. The random variable X equals the number of successes in the n trials.

A binomial distribution is denoted by the symbol b (n, p) and we say that the

distribution of X is b (n, p). The constants n and p are called the parameters

of the binomial distribution. Thus if we say that the distribution of X is b (10,

n=10 from a Binomial distribution with p=1/5.

The binomial distribution derived its name from the fact that the (n+1) terms

in the binomial expansion of (q + p) n correspond to the various values of b(x;

n, p) for x=0, 1, 2. . . n. That is

n

(q+ p) n=

n1

n2

+ + n pn

n

()

b ( x ; n , p )=1

x=0

Example 2: if we want to find the probability of obtaining exactly three 2s if

an ordinary die is tossed 4 times; then the probability is:

b (4,

6 =

1

6

5

6

4

3

()

b(x; n, p) are:

= np

2 = npq

Mx (t) = (q+pet) n respectively.

Proof

pe

( t )x qn x

Mx (t) = E[etx] =

x=0

x=0

= (pet + q)

And second derivative is

(pet + q) n-1

Var[X] = E[x2]-{E[x]} 2= Mx (0) (np) 2= n (n 1) p2 + np (np) 2= np (1 p)

2 1 t

+ e

3 3

)5

then X has a binomial distribution with n = 5 and p = 1/3; that is, the pmf of

X is

Note: Binomial distribution reduces to the Bernoulli distribution when n=1.

Sometimes the Bernoulli distribution is called the point binomial.

Example 4: Let the random variable Y be equal to the number of successes

throughout n independent repetitions of a random experiment with

probability p of success. That is, Y is b (n, p). The ratio Y/n is called the

relative frequency of success. Now recall Chebyshevs Inequality i.e. P (|x-|

2

) 2 for all >0.

Y

Var ( )

Y

p(1 p)

n

P (| n p )

=

2

n 2

Now, for every fixed > 0, the right-hand member of the preceding inequality is close to zero

for sufficiently large n. That is

Since this is true for every fixed > 0, we see, in a certain sense that the

relative frequency of success is for large values of n, close to the probability

of p of success [3].

Example 5: Let the independent random variables X1, X2, X3 have the same

cdf F(x). Let Y be the middle value of X1, X2, X3. To determine the cdf of Y ,

say FY (y) = P(Y y), we note that Y y if and only if at least two of the

random variables X1, X2, X3 are less than or equal to y. Let us say that the ith

trial is a success if Xi y, i = 1, 2, 3; here each trial has the probability of

success F(y). In this terminology, FY (y) = P(Y y) is then the probability of

at least two successes in three independent trials. Thus

FY(y) =

3

2

()

y

1F

[

)+

2

[F ( y )]

[F ( y )] .

If F(x) is a continuous cdf so that the pdf of X is F(x) =f(x), then the pdf of Y

is

FY(y) = FY(y) =6[F(y)] [1-F(y)] f(y). [4]

MULTINOMIAL DISTRIBUTION

Recall that in order for an experiment to be binomial; two outcomes are

required for each trial. But if each trial in an experiment has more than two

outcomes, a distribution called the multinomial distribution must be used.

For example, a survey might require the responses of approve,

disapprove, or no opinion. In another situation, a person may have a

choice of one of five activities for Friday night, such as a movie, dinner,

baseball game, play, or party. Since these situations have more than two

possible outcomes for each trial, the binomial distribution cannot be used to

compute probabilities.

If X consists of events E1, E2, E3, . . . , Ek, which have corresponding

probabilities p1, p2, p3, . . . , pk of occurring, and X1 is the number of times E1

will occur, X2 is the number of times E2 will occur,X3 is the number of times E3

will occur, etc., then the probability that X will occur is

P ( X )=

n!

. p x p x p xk

X1 ! X2! X3! Xk ! 1 2

1

For an illustration purpose let a box contains four white balls, three red balls,

and three blue balls. A ball is selected at random, and its color is written

down. It is replaced each time and let we want to find the probability that if

five balls are selected, two are white, two are red, and one is blue.

The distribution with discrete density function

f(x) = f(x; n, , ) =

(nx)

( + ) (n+ x )

.

(n+ + )

I{0,1 , , n}(x)

binomial distribution.

The beta binomial distribution has Mean =

n

+

and variance =

n ( n+ + )

( + )2 ( + +1)

If ==1, then the beta binomial distribution reduces to a discrete uniform

distribution over the integers 0, 1 n. [2]

the fetuses in a litter are not mutually independent. This idea is due to

Bahadur (1961). Retaining only the first order correlation between the

responses and denoting as the covariance between the binary responses of

any two fetuses, the random variable X is such that

where p is the probability that the fetus is abnormal. Note that for the above

equation to be a valid probability distribution, a data-dependent bound for

the parameters has to be imposed; see Kupper and Haseman (1978). It can

be shown that the expectation and variance of the correlated binomial

distribution are np and np (1-p) + n(n-1), respectively. Thus, the correlated

binomial distribution is a generalization of the binomial distribution, the CB

distribution becomes the binomial distribution when =0. Altham (1978)

derived a further two-parameter generalized binomial distribution, namely,

the multiplicative generalized binomial (MB) distribution.

The probability mass function of the Altham-multiplicative binomial

distribution is

n p (1 p)

(

x)

P ( X=x )=

x

nx

a x(nx)

F( n)

x= 0, 1, 2, . . . , n

a0

0p1

Neyman C () test

10

hypotheses testing problems in applied research often involve several

nuisance parameters. In these composite testing problems, most powerful

tests do not exist, motivating search for an optimal test procedure that yields

the highest power among the class of tests obtaining the same size.

Neymans locally asymptotically optimality result for the C() test employs

regularity conditions inherited from the conditions used by Cramer (1946) for

showing consistency of MLE and some further restrictions on the testing

function to allow for replacing the unknown nuisance parameters by its nconsistent estimators. It is the confluence of these Cramer conditions and

the maintained significance level that gives the name to the C () test

DISTRIBUTION*

R. E. Tarone from National Cancer Institute, Bethesda, Maryland; derive the

tests for the goodness of fit of the binomial distribution using C() procedure

of Neyman (1959), which are asymptotically optimal against generalized

binomial alternatives proposed by Altham (1978) and Kupper & Haseman

(1978) [5].

Consider an experiment in which the responses take the form of proportions

and let the ith response be given by pi=xi/ni for i=1, ... , M. Under the

correlated binomial model the log likelihood function is :

M

i=1

i=1

x ni p ) 2+ x i ( 2 p1 )ni p2 }]

2 2 {( i

2p q

the goodness of fit of the binomial distribution is obtained by testing the null

hypothesis: Ho: =0 in the presence of nuisance parameter p. Moran (1970)

demonstrated that for such problems the C () tests proposed by Neyman

(1959) are asymptotically equivalent to tests using maximum likelihood

11

the following partial derivatives of L evaluated at = 0:

Under the null hypothesis, the xi are independent binomial random variables,

and hence it follows from (2) that E {S2 (p)} =0. Neyman (1959) has shown

that when E {S2 (p)} =0 the null hypothesis Ho: =0 can be tested using the

^

statistic S1 ( p) , where ^p is a root-n consistent estimator of p (Moran,

1970). Substituting the consistent estimator

^p=

xi

ni

p

x ini ^

ni

S

2

(^

p) =

S=

,

we

find

that

C

()

test

statistic

is

given

by

S

.

Since E {S2 (p)} =0, the variance of S ( ^p ) is given by E {S3 (p)} where the

expectation is taken under Ho: =0. From (3) it follows that E {S3 (p)} =

ni (ni1)

2 p2 q2

. Substituting

^p

The statistic X2c is the C () test statistic for homogeneity of proportions

which is asymptotically optimal against correlated binomial alternatives.

The binomial variance test for homogeneity is based on the statistic

12

freedom when b= 0. It is clear from the above expressions that for the case

in which ni = n for all i, the C () test statistic S is equivalent to the variance

test statisticX2v.

The beta-binomial distribution is a mixture of binomial distributions which

has often

been utilized as an alternative to the binomial distribution. Under the betabinomial model

the log likelihood function is given by

of fit of the

binomial distribution is obtained by testing the null hypothesis Ho: = 0.

The derivation of

the C () test statistic using the beta-binomial model is similar to the

derivation for the correlated binomial model, and the optimal statistic again

is found to be the statistic S

derived in the last section. Note, however, that in the beta-binomial model

the parameter cannot take negative values. The alternative hypothesis is

necessarily one sided, and hence the

C () test is the one-sided test based on the statistic the C () test is the onesided test based on the statistic cannot take negative values. The alternative

hypothesis is necessarily one sided, and hence the C () test is the one-sided

test based on the statistic

Under the null hypothesis Ho: = 0, the statistic Z will have an asymptotic

standard normal

distribution.

13

The multiplicative generalization of the binomial distribution provides an alternative for which

the correlated binomial C () test is not asymptotically optimal. The log likelihood function for

the multiplicative generalization of the binomial model is

i

nix

The C () test for Ho: =1 is based on the statistic x I () . Note that unlike the correlated

R=

binomial C () statistic, R is not equivalent to the variance test statistic in the case ni = n for all i.

Will have an asymptotic chi-squared distribution with one degree of freedom. The test based on

X2m is asymptotically optimal against alternatives given by the multiplicative generalization of

the binomial mode

In order to compare the different tests of the goodness of fit of the binomial

distribution we

consider the treatment group data of Kupper & Haseman (1978, p. 75). The

observed proportions were 0/5,2/5,1/7,0/8,2/8,3/8,0/9,4/9,1/10and 6/10.The

variance test gives X2v = 19.03 and P = 0.025,the correlated binomial C()

test gives X2c= 6.63and P = 0. 01. Thus for this example, the correlated

binomial C() test is more sensitive to the departure of the observed

proportions from a binomial distribution than the other tests considered.

14

the null hypothesis, a Monte Carlo experiment was performed. Ten binomial

proportions were randomly generated using the unequal sample sizes from

the above example. For each pseudorandom sample of 10 proportions the C

() statistics X2c and the variance test statistic X2v were calculated and

compared to the 100%, 500 and 1% points of their asymptotic null

distributions. The empirical significance levels based on 1500 replications are

shown in Table 1 for under lying binomial probabilities of 0.10, 0.25and 0.50.

For the cases considered, the empirical significance levels for the correlated

binomial C () statistic are significantly lower than the nominal level for the

500 and 10% critical values. The empirical significance levels for the 1%

critical value show no consistent pattern.

optimal against correlated binomial and variance test, based on

1500 replications for underlying binomial probabilities of 0.10, 0.25

and 0.50

Nomin

al

level

X2c

X2m

X2v

0.01

0.007

0.010

0.003

Binomial Probabilities

P=0.10

P=0.25

0.05

0.10

0.01

0.05

0.10

0.019

0.043

0.042

0.048

0.100

0.082

0.013

0.012

0.012

0.035

0.037

0.042

0.073

0.085

0.097

0.01

0.009

0.009

0.007

P=0.50

0.05

0.10

0.034

0.031

0.049

0.077

0.075

0.108

variance test and the generalized binomial C () tests for

correlated binomial and multiplicative alternatives

15

Test

statistic

X2v

X2c

X2m

Correlated binomial Multiplicative generalized

binomial

0.95

0.71

1.00

0.82

0.79

1.00

variance test and the generalized binomial C () test for

correlated binomial and multiplicative alternatives; it shows that

the correlated binomial C () test is more efficient than the

variance test for multiplicative alternatives as well as for

correlated binomial alternatives.

REFERENCES

1. Alexander M. Mood, Franklin A. Graybill and Duane C. Boes,

Introduction to the theory of statistics, third edition, McGrawHill series in probability and statistics

16

second edition, page 89, Duxbury Advanced Series.

3. Hogg, McKean and Craig, Introduction to Mathematical

Statistics (2013), seventh edition, Pearson education, Inc.

4. Paul S. R. , A three parameter generalization of binomial

distribution, Windsor mathematics report, February 1984

5. Robert V. Hogg, Elliot A. Tennis, Jagan Mohan Rao, Probability

and Statistical Inference, seventh edition, Pearson Education

6. Tarone, R. E. (1979), Testing the goodness of fit of binomial

distribution, Biometrika 66, 585 590

17

- ,,,,,,,,,,,Transféré parAra Taningco
- Distribution in StatisticeTransféré parruchika kumari
- MB0034 Research Me the Do Logy FinalTransféré parspm_ashu
- Bio3TC17 Chi Square Test NotesTransféré parFounder Cha
- Investment Pattern Over the Life Cycle of Family(2)Transféré parsanghamitra_das
- Estimation of Corrosion Growth Rates in Pipelines Rev 4Transféré parToby Fletcher
- IJERE--2019--April--Volume 4 Issue 2 (Iftikatul-Waspodo)Transféré parwaspodo tjipto subroto
- Indian Hockey Team NewsTransféré parSanjay Soni
- ch05Transféré parabdio89
- The Effect of Video-based Perceptual Training on the Observation Conditions and the Number of Football Coach FeedbacksTransféré parThe Swedish Journal of Scientific Research (SJSR) ISSN: 2001-9211
- synopsis.docxTransféré parRajesh Insb
- Seminar's Slide for listeningTransféré parlisa
- AR12-Slater 2011 corporate differentiated strategies.pdfTransféré parRudine Pak Mul
- Statistics Two WorkbookTransféré parRakesh Haldar
- PROBABILITY-PLOT.pptTransféré parCharmian
- CollocationsTransféré parIsvoranu Lenuta
- Quantitave MethodsTransféré parWen Jiawen
- Elements of Statistics - STAT 111 Z1 - Course Syllabus or Other Course-Related DocumentTransféré parContinuing Education at the University of Vermont
- Unit 5_Testing of Hypothesis_SLMTransféré parVineet Sharma
- Health Care ExpenditureTransféré parfirdasairul
- 780GroupProject(Mathletes Group)Transféré parguestjl
- dt9806Transféré parpostscript
- Homework9 AssignmentTransféré parMaria
- ANOVATransféré parsharkbait_fb
- 43_2_DALLAS_03-98_0310Transféré parRaju
- 1 ARCH Model 80 PagesTransféré parPuPu Shing Shing
- Simulated ExamTransféré parDr-Mohammed Farid
- Microsoft PowerPoint PresentationTransféré parratnesh_mishra1985
- Effect of Short Educational Movie on the Increasing Knowledge of Breastfeeding ManagementTransféré parAyyu Sandhi
- maher.pdfTransféré parMirea Constantin Alin

- OrthogonalityTransféré parUsama Ajaz
- Binomial and Related DistributionsTransféré parUsama Ajaz
- 7 Cs of CommunicationTransféré parUsama Ajaz
- Binomial and Related DistributionsTransféré parUsama Ajaz
- work of GEP BoxTransféré parUsama Ajaz
- Challan FormTransféré parAhmer Khan
- GAT-subjectveTransféré parUsama Ajaz
- ProjectTransféré parUsama Ajaz
- Microsoft Word - SYLLABUS-cceTransféré parab99math
- Do Qomi Nazariya Aur Aalahazrat Imam Ahmed RazaTransféré parsulemansubhani
- The Respect of a MuslimTransféré parAli Asghar Ahmad
- Bloodshed in Karbala [English]Transféré parDar Haqq (Ahl'al-Sunnah Wa'l-Jama'ah)
- Forty Hadith on the Intercession of NABI SALLALAHU ALAYHI WASALLAMTransféré parAbdul Mustafa
- Antidote to SuicideTransféré parSehra E Madina
- Khazana-e-Khuda_Ki_Chabiyan_Habib-e-Khuda_K_Hath_MeinTransféré parTariq Mehmood Tariq
- 101 Madani PearlsTransféré parMuhammad Wajeeh
- The Odds, NYtimes BeyesianTransféré parUsama Ajaz

- Environment and Behavior 2006 Miwa 484 502 (1)Transféré parAna
- E-Recruiting_6.0_Config_v1.pdfTransféré pareurofighter
- Jehovah's Witnesses Warwick Property ContaminationTransféré parsirjsslut
- Euro Disney Case AnalysisTransféré parlaxge54
- Advanced Finite Element Model of Tsing Ma Bridge for Structural Health MonitoringTransféré parZhang Chaodong
- Osu 1321996306Transféré parAswin Gaming Lord
- LMC SuccessStory A4 Double-Sided ROCHE Excellence in Clinical Supplies en 100423Transféré parrusho_rashid
- Chapter 9 - Quality Assurance.docTransféré parhello_khay
- Low Plasma Levels of Omega-3 Fatty Acids Associated With Preterm BirthTransféré parjennifer lawrence
- PCpumpdemo_Transféré parYraidis Pertuz Roble
- Social Media and the Dynamics of Agenda Setting in British Political DiscourseTransféré parjoshcowls
- Chapter 3 MgtTransféré parFarahain Masri
- journal2008 indbTransféré parapi-290976224
- 000 Role Clarity a Phenomena Based on Organizational Origin is a Catalyst for Job Satisfaction a Comparative Study Launched in India Revealing Some New Facts 2011 2011Transféré parKhalid Dahleez
- 01Malhotra-V3-Chap1Transféré parNurhikmah Rn
- AberdeenGroup_MRO Benchmark ReportTransféré parLevin Liu
- Assign 1 FinalTransféré parBharath Porchelvan
- Sci Industrial Biochemistry Presentation Michele FabrisTransféré parKunal Seth
- Stress Concentration Study in Steel Elastic Element of Load CellTransféré parAbu Bakr M. Saeed
- copy of reflection essauTransféré parapi-457092621
- The Impact of Social Media on the Change of Interrelationship between the Organisations and the CustomerTransféré parChannaWallheimer
- Attachment ReportTransféré parOrinde Omondi Orinde
- 00001089_139131Transféré parPravindraja Pravin
- Unit 5 prashanth reddy poola.pdfTransféré parPrashanth Reddy
- 3MTransféré parVividh Bansal
- 177Transféré parAmit Garg
- Sign Language Converter and Central Health Monitoring and Controlling For Deaf and Dumb Patients in ICUs: A SurveyTransféré parIRJET Journal
- Ethics Exemplar PackTransféré parSanjeev Jayaratna
- willingness to buy privatebrandsTransféré parapi-19739582
- LiteracyTransféré parAndreea-Alexandra Lupu