Académique Documents
Professionnel Documents
Culture Documents
Author(s): P. Holgate
Source: Journal of Applied Probability, Vol. 3, No. 1, (Jun., 1966), pp. 115-128
Published by: Applied Probability Trust
Stable URL: http://www.jstor.org/stable/3212041
Accessed: 09/06/2008 07:54
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=apt.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We enable the
scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that
promotes the discovery and use of these resources. For more information about JSTOR, please contact support@jstor.org.
http://www.jstor.org
J. Appl.Prob. 3, 115-128(1966)
Printedin Israel
Smmary
Some comparisons are made between various characteristicsof the genetic
structures of populations of the same size and age, which have (i) evolved
from a small founder population, and (ii) evolved from a population which
has been of constant size throughout the period considered.
1. Introduction
Consider a gene having two alleles A and a. In the absence of selection, and if
there is no mutation, it is possible that one of these will become fixed in a popula-
tion as a result of the random fluctuations in its frequency from generation to
generation. This is generally known as the Sewall Wright effect, and its possible
contribution to evolution has been the subject of considerable discussion. A sum-
mary of Wright's own views and a survey of his mathematical work on the subject
are given in Wright (1964).
Mathematical work on the tendency to homozygosity of a finite population
has been surveyed by Moran (1962), and Kimura (1964). Roughly speaking, for
given initial gene frequencies, the probability that the populations will become
homozygous within a given number of generations may be appreciable for small
populations, but becomes negligible as the population size increases.
Suppose however that a small founder population colonises an area, and in
the course of several generations multiplies to a much greater size. Fixation may
occur at a given locus when the population is small, and the effect of population
growth will then lead to a homozygous population of such a size that fixation
as a result of random fluctuation would be inconceivable in a population which
had always had that same, large size. This phenomenon is known as the founder
principle, associated with Mayr (1942, 1963). Experimental work in relation to it
is reported by Dobzhansky and Pavlovsky (1957), and it has recently been invoked
in the controversy about the origins of geographical variations between snail
colonies in Southern England, (Goodhart 1962, 1963; Cain and Currey 1963a, b).
Despite its interest, little mathematical work has appeared on genetic fluctuations
(1) f(s)=
(2+f s).
Since the mean of the distribution given by (1) is 1, the ultimate elimination of
heterozygotes is certain. The p.g.f. of Zt is obtained by functional iteration which
may be carried out explicitly for small t. For instance, the probability that F3
will contain j heterozygotes is the coefficient of sJ in
2 4-
(5) Pr(Z, =0) 1- tf(1)
I
t
1
In fact, the p.g.f. (1) and result(5) are mentionedby Kolmogorovas an example
at the end of Section4 of his paper,wherethey arisein the solutionof a different
problemin populationgenetics.It may be noted that t has to be quitelargefor (5)
to give good numericalresults. For t = 40 when (5) gives 0.9, the correctvalue
obtainedby iterationis 0.9142.Howeverthe derivationgivenby Harris(1963,p. 20
ff.) suggestst/(4 + t) as a betterapproximation,and for t = 40 it gives0.9091.For
comparisonconsider an Ft which had arisen from an Fo of 2t heterozygotes,
evolving throught t generationsof selfing at constant population size, and let
Zt be the numberof heterozygotesin Ft in this population.(It should be borne
in mind that the sequence{Z'}, althougha stochasticprocessin the mathematical
sense, does not representthe evolution of a definitepopulation.) By reasoning
similarto that given in the case t = 3, the distributionof Zt is binomial with
n = 2t, p = (1/2)t, and hence
Comparisonof the variancesgivenin (4) and (6), and the probabilitiesthat hetero-
zygotes will be absent, given in (5) and (7), togetherwith the numericalresults
of Table I, illustratethe fact that selfingpopulationswhichhave reacheda large
size by repeateddoubling will be more variable in respect to proportions of
heterozygotes,and in particularwill be more likely to contain none, than those
studyof thefounderprincipleof evolutionary
A mathematical genetics 119
of the same size and age that have always been that size. The stochastic process
describing the number of individuals in Ft which are homozygous for a given
allele, say A, is not a branching process, nor even a Markov process. However,
the probability P, that Ft will consist entirely of AA's, has a simple recurrence
relation. The probabilities that the offspring of a heterozygote are (AA,AA),
(AA,Aa) or (Aa,Aa) are 1/16,1/4,1/4 respectively. It is then required that the
offspring should consist entirely of AA's after (t- 1) generations, which leads to
11 1 2
Pt =- + Pt-1 + Pt2-1
2fYt, j=1,2,-.,
t= I
is such that almost every realisation is monotonically increasing, the form of the
'remainder' term in the submartingale decomposition theorem (Loeve, 1963,
p. 389). Some properties of Y may be obtained by enumerating the possible
structures of F1, and equating the unconditional expectations of certain variables,
to the appropriate expectations obtained after conditioning by the outcome of F1.
The possible F1 's are listed in column 2 of Table II, preceded by their probabilities.
120 P. HOLGATE
Column 3 gives the conditionalexpectationof Y, columns 4 and 5 give thecon-
ditional varianceof Y and the probabilitythat Y = 0, wherev and P denotethe
unconditionalvalues of these quantities.Column 6 gives the conditionalexpec-
tation of ez, wherem(z) is the unconditionalvalue.
TABLEII
Probabilitiesof possibleF1's, and values of variouscharacteristicsof population
structureconditional on these, for the growing,selfingpopulation
1
AA,AA 1 0 0 ez
16
- AA,Aa 4 ^ 0 e '2m( z)
1
AA,aa 0 0 ezI
2
Aa,Aa 2 \ (z
AA
aa,aa 0 0 1 1
16
414+)14(16+4)
6 +4v+8(1+
-v+4T6H+4
which v = For the
gives of the eliminationof
1/12. probability A,
1 1 1 2
P=- +-P +p-p2
Since Y is bounded, all its moments exist, and expansion of the m.g.f. in (9)
provides a recurrencerelation for the even order central moments. The first few are
1 11 121 61663
#2 2= , P4 =
720 6 = 36288' P8 79315200'
of which the first value confirms the solution of (8). Since the probabilities of
the end points are known exactly, the moments of the distribution conditional
on fixation not occurring can be calculated from the formula
3. Random mating
Notation used in this section is not everywherecomparable with that of the last.
Since a random mating population cannot be initiated by a single individual,
in order to make the situations in this section and the last as similar as possible,
I have assumed that the population begins with an F1 consisting of 2 animals,
each of whose 4 genes are A or a with probability 1/2. This means that if the
mathematical definition were extrapolated backwards, independently of its
interpretation, we could imagine an Fo consisting of a single heterozygote. As in
studies of random mating populations of constant size, only the number of genes
in each generation is considered, and they are supposed to be determined inde-
pendently, being A or a with probabilities equal to the proportions in the previous
generation. Moran (1962, pp. 12-20) has discussed the implications for the form
of the offspring distribution of the constant population model and analogous
considerations apply here. Let Zt be the number of A genes in F,. The distribution
of Zt conditional on Z- = z,_ is binomial with n =2'+ p = (1/2)tzti.
The distribution may be computed explicitly for small t, and for t = 3 it is given
in the first column of Table III. For a comparable population of constant size,
suppose that Fo had consisted of 8 heterozygotes, which had then evolved through
3 generations of random mating. The distribution of Z3, the number of A's in
F3 is given by post-multiplying the 1 x 17 row vector with 1 in its ninth column
and zeros elsewhere, three times by the matrix (ail) with elements
122 P. HOLGATE
1
iJ
al= ( 16- ( i
{ HJ) (6) 16)
(Feller, 1951). The resulting distribution is given in the second column of Table III.
In this case the probability of elimination of A can be seen to be greater for the
growing population. The variances of the proportion of A genes are 0.0962 and
0.0440 for the growing and stationary populations respectively.
TABLEIII
Distribution of number of A's after three generations of random mating
(The distribution is symmetric about 8.)
Probability
Growing Stationary
j Population Population
0 .0996 .0068
1 .0284 .0155
2 .0423 .0290
3 .0505 .0457
4 .0562 .0639
5 .0607 .0816
6 .0738 .0964
7 .0655 .1062
8 .0661 .1096
He(t) = 2 x2P(x, t)
=O 0
t 2
=x y(2X
= ]
P(y,t - 1) X2y
y=0
2= P(yt - 1) 4(2
2y 1I1)
+ - --
2(H; -)+( 2-^l )82( -)
_
Since the variancesof all distributionson [0,1] are boundedabove by 1/4, (11
shows that the sequence tends monotonically, as t -> oo to a limit 1/4.
On writing(11) in the form
I I 1
( V2(t)) ()(--2(t-))
an explicit solution can be obtainedfor any t,
d2) +
(,-,){~) )_ -() ) T ).
Withinitial values v2(0) = v4(0)=0, computationusing (11) and (12) leads to the
limits
(13) v2 = 0.105606, V4= 0.020182.
For everyk,
V2k(t) = ( Yt - )
t
-tPr (Yt-1 = q) 2 Yt-=
= V20(t-1),
124 P. HOLGATE
Pr Pry( | - 1.5385997)
The probabilitythat a given allele, say A, will be fixed is less than half this, i.e.
0.1511.
Let yt be the probabilitythat the populationbecomeshomozygousat exactly
the tth generationconditionalon it being heterozygousat the (t - 1)th, y, the
probability that it becomes homozygous at or before the tth generation,and
y = lim, . Vtthat it ultimatelybecomes homozygous. The chance that fixation
will occur precisely at time t can be shown to be greatestwhen Ft-_ contains
only one representativeof one of the alleles,and least when they are presentin
equal numbers.Hence
1 )t 2t (1 )t 2t+1
^(~~~~~-a'^ ))"-
A mathematicalstudy of thefounderprincipleof evolutionarygenetics 125
k+
>=Y + -)
[1
+(l-
(1 )k+i
_- 1 )\ (
e-)
(14) ( 1 k+2
Y-k + (Y) [ ) 1 1 -
([2eI)}]
TABLEIV
Probabilitiesof the fixation of A after t generationsof randommating, for the
growingpopulation
k 1 2 3 4 5
1 y > 0.105598,
4. A fullystochasticmodel
A simplefully stochasticmodel is obtainedby lettingthe numberof offspring
of each individual be, not a constant, but a Poisson variable. Bartlett (1937,
Formula(7b))obtainedthe variancesof the numbersof heterozygotesin successive
generations,(without conditioning on the population as a whole remaining
non-extinct).For the case where(i) the Poissonmeanis unityand hencethe mean
populationsize is constant,(ii) the numberof generationssufficientlylargefor the
probabilitythat the heterozygousoffspringof a given individualwill not have
been eliminatedto be given by the asympoticformula derivede.g. by Harris
(1963, p. 18),Formula 9.5), and (iii) the founder population so large that the
chance of completeextinctionis negligible,Bartlettalso showedhow the proba-
bility that the population should consist entirely of homozygotes could be
calculated.
In the remainderof this sectionI presentsome numericalvalues relatingto the
results of three generationsof mating,for a populationinitiatedby a single in-
dividualand havinga mean rate of increaseof 2, and for a populationinitiated
by 8 individuals,and having a mean rate of increaseof unity. If the numberof
offspringof each individualis a Poisson variablewith mean 2, and the offspring
of a heterozygoteare AA, Aa, aa with probabilities1/4, 1/2, 1/4. The trivariate
stochasticprocessgivingthe numbersof the threetypesin successivegenerationsis
a multi-type Galton-Watson process (Harris, 1963, Ch. 2) with generating
functions
f2(sl,s2,s3) = exp2 s + + 3-
Acknowledgements
I am gratefulto Mr. J. G. Skellamfor severalhelpfulsuggestionsmade in the
course of this work, and to Mr. K. Lakhaniand Mr. D. Spaldingfor assistance
in computation.
TABLE V
Probabilitythat F3 consistsentirelyof:
(i) AA's (ii) homozygotes
Growingpopulation
Deterministicreproduction 0.0838 0.4835
Stochasticreproduction 0.1580 0.4555
Stationarypopulation
Deterministicreproduction 0.0013 0.3436
Stochasticreproduction 0.0645 0.3222
References
AITKEN,A. C. (1926) On Bernoulli's numerical solution of algebraic equations. Proc.
Roy. Soc. Edinburgh46, 289-305.
BARTLETT, M. S. (1937) Deviations from expected frequencyin the theory of inbreeding
J. Genet. 35, 83-87.
(1960) Stochastic Population Models in Biology and Epidemiology. Methuen,
London.
CAIN,A. J. ANDCURREY,J. D. (1963a) Area effects in Cepaea. Philos. Trans. Roy. Soc. B.
246, 1-81.
(1936b) The causes of area effects. Heredity 18, 466-471.
H. (1946) MathematicalMethods of Statistics. Princeton Univ. Press.
CRAMER,
DOBZHANSKY,T. AND PAVLOVSKY,0. (1957) An experimental study of the interaction
between genetic drift and natural selection. Evolution11, 311-319.
DOOB,J. L. (1953) Stochastic Processes. Wiley, New York.
FELLER, W. (1951) Diffusion processes in genetics. Proc. Second BerkeleySymposiumon
MathematicalStatistics and Probability, 227-246.
GOODHART, C. B. (1962) Variation in a colony of the snail Cepaea Nemoralis (L.) J.
Anim. Ecol. 31, 207-237.
(1963) "Area effects" and non-adaptive variation between populations of Cepaea
(mollusca). Heredity 18, 459-465.
HARRIS,T. (1963) BranchingProcesses. Springer, Berlin.
128 P. HOLGATE