Readings in Mathematical Psychology v1 1000003886 PDF

Readings
Volume
in
Mathematical
Psychology
Readings
Volume
in
New
York
and
London
\
Mathematical
EDITED
R.
Psychology
BY
Duncan
Luce,
Bush,
Robert
R.
Eugene
Galanter,
John
Wiley
and
University of Pennsylvania
University of Pennsylvania
Universityof Washington
Sons,
Inc.
"
Copyright
All
1963
by
John
Wiley
"
Sons,
Inc.
rights reserved.
This
book
must
not
without
or
be
the
Library
of
Printed
in
any
part
reproduced
written
thereof
in
any
the
United
of
permission
Congress Catalog
States
form
Card
of
the
publisher.
Number:
America
63-14066
Preface
"
The
first,
of
volumes
two
designed
are
Mathematical
as
Readings
The
Psychology.
references
that
articles
suggestions
and
limitations
our
to
appearing
authors
the
evaluations,
own
the
asked
were
Readings
took
we
which
this
three-volume
particularlyimportant
in
of
Psychology,
accompany
Handbook
considered
they
the
Mathematical
in
materials
source
in
to
their
fields;
considerable
from
Because
liberty in
of
journal
suggest
selected.
were
is the
Handbook
of
these
space
the
selection
and
learning.
process.
focuses
volume
This
Part
consists
and
Part
II
of
topics.
These
Volume
II of the
and
mathematical
after
the
Of
the
from
35
I of
and
of
papers
National
from
of Symbolic Logic,
of
the
papers
The
35
interest
to
U.S.
The
volume,
Journal
11
time,
statistical
the
Handbook.
chapters.
as
Decision
esses
Proc-
Theory (Stanford,
It
is
view
our
1959),
that
They
of
Transactions
of
are
every
listed
are
Psychometrika,
PacificJournal
the
10
are
3 from
the
of Mathematics,
Biophysics, the Proceedings of
the
Radio
the
the
of
Institute
Mathematical
is
from
of Experimental Psychology,
2 from
Gratitude
Force.
of
Handbook
his bookshelf.
on
Mathematical
Annals
the
8-10
and
Statistics, and
expressed
for
Engineers,
private
permissions
ment
docu-
reproduce
to
here.
represent
papers
note
statisticians,
published
Air
this
of
of Sciences,
and
other
Readings.
books
such
the
Bulletin
mathematical
Learning
present
reaction
Handbook.
in
3 from
and
publications, such
Society of America,
the
Academy
Journal
these
the Acoustical
each
one
the
reproduced
Psychological Review,
Journal
the
have
1-6
to
Mathematical
in
from
Volume
preface to
relevant
hard-cover
psychologist should
related
Chapters
papers
Studies
intentionallyexcluded
were
in
in
appeared
(Wiley, 1954)
or
referenced
psychophysics
psychophysics,
learning and
on
contains
Readings
have
psychology:
measurement,
papers
are
papers
that
Papers
21
of
areas
on
papers
of
consists
main
two
on
14
that
3
in 1947,
and
compilation
For
of
17
are
of
the
these
are
engineers,
the
a
handling
others
book
this
of
work
are
of
and
30
different
professional psychologists,8
and
are
rather
this
other
philosophers.
uniformly spread
sort
requires
details, the
It
contributors.
are
of
One
over
the
wish
to
1950-1962.
of
thank
spondence.
corre-
Miss
Katz.
R.
Luce
Duncan
Philadelphia,Pennsylvania
Robert
R.
March,
Eugene
Galanter
1963
of
was
papers
the years
surprising amount
editors
be
may
mathematicians
Bush
Ada
Contents
PART
MEASUREMENT,
An
TIME
Formulation
Axiomatic
Intervals
Successive
by
Ernest
Decision
Adams
and
and
Psychoacoustics
by
David
Some
M.
Generalization
Time
3
Messick
Relations
Duncan
Detection
Luce
Theory
and
by
M.
Correction
R.
Duncan
William
Random
by
67
Laws
Psychophysical
69
Luce
Multivariate
by
of
Theory''
Green
the Possible
On
41
Green
Comments
David
in
17
R.
and Detection
'''Psychoacoustics
by
of
Behavior
S. Christie
Lee
Samuel
and
Structure
and
Scaling
and
Simple Choice
by
AND
PSYCHOPHYSICS,
REACTION
InformationTransmission
84
J. McGill
Fluctuations
William
of Response
104
Rate
J. McGill
to Changes in the Intensity

Sensitivity
of White
and
Loudness
to Masking
Noise
and
Its Relation
by George
A.
Miller
The
Magical
Number
Some
Limits
by George
Remarks
I. The
Least
Frederick
on
Our
or
Minus
Two:
Capacityfor ProcessingInformation
135
Miller
the Method
on
Deviations
by
A.
Seven, Plus
119
Squares
and
of Paired
Solution
Assuming Equal Standard
Equal Correlations
Mosteller
Comparisons:
152
CONTENTS
Vlll
Theoretical
Some
Relationships
among
Measures
of
Conditioning
by Conrad G. Mueller
The
by
159
Theoryof SignalDetectability
W,
and
Peterson, T. G. Birdsall,
W.
167
W.
C. Fox
Aspectsof Theories of Measurement

Scott and Patrick Suppes
Foundational
by
Dana
for Choice-Reaction
Models
by Mervyn
PART
II
by T.
AND
about
Inference
Stochastic
by
STOCHASTIC
Anderson
W.
R. J.
and
Model
by
Robert
Model
R. Bush
Two-Choice
by
R. Bush
K.
Choice
Behavior
263
K.
for SimpleLearning
Frederick
Frederick
Thurlow
278
Mosteller
and
Discrimination
Fish
300
R. Wilson
TheoryofLearning
308
Estes
Theoryof SpontaneousRecoveryand Regression
Estes and
332
in Learning
Variability
C. J. Burke
Verbal
Situation
Conditioning
in Terms
of
Statistical LearningTheory
An
by
A
K. Estes and
J. H.
Investigation
of Some
Curt
F.
by
343
Straughan
Mathematical
Models
for Learning
353
Fey
Functional
Laveen
322
Estes
Analysis
of a
by W.
289
Mosteller
ofParadise
and
Theoryof Stimulus
W.
241
A. Goodman
Generalization
and
Statistical
K.
Statistical
by W.
and
Behavior
Robert
W.
Chains
for Individual
for Stimulus
Toward
PROCESSES
Markov
Leo
Model
R. Bush
by Robert
by
228
Audley
Mathematical
Time
Stone
LEARNING
Statistical
212
EquationAnalysis
of Two
Kanal
LearningModels
360
CONTENTS
The
Distribution
Asymptotic
Samuel
Walks
Learning Theory
by John Lamperti and
Markov
A.
by George
Statistical
A.
by George
Ultimate
by
Miller
Patrick
in
Frank
Role
Suppes
429
Psychology
of the
Shannon-
Wiener
448
William
and
and
Miller
between
a
G.
Two
Madow
Verbal
William
470
Learning
J. McGill
Attractive
Goals:
498
Model
Mosteller
and
Maurice
Tatsuoka
515
Learning
Restle
of Observing Responses
Benjamin Wyckoff,
in
Discrimination
Learning,
524
404
Applicationto
Estimate
Theory of Discrimination
L.
Model
413
Descriptionof
from
Frederick
Part
Their
Likelihood
Choice
Predictions
The
Learning
of Information
Measure
by
Beta
Miller
the Maximum
381
Suppes
and
Processes
A.
by George
by
Patrick
Order
of Infinite
Finite
Models
Karlin
Lamperti and
John
Chains
Arising in Learning
Asymptotic Propertiesof Luce's
Some
On
Two-Absorbing-Barrier
Kanal
Random
Some
by
the
376
Laveen
by
for
Model
Beta
by
IX
Jr.
Part
PSYCHOPHYSICS,
MEASUREMENT,
REACTION
AND
TIME
AXIOMATIC
AN
SCALING*
INTERVALS
SUCCESSIVE
OF
GENERALIZATION
AND
FORMULATION
Adams
Ernest
university
of
berkeley
california,
AND
Samuel
educational
Messick
service
testing
A formal set of axioms is presented for the method

of successive intervals,
of the scaling assumptions are
derived.
directlytestable consequences
by a systematic modification of basic axioms the scalingmodel is generalized
stimulus distributions of both specifiedand unspecified
to non-normal
and
Then
form.
scaling models
Thurstone's
comparisons [17,24] have
been
of successive
intervals
[7, 21] and
severelycriticized because
paired
of their
dependence
untestable
of
This
assumption
apparently
normahty.
objection
upon
was
recently summarized
by Stevens [22],who insisted that the procedure
of a psychologicalmeasure
of using the variability
to equalize scale units
of a kind of magic
trick for climbing the hierarchy of scales.
"smacks
a rope
in this case
The
is the assumption that in the sample of individuals
rope
in
tested the trait
question has a canonical distribution,(e.g.,'normal')
There
those who
believe that the psychologistswho make
are
tions
assump."
whose
validityis beyond test are hoist with their own
petard
Luce [13]has also viewed these models as part of an "extensive and unsightly
have
literature which
has been largelyignored by outsiders,who
correctly
condemned
the ad hoc nature
of the assumptions."
Gulliksen [11],on the other hand, has explicitly
discussed the testability
of these models
and
has suggested alternative procedures for handling data
an
"
"
"
"
"
which
also
do
not
satisfy the checks. Empirical
mentioned
implied
or
in
other
several
Random
deviations
*This
paper
from
was
M'hile the
authors
were
were
[e.g.,
presented [8, 18],
satisfactory scaling within
scaling assumptions, are
written
been
methods
well
sampling fluctuations,as
and
errors
fit have
scaling theory
of the
accounts
8, 9, 12, 15, 21, 25]. Criteria of goodness of

which, if met
by the data, would indicate
acceptable error.
of the
tests
"
"
thereby evaluated
attending
the
1957
as
by
an
tematic
sys-
these
Social Science
in Social Science.
Applications of Mathematics
171-034
The
research
was
supported in part by Stanford University under Contract NR
with
Research, by Social Science Research
Group Psychology Branch, Office of Naval
Council, and by Educational
Testing Service. The authors wish to thank Dr. Patrick Suppes
for his interest and
throughout the writing of the report and Dr. Harold
encouragement
the manuscript.
Gulliksen for his helpful and instructive
comments
on
Research
This
Council
article
Summer
appeared
Institute
in
on
Psychomelrika, 1958, 23,

3
355-368.
Reprinted
with
permission.
READINGS
PSYCHOLOGY
MATHEMATICAL
IN
over-all internal
checks. However, tests of the scaling

consistency
assumptions,
been
and in particularthe normality hypothesis,have
not
yet
explicitly
derived in terms of the necessary
and sufficientconditions required to satisfy
the model. Recently Rozeboom
and Jones
[20] and Hosteller [16] have
of
successive
intervals
and pairedcomparisons,
t
he
investigated
sensitivity
that departuresfrom
to a normality requirement,indicating
respectively,
normality in the data are not too disruptiveof scale values with respect to
of the assumptions of the
goodness of fit,but direct empiricalconsequences
not specified
model were
such.
as
The present axiomatic characterization of a well-established scaling
model
was
(a) an
attempted because of certain advantages which might accrue:
of
that
follows from a preciseknowledge of formal propease
generalization
erties
in
and
parisons
by systematically
modifying axioms,
(6)an ease
making comof different models. The next section deals
between the properties
with the axioms for successive intervals and serves
as the basis for the ensuing
i
n
stimulus
to non-normal
tions.
distribusection, which the model is generalized
which
One outcome
of the followingformalization
should again be
sequences
highlightedis that the assumption of normality has directlyverifiable conand should not be characterized as an untestable supposition.
Thurstone's Successive Intervals
ExperimentalMethod
The
In
set of
to
ScalingModel
some
of successive
method
the
stimuli and asked

attribute.
placedin category
The
intervals subjects are
to sort them
into k ordered
proportionof times f,i that
i is determined
from
the responses.
presented with a
with respect
categories
a
given stimulus s is
If it is assumed
category actuallyrepresents a certain interval of stimulus values for

then
the relative
frequency with
category should
the stimulus
This
value
to
probabilityis in
which
represent the
lie within
turn
that
probability
the interval
simply the
area
the interval. So far scale values for the end

but
areas
if the observed
under
for
probabilities
normal
category boundaries
Scale
and
values
interval widths
curve,
and
for
the
given stimulus
is
the
pointsof the intervals
given
then scale values
may
ticular
par-
subject estimates
the
under the distribution
are
subject,
placed in
correspondingto
stimulus
that
taken
be obtained
category.
curve
are
to
inside
unknown,
represent
for both
the
stimulus.
interval boundaries
not assumed
are
determined
in the method
by
of
this
model,
equalappearing
obtainingsuccessive intervals
scale values have been presentedby Saffir [21],Guilford [10],Hosier
[15],
Attneave
Burros
Garner
and
Edwards
Hake
Bishop [3],
[5],and
[2],
[7],
[9],
Rimoldi [19].The basic rationale of the method
outlined
had been previously
by Thurstone in his absolute scalingof educational tests [23,26].Gulliksen
are
equal,
as
intervals. Essentially
equivalentprocedures for
ADAMS
ERNEST
MESSICK
SAMUEL
AND
[6],and Bock [4]have described least

and
and
Rozeboom
Jones [20]
solutions for successive intervals,
square
presenteda derivation for scale values which utilized weights to minimize
sampling errors. Most of these papers contain the notion that the assumption
than one
stimulus. Although
of normality can be checked by considering
more
distribution of relative frequencies
can
always be converted to a normal
one
it is by no
means
always possibleto normalize simultaneouslyall
curve,
and variances,on
of the stimulus
distributions,
allowing unequal means
of exact conditions under which this is
the same
base line. The specification
the problem of sampling
be attempted.In all that follows,
possiblewill now
fluctuations is largelyignored,and the model is presentedfor the errorless
Tucker
[12],Diederich,Messick, and
case.
The
Model
Formal
set of
The
limit upon
testingthe
s
in
which
from
admissible
each
have
category i
stimulus
stimuli,although
"
"
it will
specifically,
placed in category i
product of "S X {1,2,
be
the
that
members.
"
"
There
"
is no
for the purpose

of
For each stimulus
k, the relative frequency/","with

is given.Formally / is a function
"
is
the Cartesian
r, s, u, v,
least two
at
1, 2,
elements
of
number
S must
model,
S, and
S, has
denoted
stimuli,
the
"
"
k}
"
More
in
S, f^ will be a probability
the set {1,2,
distribution over
statement
k}. For the sake of an explicit
of the assumptions of the model, this fact will appear
as
an
axiom, although
it must
be satisfiedby virtue of the method
of determiningthe values of /^,,case
"
"
for each
to the real numbers.
"
A;}into the real numbers

/ is a function mapping *S X {1,
in
is
d
istribution
s
over
k];'i.e.,
{1,
S, f^ a probability
each sin. S and i
k, 0 " j,î" 1 and Xli=i
f^.i 11,
The
the function
set S and
/ constitute the ohservables of the model.
which
remain to be introduced.
not directlyobserved
more
are
concepts
Axiom
1.
"
"
"
such that for each
"
"
"
for
"
"
"
Two
The
first of these is
of the
ti
"
"
t^k-l) which
"
are
corresponding to the categories.It is

the entire
adjacent and that they cover
simply be assumed
that
î
"
"
^(^-1)are
"
the end
are
intervals
intervals
it will
set of numbers
assumed
that
real line.
points
these
Formally,
series of real
increasing
an
numbers.
Axiom
i
,{k
Finally,the
2,
"""
by
Axiom
over
2. Interval
boundaries
ti
"
"
^(i-d are
"
1), ^(.-1)"U.
correspondingto
distribution
normal
3. iV is
distribution function
a
function
mappmg
each
real
stimulus
numbers, and
s
for
*S is represented
A'^,
.
S into normal
distribution functions
the real line.

Axioms
1-3
do not
state
fullythe mathematical
properties
requiredfor
READINGS
IN
the set
ti
S, the numbers
of completeness,these will
"
MATHEMATICAL
formal
stated in the
be
0. S is
In the interests
followingAxiom
should be referred to instead of Axioms
purposes
Axiom
t(^k-i)and the functions N,
"
PSYCHOLOGY
for
0, which
1-3.
set. A;is
a
positiveinteger.
/ is a function
k] into the closed interval [0,1],such that for each s
mapping S X {1,
i^ S) î=i fs.i
1- For i
{k
1),ti is a real number, and for
I,
"
i
N
is
a function
(k
ti+i
2), ti
1,
mapping S into the set of
"
non-empty
"
"
"
"
"
"
"
"
"
"
normal
Axioms
2 and
3 state
only the set-theoretical character of the elements
ti and
N,
of the
theory states the connection
and
have
/,.," and the assumed

Axiom
4.
intuitive
no
Axioms
between
1-4 state the formal
hypothesis
relative frequencies
in *Saudi
=!,"""
k,
f' NXcc) da.
1, ta-D is set equal to
central
the observed
(Fundamental hypothesis)For each

=
if i
empiricalcontent. The
underlyingdistributions A^,
/,.,"
(Note that
the real numbers.
over
oo, and
"
ii i
k, ti
oo.)
assumptionsof the theory although,because
the fundamental
ti , it is not
wiU
hypothesis(Axiom 4) involves the unobservables N, and

testable in these terms. The questionof testing
the model
directly
be discussed in the next
yet been introduced. These

N,
and
hence
are
section. Scale values for the stimuli have

defined to be
are
equal to the
easilyderived. The function
means
v
not
of the distributions
will represent the
scale values of the stimuli.

Definition
that for each
1.
in
is the function
S, v, is the
V,
mapping S into the real numbers

of N, ; i.e..
mean
such
aNXoi) da.
Testingthe Model
The
model
of Axioms
1-4
will be said to fit exactlyif all of the testable consequences

will be
of these axioms
verified. Testable consequences
are
those consequences
concepts S and
If
which
/, or
are
formulated
of concepts which
further
about
solelyin
are
terms
of the observable
definable in terms
of S and
independent determination
/.
of
an
assumptions are made
then
the
and
testable
are
just those which
ta-i)
N,
consequences
follow about
the assumption that there exist numbers
/ and *S from
1-4. In this model,
^1
t(k-i)and functions N^ which satisfyAxioms
it is possibleto give an exhaustive description
of the testable consequences;
hence this theory is axiomatizable in the sense
that it is possible
to formulate
ti
no
"
"
"
"
"
"
observable conditions which
are
necessary
and sufficientto insure the existence
For all
i
be linear functions of each other in the
(7),is that all 3, ."
and
I,
r
"""
and
in
S,
there exist real numbers
2,..-
requirednumbers
a,
Zr
-^
ar.sZr.i
and "r
,,
(9)
6^,,such that for each
independent of
i and
"
exist if and
"
only if for each
and s, the ratio
i
=
a,,r
Gr,,
Z,j
"
"r."
."
Zr
"
Zs,i
is
a^., and
followingsense.
,k,
(8)
The
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
j.
(8) exist,then they are related to

a^., and 6^,.satisfying
the scale values v,. and the standard deviations o-^ in a simpleway. For each
If constants
r,
in
S,
(10)
ttr.s
(^r/(rs
,
and
"r."
(11)
(i^r y.)/0-a
-
"
Clearlythe arbitrarychoice of the constants Vr and o-^ in (5),(6),and (7)

represents the arbitrary choice of originand unit in the scale. Since scale
once
values of U and v, are uniquely determined
chosen,the
v^ and
o-^ are
interval scale
an
scale values are unique up to a linear transformation;
i.e.,
of measurement
does
not
has
been
determined.
requireequalityof standard
It should
deviations
be
noted
(or
what
that
this model
Thurstone
has
dispersions[25]) but provides for their determination

in its possible
from the data by equation (6).This adds powerfulflexibility
applications.
remark
about
the necessary
and sufficient
It remains only to make
a
fulfill in
condition which a set of observed relative frequencies/,."must
and sufficient condition is simply
order to satisfythe model. This necessary
called discriminal
that
the
numbers
z,,i
which
are
defined in terms
of the observed
relative
be determined
related as expressedin (8),This can
be linearly
frequencies,
by seeing if the ratios computed from (9) are independent of i and j,or by
of the plotsof z,,i againstZr,i
Hence
evaluatingfor aU s, r the linearity
for this model there is a simple decision procedurefor determiningwhether
not a given set of errorless data fits.
or
related for all s, r in S, the assumpIf z, ,i and Zr ," are found to be linearly
tions
of the scalingmodel are verified for that data. If the z's are not linearly
then assumptionshave been violated. For example,the normal
curve
related,
other
distribution function for the stimuli and some
not be an appropriate
may
function might yielda better fit [cf.11, 12].Or perhaps the responses cannot
be summarized
unidimensionallyin terms of projectionson the real line
representingthe attribute [11].If the stimuli are actuallydistributed in a
.
ERNEST
multidimensional
be
may
does not
This
if the
that
A
satisfactorily
by
model
does
multidimensional
such
never
be
in several
varying
dimensions
may
rather
might be operating.
more
appropriate in
effects
[14] might
sions.
dimen-
but
intervals,
of successive
distortion
scaling model
practicethe
prove
points (2;^,,(k
1) will
Zs,i)ior i
2,
exactly fit the straightline of (8) but will fluctuate about it. It remains
this fluctuation
decided whether
represents systematic departure
the model
from
or
set
still be
is not
size of the
In
"
"
"
"
obtained
the
only by eye.
by the method
minimum
of
absence
precise,although the
if
straightline
variance.
error
evaluated, even
pointsto
of
the decision
linearity,
the
stimuli
fit,such
not
of variations in other
the presence
the method
of the attributes
one
cases.
In
to
that
mean
MESSICK
on
judgments of projections
then
space,
SAMUEL
AND
distorted by
differentially
be scaled
not
ADAMS
One
statistical test
linearityof the
and
of least squares
[4,6, 12].In
error
plots may
is to fit the
approach
then
obtained
evaluate
event, the
any
for
test
of
of a suitable
in the errorless case, and the incorporation
sampling theory would
provide decision criteria for direct experimental
the
model
is exact
applications.
Generalization of the Successive
be
generalizedin
of ways.
number
previous section can

treated in detail
generalization,
in the
discussed
model
successive mtervals
The
Intervals Model
One
of a
by Torgerson [27],considers each interval boundary f,-to be the mean
approach toward
subjective distribution with positivevariance. Another
distributions
the requirement of normal
generalizingthe model is to weaken
amounts
to enlarging
of stimulus scale values. Formally, this generalization
of specifying
the class of admissible distribution functions. Instead
exactly
which
distribution
functions
an
assume
generalization,
real line,to which it is requiredthat
allowed
are
in the
arbitraryset \pof distributions over the

the stimulus distributions belong.In formalizingthe model, \pis characterized
be
3 may
the real line. Axiom
simply as a set of distribution functions over
of the class 4/ and stating
the nature
axiom specifying
replaced by a new
for each s in S, C,
of \p;i.e.,
that C is a function mapping S into elements
the distribution
(interpretedas
final
One
if \p contains
formations
defined
C by
stretch
on
as
of
assumption
other
any
shift of
along
about
the
C. A
linear
C,
class \{/needs
then
transformation
originand
the horizontal
of
be
function.
form. Let D and D'
added:
be
to
namely,
which
scale transformation
axis must
of \l/.
member
contain all linear trajis-
it must
be
can
of the
compensated
the vertical axis in order that the transformed
density
s) is
of the stimulus
function
Algebraically,these transformations
then D' is a
be distribution functions,
obtamed
horizontal
for
by
also be
have
is
from
axis. A
contraction
a
a
probability
following
the
linear transformation
10
of D
if there exists
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
positivereal number
and
real number
h such that
for all X,
D'{x)
This is not truly a linear transformation

for lack of
h).
aD{ax +
of
because
multiplication
by a on the
is
for
used.
The
reason
phrase
this
better term
a
ordinate,but
of distribution functions be closed under
class
the
that
^
requiring
in any
possibleto convert them
it will be
values
to form
in
by
of stimulus
scale
linear transformation
into
the stimulus values obtained

values;i.e.,
set of scale
admissible
another
determination
that
insure
is to
transformations
linear
are
interval scale. If the set ^ is not closed under linear transformations,
an
will not
generalit
be
possibleto alter the scale by
an
arbitrarylineaf
transformation.
3'. ^ is
Axiom
C is a function
and
b is
set of distribution functions
S into \p.For all D
mapping
real number, then the function D' such
D'{x)
is a member
aD(ax +
the real
over
numbers, and
positivereal number
in ^, if a is a
that for all x,
6)
of ^.
that the set of normal
It is to be observed
distributions has the
required
property of being closed under linear transformations. This set is in fact a

that aU normal distribution functions
minimal class of this type, in the sense
can
be
generatedfrom
the
points ti
4'. For each
Axiom
in *S and
before to be the
for each
1'.
s
in
and
f*
0"
1,
"
S,
is the function
v, is
the
problem
\p.Each
now
is to
"
k,
stimulus
.) The
mean
C,
are
defined
as
such
specifythe
distributions. If the
values
mapping S into the real numbers

of C,
i.e..
dx.
class of admissible distribution functions
of this class amounts

specification
stimulus
C,
C.ix) dx.
=/:xC,{^)
The
"
of the distribution functions
means
Definition
that
""
"
/",." the distribution functions
formations.
trans-
f,.i
(Here again ô
by linear
which specifies
generalization
obvious
an
the observed
between
interval end
singlenormal
4 is replacedby
Axiom
Finally,
the connection
and
to
theory about
the underlying
hypothesis of normality is altered
or
ADAMS
ERNEST
weakened,
what
of all distribution functions

is made
of C,
C.{x)
Non-normal
It is
be
to
of f,-can
scale values
1 "
Every
set of
be determined
only
members
transformations,has
the
the
when
restrictions
to
linear transformation.
fits. For
model
minimal
class of distribution functions.

1. There
exists
an
h, such
that
in
S,
k,
It will next
of
sense
that
assume
\p is
such that for all

and
exists
h).
scale values
3' and
positivereal number
obtained
are
Assumption
a, and
imply
real number
for all x,
CXx)
the function
(13)
on
the
/.,."
=
is the cumulative
distributions p,
(14)
follows. Axiom
aMa.x
,"
are
defined
p,,i
as
1,
aMa.x
distribution
(12) is
functions
the
=
5J,
+
of
rightside
form
linearlyrelated to all
specified
Axiom
4',then, for each s in S, and 2
present
1 is satisfied the
proceed as
there
(12)
where
positivereal number
aD{ax +
Assumption
interval scale,we
that for all
"
that for all x,
if
that
on
distribution function D
D'{x)
show
the
D' in \}/there exists
To
"
are
values
h such
"
if the scale values

i/-
some
stimulus
real number
1,
by linear
generatedfrom a singlemember
desired property of generatinga linear scale of
class all of whose
Assumption
class of distribution functions,in the
minimal
any
z,
Form
oj Specified
clearlynecessary to make
determined
uniquely up
that
"
a;
othenvise.
Distributions
shown
If
weak.
then the theory is very
on
no
construct
to
necessary
If
always possibleto determine distribution functions

this it is only
for arbitrarily
ti To show
specified
with the followingdefinition.
them
in accordance
4'
Axiom
C, satisfying
be
amount
ordinal scale. It is
an
are
would
real numbers.
over
theory, and the
will fit the
data
on
the forms
about
assumption about
to letting
\^ be the set
assumption whatever
replaceit? Omitting any
assumptions can
of the distribution functions
the form
11
MESSICK
SAMUEL
AND
"
"
"
D'
Tr(aJ,+
5,
some
According
to
6,)dx.
correspondingto D,
a,D(a,x +
in ^.
of
,k,
before,then
fixed function
b,)dx
and
the
cumulative
12
READINGS
Assuming
that
the function
of function
form
z,
such that for each
,,"
is
is
Ps.i
(16)
z,,i
that
to assume
based
on
Vs
tt
generalbe
in
ti ,
in S and
and
a,
1,
is
"
Trfe.i).
"
k. It is clear from
"
"
be
not
in the normal
If it were
increasing.
Zs,i determined
not, there would
by (15);hence the scale values
unique. It is also seen that (4),relatingZs,i to

distribution model, is simply a particular
of
case
between
connection
0-,
(15)why it is necessary
unique
h,
aji +
monotone
strictly
z,,i would
(16)here. The
"
(15) imply immediately that
Equations (14) and
for all
"
(15)
not
strictlymonotone
increasing,
then,knowing
possibleto determine uniquely the numbers
in aS and i
1,
k,
x
Z), it
the
PSYCHOLOGY
MATHEMATICAL
IN
a^
1/a,
b, and
Vs
and
a-,
"hja,
is
v,
(15),as in the correspondingset of equations obtained from the

the left are known, and the numbers
on
normalityassumption,the numbers
As before,if two numbers
the rightare unknown.
arbitrarily
on
a^ and 6^ are
determined for a fixed stimulus r, then the ti are uniquely determined by the
followingequation.
In
(17)
U
scale values
The
the
from
for the
coefficients Zs,i
the
mean
of C^
is determined
,
(18)
Vs
Both
the
tts and
is the
cannot
h^ without
and
a^
I,
stimuli,however,
of the basic distribution D. If

as
{zr,i ".)M
mean
of
,k.
"""
then v^
D,
v^
which
,
was
mean
defined
by
(m
6,)/a,
"
(17)can be determined in terms

in
is immediately determinable
the 6, in
(19) and (20);hence

by (18).
quantities
directlydetermined
be
first specifyingthe
of z,,i a^ and 6, ,
of just these
terms
,
''"'""'''"'
(19)
a,
a.
^"\zr,"
(^'''
h).
~
(20)
hs=Zs.i-
It is clear then
that
transformation.
that
set of data
the
related.
rightin
ti and
y, are
determined
up
to
sufficient conditions
and
Furthermore, necessary
of differences in
the
that
ratios
model
are
simply
that the z's be linearly
(19)be independentof i and j;i.e.,
linear
g's on
the scale values
fit the
ERNEST
The
final generalizationto
holds, but where

it is assumed
i.e.,
the form
that the
that the
class,but
class
obtain
than
more
assumed
that
it is
still possibleto
Since
will
D
is
tt
obtain
never
This
zero.
if and
only
in S and
\i
i,j
(21)
1,
"
ps.i
Therefore
from
"
"
about
all x,
minimal
one
family
"
through
Now,
by solving (15),but
it
increasing,
numbers
{a,ti+
the
case
is
6 J,
however,
increasing;
explicitin
is made
D{x)
up
is also unknown.
the
D.
and
values. If it is
strictlymonotone
increasing in
assumption
it
Assumption
2.
0.
it follows that
then
increasingj
(14) holds, then
it will be
the
wix)
that
case
"
ir{y)
for all r,
k,
"
only if aJi +
if and
Pr,j
the numbers
ordering on
an
scale
z^,,-
distribution it is monotone
y. If
"
tt
numbers
is
information
if TT is strictly
monotone
Now,
the function
case
function
some
2. For
Assumption
the
about
the model
test
is unknown, all of the deductions
in this
only be strictlymonotone
is
Assumption
any
distributions all belong to

but D
the
cumulative
information
impossible to discover the
postulatedthat
in which
one
it is still possibleto
case
ordinal
generated by a
D,
(14) go through, although
of course,
generated by
be
stimulus
the
is
considered
be
generatingfunction D is not specified;

underlyingdistributions all belong to one minimal
can
function
if it is
13
MESSICK
of the
in this
Interestinglyenough,
to
SAMUEL
of the Distributions Unspecified
Forms
A
AND
ADAMS
63
p^.i
"
Qrt, +
one
can
6,
obtain
system of
h^ and i, If it is further specified

involving the constants
inequalities
a^
(as is requiredfor the conditions of the problem) that a, " 0 for all S, then
this set of inequalities
solution.
will not in general have
a
,
However,
The
a,
ti and
this set of
taken
to be
whether
and
necessary
not
or
set of data
fits the model
the data. This
stillbe determined.
for fitis that there exist numbers
sufficient condition
a,
"
to
construct
is done
in the
function
A
6J
which
can
represent
differentiable monotone
following way.
is constructed
7r(.T)
by connecting the
T(aJi +
discrete set of
creasing
in-
points
ps,i
increasingcurve.
smooth, strictlymonotone
of stimuli,then such a curve
can
only a finite number
D
defined
is
the
function
distribution
by
Finally,
(22)
may
the system of inequalities

(21).If
0) satisfying
be
inequalitieshas a solution,then the interval boundaries
may
the ti satisfying(21).To determine
the scale values of the stimuli
h, (where
it is first necessary
with
any
D{x)=-~Tix).
If,as is usual, there is

always be constructed.
14
READINGS
Then, if the
are
determined
the
V, is
of the
mean
determined
arbitrary
h and
and
can
on
in the
constant
PSYCHOLOGY
distribution D
by (18),v,
concerned,it
MATHEMATICAL
IN
(m
be
seen
the
mean
is m,
the values
of the
v,
stimuli
6J/a, As far as the determination of

that they depend solelyon the previously
"
which
m,
determination
be regarded
can
of the
v,
as
an
additional
remaining point of discussion for this model is the determination

of the degree of uniqueness of the scale values. Finding the set of all possible
solutions to the inequalities
(21) presents, in general,extreme
difficulty.
be simply determined
is the class of what might be called
One thing that can
the universal transformationsof the solutions of the system of inequalities.
is one
A universal transformation
which, applied to a solution of any set of
another
solution
set of inequalities.
to the same
yields
inequalities,
By noting
between
the theory of the inequalities
close connection
a
(21) and a twodimensional affine geometry with a distinguished
set of horizontal and vertical
shown
of
universal transformations
for this
[1]that the class
lines,it can be
The
model
is
subset of the affine transformations.
of the interval boundaries
by
and
hence
arbitraryconstant
is also
so
universal transformations
the linear ones, and of the

The
6, also are determined
up
tf are
positiveconstant.
The
the
are
also enters
scale values
into their
a,
to
are
a
tions
multiplicalinear transformation,
(although the additional

determination).
in which, even
though there is
y,
interesting
specialcase
the scale values of the ti are determined
only a finite number of observations,
to
linear
transformation.
This
a
might be called the specialcase of equal
up
intervals,in which differences in successive tiare all the same.
If,for example,
there exist stimuli with such relations among
correspondingp's as p^,,
mine
etc.,it is possibleto deterVx,i+i
Pv,i+2
Py,i
Pz.i+i
Vz.i-^2
Vy.i+i
that successive intervals are
equal [1].
The fact that scale values obtained in this model, at least under certain
circumstances,are unique up to a linear transformation has two interesting
based
for the originalsuccessive intervals model
the noron
mality
consequences
model
If
in
errorless
the
t
the
case
original
hypothesis,(i)
fits,hen
bution
other successive intervals model which assumes
a different
no
form for the distriwill fit.
The reason
for this is that the forms of the distribution
functions
determined
cumulative
functions (or the
are
distributions)
by the values of
determined
the point ti Hence, if the ti are
p,,i lying above
up to linear
(ii)Where the normality assumption
transformation,so are the curves
p..,does not fitthe data it is theoretically
possibleto use the present generalization
There
an
to obtain
scale. Then
the deviation
of the scale values from
those obtained
normality requirement can be evaluated. This, at least in principle,

provides a second kind of goodness of fit besides the usual least squares
methods
regression
employed where the data do not exactlyfit the Thurstone
under
model.
16
READINGS
[24]
Thurstone,
L.
[25]
Thurstone,
L.
L.
[26]
Thurstone,
L.
L.
The
1927,
[27]
Manuscript
MATHEMATICAL
Psychophysical
of
law
Amer.
analysis.
comparative
of
unit
PSYCHOLOGY
judgment.
in
measurement
J.
Psychol,
1927,
Psychol.
educational
38,
Rev.,
1927,
scales.
J.
368-389.
34,
educ.
424-432.
Psychol.,
505-524.
Torgerson,
behavior.
Revised
18,
L.
IN
W.
New
received
manuscript
S.
York:
law
New
of
York
2/12/58
received
4/21/58
categorical
Univ.
judgment.
Press,
1954.
In
L.
S.
Clark
(Ed.),
Consumer
STRUCTURE
DECISION
AND
CHOICE
SIMPLE
S. Christie^
Lee
Group
The
It
Laplace
transform
attempts
to
problems
the
of
fit
further
in
considered
In
this
beings organize
the
I. Introduction.
into
such
that
describe
of
Control
Systems
of
carried
was
Laboratory,
that
laboratory
should
the
for
empirical
for
University
of
April
issued
the
authors
the
it
were
Illinois.
1954,
tions
situathesis
our
be
may
studies,
model
sible
pos-
is
must
we
not
firmly
consultants
It is
supported
the
in
distribution.
reaction-time
the
from
when
that
man
hu-
way
reflected
be
must
and, therefore,
from
It is
decisions.
speculative,
out
model
propose
decisions
derives
as
analysis
the
which
required by simple choice
organization
proposal
work
R-53
of
decision
made.
are
component
times
thinking
our
this
This
of
reaction
the
infer
Although
we
organization
an
distribution
to
paper
of
data
plex
com-
csises
evaluating
of
treating
features
provide
to
model,
decisions
collection
essential
the
on
work
is
proposals,
the
that
to
special
problem
shown
earlier
leads
Two
the
of
use
negated
model
solved.
statistical
It
leaves
which
model
decisions.
the
by
general
not
the
discussed.
is
experimental
Two
but
and
out,
time-discrete
as
unchanged.
be
model
the
processing
worked
are
The
formulated
are
elementary
difficulties
overcome
of
terms
treated
be
can
reactions.
choice
which
model
the
of
as
so
analyze
in
hypothetical
data
to
considered
is
from
decisions
reaction-time
that
Technology
of
decisions
simple
such
composes
argued
is
Electronics,
of
Institute
of
Luce^
Duncan
Laboratory,
Laboratory
structure
which
R.
and
Massachusetts
IN
BEHAVIOR*
Networks
Research
RELATIONS
TIME
under
the
to
of
revision
port
re-
contract
DA-36-039-SC-56695.
'Present
Research
Operations
address:
Maryland.
Chevy
University,
Chase,
^Present
for
address:
Center
Sciences,
This
Stanford,
article
appeared
Advanced
Office,
Study
The
in
Johns
the
Hopkins
Behavioral
California.
in
Bull.
math.
Biophysics,
permission.
17
1956, 18, 89-112.
Reprinted
with
18
READINGS
leased
has
such
on
led
to
us
to
determine
to
decide
experiments which
two
suggest
These
merit it has.
what
it is desirable
model
modify the
PSYCHOLOGY
However, the development of the model
studies.
whether
to
MATHEMATICAL
IN
better with
details
particular
little hope that the
further work
pursue
help
help
may
will also
experiments
to
to accord
believe
we
in
for
reality,
of the present model
tempt
at-
an
have
we
have
any
lasting value.
U.
a
fixed
at
type
time
known
is
that
clear
the
as
mean
reaction
unwanted
stimuli
and
response,
of
and
that the
no
the stimulus
If the
time.
and
choice
type of
and
the
subject is
sponse
re-
sented
pre-
of response
In either
time.
than
more
interaction
time
fixed
of
tingent
con-
it is
case,
tions
readilyanalyzable time distribustimulus
be simple enough so that
a
between
intervene
may
stimulus
required the corresponding time interval
stable
time is
t with
time
at
disjunctive reaction
the
the
of stimuli
is
it is necessary
the
responds
set
obtain
to
subjectreceives
simple reaction
stimulus
the
on
of
one
interval,i, between
the
is called
with
0 and
time
The
response.
that
Suppose
Times.
Reaction
will
which
distribution
stimulus
will
stimuli
be
Otherwise
two.
or
the test
the
among
second
and
a
cause
the
tortion
dis-
difficult to
very
analyze.
study of reaction times, including disjunctivereaction times,
long history in the literature of psychology (cf.Woodworth,
The
has
1938, chap. xiv). In

has
this loss
a
been
recent
evident
of interest
to
in reaction-time
related
two
this separation involved
to make
to
stimulus
this
time
with
the
when
from
If the
reason.
motor
"
from
times
We
use
stimulus
of
the
motor
same
technique has
his
lags involved
been
no
time
reaction
One
to
was
the
subject'sresponse
to
decision
was
subtracting
and
made
be
attempt
the
stimulus
same
involved.
This
unsatisfactory for the following

he is able to bring
to make
decision
considered
no
for the
time
presentation
distinguished
required to respond
subject has
readiness
First, there has been

decision (decisionlatency*)
measuring
when
attribute
may
in the total process.
decision
action
We
studies.
causes.
failure to separate the time to make
from the other time
terest
however, relativelylittle in-
years,
when
to
specified
referring
motor
parts of such
response
to
response;
a
process.
the
time
to
of
latency when
much
process
higher
timed
referring
to
S,
LEE
he
than
pitch
determined,
time
if at
make
distributions
if these
(base times) from
one
been
obtained
would
We
these
latencies
human
based
being
At
the
our
its
of
the
be
when
the
decision
that
in
that
of
elementary
latencies
in itself.
end
an
for it in
primitive
more
next
section
with
familiar
to
we
Let
Laplace Transform.
t such
that
F{t)
L{F) of the real variable
that
that
be
the
of information
Laplace transform
about
in
certain
mathematical
and
to
of
one
every
we
have
of
list of those
shall need.
be
real-valued
for ^
defined
of F.
making
concerned.
are
we
it
or
way
"
0.
by
The
function
real-valued
this
There
of
tion
func-
equation
the
(1)
e-''F{t)dt
the
tirely
en-
an
the
Laplace transform,
the
position
decom-
from
unlikely that
it is
decompose
employed usefully in
be
may
its definition
elementary propertieswhich
may
decision
model
the idea
Since
times.
to
describe
to
observed
finarlly
a
non-choice
empiricalcorrelate
no
purports
such
able
unreason-
Such
distribution
with
not
adapted,
or
primitive terms.
Laplace transform
be
is called
If,
of
were
the
out
tease
to
used,
proposal is
our
will
variable
be
more
It is with
readers
III. The
real
the
reaction
the
will
It is true
non-choice
of
might
used
which
model
study of
devoted
to
time
pure
class
to account
process
composes
heart
technique of
the
on
elementary ones.
more
the
what?
some
reaction-time
observed
mathematical
formal
be
taken
for it is
difficulties,
also
into
latencies
of the
may
base
latency distribution
method
(base times) can
decision
the
equated
extremely simple,
be
to
separation
related
as
that the
suppose
latencies
the
situation
remain.
describe
to
that
then
"
approximated by
complex character,the challenge

terms
be
cannot
"
another
or
way
found
decision
choice
measurements
resulting decision
the
however,
itself
from
in
to react
action;
disjunctive re-
decision.
functions, the
mathematical
make
to
conclude
may
were
well
be
they could
We
all, only
has
latency distribution
time
decision
that in
suppose
required
the
"
19
LUCE
DUNCAN
R.
is
the
time.
subject is required to
Second,
AND
he
for
simple reaction
any
be
base
time
the
excluding
when
can
the
thus,
CHRISTIE
is
essentially no
transformation
loss
[see equation
20
READINGS
(4)],but
there
because
is
of
We
for
they
which
we
known
well
are
shall
L(F)
with
F and
shall
formed
working v/ith transelementary properties
to
of the
few
need
later;no proofs will

(cf. Churchill,1944).
be
given
(2)
L(G), then
the
sL(F)
property
(3)
N is
where
N,
f^ N(t)dt
continuous, the
are
F(0)
0 for all r"
A' is continuous
(4)
some
0. If it is known
and
so
0,i.e.,
G.
iv.
If
and
constants,
are
L(aF
If
V.
F(i)
\e~
bG)
where
The
aL(F)
A is
bL(G)
(5)
constant, then
-^"
UF)
IV.
f^y
iii. If
List
transform
ii.
that
advantage
the
F^(T)F^(t~r)dr\
L(F^)L(F^).
l\J
1.
function
PSYCHOLOGY
special propertiesof
distinct
functions.
MATHEMATICAL
of the
some
sometimes
of the transform
IN
(6)
=
.
Our
proposal is based on assumptions which are

do not appear
to be
intuitivelyacceptable, but which at the moment
It is our
susceptible of direct verification.
impression that any
Model.
deal with the full set of

must
empirical verification of the model
assumptions rather than with each in isolation.
Assumption I. It is possible, for a given experimental situation,
to divide the observed
reaction
time t into two latency components
tând t^,called
1.
t+
2.
The
base
time and
respectively,such
time
choice
that:
value
and
on
it is
not
of
t^ depends only
the
required of
actions
motor
mode
the
on
directlydependent
on
the
of stimulus
entation
pres-
subject. Specifically,
the
character
the
choice
of the
choice
demanded.
3. The
value
of i
it is not
or
Let
the
depends only
on
directlydependent
the motor
actions
distributions
respectively. Since
of
t,
on
the
mode
demanded.
of stimulus
cificall
Speentation
pres-
required.
t^,and
conditions
and
be
3
denoted
by f, /^,and /^
imply that
the
two
com-
LEE
S.
latencies
ponent
it follows
CHRISTIE
R.
DUNCAN
for
independent
are
from
AND
21
LUCE
fixed
tion,
situa-
experimental
1 that
condition
(7)
Our
and
second
major assumption
requires the distribution

The
distributions.
by
in
such
exists
structure
the
of
structure
which
be
may
which
set
of
simpler decisions
of
decision
to that
organization of these
complex
is
one
of the nature
of
the
In
addition, the
and
restricted
shall
we
breakdown
to
of
decision
of the
machine,
both
analogous
of
set
by
decisions
engineer. The
the
This
at least
is true
of human
complex decision
more
machine
or
man
it is true
where
form
to
individual
the
serial process
from
being made.
If
machine, i.e.,the elementary
suppose
of
are,
computing machine,
machine
the
which
it is
elementary decisions
function
in
made
built into him.
making,
composed
as
tary
elemen-
the final decision
elementary decisions
process
latencies
more
that
capabilitiesbuilt into
actual
from
composed
is
decision
thought
choice
idea
in human
be
to
elementary relative
are
decision
and
into
organized
appropriate sense,
some
to
is
person
basic
only the
concerns
beings.
is not, in general,
elementary decision
one
portions may be
simultaneously employed on different parts of the problem. There
this is also true in a human
to suppose
seems
being.
every reason
by another, for in
is followed
We
shall
describe
graph, (The
in the
used
Lines
of
of
finite set
between
It is
line to connect
in
more
that
there
In this
may
paper,
that
shall
the
directed
Lines
when
we
Line.
that
in
of
with
use
these
between
in the
as
in
term
for
nodes^
we
and
directions
as
directed
pair of
nodes
ployed
em-
directed
in
directed
one
have
may
in
in
is
sists
con-
shown
are
graph, we
allowed,
there
graph
with
than
Figure 2a,
possibilitiesis
any
directed
that
been
flow diagram
term
more
directed
also
examples
sense
opposite
the
the
Several
by
have
called
are
general,
direction
be
decisions
coding.) A
points which
points, both
same
neither
suppose
computer
possible,
two
and
literature
pairs of them.
some
different
graph and network
mathematical
with
macKine
organization of
oriented
terms
in connection
Figure 1.
or
the
the
two
sense
Figure 2b.
shall
that
is at
pose
sup-
is,
most
we
one
22
READINGS
PSYCHOLOGY
MATHEMATICAL
IN
Figure
employ a directed graph to represent the organizationof

At each node we shall assume
that
decisions
in the followingway:
an
"elementarydecision** will take place, the latency distribution
governingthe decision at node i being denoted by /^,The decision
have
process is initiated at node i when, and only when, decisions
We
been
shall
made
each
at
line from ; to
begin making
of those
We
his
may
; such
nodes
that there is
think of the "demon**
decision
until he
has
at node
received
waiting to
the decisions
precede him in the directed graph.

For the directed graphs we
shall consider, there will be
one
node, possibly more, which is the terminal point of
all the demons
these
will be
and
of
who
the
stimulus
node,
directed
decision
points which
are
activated
0,
There
will also
again possibly more,
which
initiates
at
time
Figure
be
no
by
at
at least
line;
no
the
least
directed
perimental
exone
line.
24
READINGS
Even
if
we
for certain
able
were
wide
MATHEMATICAL
into
process
hope is
relatively small
But
practicallyidentical.
II.3
formal
that
the
the
he
others
and
VVe should
want
When
is
as
For
the set
with
is
required to signal
is
but
of the
is
not
colored
sis
analy-
an
sumption
that As-
tary
elemen-
one
the
if
part of the
situations
the
will
naturally
we
subject is
differentlyfrom
location
the
set
of
that
one.
tions
situa-
of these
integers. VVe should
smaller
the
over
applied
be
put in the
find
to
the model
set
same
other
less
tuitivel
in-
held.
to
following sections
to the
fM)
form of
shall
we
make
the
following
/:
\e-'^\
t "
0,
[0
0,
^5"
There
grounds for supposing

this might be an appropriateassumption. First,let us suppose
that when
reached
been
has
decision
no
by time t following
A
stimulation
reached
are
ones.
explicitassumption as
where
to
probable
suppose
"similar"
as
ranges
model
of "similar"
some
the
experimentaldata we anticipate
of the directed
graph being a single point will be
the intuitively
"simplest" choice situation within
case
identified
is
stimulus
example,
of which
consider
n
model
the
the
forced
demands
which
they could not

if by great ingenuity we
able
were
simple sets of situations for which
even
positive constant.
at
time
between
proportional
case,
to be
want
experimental situations
points, one
probably reject the
In
those
to
generated
that
subprocesses which
S of "similar"
being similar.
as
presented with
S,
of
thinking,although
our
sets
subsets
as
of
Equally well, if
firings. It
situation
implicit in
model,
think
an
for its response.
also
include
not
neurone
stimulus
decision
It is
as
effectivelyprevents this extremity by requiringthe
of
existence
are
accept the model
process.
set
do
we
of individual
in terms
doing
extremely
so
excessively complex we should reject

that it is possible to subdivide
the total
The
model.
/ which
should
we
decision
adequate descriptionof the

directed graphs required are
the
distributions
that
is doubtful
complicated, it
be met
assumptions can
experimental data, but that in
of
classes
PSYCHOLOGY
that these
show
to
elementary decision
obtain
we
IN
to
it is not
0 then
t and
A^, with
+
a
are
the
that
probability
A^,
where
constant
difficult to show
that
two
the decision
small, is approximately
A^
is
of
A.
proportionality
the
will be
distribution
In this
of decisions
LEE
CHRISTIE
AND
R.
DUNCAN
is
correct
the
an
virtue
of
made
better
observation
and
more
that
certain
as
simple, the
more
is
assumption
observed
decision
situations
better
latency is
and
placed
exponential distribution slightly disorigin (Christie,1952b; Luce, 1953). The main
approximated by
from
this
that it has
empirical problem, but it must be admitted
simplicity. Second, and probably more
relevant,it is
relativelycommon
are
25
LUCE
Whether
exponential (Christie,1952a, b).
is
S.
the
an
is
plicity
generally on the rising limb. If this change toward simdirected
is actually toward
a
graph consisting of one point,
and if our
other assumptions hold, then it seems
plausible that the
elementary decision
latency is actually exponential but that the
error
observed
distribution
time distribution
VI.
presumed
to
satisfy the
then
to
graphs /V^,where
when
f ^cr)
(fiy
solutions
choice
that
appear
where
fând /âre
which
compose
general, there
/ and
/' are
from the
same
triples with
stimulus
These
one
are
two
set
the
situations
distributions
appear
/. and
should
to
be
years,
of
be
no,
an
appropriate
or
one,
in any
degree
indirectly.
plausible to
It may
present
suppose
and
that,
if
problem. However,
this
to
with
associated
to
in
situations
choice
accept only those
both
to restrict
attacked, let alone
Further
cases.
the
possibilities.
solved, in this
difficulty. We
direction, but
electrical
of Section
satisfy the assumptions
considerable
in this
triples
following problem: Given a

of all triples (/j^,
set
f^,A/),
further
be
by
The
S.
the
of
solved
is to be
very
serve
not
important lead
it. In recent
that
set
of directed
set
of course,
S, then it will be necessary

same
of
each
somewhat
solutions
many
and
is
are
solution.
problem
seems
which
assumptions
hopes
one
first the
/. It
problems will
they
paper;
form
the
to
continuous, which
to
base-
the reaction-
member
that
may,
/, find the
distribution
/^ denote
/ând /
attacked
solve
appropriate to
continuous
only
be
must
the
situations
typical
exactly one
if the
of
model, i.e.,S
II. Let
S, such
over
problem, but
will be
generality,it
prove
in
with
ranges
to the
of S there
It would
of
Assumption
find distributions
a
of the
assumptions
in
convolution
of choice
set
yields the distribution /^. There
many
the
distribution.
composed according
"
IV
by
associated
distribution
problem is
"S be
Let
of the type described

time
smeared
the decision-time
and
Problem.
The
is
engineers
we
have
have
been
not
know
of
gated
investi-
concerned
26
READINGS
with
the
problem
networks
to have
with
there
systematic
manner
functions.
If
with
distribution
the
electrical
identifythe
functions, the
/ with component
the two
analogy between
an
we
transfer
and
network,
electrical
is
transfer
preassigned
the
PSYCHOLOGY
MATHEMATICAL
synthesizing in
of
given reaction-time
graph
IN
acteristic
char-
problems. This
is
but it is almost
certain
that solving
probably worth investigation,
our
problem will prove to be a major research
undertaking.
To
of
some
assumptions
our
then
by Assumption
and
another
consists
of
one
point. Let
in the latter
the
and
includes
times
pose
II
times
stimulus
From
case.
Let
exists
whose
situation
set
Assumption
we
the
tion
situaS which
directed
distribution
the
f^be
given stimulus
there
/j denote
simplifiedby using
transform.
for
know
we
be
may
Laplace
of reaction
distribution
observed
a^
problem we
the
extent
some
graph
of reaction
write
may
(8)
f ft,(r)fjt-r)dr.
/i(^)=
Taking
(2),
the
Laplace
in each
transform
L(0
If
we
divide
the
first
and
case
applying equation
L(4)L(4)
the
equation by
^g^
second
in
equation (9),we
obtain
Up
L{A)
This
is
is
is
seen
fairlycrucial
that all mention
(10)
UL)
of
consequence
of the base
this
point
Empirically,one
rather
we
should
does
not
approximations
raise
obtain
an
to
eliminated.
been
/ and
it
It
A'^.
important practicalproblem.
estimates
to the cumulative
F(t)
assumptions, for
our
has
time
equation relatingthe empirical data
an
At
but
L{Q
_
of
the
distribution
distribution
f f(r)dr
/,
LEE
(Throughout
while
Now,
well
whether
into
about
statements
(3) we
to
denote
their
be
reasonably accurate,
results,
our
cumulative
the
of data
So
cumulatives.)
tends
magnify
to
question
the
it is
arises
particular equation (10),
in
From
distributions.
equation
have
LU)
sL{F)
are
we
so
F{Qi)
speaking of empirical
equation (10) becomes
Since
and
to
avoided.
be
to
translate
can
may
tions
distribu-
denote
letters
differentiation
is, therefore,
we
to
numerical
that
and
errors
corresponding capitals
approximations
known
Latin
small
use
27
LUCE
DUNCAN
R.
AND
CHRISTIE
shall
we
the
and
S.
data
we
F(0)
assume
may
H41.
^^
eliminated
/^ from
Since
it remains.
/,
is
the
from
any
consists
in the
same
of
several
LiQ
As
an
example
exponential with
of
constant
suffice
is the
course,
case
to
assumes
determine
the
where
it
graph
case
L(F,)
UU)
^
.
_
^llilA
may
be
used, suppose
by equation (6)
Then
(12)
equation (12)
how
time
it will
cases,
point, in which
one
equation (11)
in
division
simplest, of
The
cne.
mining
discussion, the problem of deter-
our
our
0,
(11)
Having
/îs
s
1
and
If
so
we
equation (12) becomes
make
the
equations (3) and
reasonable
(5) we
find
assumption
that
/j(0) 0,
=
then
from
28
READINGS
Assuming
that
that
and
continuous
f^ is
PSYCHOLOGY
MATHEMATICAL
IN
f^ has
rivative
de-
continuous
equation (4)implies
1
df^
or
integratingfrom
0 to ^
Since
be
f^ must
determined
from
equation (13) that considerable
VII. Serial
the
of
estimates
accurate
Decision
F^
graph
An
discussed
in Section
but
they
may
such
extra
assumptions
with
hope that they
this
the
case
VI
is
choose
We
shall examine
extreme
forms
section, is
second, which
shown
in
be
may
two
cases
of the directed
the
general serial
will be discussed
the
about
have
relevant
which
for
The
shown
in Section
"
""
^-"-
Stimulus
FIGURE
directed
solution
of
heuristic
sense,
grounds,
the two
first,the topic of
in Figure 3a, and
VIII,is
-"
the
experimental
some
are, in
graph N,
case
intuitive
on
solving
this alternative
considerable
Figure 3b,
Stimulus
obtain
discover
to
may
most
to
program
explicitassumptions
value. We
data.
alternative
the
general problem,
the
to
necessary
The results of
elementary latency /
will,unfortunately,be much weaker than
A' and
program
the
be
clear from
is
certain
of
will
data
Process.
general problem
consequences
empirical data, it
"^
the
parallel
LEE
It follows
observed
S.
AND
CHRISTIE
Assumptions
immediately from
distribution
/ of
serial process
29
LUCE
DUNCAN
R.
II.2. b that
I and
having
nodes
is
the
given
by
rr 4^)4^^ -î). -4^^

=['"""
4^^)
Applying the Laplace

(2) we
transform
to
equation (14) and using equation
have
dividingby
or
tjdt.dt,
...dt^. (14)
the
case
1,
"^.,(/)-.^:l^.
(16)
Equation (16) is the explicit form of equation (11) for

data we
case.
Clearly,if we have given numerical
may
(possiblynumerically)/ for each value of n.
As
general
A
example
an
of
form
this
might
be
when
done
/ is exponential with
suppose
the
serial
determine
know
we
time
the
constant
equation (16)becomes
In that case,
how
of
the
UF)
(17)
71-1
M^.)-(A.i)
1
In
Figure
have
we
presented plots
of
/1
values
A
of
equation
may
serial process
\X
second
of
^j(n),
is
be
obtained
with
vs.
"
for small
"
'^
1 r
by observing
that
exponential elementary
the
mean,
decisions
given by
^j(n) ^j(6)+^,
(18)
where
is
ii^{h)
the
base
mean
time.
Thus,
^.
(19)
,z,(n)-Mi(l)
=
equations (17) and (19) to attempt to decide

whether
a
given set of data is adequately fit by the assumptions of
the model, plus the added
assumptions of a serial directed graph
We
may
and
exponential elementary
now
questions
use
as
to
how
latencies.
this
may
best
There
be
are
serious
cal
statisti-
done, but the following
30
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
LO
0.8
0.6
0.4
0.2
O.Q
FIGURE
method
ready
s; this
we
assume
may
For
plot A"
5
We
know
find in
is in the form of
plot,which
of
and
We
thus enter
left side
for
of
function
shall call
we
value
some
of
correspondingvalue
if the
plot A
at this
selected
we
will be
of
the
and
correct
are
Since
the value
Figure 4
equation (17) that this
from
(19) presents
which
as
lated
formu-
are
of
y*
^,
chosen.
compute
we
value
(reasonable)
each
assumptions
s
the data
say
of
From
solved.
and
problems
statistical
until the
suffice
may
3.0
2.5
2.0
1.5
1.0
0.5
0.0
relation
satisfied
such
2s
our
that the
be
equation (19)]and (n
of
value
correct
has
our
observed
between
1)/A is
are
valid.
the observed
a
But
means,
been
the value
point and determine
the
if
to
HP,)
assumptions
error
(x^r
equal
this determines
between
if
must
of
equation
A, and
We
means
choose
[the
minimum; this yields
32
READINGS
To
the
evaluate
above
PSYCHOLOGY
MATHEMATICAL
IN
the
consider
sum,
function
n~\
t"(x)
-x'^-(i
-xy-'
2] C'-k')(~^)^cc^-x^
k=0
Observe
that
'
i\^k)
'
dx
^0
^0
=0
fc=o
A;
=0
T-
/^ +
and
L{nfFr').
that
^(x)dx
nl
x^^"^(1
dx
x)''-'^
i)r(n)
nr(|+
n) is
S(;72,
where
From
these
the
Beta
results
we
function
and
r(n)
Gamma
is the
function.
easily obtain
nBjY
1, n\
(22)
r(^
(l
+
^
n
^
+
-^
^)
i
n\r(j2J
+
In
Figure 4
for small
have
we
values
of
n.
also
presented
plots of
"
-.
iA
r-
vs.
LEE
The
of
mean
/i,(n) ^j(6)+
=
S.
the
AND
DUNCAN
R.
parallelprocess
and
"
T-
second
CHRISTIE
thus
be
can
have,
we
33
LUCE
shown
in
as
be
to
given by
serial
the
case,
relation which
be
must
met
^j(n)-^j(l)
=
^^
(23)
J,
1=2
The
procedure
for
fittingis
curve
s
serial
that
except
case
y
X
the
enter
IX.
graph
assumptions
there
there
is
but
it is
is
that
other
such
other
directed
between
to
use
first to do
to
The
that
but
another
to choose
will
be
such
selection
value
in
ad
an
be
there
Thus,
cases.
the
are
from
is
stants,
con-
an
of
is
sumptions
as-
Figure
in the
Presumably, any
some
shape of
the
extremely revealing of
cated
indi-
the
sets
seen
which, in
curves
have
n,
procedure
different
can
one
hoc
fairlysimilar.
produce
We
determine
to
among
for it
of
are
unfortunate
there
VIl)
and
as
of
set
one
any
them.
graph
of similar
set
such
constants,
that
curves
extreme
not
an
directed
serial
addition, within
to how
as
lie
sense,
the
proper
empirical
directed
situation.
a
number
in all Likelihood
of difficult
it will
prove
to
statistical
be
lems
prob-
efficient
more
experimental exploring using subjectivejudgments

and
the
to solve
goodness-of-fitbefore trying to formulate
statistical
X.
two
"
here,
two
place
general problem
problems
as
whether
best
certain
small
any
will
It is clear
as
In
how
as
almost
graph
these
the
to
such
undetermined
are
that the
curves
graph
and
evidently quite serious
for almost
favorable
more
arise statistical
data
optimal.
The difficultyof making
data
solution
procedure (end of Section
one
for the
not.
or
question
be
better
is
there
fit the
exponential /
assumptions
not
particularset of assumptions,
and
VI,
described
as
Without
in Section
described
and
is
Selection,
Model
well
than
same
to
seems
to
the
some
problems.
Perceptual Moment.
reaction-time
studies
the
mean
In
Section
reaction
II
time
we
should
remarked
be
that
in
of the order
34
READINGS
of
if unwanted
second
one
This
avoided.
have
peculiar phenomena
very
such
one
termed
PSYCHOLOGY
with
interactions
that the data
means
rapidly at certain discrete

The
them.
period between
and
times
who
to
adapted to cope
simple hypothesis as to
belief
Let
the
that while
and
that
is
timing of the moment,

to
assumption
only
be
may
able
to
shall return
we
The
we
where
took
it to
moment
formal
vestigat
in-
are
exist,
probably
sis
analy-
with
not
any
general
8
duration, say
stimulus
at
tures
fea-
seconds,
time
any
the
at
Since
we
end
tary)
(elemen-
is
during
cording
ac-
This
5.
to
that
happen
may
the
presented
0
in the interval
information
that
assume
may
the stimulus
h
during
of the period.
presentation and
the stimulus
person
part of the moment;
point later.
form
exponential, and
that if
no
so
shall
we
decision
has
decision
in the ith moment
by
response
the
ith moment
(1
-X5)P,_, +A5
reached
time
X5
iS
If
we
sume
ascase
discrete
by
then
the
the
call the
then
P.
,
î =î-,+ [l-P"-.]^S
=
the
use
been
is
should
we
continuous
the
following the presentation,i.e., at
probabilityof a
of a
probability
this
shall make
that the
information
inappropriate,for it
assume
the
we
of the moment,
assume
may
assimilate
be
it most
that all intermediate
assume
distribution
to this
give
to
that it does
arises
to the discrete
as
question now
In
for the elementary decision
process.
analogue. We
ith
multiples of
we
uniform
as
between
correlation
no
fractory
re-
on
Indeed, there
to indicate
receive
may
at
occur
time
In this section
is of fixed
moment
person
unchanged.
only serve
Furthermore,we will
there
is
only
period it will
decisions
conducted
case
situations
to
the nature
analysis remain
assume
us
In the
with it.
that it is correct, but

of the
is in
is, therefore,of interest whether
be
can
It
effect.
an
tion
informa-
that he
been
not
applied
servatio
ob-
period from the beginning of

the beginning of the next has been
(Stroud,1949a,b). Unfortunately,
its existence.
will be
analysis
will have
doubt
certain
explain these
and
possible at this
propertiesof the moment.
of the
statement
our
it is
so
where
range
to be
are
subject processes
relativelylittle direct experimentation has
problem,
stimuli
To
observed.
been
hypotheticalevent
the perceptual moment
other
will be in
proposed that
been
it has
MATHEMATICAL
IN
(24)
LEE
With
the
solved
CHRISTIE
S.
AND
initial condition
P^
the
equation (24) is
difference
by
p^
The
35
LUCE
DUNCAN
R.
probabilityof
xsy
in the ith moment
decision
(1
is
obviously
[1^P._JA5;
hence,
have
we
X5(l-A5)'-i
as
distribution
our
If
replace this
we
continuous
^
the limit
as
the observed
f^(t)fUrn
=
-"o
data
point i8
the
in the discrete
then
,
width
and
distribution.
be
by /^ as
denoted
serial
case
is
height
it is clear
the discrete
distribution
base-time
the
rectangles of
this becomes
"
"
has
about
centered
"
distribution,equation (25), by
discrete
"^^which
one
"
Let
(25)
that in
before, then
given by
/ f,(t,)h(t2-î)%(h-i2)-"
/'"""/'/
...
(26)
Applying
L
(fj
the
=
Laplace
transform
L{h) L {^^^
(/j)
Urn L
using equation (2)
and
=
lim L
L(h)\
(f^)
($,)]" (27)
.
Observe,
.2.
_i
f
""
8"
2] X8(l-X8y-'e'
A5(l-A5)-'
se
"
f^l(l-X8)e-'^'
36
IN
READINGS
MATHEMATICAL
PSYCHOLOGY
But,
lim
so,
Lim
(0^)
^
5-
-(l_A5)e-^^
Substitutingin equation (27) and
dividingby
the
case
1, we
have
(28)
=[l_(l-.A5)e-^sJ
Z(^)
'
which
is the
crucial
of the discrete
mean
equation for
distribution
^
Thus,
the
is
discrete
observed
The
case.
given by
i5A5(l -kSy-^
the relation between
serial
=j
(29)
is
means
//j(n)-^,(l)=^^.
if
Now,
then
these
fashion
know
we
two
as
were
theoretical
no
the value
5, i.e-,the length of the moment,

be used in exactly the same
of equations may
sets
We have
equations (17) and (19) of Section VU.
value
of 5
is
real
of
measurements
moment
of
it will be
so
phenomenon
it will be
71
further comment
1, the
such
as
with
f^
convolution
that shown
,
of h and
in
for reasonable
than
1, the
amount
0^, when
Figure 5.
/. will
,
The
serve
Smearing
depending on the
If
"
0, is
convolution
to
the
smear
will also
value
of
ascertain
its
ignore /^ and let
we
""
perceptual
time.
reaction
on
to
dependen
perform in-
of this function
steps but it will
result
n.
step function
if
is
Thus, if
larger
our
sumption
as-
roughly correct, we should expect,

least for comparatively simple situations,to find the observed
as
to
the moment
that if the
important
interest:
some
utterlydestroy them.
not
at
of
to
necessary
It is clear
it.
propertiespriorto analyzing experiments

One
(30)
is
latencydistribution somewhat
lumpy. Indeed, in the Literature (cf.
Woodworth, 1988) it has been remarked not only that the data are
lumpy but that there is an oscillation superimposed on the distribu-
S.
LEE
CHRISTIE
DUNCAN
R.
AND
3"
28
to
were
to
This
curve.
effect could
h uniform
assume
55
4f
FIGURE
tion
easily be obtained
only
over
is
trulya refractoryperiod during which
information.
need
for
These
we
assume
considerations
bringout
comprehensive experiments
we
portion of the interval

the vast majority of the moment
in other
if
analyticallyif
small
5,
words,
37
LUCE
there is
even
of
strongly the
more
determine
to
intake
no
propertiesof
the
the moment.
We
shall not
reasons
with
are
attempt,
that the
worthwhile
of the
opinion
that
carry
it is
explanationof
dimensions*'
an
effect
nature
the
out
with
other
of
than
complex
it
moment
the
is
moment
one
for
hint
number
of
may
and
hardly
accepted in
serially. It
possible
changing
information
of the
information
accepted in
is
rather
analysis. Furthermore, we
unlikelythat
latter remark
the
in
the
on
that the information
parallel. The
problem is
mathematical
is dealt
moments
however,
to
The
before, to study the parallelcase.
little information
so
seems
in
as
are
ferent
dif-
happen,
processed
developing
an
"psychological
display.
ExperimentalProposals. The key assumption in our analysis

be found of such a sort
is that elementarydecision
can
processes
which
that complex decisions
be built up from them in a way
can
XI,
leaves
present
their characteristic
A value
experimentalsubjects
with
invariant.
stimuli
One
which
should
vary
in
like to
several
38
READINGS
but
dimensions
identical
several
we
values.
different
in every
other
respect,
If
possibly introducing
several
objects with the
with
and
the
have
we
have
conceptually different
uses
use
we
for each
relevant
dimension
same
into
one
of the dimensions
difficultyof
the
run
each
on
If
characteristics.
time
dimensions,
decisions
for which
PSYCHOLOGY
MATHEMATICAL
IN
identical characteristics
that
difficulty
the
reception of
into several
unitary, but broken down
parts separated by receptor orienting acts such as eye movements.
The
first of the two
following proposals suffers from the latter
the
stimulus
1st
of
row
3"
of
with
cards
pairs per card
position. Cards
n
of
triple-spacedtyped, horizontal
The
digits, 0 and 1, on each.
from
sixteen.
to
one
On
each
card
no
series
the
pair in
to
to vary
pairs will be unlike digits,i.e.,(0,1)or (1,0);

like pairs,i.e.,(1,1)or (0,0). The place of the unlike
pair or
one
the remainder
one
5"
verticallyaligned pairs
number
either
the former,
Experiment: Digit Difference Perception
White
Stimuli:
from
second
the
difficulty;
be
not
may
with
be
will
of
pairs
the
unlike
in
included
to
the initial to the final
from
vary
pair in each
the
with
set
of the
positions from
equal frequency,and
frequency.
pairwill be included with the same
the
The
remaining places will be
assignment of (1,1)or (0,0) to
made
an
on
equiprobable random basis, and the choice of (0,1)or
basis.
(1,0)for the unlike pairwill be made on the same
cards
with
no
unlike
Responses'. Experimenter
presentation how
will
respond
not
bear
or
announce
pairs the card
subject will be
told
to
no
will be instructed
and
be
each
shown
bears.
depending on whether the card does

pair, by pressing the appropriate one
that
possible positions, including in

events,
to
prior to
no,
unlike
an
The
keys.
yes
many
will
to
Subject
or
does
of two
pair in each of the

equally likely
position, are
unlike
an
no
stimulus
read
the lines
of
pairs from left
primary interest will be the latencies of the

tencies
unlike pair and the lawhich
bear no
to the cards
response
bear an unlike pair
of the yes response
to the cards which
right. The
in the
nth
data
of
position.
Apparatus: 1.
2.
Stimulus
cards
as
described
Light projectorwith
fast
above,
shutter,
40
IN
READINGS
data
of
primary
set-stimulus
Same
Apparatus'.
for
calling
pairs
for
as
be
will
interest
PSYCHOLOGY
MATHEMATICAL
the
latencies
the
the
to
response
response.
yes
first
of
experiment
for
except
the
stimulus
cards.
LITERATURE
S.
L.
Christie,
Psychol,
59,
S.,
L.
Christie,
in
Cambridge:
R.
Discriminative
Behavior,"
Macy,
Jr.
Groups."
of
1952b.
"Communication
(Technical
Report
Electronics,
231.)
No.
Massachusetts
stitute
In-
J.
Modem
Macy,
Jr.,
L.S.
Christie,
Groups."
Task-Oriented
in
Mathematics
Operational
Engineering,
in
Hill.
Research
of
J.
Laboratory
McGraw
D.,
Flow
and
Task-Oriented
1944.
V.
York:
Luce,
of
Technology.
R.
Churchill,
New
Luce,
D.
Research
of
Measvirement
443-52.
R.
Learning
and
"The
1952a.
Rev.,
Laboratory
and
H.
(Technical
of
1953.
Hay.
"Information
Report
Massachusetts
Electronics,
bridge:
Cam-
264.)
No.
Institute
Technology.
Stroud,
J.
Steinford
Stroud,
1949a.
B.
J.
1949b.
B,
"The
Sixth
Ed.
Forrester,
Moment
Hypothesis."
Function
M.
S.
Thesis.
Psychological
Conference
Moment
on
Perception."
in
Cybernetics,
Josiah
H.
Macy,
27-63,
Foundation,
Woodworth,
"The
University.
R,
S.
1938.
Experimental
Psychology,
New
York:
RECEIVED
Holt.
5-5-55
von
Jr.
PSYCHOACOUSTICS
DETECTION
AND
David
massachusetts
This
institute
how
process.
By
statistical
to
to be
appears
of signal and
with
theory
and
the
theory
has
been
threshold
noise
of the
held
are
on
as
process
independent
special emphasis
of
concept
used
theory
as
ideal
the
applied
of
two
The
observer.
analyze
to
it is
as
combination
paper
threshold
auditory
of
the
instance
an
psychophysical
The
constant.
of
assumptions
when
procedure
of
concept
derivation.
the
the
ideal
physical
is
observer
viewed
re-
usefulness
The
of
the
by considering the shape of the psychophysical function
relatingthe detectability of the signal to its intensity. A rather general model
is illustrated
this concept
function
based
of detection
is treated
theory
hypothesis testing,two
of the
(2)
are
(1) the detectability of the signal and
recognized:
process
of analysis which
The
level of the observer.
theory provides a technic
The
tectability
factors.
of signal deof both
obtain
measure
a quantitative estimate
criterion
parameters
technology,
review
Detection
decision
treating the
determinants
one
of
massachusetts
fairly complete
decision
structures:
discusses
allows
data.
psychoacoustic
theoretical
the
presents
paper
certain
Green
M.
cambridge,
to
THEORY*!
the
on
"
of
concept
is
signal uncertainty
which
presented
to
attempts
this
explain
relationship.
Introduction
There
is the
two
are
breadth
from
techniques range
of vowel
that
which
to
second
view
deficit
consensus
is the
This
recent
interest.
t This
by
the
Hanscom
Note.
Its
Other
paper
data
have
not
of
Electronics,
M.I.T.
is
"
any
to
in J. Acoust.
This
is the
Soc.
first of
papers
was
will
This
in
is Tech.
Rept.
No.
part by
follow
the
Force
in
If
basic
some
AFCCDD
41
papers
U.S.
Air
on
grant from
under
Research
by
TR-60-20.
Psycho-
reflection
a
of
general
complete
of the
Reprinted
latter
with
mission.
per-
aspects of acoustics
the
National
of
Science
issues.
Force
Cambridge
force
example
1189-1203.
theoretical
old.
the
may
recent
subsequent
administered
and
Massachusetts,
32,
from
integrative
paper
A
series of tutorial
supported by
partially
Bedford,
new
1960,
Amer.,
publication is supported
of this kind
field,a
procedure.
measurement
difficult.
structure
complete comprehensive theory.

where
on
methodology. Often, even
of the
tion
percep-
it reduces
most
areas
integratedwith
easily be
The
being overlooked.
is
these
of any
since
one
area
some
entire
analysisof the
system
sensory
lack
study hearing.
to
fortunate
integrationof
makes
Operational Applications Office, Air

Field,
of the
field is the
might
consensus
exist in
appeared
Editor's
Foundation.
lack
of the
article
of the
new
does
to
seems
re-examination
cochlea
rapidlyexpanding experimental literature.
the
acoustics, however,
this
diversitywhich
characteristic
existed, these
structure
of the
studies
psychoacoustics. One
techniques used
approach
multidisciplinary
reallysignificant
aspect
any
it creates
However,
A
This
field of
of the
skills and
hydrodynamic
forms.
the chances
strikingcharacteristics
very
varietyof research
and
the
contract,
Center,
Research
monitored
Laurance
Laboratory
G.
of
42
READINGS
IN
PSYCHOLOGY
MATHEMATICAL
scale of
exchangesof Garner^ and Stevens^ on the quantitative
loudness.
Such a situation compounds the problem of integration.
This paper, therefore,makes
no
attempt at broad coverage. The author hopes
that by concentrating
limited
rather
contribution
one
be
on
can
topicsome
positive
made.
This topicis the detection of signals
in noise. In recent years a general
tical
theorestructure
(detection theory)has been used to analyzesuch experiments.Unfortunatel
confusion
there appears to be some
both about the theoryitselfand the
of its application.
The main objective
of this paper will be to clarify
these two
manner
Fart
of
confusion
the
about
the
arises
from
the
fact
that
detection
questions.
theory
of two distinct theoretical structures:
decision theoryand the
theoryis a combination
of
ideal
observers.
Before
detailed
discussion
of these two
we
a
theory
begin
aspects
of detection theory,we
will briefly
outline them
and relate them
to psychoacoustic
problems.
Decision theoryprovidesan analysis
of the process which generates the dichotomy
between
stimuli the subjectreports he does and does not hear.
The theory
that
and
of
and
incorrect
a
correct
costs
values,
decisions,
recognizes
prioriprobabilities,
this
as well as the physical
parameters of the signal,
playa decisive role in establishing
criterion.
dichotomy. We will find that this dichotomy is determined by an adjustable
The theoryshows how a quantitative
estimate of the criterion can be obtained from the
be found
may
in the
data.
There
are
many
whose
psychoacousticians
parameter from
constant
which
obtain
to
parameters, for example,the absolute

the
only interest
substantive
threshold
in this criterion is as
relations between
energy
as
function
two
of
physical
or
frequency,
justdetectable
change in power as a function of power (A/ vs /). To them this

if
theory will be of methodologicalinterest only. Yet clearly,
factors such as a prioriprobability,
and
do
in
role
the
costs
values,
playa
determining
is imperative.
threshold, their control in substantive experiments
The
second
related to substantive
directly
part of detection theory is more
it is the theoryof ideal observers.
the theoryprovides
matters
a collection of
Briefly,
ideal mathematical
models
which
relates the detectability
of the signalto definite
characteristics of the stimulus.
There is a collection of such models
because
physical
aspect of detection
"
different restrictions
make
one
may
theoretical observers
are
rarelyused
Most
often,theyare
the ideal observer
comparison,in
used
of the
suggestseither
mechanism,
hearing
This
discrepancy. will be
or
new
models
and
and
detection
of the
comparing human
specifythe nature
to
of the
nature
actual
as
for the sake of
in order
turn,
the
on
amount
new
illustrated in
These
hearingmechanism.
performancewith that of
of discrepancy.
This
accurate
representation
more
hopefully
further
to clarify
experiments
a
device.
the exact
nature
of the
later section of the paper.
Decision
Theory
We shall demonstrate, under quite

how a transformation
of
generalassumptions,
utilized
determine
both
the
criterion
and
the
be
to
can
subject's
subject's
responses
of the signal.This analysis
detectability
requiresan understandingof several basic
and start,
concepts which are rather complex. We mightskipover these fundamentals
the
W.
S. S. Stevens,J. Acoust.
R.
Garner,
/. Acoust.
Soc. Am.,
Soc. Am.,
1958, 30, 1005.

1959, 31, 995.
as
assumptionsabout Gaussian distributions

would
be unfortunate
because
a procedure
impliesthat strong assumptionsare needed to
have, with
previousexpositions
some
and
parameters of these
it robs the
of
analysis
distributions.
some
Such
its generahtyand
its applicability.
Such
justify
is not
43
GREEN
M.
DAVID
the
case.
psychoacousticians
Typically,
by making
responses
try to analyzethe subject's
the
the
about
in
is
the
which
sound
some
way
assumptions
processedby
hearing
mechanism.
One assumes,
for example,that the cochlea either makes a frequency
sis
analyof the waveform
that it does not, etc. We
wish to postpone temporarily
such
or
substantive
issues. Let us, for the present, merely assume
that each sound
may be
These numbers
by a series of numbers.
represented
might be the values of a series of
the representation,
let
attributes,or various states of the nervous
system. Whatever
us
call this abstraction

The
alternative should
observation.
an
wish
problem we
consider
to
be chosen?
is this: Given
is
What
observation, what
an
choice
response
how can
we
good
analyzethese
choices?
these questions
We shall attempt to answer
a single
by considering
example.
the generality
The example is obviouslyspecific;
rests in the concepts. The
single
motive in presenting
this example is to enable us to discuss these concepts
likelihood
with some
and yet avoid formalism.-^
ratio, decision rule, and criterion
precision
of these concepts
After this theoretical discussion,we shall investigate
the applicability
to a psychoacoustic
experiment.
a
and
"
"'
"
exampleof decision theory
An
Let
us
three numbers
[Xj
observations.
Given
of
instance
H^
{x^,x^, x^)],and
an
shall
observation
extended
reader
should
of the observation.
The
(integers
or
completeinformation
hypothesis.
observations
with
work
can
we
is an
about
the
probabilities
x^) could
{x-^,
x.-^,
numbers
have
been
ality
is independentof the dimensioncould
{x) of the observation
variables
observation
the
whether
have
we
10
{Xj) represented
by
//j,H.y,about the
hypotheses,
two
Everythingthat follows
real numbers)
or
observation
decide
to
that the three
note
three hundred.
to
wish
we
assume
to
have
we
giveneach
the example
By limiting
The
directly.
that
observation,
//g.^ We
or
of each
probability
10 observations, each
have
we
assume
(red, blue, or
qualitative
be
green).They
are
quantitative
scription
simply de-
of the observation.
Likelihood
numbers
In
ratio.
corresponding
These
inference.
concepts
Most
which
principle
*
A.
J.
"
I,
have
we
observation.
from
come
of the
Table
each
to
theorems
the
topicof
key
with Neyman
originated
listed the
The
statistical decision
and
Neyman
For
a concrete
theoryand
E. S. Pearson, Phil. Trans.

of the
interpretation
the
extended
who
data
theoryof
the
basic
Pearson.
Wiley,1950.
York:
New
Wald, Statistical decision functions.
and
the three
providethe
columns
first presented
by Wald,
were
and
observations
two
next
Roy.
Sac.
1933. A231, 289.
London,
example,the reader
mightthink of the observation
and the
width, and depth of the package,
package,the three numbers as the length,
animal.
The
whether
the package contains a toy car or
as
problem,then, is this:
hypothesis
as
sealed
Given
the measurements
attributes.
consonant
package, guess
mightthink of the observation

The
problem is: Decide
one
or
of
or
vowel.
as
whether
a
from
sound
it contains
which
the three
can
be
numbers
car
or
an
animal.
specified
by three
whether
the
natively,
Alternumbers
sound
is
44
on
READINGS
the
of each
probabilities
observation
the ratio of the fifth column

likelihood
H^
divided
that
H^
that
by the probability
is correct.
that this number

observation
variable
to
on
PSYCHOLOGY
each
If
Note
is
which
is
we
have
representsthe
a
itresulted from
{X^) we
should
that the likelihood
function
The
hypothesis.
the sixth and
that
ratio,then, is the probability
call the "odds."
some
MATHEMATICAL
IN
specified
by
three
values
likelihood
observation
particular
H^.
be
The
likelihood
to
willing
ratio is
of three variables
final column
number,
ratio.
The
resulted from
ratio
giveswhat
nine cents
wager
not
is simply
to one
and
probability,
have taken an
{x-^^,
x^, x^. Thus we
related
and
it to a single
(x^,x^, x^,
l{x-^,
x^, x^.
The
reason
make
we
have
decisions
performed this
if we
transformation
is
simplystated:
We
can
the likelihood
ratio. We
have not stated what we
use
optimum
by optimum, but let us take up this pointa littlelater. First,let us show how we
might use the likelihood ratio in making decisions.
rule. If someone
Decision
asks us to make
decision about
a
a
particular
observation,whether it is an instance of H^ or H^, we would probablyguess it was H^
if the probability
of that observation was
greateron H^ than on H^. Such a statement
is called a decision rule. In terms
of likelihood ratio this decision can be expressed
as
follows:
Choose
In
if
decision
rule by
1{X) " 1.
our
effect,we have specified
H^
choosingone number; in this case, the number "one." This number is called a criterion
a likelihood-ratio criterion.
or, more
precisely,
ten times as
observation, //g was
Suppose that, independentof any specific
would
without
maintain
not
a
s
criterion;
w
e
our
even
likely H^. Clearly,
previous
mean
knowingthe
It turns
choose
and
out
characteristics of the observation, the odds
in this
case
H^ only if,in
that
we
should
choose
are
H^ only\U{X)
ten
"
to
one
in favor of
10. That
is,we
i/g-
should
observation
is X
(4,3, 3).
example,the specific
if we
Similarly,
placeasymmetricalvalues and costs on the various correct
incorrect decisions,we should change our criterion or likelihood ratio accordingly.
our
DAVID
Monotonic
functions
of likelihood
in terms
of likelihood
While
ratio.
ratio,there
45
GREEN
M.
we
state
can
decision
our
cedure
pro-
exactlyequivalent
ways of stating
the decision rules. In the example, it so happens that the producta\ times ."".,minus .T3
is also an optimum decision quantity.This is true because this quantity
is monotonic
with the likelihood
on
which
we
ratio. The
likelihood-ratio
decisions
criterion number
the criterion number
alternative
usingthe
In
is not
scale,but there is always some
correspondsto
select the
other
are
on
likelihood
if
" 1.25;
/(-î,
H^
-^2,.T3)
decision rule,select H^ if {x\ .rg
"
such
the
same
number
x^
"
use
scale
example,suppose
would
we
would
we
this monotonic
ratio. For
then
"
that
as
on
identical
make
5.00.
of this theoryto psychoacoustics,

the
application
many
decision axis is unobservable, and hence we
interested
in
are
decision
only
equivalent
procedures.To say the observer uses an optimum decision proceduremeans
only that
he is usinga monotonic
transformation
of likelihood ratio.
Optimum nature of likelihood ratio. We turn now to the very importantquestion
of the optimum nature
of likelihood ratio. Clearly
a decision procedurebased
on
likelihood ratio is only optimum if it best attains some
Let us list
specific
objective.
of these objectives
to indicate their generality:
the expectedvalue
some
(1) maximize
cases,
the
as
decisions,^
(2) minimize risk,^
(3) estimate a posteriori
(4) maximize
probability,^
and (5) set the error
decision
rate
decisions,^
on
some
percentage of correct
of
at
and
constant
some
maximize
the number
of correct
decisions
the
native
alter-
for the other
alternative.^ The
fact is that a decision criterion based on likelihood ratio is

impressive
optimum
objectives.
Naturallythis criterion may be different for
different objectives.
The references listed with the objectives
contain a more
detailed
of
each
and
how
decision
rule
based
likelihood
on
a
ratio,
explanation
objective prove
monotonic
transformation
of
that
the best
or
some
quantity,
may be used to make
under
all the above
decisions.^"
Distribution
of likelihood
of the number
quantity
We
ratio.
of attributes included
likelihood
ratio.
Likelihood
have
seen
how
each
observation, independent
in the observation,can
ratio is
be reduced
function
to
single
of several variables
simply
We
singleobservation is simplya number.
may then properlyconsider a
defined
variable
likelihood
ratio.
Let
the
the
us
consider, in particular,
on
probability
that
shall
obtain
of
likelihood
ratio
value
under H^ and Ho
a particular
probability we
of the precedingexample. Table II shows
these probabilities
and the corresponding
cumulative
distributions for both hypothesesof our example. The likelihood ratio is
of the ROC
curve.^-^
ranked from largest
to smallest to facilitatethe explanation
"
and
for any
ROC
and
curves
their
(Receiver OperatingCharacteristic)curve.
is to accept
H^
if l{x^,
x^,
shall
properties.We
x^) ^ k.
If ^
14
'
W.
W.
Peterson, T. G. Birdsall,and W.
T.
W.
Anderson,
M.
Woodward,
An
introduction
to
Table
do this,let
To
=
use
we
us
find that the
C. Fox, Trans.
multivariate
II to construct
the decision
assume
of
probability
IRE,
ROC
an
rule
accepting
171.
1954, PGIT-4,
York:
statistical analysis,New
Wiley,1958.
^
New
P.
York:
^"
estimate
"
McGraw-Hill,
of
criterion is involved.
no
probability
posteriori
transformation
monotonic
is a simple
posteriori
probability
estimate
To
a
Note
and informationtheorywith applications

to radar.
Probability
1955.
that since two
under
probabilities
both
observations
to
hypotheses
yielda
obtain
the
likelihood
In
ratio of 0.50,
of that
probability
this
the best
case
of likelihood
we
have
likelihood
ratio.
added
ratio.
the
46
READINGS
IN
PSYCHOLOGY
MATHEMATICAL
TABLE
Each
under
Probability
KX)
Will Have
Hi when it is true [Pjj(Hj)]is 0.14 and

k, we
[Pjj(Hi)] is 0.01. By decreasing
II
Hypothesisthat
Certain
Value.
of accepting
H^ when it is false
probability
The upper curve
change both probabilities.
the
probabilities
change as a function of k, and is called an
matrix
The two
ROC
curve.
probabilities
completelyrepresent the stimulus-response
in a two-alternative detection task since the complements of Pjj (Hj) and Pjj (Hi) are
the two remainingcells in the stimulus-response
matrix.
shown
in
Fig.1
shows
how
the
1.00
0.80
55
0.60
a:
:z
0.40
0.20
0.00
0.20
0.40
Error
0.80
1.00
rate-PH2(H])
Figure
The
0.60
receiver
characteristic (ROC) curve

of
operating
is the probability
of responding
if
the
observation
H^
of respondingH^ if the observation
was
probability
Table
II.
the
example.The
was
from
axes
from
H^, and
H^.
The
are
which
Pji_J^H^),
which
PgJ^H^),
pointswere
is the
plottedfrom
48
READINGS
IN
MATHEMATICAL
PSYCHOLOGY
III
TABLE
Calculation
of the
in
likelihood
ratio
of
Probability
Forced-Choice
Correct
Response
Test.
largerlikelihood was in
fact produced by H^ and the smaller was
in fact produced by //g. The probability
of
this occurrence
is Pjj^[lîX)]
where
In
if
the
"
fact,
lîX)
lîX).
PjiJIzi^)]
larger
likelihood ratio is equal to k, the probability
of a correct
choice is simply:Pjj [1{X)
all
k] îPfjUii^) " k]}^ To obtain the final result we need only summate
over
the values of k, since any of these values mightbe the largest,
value
the
lowest
except
was
produced by H^, we
shall be correct
ifthe
"
"
of likelihood
Table
ratio itself.
(0.8042). While the

givesthese calculations and the final answer
of calculating
method
this probability
is straightforward,
in psychooften, especially
acoustic experiments,
does not have numerical
distributions on a likelihood-ratio
one
scale. Two
and the safest,
approachescould be used in these situations. The first,
since it makes
additional assumptions,
would
be to compute the probability
from
no
determined
If
III
ROC
look
Table
at
curve.
an
experimentally
closely,
you
you will
used in the calculation are simplyAf^ (//J times [1
see that the quantities
Pfj (H^)]
for each successive pointon the ROC
1
t
he
curve
(Fig. ). Obviously, accuracy of such a
estimate of the
procedureis heavilydetermined by the accuracy of the experimental
ROC
The merit of the techniqueis that no assumptionsbeyond that of the
curve.
decision rule are necessary to predictforced-choice
data.
behavior
from
the ROC
A second procedure,
which has often been used, is to make
one
some
tions
assumpabout the distributions which generated
and then use these assumpthe ROC
curve
tions
in predicting
behavior in the forced-choice
experiment.The most popularset of
assumptionsis that the distribution of observations on the likelihood-ratio axis, or
monotonic
function of that axis,is normal
under both hypotheses.
some
or Gaussian
The distributions are assumed
to differ only in their means
and, sometimes, in their
standard deviations.
for simplicity,
that standard
deviations are equal
Let us assume,
under both hypotheses,
then the ROC
be characterized
curve
can
by one parameter;
III
"
"
If
more
than
two, say M, alternatives
are
used
in the forced-choice
becomes
P(correct)
2^^/^
k
^)
" k)
^Puîli
test,the
equation
M.
DAVID
the difference in the
usuallydenoted
divided
means
by the
49
GREEN
standard
deviation
by
d'
^.Mja.
calculations
(AM/rr). This parameter
of a correct
probability
detection in a two-alternative forced-choice situation ifthese assumptions
made are
are
that
The
likelihood
is
than
another
is
the
quitesimple.
probability one
larger
ability
probthat the difference is greaterthan zero.
tion
transformaSince, by assumption,
some
is
The
of the
of /(A')is normal, the difference distribution is normal

variance
equalto
decision
is
the
sum
of the
variances.
original
PCcorrect,2 alternative)
The
of beingcorrect
probability
reference
of AM
mean
of
probability
and
correct
-|- 0%)^] 0[^7(2)'].

a)[AM/(CTf
for any
number
of alternatives is
givenin
footnote
have
reviewed
now
all the essential aspects of how
theoryin
the process of detection.

analyzing
these notions
results and see to what
extent
Let
are
detection
theoryuses
mental
experisupported.Followingthis
us
now
turn
to some
studies,we shall conclude this section with a discussion

experimental
studies
for psychoacoustic
of
these
implications
proceduresin general.
review
the
the
15.
We
decision
Hence
with
of the
of
Experimentalresults
ROC
One
curve.
the
of the earlier studies^^ simplysought to
shape of the ROC
in
curve
determine
task.
simplepsychoacoustic
The
experimentally
signalwas
1000-cpssinusoid. White noise,the masking stimulus, was present

occurred to mark the obsession. A light
servation
continuously
throughoutthe experimental
added to the noise (SN)
interval. During this interval either the signal
was
these
the
two
were
or simplythe noise was
(N):
hypothesesof the detection
presented
button if he
of two
task. The subject
possible
gave one
responses; he pressedone
believed no
second
if
he
believed the signal
button
a
was
or
present("yes") pressed
noise
of
The
the
situation,including
signalwas present ("no").
physicalparameters
the probability
and signal
held constant.
The independentvariable was
levels,were
selected
of
of
Five
levels
were
a prioriprobability
(0 priori) a signalbeingpresent.
(0.1,0.3, 0.5, 0.7, 0.9) and the one used for a givensession of 300 observations was
announced
to the subject.After the subjectresponded,he was
given immediate
The
information
the signalhad in fact been presented.
to whether
not
as
or
subject
and fined an equalamount
fraction of a cent for each correct answer
awarded
was
some
much
as
He
for each incorrect answer.
instructed to make
as
was
possible.
money
The results for one
of the subjects
are
ability
presentedin Fig.2. [Py;{A)is the probof
the
data
trend
The
of saying"yes"when
noise alone was
general
presented.]
is generated
drawn
The curve
by assumingthe
analysis.
supports the decision-theory
distributions on likelihood ratio are normal
under both hypotheses.The normalized
a
1/10second
of
difference between
Threshold
the
model
is 0.92.
means
and
the ROC
subjects
adopted the
consider
^=
No.
the
whether
not
or
considering
let
their
maximize
payoff, us
actually
Before
so
as
proper
alternative explanation
of the data.
one
P.
curve.
criterion
B.
Elliott,Electronic
Defense
to
This is the so-called threshold
of Michigan,Technical
Group, University
model.
Report
97, 1959.
"
W.
P. Tanner, J. A. Swets, and
Michigan,Technical
Report No.
30,
D. M.
1956.
Green, Electronic
Defense
of
Group, University
50
READINGS
IN
0.00
0.20
0.40
0.60
Figure
A
sample of the ROC
is the
The
0.80
1.00
experiment.See footnote 16. P^{A)

was
Psn(^) is the probapresented.
bility
when
These probabilities
estimated
was
were
saying"yes"
signal-plus-noise
presented.
from the stimulus-response
matrix.
See text for details of the experiment.
probabihtyof
of
from
PSYCHOLOGY
MATHEMATICAL
curve
auditorydetection
responding"yes"when
essentials of this model
process within
an
are
that the
noise alone
when
signal,
added
to
the noise,augments
some
organism,such that if the increment reaches a critical level called

the threshold, the signalis heard and can
be correctly
detected.
So far,we
note
no
analysis
greatdifference with the decision-theory
except in semantics. If one calls the
criterion a threshold
and the hypothetical
decision-theory
process likelihood ratio,
is
differences
between
the models appear when
the correspondence complete.The
one
considers
"subthreshold"
and the proceduresused to deal with these events.
events
fail to reach the threshThe threshold model
that should the signal
increment
old,
assumes
is present.
the subject
can
only make a pure guess as to whether or not the signal
This is surelytrue since anythingbelow the threshold is justthat. If orderingis preserved
below the threshold, the word
has no meaning. The difference in terminology
criterion and threshold is important,
for to say the subject
between
adoptsa criterion
rule.
continuum
is
used
is to simplysay an arbitrary
cut pointon
a
as the decision
Given that the subject
guesses about events which are "subthreshold," he may,
if blanks are ever
employed,report the signalis presentwhen it is not (falsepositive
both consistent with the threshold assumption,
might be
response).Two techniques,
instruct
the
t
o
ifthis
used
is
to
One
occurs.
subject be more
employed
procedurewidely
be interpreted
an
as
careful; this can
attempt to instruct the subjectto respond
cussed
of this procedurewill be disThe
all
"subthreshold"
to
events.
implication
negatively
of
this
from
the
valid
in a later section. Another
assumptions
equally
procedure,
sumes
model, would be to employ a correction for guessing.This correction procedureasThe
the guessingmechanism
and the sensory mechanisms
are
independent.
the inadequacy
I believe,to show
excellent experiments
the first,
of Smith and Wilson^'' were
of this second procedure.This fact led them to reconsider the entire notion
1'
M.
the
Smith
and
E. A. Wilson,
Psychol.
Monogr.,1953, 67, Whole
No.
359.
M.
DAVID
of the threshold
51
GREEN
and
alternative model, one

as an
theypresented,
very similar to that
footnote
Sec.
(
See
17.) Munson
suggestedby decision-theory
analysis.
especially IV,
and Karlin,*'^
the detection process
usingan information-theory
analysis,
investigated
under "absolute threshold conditions."
In order to deal with false positive
responses,
This model
is also very similar to that
they proposed a "discriminant level model."
suggestedby decision-theory
analysis.
The threshold model could stillattempt to account
for the data shown
in Fig.2.
The argument would
run
achieves some
hit and falseas follows:
Suppose the subject
If the situation is changed in some
alarm rate.
modify his behavior by
way, he can
simplygivingmore
"yes" responses. Since this guessingrate is independentof the
stimulus conditions (both noise and signal-plus-noise
below the threshold)
events
are
will
this
relative amounts,
both the hit and false-alarm rates.
increase,by the same
In short,a linear function will result. In the extreme, the subject
says "yes" all the time,
hence this linear function must
the
in
corner
point the upper right-hand
go through
\.00, P^^(A)
for the data is a collection
1.00]. Thus the threshold prediction
[P^{A)
of lines havingthe upper right-hand
the
and
as
corner
common
intercept, a slope
of the signal.No
linear function which
has this
depending upon the detectability
fit
value
than
few
of
the
data
for
of the
value
one
can
more
a
as
intercept
points
any
conflict with this version of
then, seriously
slope.The results of this first experiment,
of support to the decision-theory
the threshold model and givesome
measure
analysis.
=
conflict between
The
version
some
of the threshold
model
and
the
decision
analysishas been the subjectof considerable experimentaleffort. There are other

experimentalresults more
damaging to the threshold position.These experiments
attack the threshold
because they suggest that orderingbelow
the
concept directly
^^
threshold value is indeed possible.
We shall drop this conflict and proceedto other
questions.
Actual
in
Fig.2
wishes
one
to
optimum
discuss the
and
select
P(N)IP{SN), where
the
and
criterion
an
questionof the
optimum
obtain
criterion
P is the criterion value
of noise
prioriprobabilities
of course,
Let
criterion.
alone
on
on
us
now
optimum
likelihood
likelihood
return
to
played
the results dis-
criterion. It turns
out
that if
ratio, it is equal to ji
P(N ) and /*(SN) are
=
ratio and
We
can,
respectively.
signal-plus-noise,
criterion
the
slopeof
subject's
by measuring
data point.This rough comparison
experimental
and
of the
rough
pointnearest the
the
is displayedin Fig. 3. Note
that while there is a strong relation between
from
consistent
an
estimated and optimalcriterion values, there is also a
departure
the
summarized
The
trend
be
exact
subjects
saying
by
correspondence.
general
might
1
f!
conservative; they tend to adopt criteria which are not as diff"erentfrom
are
of the procedure.
be. This result is almost an inevitable consequence
as they should
of the
The way in which expectedvalues change for various criterion levels is the crux
A.
detail in Appendix
problem. This topicis discussed in more
have been utilized to vary the
Since these earlier investigations,
other procedures
successful
and is certainly
criterion. One which seems
more
straightforward
subject's
is simplyto instruct the subject
to adopt diff'erentcriteria such as lax or very
verbally
for Py{A)}^
maintain
instruct
the
a certain value
to
to
strict,or even
subject
the ROC
curve
at
measure
the
18
1^
W.
A. Munson
J. P.
Egan, A.
and
J. E. Karlin, J. Acoust.
I. Schulman,
and
G.
Soc. Am.,
1956, 26, 542.
J. Acoust.
Z. Greenberg,
Soc. Am.,
1959, 31, 768.
52
READINGS
IN
PSYCHOLOGY
MATHEMATICAL
4.00
2.00
"aa
T3
"U
"E
LOO
0.50
0.20
0.10
0.20
0.50
1.00
2.00
10.00
5.00
Optimum /3
Figure
of the
Comparison
of the
equivalent
normal
assuming
optimum
and
criterion level
on
statistics for both
/"(SN) is the
obtained
likelihood
criterion
ratio.
levels.
This
The
criterion
level,/5,is the
criterion is obtained
optimum
by
where
hypotheses.It is equal to [1 i'(SN)]//'(SN),
of the signal.
a priori
probability
"
from the questionof the criterion

Let us turn now
of detectability.
the measure
of
of
another
to
analysis,
adjustment
detection-theory
aspect
this measure
remains
whether
and more
not
or
relatively
specifically,
detectability,
different
different experimental
invariant over
can
one
procedures.How
compare
is
different
obtained
measurements
using
experimental
procedures an important
question,
but for any scientific enterprise.
Let us review
not
only for psychoacousticians
has permittedsuch a
to which
the evidence
the extent
on
analysis
detection-theory
comparison. If we make the usual assumptionthat the distribution of likelihood is
with equalvariance on both hypotheses,
normal
as in the situation outlined in the first
is
d'.
of detectability
then the measure
experiment,
index
of this detectability
the applicability
A paper by Swets^" has considered
obtained
and
also
he
has
for yes-no and forced-choice procedures
;
compared predicted
results using
two, three,four,six,and eightalternatives in the forced-choice procedure.
failure
based on d' hold up remarkablywell. The worst
these predictions
In general,
1 db; no
trend is evident in the data.
consistent error
to be about
reportedseems
first
ROC
of generating
Another
method
suggestedby Swets et al}^ has
curves,
with the standard
al}^
method
this
tested and compared
been employed. Egan et
the decision-theory
observation
or yes-no procedure,
yes-no procedure.In the single
determines
and
this
criterion
claims
the
that
a "yes"
subjectadopts a single
analysis
"no"
then, is employing the subjectas a threshold
or
response. The experimenter,
after
could have the subject
the experimenter
device. Alternatively,
reporta number
could
such as likelihood ratio; from these numbers, the experimenter
each observation
ROC
construct
curve
an
by placingvarious criteria on the likelihood ratios reported.
The subject
The rating
procedureis a compromise between these two extremes.
in the rating
procedureis asked to placeeach observation in one of several categories;
Measure
and
of
its
2"
J. A. Swets, /. Acoust.
^^
J. A. Swets, W.
Michigan,Technical
Soc. Am., 1959, 31, 511.
P. Tanner, and
Report No.
T. G. Birdsall,Electronic
40, 1955.
Defense
Group, University
DAVID
M.
53
GREEN
signal's
presence, the next for a lesser degreeof
and so forth. ROC
constructed.
One can then comare
curves
pare
sureness,
subsequently
the measure
of signaldetectability
obtained
from these two
procedures,
yes-no
differed for his three subjects
and rating.Egan et alP found these two
measures
by
error.
0.3,0.4, and 0.1 db, differences probablywell within the experimental
In summary
decision analysis
have seen
how
allows one
to predict
then, we
within a fairly
wide range of psychoacoustic
procedures.The forced-choice procedures
using two to eightalternatives and a single-interval
procedureusing two to four
of
summarized
of detectability,
a
categories response can be
by a singlemeasure
is
invariant.
measure
which, for practical
purposes,
of
methods.
The
traditional methods
more
for psychoacoustic
Implications
utilize some
psychoacoustics
parameter of the signalsuch as the threshold energy.
This value is obtained by an analysis
of the subject's
responses. Many of these methods
allow
determine
do not
the subject's
criterion and in most
methods
it
to
one
directly
is presumed to be constant.
how variation in the subject's
Let us investigate
if it occurs, will affect
criterion,
the estimate of the threshold energy. Variation
of the subject's
criterion affects the
false-alarm rate P^{A). Figure4 shows
how
the probability
distribution for signalbe varied as the false-alarm rate P^{A) is changed to maintain
must
a constant
plus-noise
value of signal
detection P^^{A). We
have assumed
Gaussian
distribution and
The insert displays
the essentials
equalvariance to construct the solid line of the figure.
of the calculations and shows how a change in P^{A) of from 0.10 to 0.01 necessitates
of the signaldistribution from
1.3 to 3.1 in order to maintain
a change in the mean
the top
^sn(^)
one
beingused
of
sureness
This value oi
0-50.
"
for
P^^{A)
is a reasonable
since it is often
one
used
the
as
10"
10"^
10"
10
of how
change
P-îA) is the false-alarm

constant
rate
at
0.5.
for various
The
rate;
mean
values
in criterion
a
will influence
"yes" response
signaldistribution
to
no
of the
oi P^{A).
The
constant,
10"
10
Figure
Evaluation
,-3
C,
the size of the "threshold"
signal.The
hit rate, P^^(A),
signal.
was
varied (see insert)to achieve
was
chosen
so
that 10
log 1.3
held
was
this hit
-|-C
0.
54
READINGS
estimate
of "threshold."
methods
this
control
IN
Very
PSYCHOLOGY
MATHEMATICAL
small values of false-alarm

of
parameter to the extent
rate
were
used
because
most
keepingit very low.
how this change in the mean

of the signal
distribution
generally
for
sinusoidal
in
signals noise, d' is
any signalparameter. However,
to signal
roughlyproportional
energy ; thus the "estimated threshold" may vary over a
the criterion of the subject.(In other experimentsd'
6-db range depending on
cussion.)
varies with signal
voltage hence the range might be 12 db. See Fig.7 and the disWe
cannot
is related
say
to
"
ject's
change in the estimated threshold, of say 6 db, will only occur if the subthat it is approximately
to assume
criterion changes. One may be willing
constant
^^
of the experiment.
Then this number, 6 db, could be interpreted
the course
over
The theory,
sets of different measurements.
as a tolerable difference in comparing two
view in psychoacoustics
then, is consistent with the rather wide-spread
; namely,that
results obtained
gruence.
usingdifferent methods should not be expectedto show exact conthese differences are largeenough to warrant
Whether
concern
dependsboth
of the problem and the precision
desired.
nature
the particular
on
The use of ROC
and speechresearch.
and the measure
Decision
curves
analysis
confusion
has been
d' has not been limited to detection experiments.Since some
of
d'
this
issue
deserves
attention.
the
some
multiplicity measures,
by
generated
taken from a reportby Egan.^^The similarity
ROC
an
curve
Figure5 displays
and Fig.2 is apparent,even
between
this figure
though measures
employed to construct
is presentedin
The procedurehere is as follows:
A word
this graph differ greatly.
This
noise to
writes down
listener who
he thinks
the word
was
He
presented.
then checks
The
conditional
he believes this identification response is correct.
whether
not
or
those words where he in fact was,
correct
of the receiver sayinghe was
on
probabilities
and
not
was
correct,
Egan's ROC
define the ordinate

curve,
rather than
contingencies
and
abscissa
then, is constructed
from
from
of Fig.5.
respectively
of response-response
contingencies,was the ROC curve
stimulus-response
a
table
as
is by no
the standpointof analysis,
means
decision
the
a
First,
really
by Egan
two-stage
process.
the
he
must
several
most
select
observer has to
word; second,
(from
possibilities)
likely
Such a process produces
evaluate this decision with respectto all other possibilities.
ful
doubtto evaluate exceptunder the most
mathematical
impossible
virtually
expressions
set of simplifying
assumptions.
does not, of course,
This difficulty
prevent one from summarizing the data
presentedin Fig.5 by a singleparameter. The line drawn to the data pointsis that
generatedby moving a criterion alongtwo normal deviates of the same variance which
labeled d' because of its
This measure
differ onlyin means.
initially
was, unfortunately,
d'
because the detection measure
It is unfortunate
analogyto the detection measure.
of signal
and noise. No
measurements
related to physical
has often been specifically
^^
Obviouslyone
of the order
probabilities
false-alarm
discussed
d'
rate
in the
1, could
-^
Report
J. P.
under
to
is
used
method
trivial. The
difference,from
This
earlier.
presented
10"^. If one
measurable
be used
as
The
because
constant
willing
to
make
10 \
signalenergy
or
this
use
Communication
measure
directly
raise the
must
assumption,
of
other
the
one
techniques
one
cannot
one
necessary
the counterpart of the threshold
Egan,Hearingand
contract, 1957.
is not
value, Py(A) "
previoussection.
then
it is
only assume
can
to
obtain
certain
d', say
energy.
Indiana
Laboratory,
Technical
University,
56
READINGS
MATHEMATICAL
IN
PSYCHOLOGY
Theory of Ideal
In the most
ideal observer
is
simply a
function
an
relating
an
alreadyspecified
since
I
Table
this task. This is not
ideal observer for our simpleexample,
accomplishes
because
the
observations
in terms
were
however,
an interesting
alreadyspecified
example,
A
under
each
of
of the probabilities
ideal
an
hypothesis. more
interesting
example
observation
observer
the observations
arises where
differ under
are
ratio
Thus
waveforms
The
hypothesis.
each
likelihood
calculate
waveform,
an
of that observation.
the likelihood
to
the waveform
a
generalsense,
Observers
or
have
we
and
where
the characteristics of
task of the ideal observer

monotonic
some
is,then, given
transformation
of that
quantity.
need
ideal observer,strictly
speaking,
The
problem of what
ratio is computed, the
not
decision
any decisions.
make
rule to
If likelihood
is determined
employ
by the
have
cussed
been dismaking
possible
objectives
in the previous
could be
sections,where it was pointedout that these objectives
attained by usinga decision rule based on likelihood ratio. Although the calculation of
the ideal observer for a givenproblem,such information
is of
likelihood ratio specifies
of
littlevalue unless we can evaluate this observer's performance.One general
method
/?0C curves, but to obtain
the ideal observer's performanceis to determine
evaluating
Thus
evaluate completely
the
to
calculate two
ROC
must
we
an
curve
probabilities.
likelihood is calculated but the
ideal observer we actually
have to specify
not only how
likelihood
ratio
both
distribution
of
on
hypotheses.
probability
Having established the generalbackground of this problem,let us consider a
specific
example: the ideal observer for conditions of a signalwhich is known exactly.
in
objective
specific
Ideal observer
must
Hi
known
for the signal
Various
exactly(SKE)
define this special

case
hypothesesactually
of the following
select one
hypotheses:
Two
one
the decisions.
"
the
waveform
bandwidth
//g
"
is
{W) and
the waveform
sample
of white
noise power
Gaussian
noise
givena waveform,
n(t) with
specified
density(A'^q)-
sit).Everything
specified
signalwaveform
its starting
time, duration, and phase. It
known
sine wave
as
i.e.,
long as it is specified,
is n(t)plussome
is known
about
need
be
not
in which,
s{t)if it occurs:
segment
of
exactly.
derive
hypotheseswe wish to calculate likelihood ratio,and, if possible,
distribution of likelihood ratio on both hypotheses.Obviouslysuch
the probability
calculations will be of littleuse unless the final results can be fairly
simplysummarized
such isthe
and noise. Happily,
of signal
in terms
measurement
of some
simplephysical
From
these two
case.
We
shall not
present the derivation
here
since it is not
in itself particularly
assumptionof the derivation will,

raised;
an
assumptionhas been recently
objection
of applyingthis result to any
which
an
objection
seriously
questionsthe legitimacy
native
the alterconducted.
been
which
has
Unfortunately,
psychoacoustic
experiment
yet
assumptionsuggestedhas a different but equallyserious flaw.
instructive and
can
be obtained
however, be discussed,since
elsewhere.'^ One
to
this
DAVID
57
GREEN
M.
of the waveform
Representation
The
assumptionconcerns
likelihood
ratio,one
must
Since the waveform

hypothesis.
a
with
probability
various
we
probabilities
of waveforms
we
must
limited.
set
somehow
of
measures
to
In
order
pute
com-
each
on
associate
from
the
form
wave-
measures.
to
these
compute
assumptionsabout
very specific
some
Fox^
If the waveform
assumed
that the waveforms
is of this class It
where
ff is the "bandwidth"
in terms
representation
of course,
of the waveform?
nature
make
obtain
these
series
measures,
In order
waveform
the class
will consider.
Peterson, Birdsall,and
band
somehow
or
with
probability
exactly is the
what
But
of the waveform.
representation
of a certain
probability
is simplya function of time,one must
find the
this waveform,
associate
and
the
can
of the noise and

of sine and
be
representedby
Tis the duration
cosine
of
Fourier
were
be
might
the
identify
//
series-
of the
used.
2WT
form.
wave-
There
are
n parameters, but
equivalent
writingthis series to
ways
if the original
waveform
is indeed Fourier series-band limited,
the waveform
in the interval (0, T). Acceptingthis assumptheywill reproduceexactly
tion,
find that a monotonic
transformation
of likelihood ratio (the logarithm)
is
we
under both hypotheses.
normal
these
are
many
all unique,and
is normal
H^: log/(x)
with
mean
"EJNq,
variance
EjN^^,
is normal
H2,'.
log/G^)
with
mean
+EINq,
variance
EJNq,
where
(IE/Nq)^
is the signal
dt, and Nq is the noise
energy, J^[5(/)]^
if
this
about
waveform
is not made
the
the
density.Naturally,
assumption
power
David^"
result
is
invalid.
Mathews
and
have
considered
different
a slightly
preceding
Fourier integral-band
the waveforms
limited. The
are
assumption. They assumed
conclusion
from
this
is
that
the
is
detectable
in the
resulting
assumption
signal perfectly
In short, d' is infinite
noise independentof the ratio "'/A^0'
^s
long as it is not zero.
for any nonzero
value of E/Nq. Which
of these assumptionsis the more
reasonable or
to a psychoacoustic
applicable
experiment?
all psychoacoustic
In almost
Neither assumptioncan
be completely
justified.
noise
tube.
The
t
he
is
a
experiments,
voltage
voltage actually
produced by special
and filtered. Such noise is not Fourier series-band
produced by this tube is amplified
^^
not
limited,for the noise is clearly
periodic.
Although a Fourier series might serve
d'
AMjo
3"
M.
^^
It is somewhat
V. Mathews
and
unfair
E. E. David, J. Acoust.
to
Soc. Am.,
1959, 31, 834(A).
implythat Peterson, Birdsall,and

that each waveform
Fox
assumed
could
be
the noise
was
represented
by
assumption,
strictly
speaking,
is througha samplingplan,which
The way theyobtained these numbers
a finite set of numbers.
discuss in detail. It was
cannot
not
we
a simpleFourier
expansionin terms o^ sine and cosine.
footnote 7;
This is a difficultand complex topic;for a discussion of the details in this area
see
in
Gaussian
Trans.
detection
of
Gaussian
noise,"
D. Slepian,
the
"Some
comments
on
signals
L.
Random
in
W.
Root,
IRE, PGIT-4, 65 (1958); and W. B. Davenport and
signals noise. New
of the situation where the noise is filtered,i.e.,
York:
McGraw-Hill, 1958. Precise analysis
in principle.
The
be worked
out
where the power spectrum of the noise is a polynomial,
can
One can
be obtained
is complex and exact
can
answers
only in certain simplecases.
analysis
of the signal
is finite.
situations the detectability
show
in general,however, that for practical
(See Davenport and Root.)
Their
periodic.
was
58
READINGS
as
an
be
an
IN
MATHEMATICAL
PSYCHOLOGY
excellent
in the interval (0, T), it would

not
approximationto these waveforms
of the waveform.
an
representation
Similarly,
assumptionof a Fourier
exact
limitation of the bandwidth

be correct, because the waveform
cannot
does not
integral
have a sharp cutoff in the Fourier integral
If
it
the
waveform
would
be
sense.
did,
If it were
the ideal observer could sampleat one
analytic.
analytic,
pointin time, obtain
all the derivatives at that point,
and know
the exact form of the wave
for all time. Such
result
leads
the
conclusion
that
the
ideal
to
a
observer, by observing
one
sample of the
waveform
at any time can, immediately,
in principle,
his decision about all the
make
waveforms
the experimenter
has presented
in the pastand all those he may ever
decide
to produce. This approach is therefore of littlepractical
use.
The
has indicated
issue,while obviouslyonly an academic
one
one,
very
of
the
important
problem. The ideal observer is,like all ideal concepts,only as
aspect
good as the assumptionsthat generateit. Clearly,
any such idealization of a practical
situation is based on certain simplifying
assumptions.It is alwaysextremelyimportant
to understand
what
these assumptionsare and even
more
importantto realize the
of a changein these assumptions.In short,there are many ideal observers,
implications
each generatedby certain key assumptionsabout the essential nature
of the detection
task.
For
the discussion
approach and
number
As
follows,we
that the waveforms
assume
of measurements.
shall
can
A similar treatment
progress is made
more
which
with the
theoryof
be
use
completelyrepresented
by
is givenby Van
ideal observers
we
Meter
detection will vary if certain definite restrictions

the observer operates.Peterson, Birdsall,
and Fox
in which
several such
and
their results. Each
and
should
how
quiteprecisely
manner
Fox'
the Peterson, Birdsall,and

a
finite
Middleton.^^
be able to state
imposedon
are
the
have, in fact,considered
providesus with a framework

parison
performanceof the subject.Such a commay
both
aud
provides
qualitative
quantitative
guides for further research. ^^
There are several areas
we
might select to illustrate this approach. The one we have
selected was
chosen because it is a general
what
sometopicand because it has been slighted
in psychoacoustics.
from
which
cases
evaluate
we
and
assess
case
the
Shape of the psychophysical

function
The
of
function is generally
defined as the curve
the perpsychophysical
relating
centage
detections of the signal
correct
of the
(the ordinate)to some
measure
physical
variant of the constant

stimuli method
is used, the curve
signal(the abscissa). If some
rises monotonically
from zero
hundred
level
is increased.
the
to one
signal
percentas
Generally,
hypothesesabout the form of this function arise from assumptions
about the process of discrimination.
Often these assumptionsare sufficient to allow
deduce
the
form
of
function to within two or three paramto
the psychophysical
one
eters
which are then determined
Obviously,it is extremelyimportant
experimentally.
for the model to specify
stimulus which is used
the exact transformation
of the physical
the theory
as the abscissa of the psychophysical
function; without such specification,
is
incomplete.
In
there
psychoacoustics,
this function.
"-
D. Van
33
W.
Most
Meter
P. Tanner
has been
theories of the
and
and
D.
littleconcern
comparatively
auditoryprocess have been content
with
Trans. IRE, 1954, PGIT-4, 119.

/. Acoust. Soc. Am., 1958, 30, 922.
Birdsall,
Middleton,
T. G.
with the form
of
attempting
DAVID
M.
59
GREEN
or
curve,
usuallythe mean
parameter of the psychophysical
to obtain from the literature information
result,it is nearlyimpossible
predictonly one
to
threshold.
As
the actual form
of the
function.
psychophysical
is the neural-quantum
exceptionto the precedingstatement
The authors of this theorysay that it "enables us to predict
the form and
hypothesis.^*
from the model
the slopeof certain psychometric
functions."
It can
be demonstrated
to
that the form of the function should be linear and this linear function is specified
within one parameter. The physical
is never
mentioned
in the derivation of the
measure
that sound pressure and frequency
theoryand we find onlyafter the data are presented
The authors remark
in their paper that "strictly
are the appropriate
measures.
physical
rectilinear
functions
data
when
againstsound
speaking,
yielding
psychometric
plotted
when
expressedin terms of sound energy,
pressure do not show absolute rectilinearity
but calculation shows that the departure
from rectilinearity
is negligible."
It is certainly
and indeed pressure cubed, are all nearlylinear
true that pressure, pressure squared,
for small values of pressure
but that is not entirely
the point.
function
that
It is the location of this
playsa crucial role in the theory.If the
physical
subjectemploys a two-quantum criterion then, accordingto the theory,the psychoon
The
notable
"
function
hundred
percent at
Where
the
be
must
two
breaks
curve
zero
up
to
quantum unit,show
one
linear increase to
one
quantum units.
quantum units,and maintain this level for more
hundred
it
reaches
and
where
from zero
one
percentreports
if the subject
requires
specified
by the theory.In general,
percentreportsis precisely
extend
from
function
must
linear
the
to
a
increasing
produce positive
report,
quanta
be
to
units.
what
Now
io n + \ quantum
a two-quantum subject
n
clearly,
appears
(0% at one pressure unit, 100% at two pressure units),when the data are plottedin
n
in energy units.
as a two-quantum subject
interpreted
This is true no
as an
any-number-of-quantumsubject.
interpreted
pressure units,cannot
he cannot
how
be
be
small the values of pressure.

This criticism of the rather post hoc
In
fact,
matter
physicalscale is by no
limited to the neural-quantumhypothesis.Many hypothesesabout the shape
means
formulations ofthe Gaussian
ofthe psychophysical
function,including
some
hypothesis,
this rather crucial factor.
neglect
based
with these theories. Models
contrast
Detection theorystands in marked
function
the ideal observer concept predict
the form of the psychophysical
on
exactly.
free
and
there
no
are
The proper physical
dimensions
are
eters.
paramspecified
completely
of the
treatment
less
observers somewhat
to find human
Obviously,one would not be surprised
least
function
at
of
the
the shape
might
psychophysical
optimum, but hopefully,
Often however, the obtained
from the model.
to that obtained
physical
psychoparallel
this
and
model
the
function does not parallel
that predicted
discrepancy
by
than
be
deserves
discussion.
some
Signaluncertaintyand
of correct
3*
^^
to
some
detections
S. S. Stevens, C. T.
The
ideas
the essentials
conversations
ideal detectors.^'" In
in
two-alternative
are
the
same.
on
this
topic.
The
we
have
forced-choice
plottedthe percentage
versus
procedure
1941, 54, 315.

Psycho!.,
isverysimilar
data from
uncertainty
signal
of
the
details
differ,
P. Tanner.
analysis
Althoughseveral
for many
author is indebted to Dr. Tanner
long and lively
Morgan, and
of detection
analysis
expressed
by Dr. W.
Fig.6,
J.
Volkmann,
Am.
the viewpointof
J.
60
READINGS
IN
MATHEMATICAL
PSYCHOLOGY
1.00
"13
40 spl
Frequency:1000
Duration: 100
I
0.50
-4-2
theoretical
The
signals.
parameter
test.
The
obtained
and
typical
subject
is simplyto detect
We
12
14
16
sinusoidal
of
ordinate
data
are
10
"
1 of M-orthogonal
detecting
The ideal detector
possible
orthogonalsignals.
it.
abscissa
The
is
times
the logarithmof signal
ten
identify
is the number
not
only detect the signal,
The
density.
energy to noise-power
"^-ôfor
functions for the ideal observer

psychophysical
need
forced-choice
10
Figure
The
cps
ms
is the percent correct
compared with the

db to the right.
series of mathematical
signaladded
to
detection in
two-alternative
theoretical function
models.
The
shifted about
problem in
background of white
all cases
noise.
subject"because
"typical
the shape of this function is remarkablyinvariant

and a range of physical
durations
For
subjects
signal
parameters.
of 10 to 1000 msec^^ and signal
from 250 to 4000 cps,^^
there appears to be
frequencies
no
greatchange in the shape of the function when plottedagainstthe scale shown in
the exact location of the curve
Fig.6. Naturally,
dependson the exact physical
parameters
of the signal,
but exceptfor this constant, which is a simpleadditive constant
in
of
this
function
form, the shape is remarkablystable. The striking
logarithmic
aspect
is its slope.We
notice the slopeof the observed
function is steeperthan most
of the
theoretical functions depicted
in Fig.6.
The class of theoretical functions is generatedby assuming the detector has
^^
various uncertainties about the exact nature
of the signal.
Each function is generated
the
detector
knows
will
that
the
be
of M-orthogonalsignals.
one
by assuming
only
signal
If the signal
is known
For sinusoidal signals,
1)there is no uncertainty.
exactly(M
the nature
of the uncertainty
be
time
of
of the signal,
occurrence
or
might
phase,
The
of
reflected
this
is
the
As
signalfrequency.
degree uncertainty
by
parameter M.
say
both
over
was
assume
38
D.
"
D. M.
^^
The
M.
Green, /. Acoust. Soc. Am., 1959, 31, 836(A).

Green, M. J. McKey, and J. C. R. Licklider,J. Acoust. Soc. Am., 1959,31,1446.
details of this model
selected because
it has
been
may
be found
presentedin
but which
signaluncertainty
functions producedby these
the value of the parameter (M) would
in footnote
the literature.
differ in details about

models
be
are
7, p. 207.
There
the decision
similar to those
changed somewhat.
This
are
model
particular
other
rule.
models
The
which
psychophysical
in Fig.6, although
displayed
DAVID
M.
61
GREEN
function
increases,the psychophysical
uncertainty
slope. It therefore
with sufficient uncertainty
about the signalto
appears that there may exist a model
function
which
is
similar
the
human
observer.
that
to
a
displayedby
generate
very
for
the
of
the
human
the
that
the
moment
extreme
Accepting
assumption
slope
observer's psychophysical
function is due to some
about the
degreeof uncertainty
to
this
various
w
e
signal, might try
manipulate
slopeby
experimentalprocedures.
Preview
technique.One generalclass of procedureswould attempt to reduce
the uncertainty
form of cueingor
by supplyingthe missinginformation throughsome
for
previewtechnique.If, example,the observer is uncertain about the frequencyof the
the signal
at a
signalwe might attempt to reduce this uncertainty
briefly
by presenting
level
the
observation
interval.
of
to
if
the
time
of
occurrence
high
justprior
Similarly,
the signal
is uncertain we might increase the noise duringthe observation
interval. If
the noise was
increased for all trials,
whether
the signal
it would
or not
was
presented,
tion
provideno information about the signal's
presence but would convey direct informatime
and
duration.
about the signal's
Both
of
have
been
these
starting
techniques
utilized with only partial
While
it
is
there
was
no
success.
impossibleto assert that
the amount
of change was
in
the
small,
change (the null hypothesis)
although
very
increases
in
direction. ^^
proper
procedureswhich has been utilized to attempt to reduce the

the signal
parameters involves changingthe detection task
that some
information
is directly
so
supplied.The proceduresare like the preceding
interval. For example,to
but actually
include the information
in the observation
continuous
sine
add
to the noise. The
a
wave
remove
we
might
frequencyuncertainty,
continuous
evident in the noise.
is adjusted
sine wave
to a level such that it is clearly
ment.
The signal
is an increment
added to this sine wave
and the task is to detect this increThe
psychophysical
proceduredefinitely
changes the slope of the subject's
function
it becomes
is easier to detect.^''
less steepand the signal
sine wave
This procedureof making the signal
increment
to a continuous
an
but does not
remove
temporal uncertainty.
providesgood frequencyinformation
minimizes
is
Another
which
all
procedure
practically uncertainty in fact a modification
A two-alternative
of a standard
the j'.n.d.
for intensity.
procedureused to investigate
at
in noise, one
sinusoids
forced-choice
Two
is
occur
gated
procedure employed.
task is to select
The subject's
standard
level,the other at this level plusan increment.
to a power
is
the interval containing
If the standard
the increment.
adjusted
signal
function
level about
the psychophysical
actually
equal to the noise-powerdensity,
case."*^ It is from 3 to 6 db off
that expectedfor the signal-known-exactly
parallels
optimum in absolute value, depending on the energy of the standard. (See Fig.7.
Note the change in scale between
Figs.6 and 7.)
Let us, at least tentatively,
acceptas the conclusion of these last results that the
to various uncertainties
function is in fact due primarily
shape of the psychophysical
about the signal
parameter. If this is true, then we stillhave the problem of explaining
Another
class of
about
subject's
uncertainty
"
Unpublishedwork of the author. Also see T. Marill, Ph.D. thesis, Massachusetts

logical
Technology,1956, and J. C. R. Licklider and G. H. Flanagan, "On a methodoin
problem audiometry,"unpublished.
W. P. Tanner, J. Bigelow,and D. M. Green, unpublished.
"
of Michigan,Technical Report
W. P. Tanner, Electronic Defense
Group, University
3*
Institute of
*"
No.
47. 1958.
62
READINGS
the lack of
IN
evidenced
MATHEMATICAL
when
the
PSYCHOLOGY
previoustechniqueswere
employed. Should
not
a
certainty?
signal,
precedingan observation, serve to reduce frequencyunThe answer
but not
might be that such proceduresdo reduce uncertainty,
stillremaining.From
enough relative to the uncertainty
Fig.6 we note that, as we
introduce
the slopeof the psychophysical
function increases very
signaluncertainty,
in
for
small
the
t
hen,
as
increases,the slope
rapidly
changes uncertainty;
uncertainty
value.
A
in
M
from
256 to 64 may
change uncertainty
approachessome asymptotic
function.
This fact also probablyexplains
hardlyaffect the psychophysical
why the
functions
do
not
for
to
much
of
a
psychophysical
change very
variety signal
appear
duration and signal
frequency.Undoubtedly,as the signal
parameters, such as signal
duration increases,the uncertainty
about the time of occurrence
of the signalis reduced.
Due
the
initial
this
is
to
large
uncertainty, change too small to be detected in
success
previewof the
the data.
of checkingthis general
signalfrequency.Still another manner
model is to vary the uncertainty
of the signal
and determine how this affectsthe subject's
One
for
select
several different sinusoidal signals
and
performance.
might,
example,
select one at random
used on a particular
trial. The subject
is simplyasked
as the signal
it. Depending on the frequencyseparation
to detect a signal,
not
and the
identify
of signals
number
used, one can directly
manipulatesignaluncertainty.
Uncertain
1.00
0.90
0O.8O
0.70
(2!
0.60
0.50
-10
-5
10
Figure
Observed
The
data in the A/
abscissa
abscissa.
The
and
two
of the
two
curves
curves
are
the
differ
at
same
by
low
level of the noise; the lines show
6 db
values
as
at
in
each
observer (M
signal-known-exactly
but
the
note
Figure6,
change in scale
/experimentand
versus
ordinate
15
the
value
of percent
of
percent correct.
correct
the level of / in power,
is
and
The
The
illusory.
the maximum
1).
of the
apparent
insert shows
vergence
con-
the
/ -)-A/ power
64
READINGS
IN
MATHEMATICAL
PSYCHOLOGY
functions obtained with this type of signal

also slightly
are
psychophysical
steeperthan
those predicted
time uncertainty
stillremains or signal
by the model.^'' Either partial
alone is not a sufficient explanation.
The author feels that a better model
uncertainty
would
that the human
assume
assumption,
coupledwith
the results obtained
the
thus far.
observer
utilizes
some
nonlinear
detection rule.
This
could probablyexplain
of
most
uncertainty
explanation,
The mathematical
of such devices,is however,
analysis
complex.
Internal
Often
it is
Before
noise.
summarizing, one
final
be
point must
considered.
cussing
temptationto invoke the concept of internal or neural noise when disbetween
ideal model
and the human
observer.
an
There are
discrepancy
for avoidingthis temptation.While
it would
take us too far afield to
good reasons
this pointin detail,
the following
remarks
will illustrate the point.
cover
Only if the model is of a particularly
simpleform can one hope to evaluate the
effects of the assumptionof internal noise. The signal-known-exactly
observer
specific
is of this type. Here one
show how a specific
of
internal
noise
can
can
simplybe
type
treated as addingnoise at the inputof the detection device. Thus one can evaluate the
function and it will be shifted to the right
psychophysical
by some number of decibels
due
internal
the
to
noise. But, of course, such an assumptioncan immedi(see Fig.6)
ately
be rejected
since no shift in the psychophysical
function can account
for the data
in the figure.
displayed
With more
difficultto say exactly
what internal
models, itis usually
complicated
noise will do. While it will obviously
lower discrimination,the specific
effects of the
effects can
be
assumption are often impossibleto evaluate. Unless these specific
the
evaluated, the assumption
of
the
simplyrephrases
original
problem
discrepancy.
I am
not suggesting
that the human
observer is perfect
in any sense, nor attempting
the importanceof the concept of internal noise. What
to minimize
I am
ing
emphasiza
the
is that the
concept must
importanceit must
be made
what
this noise is,i.e.,

that
what
way
it interacts with
what
specifically
out
be used
of the
the detection
on
If the
or
discrimination
performance. Unless
concept
is to have
any
process, and
these
steps
can
(3) evaluate
be carried
assumptionvitiates its usefulness.

Summary
The
great care.
This impliesthat we
have to (1) state exactly
specific.
have to characterize it mathematically,
in
we
(2) specify
effect it will have
the ad hoc nature
with
and
Conclusion
main
emphasisin this paper has been to explaindetection theoryand to

such a theory has been appliedto certain areas
illustratehow
of psychoacoustics.
This method
of analysis
is simplyone of many that are currently
beingused in an attempt
to understand
the process of hearing.
Two
main aspects of this approach have been distinguished.
The first,
decision
that
the
criterion
well
the
of the
as
as
emphasizes
theory,
subject's
properties
physical
stimulus playa major role in determining
the subject's
responses. The theoryindicates
both
the class of variables
which
determines
the level of the criterion,and,
importantly,
techniquefor removingthis source
suggestsan analytic
techniqueleaves
invariance
of this
relatively
pure
measure
over
measure
several
D.
M.
Green, /. Acoust. Soc
This
of the signal.The
detectability
psychophysical
procedureshas alreadybeen
of the
demonstrated.
*'
of variation.
more
Am., 1960, 32, 121.
M.
DAVID
65
GREEN
1.00
0.80
Si 0.60
1 0.40
O0.20
0.00
0.20
0.40
0.60
Figure
The
normalized
curve
based
on
expectedvalue as a function
the data presented
in Figures
2
to
The
detail. The
second
aspect,the theoryof
usefulness of such
function.
psychophysical
model
even
The
No
of
discussed in
illustrated by considering
the form
some
of the
providesa completeor comprehensive
that
psychoacoustics
hypothesesand
we
have discussed in this
standard
which experimental
against
It is too earlyto attempt any completeevaluation of
results can be evaluated.
and the application
this approach. The mathematical
of these
models are relatively
new
models to a sensory process began with Tanner
and Swets'*ônly about five years ago.
There remain
and experimental
of a mathematical
be
solved
both
to
many problems
should
become
As more
nature.
more
specific
progress is made in both areas, the theory
with the research
and concrete, then perhaps it will be able to interact more
directly
from several other areas
in psychoacoustics.
paper.
providesa
source
of
curve.
ideal observers,has also been
analysis
areas
changesin criterion. This is a theoretical

3. The appendixliststhe assumptions
used
the
ideal observer
for the rather limited

model
and
was
an
1.00
of
construct
0.80
Appendix A
of comparing the optimum criterion value and that
difficulty
function. Let us investigate
i
s
the
employed by
subject
shape of the expected-value
in detail a typical
situation. We have assumed
that the distribution on likelihood ratio
is normal
under
both hypotheses,
that the mean
separationis one sigma unit, and
The
inherent
the
that the values and
costs
of the various
decision
alternatives
are
all the
same.
From
assumptionswe have constructed Fig.9. This figureshows how the expected

of signal
PCSN) and false-alarm rate
changes in a prioriprobability
values oi a priori
that for extreme
P^{A). We see immediately
probability,
e.g..P(SN)
behavior
0.004] and
0.10, the difference between optimum expected-value
{P^{A)
these
value varies with
"
W.
P. Tanner
and
J. A. Swets,
Rev., 1954, 61,

Psychol.
401.
66
READINGS
tends
force
to
to
the
the
On
to
instructed
are
0.50
Thus
will
any
correspondence
June
23,
achieve
attempt
1960.
one
if
least
at
to
3 %.
strategies
moderate
see
we
90%
of
and
in
moderate
more
any
the
maximum
any
in
curves
the
of
of
criteria
than
Since
for
were
most
are
in
employed
a
this
tions.
condi-
extreme
within
P^(A)
expected
more
figure
experiments,
P^(A)
probabilities
value
the
maximum.
psychoacoustic
priori
optimum
the
of
values
that
in
fact,
In
location
the
see
investigate,
obtained
PSYCHOLOGY
than
to
pure
more
0.50],
less
adopt
to
hand,
between
is
allow
avoid
subject
other
0.000]
to
[e.g., P(SN)
experiment
Received
exaggerated
subjects
0.15
[P^(A)
strategy
pure
somewhat
MATHEMATICAL
IN
range
the
from
payoff.
correlational
appears
extremely
sense,
the
difficult.
SOME
COMMENTS
AND
"PSYCHOACOUSTICS
AND
David
department
of
economics
massachusetts
Dr. S. S. Stevens
acoustics
misunderstand!
ng
by
laboratory
of
electronics,
massachusetts
cambridge,
the percentageof
relating
stimulus
often called the
more
research
technology,
items in my
paper,
"Psycho-
in order
comment
to
avoid
of the
physicalintensity
is
Green
kindlypointedout two
Theory,"^that requirefurther
I called the function
function
THEORY"*
has very
Detection
and
of
OF
DETECTION
M.
and
institute
CORRECTION
the
detection responses to the

It is true that this
correct
function.
psychophysical
psychometricfunction, a
probablyintroduced
term
in 1908.2
Urban
added
successive
just-noticeable-differences
(jnd's)to
and the physical
intensity
relation is commonly called the psychophysical
of the stimulus. The resulting
function.
Since Fechner's
other techniques
for determining
this relation have been
time many
functions (e.g.,
devised and the results are also called psychophysical
Stevens' power
methods
do not involve determining
law^). The newer
jnd'sand are not obtained by
classical
methods
of
variant
of
the
usingany simple
psychophysics.We are therefore
obtained
faced with the anomaly that psychometricfunctions
are
by usingpsychophysical
functions are now
determined
different
methods
and psychophysical
other,
by
techniques.
used in vision
curve
PersonallyI find the designation
frequency-of-seeing
function.
minology
distasteful
than
the
term
Some
even
more
psychometric
change in terwould
welcome.
be most
I am
open for suggestions.
Fechner
Originally
determine
up
the
magnitudeof sensation
"
The
second
item
quantum theory. I
is
crucial and
more
concerns
asserted that data that appear
"
remarks
my
to
indicate
about
the
two-quantum
neuralobserver
be interpreted
as
plottedagainstpressure units cannot
any kind of quantum
observer when
units.
There
is,however, a very straightforward
plottedagainstenergy
this assertion incorrect.
of the scales of pressure and energy that makes
interpretation
and
this
occurred
to
I therebydid injustice
had
never
me,
Unfortunately, interpretation
and
to the authors
of the neural-quantumtheory. Let me
explainthis interpretation
when
the scale of pressure and energy units that I had in mind when
have a continuous
In the neural-quantumprocedure we
itthe standard).
(call
and
the observer's
At
times
specific
measure
of this letter
preparation
Amer.,
Office and
(Operational
Applications
Naval
Research.
F. M.
M.
This is Technical
Green,^ /icoM5^
this sinusoid
the pressure of the

the
increment, call it
plus
measure
we
Reprintedwith permission.
supportedby the U.S. Army SignalCorps,the
was
Force
D.
stimulus
1961, 33, 965.
/. Acoust.
If
the pressure of the standard
From
Soc.
remarks.
the amplitudeof
briefly
task is to detect these increments.
standard, call \ip,and

The
increase
we
my
sinusoidal
I made
Note
Office of Scientific Research), and
No.
Soc. Am.,
ESD
TN
Air
the Office of
61-56.
1960, 32, 1189.
Urban, The application

of statistical methods
Clinic Press, 1908, p. 107.

Philadelphia:
Psychological
S. S. Stevens, Psychol.
Rev., 1957, 64, 153.
3
67
to
the
problemsofpsychophysics,
68
READINGS
p + A^, then by
values of A^. We
IN
former
the
subtracting
call this
may
if we
Similarly,
PSYCHOLOGY
MATHEMATICAL
from
the latter
we
the increment
quantity/S.p
obtain
on
pressure scale
of pressure.
the power of the standard, a quantity

to
proportional
increment
standard
the
a
to
plus
quantityproportional
measure
of the
and the power
p"^,
{p + ^pT, we might subtract
the former
from
obtain
the latter,
and
(since the
constants
of
quantity,(/"^ 2^pp
A^^ ^2j
to energy, since the increment
proportional
is of constant
the increment
of energy. The
duration, and we may call this quantity
isthat
these
result
the
increment
in
two
and
the increment
quantities,
important
pressure
in energy, are nearlylinear for values of Ap much
less than p. If some
data are exactly
of the neural quantum theoryon one scale,theywould
consistent with the predictions
very nearlybe consistent on the other scale.
When
data plottedon
I made
a scale of signal
my remarks, I had in mind
I
the
waveform
added
the
to
standard
or
that
signalenergy. By signal mean
pressure
the observers are asked to detect. In this terminology,
the pressure of the signalis
and
the
of
t
o
i
s
to that quantity
proportional Ap
signal proportional
squared,
energy
data
scale
of
defined it are in
Ap^. Only
plottedon a
signalpressure as I have now
of neural quantum theory.
agreement with the predictions
for my oversight
Part of the reason
undoubtedlyarose from the fact that this
of signal
of the data
measure
some
energy Ap^ is the quantityI used in presenting
later
in
There
inherent
for
is,however,
no
reason
reported
usingmy particular
my paper.
the
proportionality
(lApp + Ap^).The
are
would
be
some
the two
cases
exactlythe
be
an
same.
increment
and
I should
have
made
my
reference
clear.
different scales of energy obtained from the pressure scale

This would
herent;
happen if the standard and signalare inco-
that is,if the middle

this would
the
is also
quantity
of the stimulus
measure
In
latter
same)
term
in the square of {Ap + p) is zero.

An example of
In the case at hand, this is not true and the
in white noise.
that I called signal

quantitythat I have called increment in energy and the quantity
different.
are
quite
energy
The general
is that the neural-quantum
to make
pointI was trying
theorydoes
in advance
how
the physical
stimulus should
be measured.
not specify
It was
my
it
is
for
t
hat
of
how
the
t
o
a
position
theory psychophysics specify
important
physical
scale is related to the expectedpsychological
results. This position
is apparently
not
endorsed.
I
with
the
of
number
theories
that
am
widely
particularly
impressed
suggest
that the psychometricfunction
is Gaussian, log-Gaussian,
or
Poisson, rectilinear,
but cannot
in advance
what particular
transformation
of the physical
logistic,
specify
scale will yield
these results. It is not hard to envision different circumstances
in which
all these assertions
experimental
error.
are
true
Somehow
at
least in the
there
never
sense
seems
that deviations
to be
are
within the range of

to these different
any resolution
findings.
One
can, of course,
simplyignoreall this and go on measuringonlyone arbitrary

psychometricfunction such as the "threshold" value. While this
parameter
positionobviouslyhas the merit of convenience, it would also appear importantto
demonstrate
how all of these different results might come
about from one single
general
To
latter
task one
have a theorywhich carefully
must
theory.
accomplishthe
specifies
the physical
theory.
part of the psychophysical
of the
Received
April14, 1961.
THE
ON
POSSIBLE
PSYCHOPHYSICAL
R- DUNCAN
LUCE
University
Harvard
This
is concerned
paper
century-old effort
functional
relations that hold
subjectivecontinua
continua
them.
that
The
are
It rests
upon
the
physical
not
the
mathematical
to
easilythe most
specifythe
interval
an
what
By
problem someessentiallyby replacing the
what
jnd assumption with the someequalthat
condition
"equally
stronger
sible
pos-
recastinghis
often noticed differences
sity
inten-
physicalcontinuum
generate
"
made
varies with
sufficient to
scale.
by Fechner.
empiricalknowledge of
was
discrimination
along the
between
(1958) have pointedout that Fechner's

reasoningwas not sound.
other
Among
things,his assumption is
the
presumed to underlie
influential,
attempt
how
and
and
first,
relations
with
determine
to
LAWS
when
and
always
or
are
never
able to show
cept
equal,ex-
noticed"
"
an
they
matical
scale results,and to present a matheassumption that jnd's are
tinuum.
subjectively
expressionfor it. Their work
equal throughoutthe conhas no practical
nation
When, for example, discrimiimport when Weber's
is proportional to
intensity law, or its linear generalizationA-r
the loga+ b, is true, because
ax
(Weber's law), Fechner claimed that
rithm
is
the
the equalleads
still
but
their
to
a
solution,
jnd assumption
jnd
relation (Fechner'slaw). scale differs from
Fechner's
logarithmic
integral
law is replacedby some
This idea has always been subject when Weber's
other
but
function
attacks
to controversy,
recent
relatingstimulus jnd's
upon
it have been particularly
At
to intensity.
severe.
the theoretical level,
Luce and Edwards
At
the
empirical level, Stevens
(1956, 1957) has argued that jnd'sare
^ This
work
has been supported in part
size on intensive,
unequal in subjective
by Grant M-2293 from the National Institute
calls
what
he
continua
of Mental
a
or
Health
and
in part by Grant
prothetic,
NSF-G
5544 from
the National
Science
contention
supported by considerable
were
interval
that
the
upon
"
Foundation.
data
"
and that the relation between
the
Ward
Edwards, E. H. Galanter, Frederick

Mosteller,Frank
Restle, S. S. Stevens, and
Warren
Torgerson have kindly given me
their thoughtful comments
drafts of this
on
of which
are
incorporated into
paper,
many
this version.
I am
particularlyindebted to
S. S. Stevens
and
for his very
and
subjective
function ax^, not
Using
such
"direct"
estimation
he
and
Stevens
two
and
article
appeared
in
"
Psychol. Rev.,1959, 66, 81-95.

69
the
logarithm.
methods
ratio
is the
as
nitude
mag-
production,
others
(Stevens: 1956, 1957;

Galanter, 1957) have accumulated
considerable
drafts.
This
power
detailed substantive
criticisms of the last

stylistic
physicalcontinua
evidence
Reprinted with
to
but-
permission.
70
IN
READINGS
MATHEMATICAL
empiricalgeneralityof
the
for
the
PSYCHOLOGY
is too
the
function.
relations
are
exponents
power
be
can
it not
Were
functions
sumptio
whose
predictedfrom
specialto
be
acceptable
a
a
theory.
power
Elsewhere
fact that some
easy
unare
(Luce, in press),I have
psychophysicists
seem
about these methods, which
iom,
suggested another approach. An axof
wide
or
law,
our
possible
to rest
experience
bility
applicaheavily upon
in the study of choice behavior,
number
the
with
system, the point
In an
be taken in conjunctionwith the
to be established.
would
seem
may
linear generalization
law to
Stevens
of Weber's
effort to bypass these objections,
demonstrate
the
of
scale
existence
had
has
a
recently
subjects
(1959)
that is a power
function of the physical
between
values
match
pairs of concontinuum.
finds
the
that
that
he
and
resulting
Although
theory
tinua,
tress
the
ables.
magnitude
separate varimuch
remains
Thus, although
the "direct" methods
to be learned about
scales of the
as
basic axiom
leads to what
in
appears
deductive
to be
the correct
criticisms.
form, it is open to two
criminati
First,the exponent predictedfrom disdata
is at
least
an
order
magnitudelargerthan that obtained

Second, the
by direct scalingmethods.
teresting
inis
based
summarize
functions appear
to
an
assumptions
theory
upon
and
these are
about discriminability,
body of data.
is
mined
not obviouslyrelevant to a scale deterGiven these empiricalresults,
one
method.
of
Scales
another
suitable
formal
to
by
challenged develop a
be related to
apparent magnitude may
theory from which they can be shown
There
be littledoubt
jnd.scales,but it would be unwise to
to follow.
can
take it for granted that they are.
monly
that, as a startingpoint,certain comThe purpose
of this paper is to outline
made
priate:
assumptions are inapprostillanother approach to the problem,
equalityof jnd's,equallyoften
that is not subjectto the last
one
Thurstone's
noticed
differences,and
The
results have applicabilcriticism.
ity
assumption. Since,
equal variance
of psychofar beyond the bounds
however, differences stand in the same
the general
physics,for they concern
logarithmic relation to ratios as
of
relation
between
the
urement
measquestion
tion,
funcFechner's law does to the power
and substantive theories.
a reasonable
startingpointmight
the
the
that
be
to
seem
assumption
Types
Scales
of
ratio of stimuli one jnd apart
subjective
lus
is a constant
independentof the stimuby now
Although familiarity
may
the
dulled
of
Obvious
have
its
as
our
sense
intensity.
importance,
dure
procein my
opinion it will Stevens' (1946, 1951) stress upon the
seem,
may
that leave certransformation
not do.
tain
Although generationsof psychologists
groups
convince
invariant
have
scale
to
managed
properties
specified
of the
themselves that the equaljnd assumpmust, I think,be considered one
tion
is plausible,
if not
cussion
obvious, it is more
strikingcontributions to the disbeen
in the past few
and
has
of measurement
not
never
particularly
ers
Prior to his work, most writdecades.
compelling; and in this respect, an
ferent.
difhad put extreme
emphasis upon the
equal-ratio
assumption is not much
is a
jective property of "additivity,"
which
This is not to deny that subhave
the equal- characteristic of much
continua may
urement
measphysical
if the power
ratio property
they must
(Cohen " Nagel, 1934). It
of
scaling,the
of
resultingpower
"
"
"
law
is correct
but
rather
and
to
Weber's
argue
that
law
such
holds
an
"
as-
wks
held
to
mental
that this property is fundascientific measurement
and,
72
more
theories are usuallystated in terms

the scales
of fimctional relations among
several
from
the
ment
measurethat result
theories for the variables involved.
For
number
is much
type
of purposes, the scale

the
crucial than
more
theory from
details of the measurement
For
the scale is derived.
which
that
limitations
the
scale
the
sibly
placesupon the statisticsone may senof
the
If
interpretation
employ.
statisticaltest
statistic or
particular
In
finds
physics one
at
least
two
of basic
assumptions: specific
empiricallaws, such as the universal
Ohm's
law of gravitation
or
law, and
o
f
tion,
construca prioriprinciples
theory
such as the requirement that the
classes
should
of mechanics
laws
of the coordinate
laws, such
to
have
the
to
empirical
system.
changed
tions
rota-
Other
of
the conservation
as
seem
be invariant
translations and
uniform
under
type
Theory
of
Construction
paid to
been
attention has
much
ple,
exam-
Principle
relations among
or
two
tive
substanIn practice,
variables.
involves
ter
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
ergy,
en-
from
the
ing
prioricategory dur-
development of physics. In
admissible scale transformations
is altered when
has been put
stress
psychology more
stantive
subthen our
are
applied,
of
the discovery empiricallaws than
on
conclusions will depend upon
the formulation
of guiding princion
ples,
of the
which
arbitraryrepresentation
relations
and the search for empirical
culations.
calhave used in making our
scale we
tends to be pursued without the
when
Most
scientists,
they benefit of
about
explicitstatements
the problem, feel that they
understand
ory.^
what is and is not an acceptabletheshun
such statistics and rely
should
have been
Since such principles
only upon those that exhibit the appropriate
in physicsto limit the
used effectively
invariances for the scale type
metic
the geometric
and arith-
Both
at hand.
in this sense
legitimate
for ratio scales (unit arbitrary),
only
the latter is legitimatefor interval
and
scales (unit and zero
arbitrary),
means
are
Stevens:
discussions,see
1955 ; for
somewhat
For
scales.
for ordinal
neither
fuller
1946, 1951,
second
group
mation
place where the transforlimitations
is in
imposes
the construction of substantive theories.

far
limitations
less
seem
attention
questions,even
more
of the
the
formulate
to have
received
the
statistical
than
though they
fundamental.
are
doubtedly
un-
The
mainder
re-
will attempt
paper
relation between
to
scale
are
possible.As alreadypointed
out, these
issues have
scientific relevance
beyond psychophysics.
of
errors
be
the behavioral
hand, if
what
limits
The
observation.
acute
problem is particularly
sciences.
to
us
On
an
some
laws,
possible
then
in
the other
prioriconsideration
constitutes
about
theory
acceptable
rather
small
fairlycrude
set
of
obser-
attempts to introduce and use such

the
in behavioral problems are
statements
combining of classes condition in stochastic
2
Two
Thomplearning theory (Bush, Hosteller," son,

work
the form of
on
1954) and some
which is based
function for money
the utility
that certain game
theory
the demand
upon
a
solutions should remain unchanged when
swer
types and functional laws, and to anthe questionwhat psychophysical
laws
not
in psychology.
possible
Without
such principles,
practically
and
relation
is
a
prioripossible,
any
is difficultto pin down
the correct one
because of the ever
means
by empirical
error
Mos-
wonders
one
something similar may
whether
present
teller,1958.
These
possiblephysical laws,
pretation
less strict inter-
of the conclusions,see
A
the
constant
sum
is added
of money
" Thompson,
to
all the
1957). In
larly
particudo the conditions seem
neither case
compelling.
payoffs (Kemeny
vations may
which law
suffice to decide
sometimes
substantive
or
able,
dependent variis simplysayingthat the
then one
strictures imposed by the measurement
theories are
incompatiblewith those
imposed by the substantive theory.
Such
a
logicalinconsistencymust, I
be
think,
interpretedas meaning that
retical
something is amiss in the total theoof
transformation
ble
actuallyobtains.
to be suggestedappears
principle
be
of one
used in
to
a
generalization
physics. It may be stated as follows.
The
73
LUCE
DUNCAN
R.
theory relatingtwo
variables and
more
theories
for
the
urement
meas-
bles
these varia-
structure.
should
be
that:
such
invariance
The
1.
substantive
(Consistencyof
transformations
of
one
controversial.
should
or
laws
independentvariables
of the
more
and
Admissible
theories)
and measurement
values of parameters that

reflect the effect on the dependent
of
admissible
the mathematical
of
the
independent
variables.
this
In
the
to
are
For
example,
one
of units is used.
physicsthat
sort of violation
particular
lows, of these until

fol-
principle
pendent
deas
ent
differ-
be viewed
can
of
Although this
are
examples
plausible,there
from
not
set
one
when
units is used and another
do
we
way,
law when
however, let us postpone
and
in what
principle,
terms
independentand
used
that
another
have
to
theory shall be
formations
trans-
to
be able to say that Ohm's
Put
current.
structure
admissible
reference
states
seems
independentof
we
the
set
of the substantive
field without
to
mations
transfor-
ables,
independent vari-
that
substantive
that voltageis proportional

product of resistance and current
without specifyingthe units that are
used to measure
voltage,resistance,or
law
to
want
of the
the
the variables.
want
we
dependentvariables.
tive
2. (Invariance
of the substanmerical
theory) Except for the nu-
subtle
more
It asserts
state
to
particularscales
measure
of the
variables
be able
of the
the
ory,
shall lead,via the substantive themations
only to admissible transfor-
part is
as
2:
of Part
the discussion
of the
consequences
have
been derived.
some
stated
meaning of the principlemay

the variables to which arbitrary, be clarified by examples that violate it.
distinguish
Suppose it is claimed that two ratio
admissible transformations
are
law.
related by a logarithmic
scales
are
the
imposed from those for which
variables
transformations
substantive
in
some
cases
in the
theorycan
are
used
are
only
determined
by the
theory. As will be seen,

the labeling
is trulyarbitrary
sense
that the substantive
be written
so
that any
either in the
appears
there is
that
true
asymmetry
variables must
some
in the
be
able
vari-
admissible
An
positiveconstant
by
of
unit.
kx
However,
log
cases
of
transformation
independentvariable
k -\-log
the
is multiplication
a
k, i.e.,
change
the fact that log
means
that
an
admissibl
in-
transformation, namely,
dependent
independentrole,but in other
or
The
to
change
of zero,
is effected
variable.
sense
fails to
dependent
Hence,
the
meet
on
the
pendent
de-
the
logarithm
consistency
tive
nential
requirement. Next, consider an expoindependentif any substanthe
transformation
then
all.
relates
them
law,
at
theory
This
be
e*=* (^)^
leads
can
to
One
sistency
can
hardly question the conviolation of coneither as
sistency
viewed
a
part of the principle.If an
and
others
admissible
of
transformation
variable leads
to
an
an
pendent
inde-
inadmissi-
or
is
of invariance.
then
exponential,
the
If the
law
dependent vari-
74
READINGS
is raised
able
to
which
power,
is
inconsistent ^^^th its being a ratio scale.

the
Alternatively,
taken
be
may
then
is
law
the
it is
dependent
exponentialraised
an
to
power
indethe unit of the pendent
variable.
APPLICA.TION
An
in
PRINCIPLE
THE
OF
of the
tering
enphysicalmeasures
into psychophysicsare idealized
physicaltheories in such a way that
Most
time
interval
or
durations
measured
are
on
i.e.,
kx", where
name
applied
logarithmicinterval scales.
Because
this topic is more
general
than psychophysics,I shall refer to
the variables as independent and dependent
rather than
assumed
Of course, differences and

of interval scale values
time
and
having
"
Although
on
attempted
scales that
either ratio
are
the
variable,where
often
assumption
and
the
has
estimation
Examples:
"law
lead
of
to
argued
methods
conditions
from
to
I have
should
us
the
data.
about
should
unit
of
the
sult
re-
law
is to say,
That
be unaffected.
dependent
depend upon k, but it

depend upon x, so we denote
matical
it by K{k). Casting this into mathemay
terms,
ratio
we
the
obtain
tional
func-
equation
Our
the relations
pendent
de-
of the functional
the form
changed
the
tion
multiplica-
shall not
ficient
given sufa
of
positiveconstant,
and
interval
that nitude
magresult in
derive
discrimination
tell
by
parative
com-
question here, however, is not how

in
have succeeded
well psychologists
other,
perfectingscales of one type or anbut what a knowledge of scale
among
In
unit of the
variable,namely
variable
belief); and
plausible
can
variables
both
transformation
difference
ory
theratio scales (but no measurement
has been offered in support of this
types
known
un-
have
closely related
of Thurstone's
scales; Stevens
or
noticed
judgment"
scale
is the
ratio scales. If the
form
pendent
de-
the
relatingthem.
law
Suppose, first,that
u{x)"
of
independent variable is changed by

multiplying all values by a positive
arrive at
constant
k, then according to the
interval,
missible
above
stated
only an adprinciple
to
former.
preferably
the equally
V
sidered
con-
of the
measurement
psychological
theories have
Case
be
ordinal,those who
to be
worked
best
at
and
corresponding value
stitute
con-
Let
point.
one
variable
functional
can
be
continua
typical value
independent
rivatives
dethe
psychologicalscales
most
use
chological.
psy-
will
numerical
than
more
denote
ratio scales.
in current
physical and
variables
form
to
interval
on
0. The
"
scale goes
scale,since the transformed
will consider
into c log X + log k. We
all combinations
of ratio,interval,and
measured
0 and
this scale type reflects

the fact that log x is an interval
ratio
physical time (not

ordinarytemperature,
durations),
are
^ "
to
Both
scales,and
entropy
scales.
by positive constants
multiplications
and raisingto positivepowers,
are
and
length, pressure,
Mass,
scales.
ratio
either
form
they
interval scales (Stevens,1957). In this

the admissible
transformations
case
not
depends upon
that
variable
ratio scale,but
invariant because
be
to
PSYCHOLOGY
MATHEMATICAL
IN
u{kx)
where
Q and
k"
cases
are
K{k)
arrived at in
They
The
K{k)u{x)
"
0.
equations for
Functional
scales.
are
summarized
questionis : What
the other
similar
ner.
man-
in Table
to these two
common
functional equations,each of which
is
interest
there
of
some
scales,
types
the principle,
in what have been called logarithmic embodies
imply about
addition
do these nine
TABLE
Functional
The
Equations
Principle
of m?
form
the
consideration
to
continuous,
of
X.
Theorem
shall limit
We
theories
for
our
where
u{k)u(x)/u{l).Lety
log["/"(!)],
then
function
nonconstant
v{kx)
log[_u{kx)/u{l)^
u(k)u{x)
If the independentand
1.
the
Construction
is
Satisfying
Laws
the
Theory
of
75
LUCE
DUNCAN
R.
"
"^m(1)"(1)
dependentcontinua are both ratio scales,

then u (x)
ax^,where /3is independent
of the units of both variables.^
=
Set:x;
Proof.
u{k)
1 in
constant
we
Equation
Because
K{k)u{l).
choose
may
non-
that
so
0, and because K(k)

follows that m(1) " 0,soK{k)
M (1) Thus, Equation 1 becomes
u{k)
"
0, it
"
u(k)/
u
In
the
in the
this and
statement
can
continuous, so is v, and it is
that the only continuous
well known
Since
is
solutions to the last functional
(kx)
tion
equa-
of the form
are
following theorems,
general*if
more
v{x)
/3log jc
log x^
be made
replacedby x + 7, where 7 is a constant

unit as
independent of x but having the same
of
The effect of this is to place the zero
X.
at some
u
point different from the zero of x.
be reIn psychophysics the constant
garded
7 may
log [m(x)/m(1)]
v(k) -{-v(x)
is
1, then
is
as
such
the
constant
The
threshold.
of
u(x)
where
of course, that a plot

will not in general be a
ae'^'^
ax^
u{l).
means,
of
log M vs. log X

straight line. If, however,
variable
presence
Thus,
is measured
in
terms
the
We
observe
that
since
independent
u{kx)
of deviations
become
the threshold, the plot may
straight. Such nonlinear plots have been
instances the
observed, and in at least some
to be correlated
degree of curvature
seems
ther
Furwith the magnitude of the threshold.
empiricalwork is needed to see whether
this is a correct explanationof the curvature.
ak^x^
a'x^
from
/3is independent of the unit of

it is
of
clearly independent
of
x,
the
and
unit
u.
Theorem
is
tinuum
If the independent conratio scale and the depend-
2.
a
76
READINGS
continuum
ent
either
(x)
IN
MATHEMATICAL
interval scale, then

log x -\-^, where a is
an
PSYCHOLOGY
It is easy
to
that 5 is
see
of the unit of
and
jS is
independent
independent
pendent of both units.

independent of the unit of the indeax^ + 8,
A much
variable,or u(x)
rem
simpler proof of this theowhere /8 is independent of the units of
be
if
can
that u
given we assume
is dififerentiable in addition to being
both variables and 5 is independent of
the unit of the independentvariable.
continuous. Since the derivative of an
interval scale is a ratio scale,it follows
Proof.
In solvingEquation 2, there
=
immediately that
du/dx satisfies
to consider.
possibilities
1. li K{k)
e".
1, then define v
Equation 1 and so, by Theorem
1,
dx
2
becomes
v (^x)
D(k)v(x),
Equation
c^.
we
e^^''^" 0 and v is conIntegrating,
where D(k)
get
tinuous,
and nonconstant
cause
bepositive,
u is.
By Theorem
l,v(x) Sx",
x^+i + 5 if /8 5^
1
u{x)
where a is independent of the unit of
j |8+ 1
where 5 " 0 because, by definiI a logx+S
tion,
X and
if fl
1
0.
V "
Taking logarithms,u(x)
3. If the independentconTheorem
tinuum
a log X+/8, where
0= log 5.
two
are
K{k) ^ I, then
2. U
be two
let
different solutions to the
and
define
u*
lem,
probIt follows
u.
"
is
u*
and
immediately from Equation 2

must
satisfythe functional
equation w{kx)
K{k)w(x). Since
that
both
and
u*
K(k) ^ I,
be
it is clear
impossiblesince
to be different.
and
that
the
0, and
u*
Since
constant.
solution isw
constant
w(x)
continuous, so is w;
are
however, it may
only
the
ent
depend-
"
units of both variables and 8 is independent

of the unit of the independent
dependent
variable,or u(x)
ax^, where /8is inthe
both
variables.
units of
of
=
this is
Proof.
Take
3 and
Thus, by Theorem
Substitutingthis into the

functional equation for w, it follows
that K{k)
".
Then
0
settingx
in Equation 2, we obtain C{k)
w(0)
observe
that
k^). We
now
X(l
ax^ -\-8, where 8
u(x)
u(0), is a
solution to Equation 2 :
=
the
let v
1,
ax^.
is
interval
logarithmic
where a
Se'"^,
scale,then either u{x)
is independent of the unit of the dependent
is
variable,
^
independentofthe
chosen
were
ratio scale and
continuum
vikx)
logarithm of Equation
log u :
K*ik)
K*ik)
where
Cik)v(x)
logK(k).
rem
By Theo-
2, either
"
u(kx)
v(x)
ax^ -\-8*
v(x)
13log x -{-a*
either
Taking exponentials,
ak^x^-\-8
u(x)
a"x^+u(0)k^-\-uiO)-u(P)k^
kû(x)+u{0)il-k^)
where
8
a
8e'"^
or
u{x)
ax^
"** and, in the second
tion,
equa-
e"*.
K(k)uix)-\-C(k)
Theorem
Any
or
other solution is of the
same
because
u*{x)
u(x) + w(x)
ax^ -\-8 -{-ax^
(a-\-a)x^ +
form
interval scale,then it is
an
impossible
for the dependent continuum
to be
tinuum
If the independentcon-
4.
is
Proof.
by
ratio scale.
Let
Theorem
Equation 4, then
ax^.
know
u{x)
0 in
we
DUNCAN
R.
Now
set
1 and
in
5^ 0
tion
Equa-
77
LUCE
Proof.
Take
6 and
a(x + cy
K{l,c)axP
the
let v
v{kx + c)
tion
logarithm of Equalog u:
K*{k,c) +
C(k,c)v{x)
SO
-^
where
K(l,cy/^x
K*{k,c)
Theorem
which
impHes
to
have
Theorem
than
more
trary
con-
that
v(x)
both
point.
one
continua
scales,then u{x)
-\-^, where
ax
independent of the
interval
both
are
unit
of
the
u(x)
where
pendent
inde-
Theorem
5 reduces
7.
is
let
we
e"*.
0, then
tion
Equa-
Equation 2 and so
Theorem
2 applies. If "(x)
a log x
\ and c 7^ 0
-\-jS,then choosing k
in Equation 5 yields
it is
tinuum
If the independent coninterval
a
scale,
logarithmic
impossiblefor the dependent
continuum
to be
Proof.
Let
c) + (S
By taking the derivative with respect
log x)
X(l,c)/3+ C(l,c) Thus, log X

ratio
u{x), i.e.,
v{y)
7 becomes
K{k,c)u{\ogx)
K{\,c)a logx
to
u(logx)
w(e^),then Equation
Z'(log^ +
log (x +
ratio scale.
to
ae^^
/3is
then
If
/3x-fa*
so
variable.
Proof.
By
5,
If the independentand
5.
dependent
constant,
assumption
our
continua
is
log K{k,c).
it is easy to see that x must

be
which
is
impossible.
constant,
conclude
that u{x)
Thus, we must
ax^ -f (S. Again, set k
1 and
is an
interval scale and
scale,which
by Theorem
is
4 is
impossible.
X,
5^
0,
Theorem
tinuum
If the independent coninterval
scale
is a logarithmic
and the dependent continuum
is an
terval
in-f
scale, then u(x)
x
log
a
/3,
where a is independent of the unit of the
independent variable.
8.
a(x -f cY
i^(l,c)ax"
X(l,c)|3+
-f
C(1,C)
Proof.
If 5 ?^ 1, then
to
differentiate with
spect
re-
v(\og x)
Let
Equation
V
8 becomes
(logk +
log x)
K{l,c)a5x^-'
K(k,c)v {logx) +
must
that
is
impliesx
conclude
m(x)
ox
1.
constant,
It is easy
so
we
to
see
so
j8satisfiesEquation
5.
log X
and
By Theorem
is
6.
If the
an
interval
independent
scale
and
tinuum
con-
both
are
interval
C{k,c)
scales.
5,
u(x)
Theorem
then
aS(x + cy-'
which
u{x),
i'(logx)
log X -f /3
the
Theorem
9.
If the independent and
logarithmic
both logarithmic
continua
where
are
dependent
ae^",
u(x)
is
pendent interval scales,then u{x)
ax^, where
a
independent of the unit of the indevariable and
/S is independent /3is independent of the units of both the
of the unit of the dependent variable.
independent and dependent variables.
dependent
continuum
interval scale, then
is
78
READINGS
Proof.
Take
9 and
PSYCHOLOGY
MATHEMATICAL
logarithm of Equation
:
log u
the
let
IN
form
course,
As
v(kx')
K*{k,c)
where
Theorem
C{k,c)vix)
K*ik,c) +
log K(k,c).
By
v(x)
+
iSlog:*;
a"
example
an
u(x)
""(*'
ax^
used to
is
interval
an
added
constraint
with
of
source
No
of them.
some
examples
idealized
are
or
as
attempt
have
The
variables
law
gravitation
in each
are
no
lomb's
Cou-
law, and Newton's

all ratio scales,and
Additional
1.
1
of the law
can
is
of
of
on
square
scales,and
as
we
can
illustrate Theorem
constant
then
relation
mass
its energy
variables
If
is moving at
If-the temperature
constant, then as
of
as
they will
body of
velocityv,
is of the form
av^ +
discussed
from
matical
mathe-
not
are
8.
of sure
a
presp the entropy of the gas is of the
between
that it is
new
"
the
may
cases.
specific
to challenge
and
physics,such
of radioactive
law
ables
vari-
two
empirical,not
theorems, they have

from
seem
be in
this view
support
an
to ascertain what
matter
theoretical,
cited
as
amples
ex-
the
ponential
ex-
decay
or
sinusoidal function of time, which

stated
violate the theorems
to
We
above.
therefore,examine
these examples bypass
must,
in which
the ways
the rather
strong conclusions of
the present theory.
have
All physicalexamples which
been
suggested
perfect gas is
function
hold
can
and
are
anticipate
therefore
2.
such
I have
which
"
had
strong misgivings about
is that
interpretation
; the feeling
substantive
of
nature
a
something
have
been
must
smuggled into the
formulation
of the problem. They
functional
that practically
argue
any
To
interval
form
whom
have
ratio
its side
variables
entropy
dependent
with
point of view
some
important
and
energy
that
Some
the function
illustrations.
Other
one.
these theorems
since
the
area
rem
Theo-
dependency of the
sphere upon its radius or
scales; thus
of the
in geometry
volume
are
and
length, area,
volume
examples of
be found
is
Discussion
rem
function,as called for by Theo-
power
scale, as
their
seems
entering into
the form
case
to
scales, because
law. Ohm's
is the linear
concerninglogarithmic
of scales of this type

been made.
temperature
interval
an
tion
usuallyassumed, then the only rela5
possibleaccording to Theorem
interval
use
If the
any
given
freezing
a
con-
will be made
illustrate the results

interval
is classical
of water.
scale is also
of the fundamental
most
either ratio
tinua that form
to
choose
may
point
variables
actual
we
Illustrations
physics,where
scales.
(subjectto the
that the length is
initial length to correspond to

such
the
as
temperature,
that accord
best
The
we
temperature
measure
C*.
be useful,priorto discussing
It may
these results,to cite a few familiar
laws
scale
since
positive),
where
Theorem
ordinary temperature,
which is frequentlymeasured
in terms
of the length of a column
of mercury.
forms a
Although lengthas a measure
ratio scale,the length of a column
of
mercury
so
of
of
consider
may
8,
logp -\-/8. No examples,

4.
are
possiblefor Theorem
to
form
common
is
to
the
:
the
me
as
theorems
examples
counter-
have
able
independent vari-
ratio scale, but
it enters
into
80
READINGS
IN
PSYCHOLOGY
MATHEMATICAL
tion may
reflect the actual state
not
of affairs in the empiricalworld.
It
and
that
they
related
tion
by a functhat is nonnegative, nonconu
is certainly
true that, in detail,cal
but
physi- stant, and monotonic
increasing,
mathematical
continua
c
ontinuous.
not
We
not
are
necessarily
now
need
that u cannot
be
continua, and there is ample reason
only show
to
that
suspect
the
holds
same
psychologicalvariables.
But
that stimuli and

both
continua
form
are
the
for
bounded
idealizations
difficult to
are
but
it is doubtful
be of much
would
are
tinuity
disconvariable.
if this
helpby itself. The
solutions to, say, Equation

manifold and extremelywild
M
.
this
"(^) ^
They are so wild

that it is difficultto say anything precise
about
them
at all (see Hamel,
1905; Jones: 1942a, 1942b),and it is
in their behavior.
T^/Lx,
,.
implies w
0, con-
assumption. Thus, for all

1, which by Equation
u(kx)
1, means
u(x), for all x and
1.
This in turn
^"
implies " is a
which
again is contrary to
constant,
have
lished
estabassumption. Thus, we
to
trary
k"
1, K(k)
that such solutions represent

empiricallaws.
doubtful
Second, casual observation
the
the
discontinuous
1
that
exist in
Suppose, therefore,that it is bounded

and that the bound
is M.
By Equation
1, u{kx)
K{k)u(x)" M, so
For k"\, the monu (x)"
M/Kik).
otonicity of u implies that u{x)
" u {kx)
K(Ji)u{x) so choosingu {x)
"0
1.
that K{k)"
If for
we
see
"
^
some
1, K{k) " 1, then K can be
made
arbitrarily
large since, for any
integern, K^k")
Kik)"",but since
responses
give up; to do
would
mean
so
casting out much
tively,
of psychophysical theory. Alternacould drop the demand
that
we
them
be
the function
tinuous,
conrelating
that
show
to
must
sumptions
as-
are
suggests
that it might be
appropriateto assume
claim that some
our
that at least the dependent variable is
reside in the
must
bounded, e.g., that there is a psychologicallythe variables.
maximum
be
though
Al-
loudness.
boundedness
plausible,
two
absolute
scale
with
or
in the formulation
tinuity
discon-
some
of the
lem,
prob-
of the
possibly in the nature
function
variables or possiblyin the
tablish
Actually, one can esrelatingthem.
that it must
of
the
variables.
form
we
hold all but
in
our
quency
intensityand freUsually
loudness.
variable constant
one
but
empirical investigations,
the fact remains

that
their presence
difference in the
and
there
make
some
of
range
variables, x
and
y,
ratio scales and
form
total
For
two
are
are
may
example,
independent
possiblelaws.
there
suppose
the others
that
both
of which
that the
ent
depend-
ratio scale,
of
then the analogue
Equation 1 is
variable
is also
u(kx,hy)
K{k,h)u(x,y)
be in the nature
Suppose,
contrary, that the variables

scales that
determine
appropriatefunctional
if the functions
equations are unbounded
be
must
a
s
they
are
increasing,
It seems
clear
for empiricalreasons.
of the dependent
that boundedness
variable is intimatelytied up either
with introducinga reference level so
that the independent variable is an
more
or
theorems, all the continuous
solutions to the
situations,there are
independent variables;
example, both
for
of
nature
many
cannot
imposed by itselfsince,as is shown
in the
Third, in
tinuity
discon-
numerical
on
are
the
ratio
continua
where
We
hold
k "
know
one
0, h
"
0, and
by Theorem
variable,say
K(k,h)
1 that
y, fixed at
0.
"
if
we
some
R.
value
and
must
be of the form
leth=
DUNCAN
1, then the solution
81
LUCE
the
dimensions
variables.
of
This
the
independent
be parto
appears
ticularly
if the dependent
appropriate
u{x,y)
a(j)x^^"^
variable has
The
But
holding
1,
also
we
and
constant
know
that
letting
it must
be
tinua
they
u's
to
partialderivatives of both
having
the
solutions only of the form
{x,y)
Thus, the
admit
ax^y'-^"^
than
when
we
be
must
that
emphasized
in Footnote
If a function
here.
3 does
not
u{x,y)
that
+ tCv)]^^"^
a(y)\ix
theorems.
The
first
can
be
rejectionof Part
ratio scale
dimensional
numerical
this
In
constants.
of the form
functions
which
provides
use
it involves
constants
is
the
to
arrive
is either
for
theories exist
being obtained.
ratio
scale,
the
not
of
For
tend
interval
there
is
these
methods
Stevens
terval
logarithmicin-
or
interval
an
results
matchings
from
to
scale.
cross-modality
the logarithmic
as
a possibility,
eliminate
scale
presumptive evidence
has
that
yield ratio scales,
as
claimed.
Summary
The
at
following problem
What
of
result of the present work.
chophysical
psy-
methods
least indirect evidence
at
of
the presence
that cancel out
of this argument
of the
by
measurement
no
2 of the
factory
satismuch
more
u{x,y) seems
and
the heuristic
convincing than
development given in Section 2.C of Luce (in
press), and the empirical suggestions given
there should gain correspondingly in interest
a
may
of the psychophysical
is determined
e.xcept for
viewed
or as the creation of a dimenprinciple

sionless independent variable from a
form
we
of scales
the form
functions
some
Since
In sum,
there appear
to be two ways
around
the restrictions set forth in the
as
known,
physical
psycho-
that
types
Once
being obtained.
solution possibilities
wholly new
(see Section 2.C.3 of Luce pn
The
ment
measure-
example, the magnitude methods

seem
result in power
to
which
functions,
ure
suggests that the psychological meas-
rather
what
the type of scale
press])
the
laws, but
in order
for certain
apply
other, e.g.,
as
mine
deter-
empirically testable
the
depends upon
independent variable is added to
either
of the
the
to
not
meantime, however, experimental determinati
variable.*
remark
that
argue
independent
one
can
theories for the several

know
stricts
principleagain severely re-
more
one
methods
the possiblelaws, even
then
hand, if the theorems
the forms
'"8
that
to create
are
exist
other
important question is
variables,
this equation can

be shown
tion
(seeSec2.C.2 of Luce [inpress])to have
the
that
bounded.
are
limited
restrict ourselves
we
one
con-
assume
applicable,then the possible psychophysical

(and other) laws become
severely limited. Indeed, they are so
a(y)x^^''^ 8{x)y'^'^
It
numerical
as
are
Thus,
If
tion
reject the idealiza-
and, possibly, to
On
8(x)y''-^
is to
of the variables
of the form
u{x,y)
true, well-defined bound.
second
of
substantive
dependent
manner
Each
to
are
theory
variable
an
variable
the
in
was
sidered.
con-
possibleforms
that
a
relates
continuous
independent variable?
is idealized
as
nu-
82
IN
READINGS
TABLE
Possible Laws
The
""
The
a/x
notation
Satisfying
the
and is restricted
merical continuum
Principle
variable
that
are
to
admissible under
theory shall
measurement
Theory
Construction
REFERENCES
by
beingeither
logarithmic
a
ratio,an
a
of theory
interval scale. As a principle
construction,it is suggested that
of the independent
transformations
theory
interval,or
its measurement
of
of the unit of x.'
"a is independent
means
PSYCHOLOGY
MATHEMATICAL
its
F., " Thompson,

for multiplechoice situations. In R. M. Thrall, C. H.
Coombs, " R. L. Davis (Eds.), Decision
York:
Wiley, 1954. Pp.
processes. New
Bush,
formal
structure
99-126.
tion
R., " Nagel, E. An introducNew
method.
logic and scientific
Harcourt, Brace, 1934.
M.
Cohen,
to
result
not
in inadmissible transformations of the

and
dependent variable (consistency)
R., Mosteller,
R.
G. L.
York:
Hamel, G. Eine Basis aller Zahlen und die

unstetigen Losungen der Funktionalgleition
Math.
that the form of the functional rela/(x + y)=/(jr) +/(3;).
chung:
variables shall
Annalen. 1905, 60, 459-462.
the two
between
and disconnected
Connected
formation Jones, F. B.
be altered by admissible transnot
functional
the
equation f{x)
and
sets
plane
of the independent variable
Math.
Bull. Amer.
+ fiy) =fix + y).
limits sigThis principle
nificantly
(invariance).
Soc, 1942, 48, 115-120. (a)
and other properties
the possiblelaws relating Jones, F. B. Measure
Math. Soc,
Bttll.Amer.
basis.
of
Hamel
2.
Table
a
in
shown
the two continua, as
472-481.
(b)
48,
1942,
portant
These results do not hold in two imThe
G. L.
J. G., 8l Thompson,
Kemeny,
First, if the
circumstances.
comes
attitudes on the outpsychological
In M. Dresher, A. W.
of games.
Tucker, " P. Wolfe (Eds.), Contributions
Princeton:
III.
to the theory of games.
effect of
is
independent variable
that
is
at
variables
upon
the
by
having
pendent
those of the inde-
Princeton
Univer.
Press, 1957.
Pp. 273-
298.
how
one
wishes
to
Second, if the
matter.
tinuous,
discrete rather than conif the functional relation is
are
or
discontinuous, then
those
constant
prin- Luce, R. D. Individual choice behavior: A

variable,then either the ciple
it is violated,
York:
or
Wiley,
content
no
theoretical analysis. New
depending
look
ratio scale
dimensionless
rendered
multiplying it by
to
units reciprocal
has
given in Table
than
laws
other
possible.
are
in press.
Luce, R. D.,
of
"
Edwards,
differences.
The
W.
tion
deriva-
able
just noticePsychol. Rev., 1958, 65,
scales
subjective
from
222-237.
M0STEI.LER, F.
corpus.
The
mystery
of the missing
Psychometrika,1958, 23, 279-289.
R.
S.
S.
Stevens,
On
S.
S.
Stevens,
and
the
1946,
New
.sensory
S.
Stevens,
1956,
S.
Stevens,
subjective
S.
S.
S.
estimation
Thurstone,
loudness.
of
Anter.
J.
the
1957,
psychophysical
Galanter,
scales
loudness,
L.
vibration.
E.
for
exp.
L.
law
Psychol.
Rev.,
J.,
theory
of
(2nd
Univer.
validation
/.
Neumann,
VON
law.
153-181.
Cross-modality
for
"
H.
Ratio
dozen
Psychol.,
ceptual
per-
1957,
of
comparative
1927.
34.
273-
286.
The
64,
1959,
377-411.
54,
data.
1-25.
On
scales
1-49.
Psychol.,
exp.
category
continua.
"
69,
Rev.,
Psychol.
of
S.,
and
judgment.
direct
The
magnitudes
Psychol.,
scales
113-116.
121,
S.
Pp.
/.
shock.
S.
Stevens,
Stevens
averaging
the
electric
201-209.
57,
ogy.
psychol-
1951.
Wiley,
On
1955,
Science,
Stevens,
S.
and
677-680.
S.
experimental
of
York:
S.
S.
83
LUCE
of
scales
measurement
In
Handbook
Stevens,
103,
Mathematics,
psychophysics.
(Ed.),
of
theory
Science,
measurement.
DUNCAN
Press,
"
Morgenster.n-,
and
games
Princeton:
ed.)
1947.
of
(Received
December
2,
1958;
economic
Princeton
O.
havior.
be-
MULTIVARIATE
INFORMATION
William
MASSACHUSETTS
J. McGill
INSTITUTE
A multivariate
analysis based
It is shown
that
sample transmitted
for
measuring
and
tables. Relations
Several
from
communication
the
This
theory
human
It
consider
us
will
statistical tests
Basic
(10)
transmitted
system.
leads
information
and
we
of transmitted
measure
appropriate
to
of information
some
uncertainty
transmitted
and
are
the
at
some
input variable,
values
or
receiver
not
not
in
of association
event
the
if
hand,
Naturally
these
most
There
extremes.
information
Some
sent.
input and
is
is
get through.
what
1, 2, 3,
and
is sent
(k,m).
This
"
"
Suppose
that
variable,
y.
"
with
1, 2, 3,
m
is
joint
"
"
"
is
in
discrete
discrete, it
probabilitiesindicated
F with
probabilities
,
received, we
event
have
we
Since
,
=
is, but
information
transmitted
the
output
values
k
other
transmitted.
was
output.
between
perfectlycorrelated,
are
between
what
about
discrete
=
the
is
found
its
output
On
transmitted.
assumes
happens that
joint input-output
amount
input and
are
does
signalsk
by p{k). Similarly,y
it
no
information
a;, and
and
information
transmission
interested
of
amount
If
input and
channel
is transmitted.
independent,
cases
p{m). If
not
or
by
Definitions
the
measures
of the channel.
output
are
on
psychology.
available
whether
contingency data,
communication
input information
We
as
how
shown
in
communications
of Shannon's
that
of
data
analyzing
shown
be
analysis
information
input and
takes
and
tests.
Transmitted
the
pointed out,
described
extension
an
1.
Let
for
is best
organism
simple multivariate
output
multi-dimensional
psychological journals have
useful
are
will present
paper
statistical
all the
in
are
communication
information.
a
theory are being applied

widely understood, however, that the tools made
It is not
to
is presented.
simple method
contingency
information
provides
information
articles in the
recent
derived
believe
transmitted
on
testing association
analysis of variance
with
TECHNOLOGY
OF
described.
are
ideas
TRANSMISSION*!
has
can
speak
of the
probability p{k,m).
work
*This
Human
Factors
Operations
was
supported in part by the Air Force
under
Research
and
Air
Force
Laboratories, and in part jointly by the Army, Navy,
with
the Massachusetts
contract
Institute
of Technology.
dependentl
been
and
fSeveral of the indices
in this paper
have
discussed
developed intests
by J. E. Keith Smith (11) at the University of Michigan, and by W. R. Garner
at Johns
Hopkins University.
This
article
appeared
in
Psychometrika,
1954, 19, 97-116.
Reprinted
with
permission.
J.
WILLIAM
The
rules
governingthe selection of signalsat either end
be constructed
so
J2 P{m)
*-l
successive
independent,the
signals are
signalis defined
in "bits" per
H(x) + H(y)
1.
k.m
transmitted
T{x;y)
T.p{k,m)
conditions,assuming
of information
amount
must
m-Y
JLvik)
these
of the channel
that
k-X
Under
85
MCGILL
as
H{x,y),
(1)
where
Hix)
H(y)
Zp(k) log,p{k),
H(x,y)
2 pM
log2pirn),
-Y^pik,m) \og2pik,m).
k ,m
equal to "logs (^)and represents the information conveyed by

two equallyprobable alternatives. Our development will use
a choice between
in information
the bit as a unit, since this is the convention
theory,but
unit
convenient
be
substituted
of
the
the
base
by changing
logarithm.
any
may
and y, H(x) + H{y) " H{x,y) and
If there is a relation between
x
the size of the inequalityis just T{x;y).On the other hand, if x and y are
H(x,y)
H{x) + H{y) and T(x;y) is zero. It can be shown
independent,
that T{x;y)is never
negative.
of
The presentation to this point has been an outline of the properties
the measure
of transmitted information as set forth by Shannon
(10).These
of
that
the
be
information
amount
summarized
propertiesmay
by stating
the association
that
transmitted is a bivariate,
measures
positivequantity
between input and output of a channel. There are, however, very few restrictions
how
be defined. The input-output relations that
channel may
a
on
sequently
in
occur
psychologicalcontexts are certainlypossiblechannels. Conmany
One
"bit" is
we
can
information
transmitted
measure
in these
contexts
and
anticipatethat the results will be interesting.

2.
Our
on
measures
of information
as
the number
we
make
of
information,i.e.,
constructed from relative frequencies.
of events
observations
{k,m). We identify
n
development will be based
Suppose that
Ukm
Sample Information
of times that k
sample
on
sent
was
that
n*.
k
E
k tTn
and
measures
was
received. This
means
86
where
that
and
received,
was
experiment
and
entries n^m
was
representedby
of times
is the number
sent, n"
of observations.
is the total number
be
then
can
that
of times
is the number
Uk
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
contingencytable
particular
cells
XY
with
"
p(w), and p{k,m) with Uk/n,

p(fc),
probabilities,
is
Sample transmitted information,T'{x',y),
n"/n, and Uk^/n, respectively.
We
may
defined
the
estimate
as
H'{y)
H'{x) +
T'ix;y)
(2)
H'{x,y),
H'{x,y) are constructed from relative frequencies

from probabilities.
[Throughout the paper a prime is used over a
H'(x), H'{y) and
where
instead of
quantity
the
without
prime,
e.g., T'{x;y)is
estimator for
an
As
T{x\y).]
in "bits"
signal.
per
we
manipulate logs of relative frequencies,
it is difficult to
Since
introduce
an
Sm
Â-n
-^n"
10g2nkm
log2n"
logsn.
of information
Expressionsinvolving sample measures
handle in this notation. For example, T'{x;y)becomes
T'{x;y)
write equations like
(3) is
s-notation. Thus
let
two
sources,
v, that
and
in equation (2) with
u,
are
has been
i
transmit
and
riu,v;y)
where
subdivided
1,2, 3,
"
"
s"
Si"
(3)
Information
the definition of transmitted
extend
us
Transmitted
3. Three-Dimensional
Now
Si
easier to
are
(3) are equivalentexpressions for T'{x;y).When

(3),we shall say that these equationsare written in
(2) in s-notation.
Equations (2) and

we
will
easier notation:
Sfc-
of
quantity
same
before,T'(x;y)
(inthe sample) measured
information
of transmitted
is the amount
of the
likelihood estimator
to indicate the maximum
"
we
y. To
to
to include
information
accomplish
this
replacex
we
find that
H'{u,v)+ H'(y)
into two
U, while
classes,u and
assumes
(4)
H'(u,v,y),
values j
v.
=
The
possiblevalues
1,2, 3,
"
"
F. The
"
88
READINGS
mitted
information
IN
MATHEMATICAL
will be called
T[{u\y),where
T.'^m{u;y)l
T'Xu;y)
=
and
is
T'j{u;y)
of V,
namely j. It
information
is
transmitted
readily shown
T'Xu;y)
that
see
is added
subscriptj
There
and
to each
are
three
table.
For
Finallywe
example, the
T'{v;y)
TL{v;y)
s,
these results in mind

and
T[{u;y) ^
y. If
has
T'{u;y).One
singlevalue
(8)
T'{u;y)except that the
as
to
A'{uvy)
"
Si
three-dimensional
tingency
con-
between
of this symmetry,
see
that
between
we
any
v,
i.e.,
(11)
Siy"
between
transmitted
and
y, then
the size of the effect is by

-
T'{u;y),
Sii
"
Si",
"
"
Sy" +
s.y"
(13)
that
T'Xu;y)
T'{u;y),
TL{v;y)
T'{v;y),
Tl{u;v)
r{u;v).
may
(12)
the information
transmission
on
(10)
A'{uvy) is the
two
s,,"
reconsider
Si -\-Sm
A\uvy)
(9)
and
Sir,
T'Xu;y)
substitutions will show
more
measure
s,"
s, -h Sii
Si^
effect
an
way
Si
+
s,"
between
let us
A'{uvy)
We
s^^
way
s"
s,,-
s^
In view
equations for transmission
s;
T'y{u;v)
few
same
two
transmission
r{u;v)
s,",
different pairs of variables in
study
may
y for
of the s-terms.
between
s^
in the
With
and
written
are
T[{u;y)is written
between
(7)
that
sy
We
PSYCHOLOGY
call
A'{uvy)the
u-v-y
(14)
interaction information.
gain (or loss)in sample information
of the variables,due to additional
mitted
trans-
knowledge of
the
third variable.
Now
from
we
u,v to y,
can
the
express
i.e.,T'(u,v;y),
as
r(:u,v;y)
=
three-dimensional
a
T{u;y) +
+ A'iuvy),
T'{v',y)
T'{u,v;y) T'Xu;y)-j- Ti{v;y)

=
information
transmitted
function of its bivariate components,
A'{uvy).
for
(15)
(16)
WILLIAM
J.
Equations (15)and (16) taken togethermean

by a diagram with overlappingcircles as
what
assumes
shall call
we
89
MCGILL
can
T'{u,v\y)
in Figure
shown
that
"positive" interaction
be
represented
1. The
diagram
u,v and
between
y. Inter-
"""v^îy)^ ^Tu(v;y)
T'(u,v,y)
Figure
Schematic
of
of the components
diagram
three-dimensional
The
three-dimensional
analyzed
into
plus
meanings
The
transmitted
diagram
formation.
in-
shows
transmission
that
be
can
missions
pair of bivariate trans-
interaction
an
the
of
symbols
term.
are
plained
ex-
in the text.
action is positivewhen
is to
constant
This
if
of these
one
increase
that
means
the effect of
the
holding one
of the interactingvariables
of association
amount
T'{u;y)and
inequalitiesholds,both
T'Xu;y)"
is
that interaction may

be negative. When
and
the interactingvariables are reversed,
happens, relations
the diagram in Figure 1
this
correct.
longer strictly
no
Components of Response Information
4.
The
because
those
to
two.
[Because of (14),
TL(v;y)" T'(v;y).
Later
must
hold.]
on, however, we
shall show
between
the other
between
multivariate
model
the situations treated by communication
we
deal with
to
knows
the statistical
means
random
experiments
propertiesof the
The
kind
generally do
transmittinginformation.
statistical noise
with
the
bivariate model
theory tells us
from
transmission
noise. This
we
to
us
the
same
as
theory are
engineer is usually able
not
appUcations.The
psychological
in
restrict himself
is useful
transmission
of information
of
not
We
source,
He
source.
single information
of
he
noise
he speaks
and when
is seldom
precision
in advance
know
must
therefore
be
available
how
many
careful
to
In
us.
sources
not
to
our
are
confuse
experimenter'signorance.
of transmitted
to attribute
information
to random
provided by
noise whatever
cation
communi-
uncertainty there
90
READINGS
is in
the
specifying
if several
will
can
transmission
H\y)
H'(y)
see
and
s"
"
that
HUy)
hand,
sources
from
the multivariate
model
For
transmittingsources.
T'{u;y) +
H'u^(y)
s,,-
H'(y), the
uncontrolled
to
model
bivariate
example,
find that
we
(1).Consequently,
the
responses,
the other
the effects due to the various
in three-dimensional
We
to
is known
effects due
On
variability.
to random
measure
where
the stimulus
discriminate
to
PSYCHOLOGY
information
transmit
certainlyfail
MATHEMATICAL
when
response
sources
those due
IN
r{v;y) +
Si,",
"
A'{uvy),
information,has
response
(17)
been
into
analyzed
plus a set of correlation terms due to the input variables. The

error
term, HiXv), is the residual or unexplainedvariabilityin the output,
due to the inputs,u and v, has been removed.
In
y, after the information
bivariate information transmission,
the response
information is analyzed less
an
term
error
precisely.For
example,
we
have
may
Ii'{y)
H'M
In
this
the
case
Shannon
(10) showed
In other words
if we
the
also control
Hi{y)
because
H'M
term, when
Equation (19)
we
have
error
inputs,namely,
term
seen
well
as
to be
noise.
is
u.
HUy)
information
Multivariate
can
on
kind
be increased
cannot
controlled,
(19)
n{v;y).
transmitted
sides in s-notation.
information
from
extractingthe association
transmitted
information
of
a
only
still smaller
and
y,
of the
one
Controlling v
v.
between
Thus
via responses,
error
is thus
y from
the
essentially information
is
the noise part of bivariate transmission.

5. An
The
is
variables that transmit
equivalent to
analyzed from
only
keep track
term; H^{y), provided we
contains
term
However, this error
the
as
HUy)-
proved by expanding both
stimulus
are
an
is recorded.
input, u,
one
In fact
HL{y)
if u and
only
(18)
that
error
v.
is
term
error
T\u;y).
Example
of analysis that multivariate
be illustrated by
set of data obtained
information
from
one
transmission
subject in
an
yields
experiment
frequencyjudgment.
equally loud tones, 890, 925, 970, and 1005 cycles per second
presented to the subject one at a time in random order. Each tone was
Four
were
I second long
and
separatedby
about
3 seconds
from
the next
tone.
During
preliminarytrainingthe subject learned to identifythe tones by pairing them

with four response
keys. In experimental sessions,a loud masking noise was
turned
on
and
random
sequence
of 250
tones
was
presentedagainstthe
WILLIAM
background.A flashing
lighttold
noise
and
J.
he
the stimulus
subjectwhen
the
if in doubt
instructed to guess
was
91
MCGILL
which
about
occurred,
of the four tones
one
it was.
object of the experimentwas to find weightsfor both the frequency

in determining which
and the immediately precedingresponse
key
One
stimulus
subject would
the
The
data
press.
presentedhere
close to the masked

order
In
Tests
were
weights,we
of three-dimensional
example
considered
stimuli. The
consider
can
in which
odd-numbered
stimuli
the
are
the experiment as an
analysis is based on the
Our
stimuli. The
the context
as
the
when
transmission.
to the 125 even-numbered
responses
ratios.
signal-to-noise
ratio
was
signal-to-noise
several
at
run
threshold.
calculate
to
were
obtained
odd-numbered
responses
are
subject judged the even-numbered

ignoredin this analysis.
designatedas the variable u. Last previousresponses

called "presponses"and they will be indicated by the variable v. These
are
the inputs. Current
are
are
representedby y. This is the output
responses
of
the
Thus
variable.
can
we
identify
joint event (i,j,m)as the occurrence
The
stimuli will be
to
response
stimulus
j. Failure
i, following presponse
possibleresponse. Consequently
and five response
categories.
as
The
subject's responses
from
to
contingency table. Two
4X5X5
table
this master
TABLE
Stimulus-Response
are
the
125
test
there
stimuli
of the reduced
reproduced here
are
respond is
stimulus
were
tables that
in order
to
Presponse-Hesponse
Table
Frequency
sorted
our
12
into
com-
Frequency
Presponse
3
sidered
con-
gories
cate-
obtained
were
illustrate
TABLE
Stimulus
to
four
Table
92
READINGS
The
for s,", goes
calculation
-^-[1 log2 1
s,"
374.05750/125,
sv.
2.99246.
same
s," is
way,
Response table,Table
^n
log21 +
s^^
372.38710/125,
s,^
2.97910.
The
for
log21 +
from
s,
[31 log2
yI^
31
Si
620.83188/125,
Si
4.96665.
is based
It
is evident
"
"
"
figuresfor
the
that
log2 2 +
logs30 +
these
"
"
33
entries
log^7+10
log^10],
in the
Presponse-
nj"
log29 +
log,3],
marginal of Table
log^33 +
the total number
on
log2125
"
the n. in the bottom
30
s,
computation for
log2 12 +
computed from
obtain the value
1 has
2:
sy"
We
Table
follows:
as
log25+12
s."
In the
PSYCHOLOGY
example, the Stimulus-Response plot in
putations. For
w,"
MATHEMATICAL
IN
31
1:
log^31],
of measurements:
6.96579.
calculations
are
wishes, the reader may
performed
also make
very
easily with
the
computations
(8),and Dolansky (3).
of p log,p tables for analyzing discrete data is not recommended,
The use
that the table of n log,n avoids.
however, because it leads to rounding errors
The complete set of s-terms in the experiment on frequencyjudgment worked
a
table of
with
out
log, n.
tables of p
as
If he
like those
log,p
follows:
=
1.45211
s,
4.96665
s.,
2.91389
s,
4.79269
Sir,
2.99246
4.93380
s,"
2.97910
6.96579
s.,"
In
Newman
preparedby
section
it
shown
was
that
s"
response
information,H'(y), can
be
analyzedinto components
H'{y)
HUy)
T\u;y) +
{v;y)+ A'{uvy).
(17)
WILLIAM
H'{y)
Since
had
2 bits. The
at most
not
respond.
and
2. The
This
extra
bits. This
for either
is the
Some
from
auditory
or
72
per
between
s.,-
part of the
by
cent
the
We
s,,"
"
did
the
or
of the
response
that
sequently,
Con-
presponses.
information
information
is
therefore
must
and the two
subject'sresponses
that
see
from
information
response
stimuli
of the response
28 per cent
to associations
be due
been
in Tables
the
the
1.46178/2.03199
unanalyzed error.
subject
right-handmarginals
equation (17) are easilycomputed
of the
accounted
bits. If the
2.03199
keys equally often, this figurewould have

shows
information
that the subject sometimes
be verified from
can
is 1.46178
HlXy)
H\y)
that
see
quantitiesin
example, Hl,{y) is computed
rest
For
s-terms.
we
the four response
used
is not
s"
93
MCGILL
J.
predicting
variables.
If
consider the association
we
T'{u;y)
T'{u;y)
only
.058 bits
transmitted
are
frequency stimuli,accounting for
the
from
3 per cent
consider
we
This
near
because
surprising
is not
threshold
masked
the
association
the
find
{y),we
responses
r(v;y)
value
response
The
of .218
bits
and
the
s,-
s"
stimuli
current
information:
4- s,"
.21840.
to
11
some
cent
per
of the
information.
last element
in
equation (17) is the
A'{uvy)
"s
A'iuvy)
.29401.
that about
14
interaction. Knowledge
is
computed
-\-Si +
per
s, +
cent
s"
stimulus
Sii
"
of the
s,".
"
response
also
of the interaction
"
s,-
Sa
permits
T'{u;y)+
.35181.
s," +
Sy" +
information
is:
Ti{u;y)
response
presponse
from
from
while
measuring transmission
inputs constant
stimuli
from
to
example, the transmission
responses
constant
(v) and
presponses
transmitted, amounts
interaction,A'{uvy).This
see
between
transmitted
little more
T'(v;y)
We
difficult to hear.
If
This
Srr. -\-s,"
information.
of the response
set
ratio was
the signal-to-noise
less than
Si
.05780.
were
responses
have
(y),we
Thus
auditory stimuli (w) and
between
s,i"
A'{uvy)
us
s,-,,",
is due
to hold
the
with
other
one
to
of the
input.
presponses
the
For
held
94
READINGS
3, 11 and
These
parts of the
the
14
per
to 28
figuressum
stimuli,presponses
for
cent
information
response
model, lead
the three-dimensional
analyze with
can
for
calculations
Our
PSYCHOLOGY
MATHEMATICAL
IN
cent, the
per
weights of approximately
interaction respectively.
of transmitted
amount
predictedfrom the size of the noise term. We can

weight directlyby computing the information transmitted
together.We have
s
divide
now
we
There
several
are
theory
this
to
component
to
the
sum
be
can
of squares
out
worked
second
for
process
information
are
multivariate
not.
are
distribution-free in the
test
square
The
of
number
discrimination
immediate
discussed
of
matter
that
is
of transmitted
No
exact.
the
partition
fact,a
notation
exactly parallelto the

(4).
transmission
information
is made
information
are
order
to
when
zero
to the restriction
categories,while methods
assumptions about
No
transmission.
linearityare
Furthermore,
later section,it will be shown

that
sense
they
are
extensions
when
based
on
introduced
statistical
that these tests

of the familiar
transmitted
of information
of amount
advantages. Garner
that the amount

of the
to
are
chi-
independence.
measure
inherent
As
similar
transmission
in discrete
information
developed in
are
measure
analysisis additive.
of error
(ornoise)
contingency-sense (asopposed
analysisof variance).In addition,the analysis
independence in
designedfor frequency data
tests
the
is very
by the
cent.
Furthermore, the analysis is
in
point
information
independent in the
to linear
in
plus the
analysisof variance.
is that
inputs
tion
applicationof informa-
our
first is that
in
contingency tables. Measures
variables
is
The
28 per
noting about
The
information.
involved.
both
Sii^
transmitted
figureof
of association
s-notation in multivariate
The
our
pointsworth
measures
approximations are
of
get back
experiment.
response
Sii +
"
three-dimensional
this
information, we
response
Sm
"
from
.57021.
T'(u,v;y)
sum
tion
informa-
also obtain this total
T'{u,v\y)
The
we
and
we
If
that
to
and
of information
Hake
also
has
certain
(5) have pointed out

approximately the logarithm
(2) and Miller
transmitted
is
perfectlydiscriminated input-classes.In experiments on

have
Like the one
we
discussed,the measure
provides an
of
picture of the subject's discriminative
apphcations of this property
in mental
abihty. Miller has also

testingand in the general
theory of measurement.
6.
It
is
Independencein Three-Dimensional
evident
from
the
definition
of
Transmission
transmitted
information
that
96
Now
ri(w;y)
Si
Ti{u;y)
s,-
T'Xu',y)
rXu;y)
T'{u',y).
kinds of
Both
Sii
s^
"
Si
"
Si,^
Sj
Sii"
s^,,-\-s
s"
"
equation (8).
Sim
between
and
Uiim
of classes in
V is the number
studied
have
the
Si
"
that
is the
happens
we
is not
do
not
only input variable

and
v).As might
generated from a single
between
be
F'
v.
independent of y. We could have

independent of v. The results are analogous to
where
case
independent of y, or
have presented.
those we
had
this
y. When
(providedthat no information is transmitted

be expected,both kinds of independence can
restriction on the data, namely
We
Si^
transmission,since
three-dimensional
where
s^
independence,(21)and (22),togethermean
in transmission
involved
have
s,-,".in
substitute for s,". and
we
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
is
7. Correlated Sources
of Information
for
information, T'{u,v;y),accounts
three-dimensional
a
contingency
only part of the total amount
table. It does not exhaust all the association in the table because it neglects
this association is considered,
i.e.,
the inputs. When
the association between
led
to
are
we
all the relations in the contingencytable are represented,
when
equation that is very useful for generatingthe components of multivariate
an
transmitted
Three-dimensional
of association in
transmission. Consider
C'{u,v,y) H'iu) +
=
If
we
add
and
H\vj + H'iy)
H'{u,v),we
subtract
(23)
H'{u,v,y).
obtain
+ r{u,v;y),
C'iu,v,y) T'{u',v)
=
C'{u,v,y) T'{u]v)-f T'{u;y)-f T'{v;y)+

=
We
see
that
C'(u,v,y)
generates all possiblecomponents
information-sources,u,
v, and
8. Four-Dimensional
It will be
instructive to extend
transmitted information
results
can
be
with
of the three
lated
corre-
y.
Transmitted
our
measures
Information
one
to
step further,i.e.,
three input variables,since
generaUzed easily to
(24)
A'{uvy).
an
from
iV-dimensional input.For
that point
simplicity
WILLIAM
we
shall restrict
input and
M
outputs does
with
us
not
present
add
us
We
u,v,w.
four
that
suppose
of information
sources
interaction
sends signalsh
and
u,v,w,
with
A^
multivariate
inputs and
be
can
constructed
input, u,v. The

1,2,3,
y. We
jointinput
W.
"""
This
proceed to define
can
gives
four-
follows:
Al{uvy)
already defined A'{uvy).The
have
case
with
clear.
information, A'{uvwy), as
A'iuvwy)
We
general
more
to the bivariate
w
w
channel
specialproblems, and
any
variable
new
of
case
the rules become
once
now
way
output. The
difficulty
no
Let
is
univariate
97
MCGILL
to the
development
our
J.
A'{uvy).
"
definition of
Ai{uvy) will be similar

except that the subscript w indicates that A'(uvy) is to be averaged over
w.
As we
have alreadynoted, this is accomplished by adding the subscripth to
each of the s-terms
that make
up A'(uvy).Consequently
Ai{uvy)
It is readilyshown
which
matter
"Sh
Shi
that
Ski +
Sh"
Shii
Sa,"
"
A'{uvwy) is symmetrical
variable is chosen
for
in the
Sa,-" +
sense
Snum
(25)
that it does not
averaging, i.e..
(26)
We
that
of information
A'{uvwy) is the amount
when
a fourth variable
by controlling
any
see
gained (or lost)in

three
mission
trans-
of the variables
are
alreadyknown.
If
table,we
all
examine
we
possibleassociations
in
four-dimensional
contingency
obtain
C\u,v,w,y)
T'(u;v)+
T'{u;w) +
T'{u;y)+
A'{uvw) +
A'{uvy) +
A'{uwy) +
T'{v;w) +
T'(v;y)+
A'{vwy) +
r(iv;y)
A'{uvwy),
(27)
where
C'{u,v,w,y) H'{u) + H'{v) + H'iw) + H'^y)

=
Equation
It turns
out
(27)
can
be
both
proved by expanding
that in the
H'{u,v,w,y).
sides in s-notation.
y) is expanded by writing
general case, C'(u,v,w,
for all possiblepairs of variables,and .4-terms
for all possible
"
"
"
down
T-terms
combinations
of
three,four variables
Four-dimensional
can
T'{u,v,w;y),
be
transmitted
written
as
and
so
on.
u,v,w
to
y,
i.e.,
follows:
T'(u,v,w]y) H'{y) + H\u,v,io)

=
from
information
H'{u,v,w,y).
(28)
98
READINGS
The
arguments
same
used
PSYCHOLOGY
MATHEMATICAL
to
justify(28)as
To
transmission.
in three-dimensional
we
are
IN
find the
used in the
were
components
of
(4)
of T'{u,v,w]y),
case
that
note
T'{u,v,w;y) C'{u,v,w,y) C\u,v,w).

=
This
that
means
correlations
the
are
T'(u,v,w;y)
all the components
contains
T'(u,v,w',y)
among
the
(29)
inputs. Consequently
C'{u,v,w,y)
except
of
the
components
of
"
T'{u,v,w;y) T'{u;y)+ T'{v;y)+ r{w;y)

=
A'{uwy) +
Aûvy) +
shown
components of T'(u,v,w;y)are
The
A'{vwy) + A'{uvwy).
in schematic
form
in
(30)
Figure 2.
T'(u,v,wiy)
Figure
Schematic
of
diagram
four-dimensional
2
of the
components
transmitted
with three transmitters

a
formation,
in-
and
singlereceiver.
If it happens that
rihii^
where
are
nii"/W,
that
of classes in w, all the components of C'(u,v,w,y)
In similar fashion,
drop out and C'(u,v,w,y) C'{u,v,y).
is the number
functions
of
C'{u,v,y)can be reduced to C'{u,y).This is preciselywhat we did in the

of independencein three-dimensional transmitted information. Since
analysis
of transmission with multivariate
T'(u;y),we see that all cases
C'{u,y)
inputs can be related to the bivariate case.
are
three inputs controlled,
we
ready to extend the analysisof
With
=
WILLIAM
response
in section
information
H'{y)
Equation (31) says that
we
4, a step further. We
HLM
can
due to the three inputs. This
from
mUy)
(31)
the effects in response

information
the fact that (30) tells us how
measure
is evident
HUy)
have
T'{u,v,w;y).
in its components.
expand T'{u,v,w;y)
to
99
MCGILL
J.
In addition
know
we
that
TUw;y),
(32)
where
TUw,y)
We
that
see
T'{w;y)+
in addition
w
controlling
transmitted
A'{uwy) +
between
to
and
A'{vwy) +
and
A'{uvwy).
v, enables
y from
to
us
(33)
rescue
the
noise,and to replace
with a better estimate of noise information,namely H^,^{y).
H'uviy)
The
evident. In general,
transition to an A^'-dimensional input is now
information
the
have
we
H'(y)
HL^...Xy) + T{u,v,w,
,z;y).
"""
(34)
z;y)can
(N -\- 1)-dimensional transmitted information,T'(u,v,w,
in the manner
that we
have described.
then be expanded in its components
The
"
"
"
9.
Miller and
to the likehhood
AsymptoticDistributions
Madow
(6) have shown that sample information is related

show that the
ratio. Following Miller and Madow, we can
mate
largesample distribution of the likelihood ratio may be used to find approxiinvolved in multivariate transmission.
distributions for the quantities
Consider,for example, three-dimensional sample transmitted-informatest the hypothesis that T(u,v;y)is equal to zero.
We
can
tion,T'(u,v;y).
the
This is equivalentto
hypothesisthat
(35)
V(i,j,in) PihJ)'Pi'"n),
=
since
independent.This hypothesis
the likelihood ratio [seereference (7)],
T{u,v;y)is zero
leads to
input and output
when
ri-"
n {n,r' n W"11 inurn)
If
we
take
logs,we
are
obtain
-2
log.X
_
1.3863
-2
n~
log.X
^'"
^'' "^
""'"'"
1.3863nT'iu,v;y).
'
(37)
100
READINGS
MATHEMATICAL
IN
PSYCHOLOGY
For
large samples, "2 log. X has approximately a x distribution with

{UV
1)(F
1) degreesof freedom when the null hypothesis(35)is true.
is distributed approximately like x
if T(ti,v;y)
Thus
1.3863 nT'{u,v',y)
is
"
equal
"
to
zero.
in
Suppose
sources.
three-dimensional
our
P(hj"m)
hypothesis leads
This
three-dimensional
testing suspected information

that
example, we assume
involves
problem
important
more
p(i)-vU) p{m)
(38)
"
to the likelihood ratio for
complete independence in
contingency table,
n-'"
X
n (nr
(nr
U (n^'"
==r^
'
(39)
"
11 (ni,J"'""
After
take logs we
we
find that
loge\
"2
Ss
H\u)
"2
large samples
log,
{U
1)
I)
{UVY
null hypothesisis true.
-
also know
We
Si
"
Sj
"
Sm
"
"
H'(v) -f Wiy)
s,-,-"
H'(u,v,y)
(40)
l.S8QSnC'{u,v,y).
For
approximately a x^ distribution with

1)
(Y
1) degreesof freedom when the
has
(V
that
C'{u,v,y) T'{u;y)+
r{v;y) +
(41)
T'Mv)^
nT'(u',y)and
1.3863 nT'{v;y)are asymptotically distributed like x with {U
1)
1){Y
and {V
if T{u;y) and T{v;y)
1) degreesof freedom, respectively,
\){Y
find
of
To
the
distribution
make the following
we
zero.
are
Ty{u;v),
asymptotic
hypothesis:
likelihood ratio
The
used
be
can
show
to
that
1.3863
"
"
pîj) is the conditional probabilityof j given

Now
we
have
(42)
Vihm)-Pm{j),
p(i,j,m)
where
m.
the ratio
'"'""^"'^
'=
li
"
^^
_
~
^""
*'"'""^ ^'"''"'
log.X
(43)
{ni,J
log,X
1.3863
-2
(H""
n (n..)"- n
"'
1.3863nri(M;").
.V
WILLIAM
In this
of
log. \ has Y{U
"2
case
(41) we
J.
1)(V
"
101
MCGILL
1) degreesof freedom.
"
write
can
+ T'{v;y)
+
1.3863nC'(w,v,i/)1.3863n[T'(w;y)
the
quantitieson
{UVY
"
freedom
"
rightside of (45)have degrees of freedom

F
"
2), Since this is the
that
to
sum
degreesof
of
(45),the quantitieson the right side of

asymptotically independent,if the null hypothesis,
as
(45) are
the left hand
number
same
(45)
T'y{u;v)l
The
In view
on
side of
is true.
This
that
means
as
approximation
an
we
under
Tl{u;v)simultaneouslyfor significance
stated. The
test
is very
similar to
tests
significance
this
we
need
will be made
data
in the
stimulus
2s +
C'{u,v,y)
.69055.
also need
with responses
predictthe
the information
Ty{u;v),
held constant.
This
expect much
transmitted
Sij^
Sj
"
transmitted
how
measures
information
T^(w;y)
Si
"
Since stimuU
auditorystimuU.
table. We
preponse
C'(u,v,y)
=
We
response
here. The
s,"
s," +
+ A\uvy),
T'{u',v)
"
5.
(45).To do
terms
not
were
of
amount
ciation
asso-
presponses
at
the
see
have
random,
computation
s"
can
in section
successfullythe
chosen
were
s"
"
from
"
example
our
Ty{u;v),since these
is the total
C'(u,v,y)
that
note
we
from
the quantitiesin equation
on
and
C'(u,v,y)
to compute
discussed in section 5. First
the null
analysis of variance. We
an
by applying the test to the

similarity
The
T'{u',y),
T'{v;y)and
hypothesiswe have
test
can
goes
to stimuli
presponses
we
do not
as
follows:
s,-,",,
.41435.
T'{u;y),T'{v;y) and
computed values for C'{u,v,y),
summarized
results
The
and
the
tests.
are
(45)
perform
x
level of
in Table 3. We
have not attempted to calculate the significance
do not have enough data to sustain the 88 degrees of
because we
C'{u,v,y)
We
may
now
put
our
Ty{u;v)into equation
freedom.
In any
The
case
same
Table
probably be leveled at our test for T'y{u;v).

effect in the experiment
that the only significant
criticism
3 shows
can
is the presponse-response
association.
One interestingfact that the analysis brings out
clearly,is that we
of transmitted information is big or small
decide whether
cannot
amount
an
without knowing its degreesof freedom. In our example we find that Ty{u;v)
and T'y{u;v)
.414 bits,while T'{v;y)
.218 bits. Yet
T'{v;y)is significant
=
is not.
The
reason
lies in the
difference
in
degrees of freedom.
Miller
and
102
TABLE
Table
Madow
measures
In
(6) have
discussed
of
Transmitted
Information
of statistical bias
the amount
in information
degreesof freedom, and have suggestedcorrections.

the association between
3, we tested T'y{u;v),
presponses
Table
held
responses
constant.
TABLE
Table
Probability
further in Table
not
are
of
Transmitted
This
association is broken
down
and
still
Information
estimated.
4. No
term, A'{uvy), because

A-terms
due to
stimuli with
**
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
probabilityis estimated
its
distributed hke
in Table
asymptotic distribution
the
difference of two
4 for the interaction
is not
chi-square.All
variables each
of which
chi-squaredistribution. The distribution of this difference is evidently

not
chi-square because the difference can be negative. Its densityfunction
has been derived by Pearson, Stouffer,and David
(9),but the writer has
In some
the problem can
been unable to find a table of the integral.
cases
T-terms.
be circumvented
by combining A-terms with T-terms to make new
[See,for example, equation (33).]However, in other cases, the interactions
in their own
are
right and should be tested directly.
genuinelyinteresting
available.
These cases
be treated when adequate tables become
can
has the
RANDOM
FLUCTUATIONS
OF
William
RESPONSE
RATE*
J. McGill
COLUMBIA
UNIVERSITY
A simple model
for fluctuating interresponse times is developed and
studied. It involves a mechanism
that generates regularlyspaced excitations,
each of which can
triggeroff a response after a random
delay. The excitations
not
are
observable, but their periodicityis reflected in a regular patterning
of responses.
The
is
probability distribution of the time between
responses
derived
and
its properties are
also
are
analyzed. Several limiting cases
examined.
number
of
that
behavioral
optic
trains of
horseshoe
preciselytimed
is illuminated
by
of
occurrence
found
in studies
The
is not
Under
close
the
example
for
is
is the
the
long
its visual receptor

with
comparable
conditioningwhen
the
rate
by reinforcingpaced responding;
point
bear
to
of intervals
ticking of
are
responses
in mind
in each
between
responses
watch
of these
is not
than
more
the
regular
ir-
random
components.
a
periodic system.
to be
wide
a
the
We
us.
for
is
so
article
of
in
intervals
distribution
model
the
with
model
but less stable than

the
capacity
these
to
found
it is
in
between
of these
both
periodic
will
then
take
be
completely
on
Since
extremes.
biologicalprocesses,
one
any
this
seems
physicalsystems
surprisingthat the
Httle attention.
to
examine
the writer
Psychometrika,
was
1962, 27,
104
properties of
the
producing noisy fluctuations
appeared
generated by
sequence,
*This paper
was
completed while
Lincoln Laboratory, Lexington, Mass.
This
to construct
periodic response
the
The
amounts.
of the type of randomness
attempt
an
of these
many
perfect, and
they will have

of possibilities
between
problem has received

mechanism
want
purely random
extension
paper
than
Intervals
orderly behavior
more
This
less
of
Moreover
range
natural
be
change in small
to
interests
stable than
of the
of electrons.
stream
itself to
seen
changes is what
more
of
scrutiny the timing
reveals
systems
to
resemble
heart
of the
tribution
Consequently the Poisson diswhich
is sometimes
proposed [11] to deal with rate fluctuations,
likelyto be very helpful.
fluctuations
and
essential
sponses
pulse-likere-
is famous
sequences
of operant
that the sequence
intervals
Another
limulus, which
is stabilized
response
is the fact
random.
mind.
potentialsit produces when
action
(see [5],pp. 498-502). The

examples
crab,
beating
constant
to
of
sequences
steady light [6].Response
also
periodicityare
of
springs immediately
the
of
nerve
generate
regularly in time. The
recur
illustration that
an
systems
in
a
3-17.
otherwise
an
elementary
constant
visitingsummer
Reprinted with
time
scientist at
the
permission.
WILLIAM
J.
105
MCGILL
intervals. Despite its
the mechanism
can
simplicity,
duplicatea variety of
observed
butions
phenomena, ranging from sharplypeaked and symmetrical distriof interresponse times to highly skewed
distributions,and even
completely random responding.Moreover, all these behaviors can be elicited
from the same
mechanism
by alteringthe rate at which it is excited.
Periodic Excitation
We
begin by examining interresponsetimes that are nearly constant.

is some
sort of periodicexcitatory
we
key to this regularity,
assume,
that triggers
after a short random
a response
delay. Even when the
process
excitations are not observable their effects are seen
in the regularintervals
between
The periodic mechanism
they impose
proposed here is
responses.
in
which
also
illustrates
notation.
our
diagrammed
Fig. 1,
The
"^1
"^2
"^S
Figure
Stochastic latency mechanism
yieldingvariable interresponsetimes
with a periodiccomponent.
regularintervals t, but are subject to random
delays before producing responses. Heavy line is the time axis.
Excitations
and
(not observable)
come
at
denote excitation and response
The
respectively.
time
interval
successive responses
variable and is called t. The
is a random
excitations is a fixed (unknown)
analogous interval (or period) between
between
two
constant
t.
and
Excitation
response
its distance
The
response
will almost
from
each
almost
always
excitation
be
can
never
coincide in time.
located between
be
expressed
as
two
two
quently
Conse-
excitations,
location
these,r, is the delay from a response to the next

excitation. The
second, s, is the corresponding interval between
and the excitation that immediately precedesit.
The
response
and
first of
basic random
the distribution
of t when
quantity in Fig. 1 is s, and

the distribution of
our
is known.
ordinates.
co-
lowing
fola
problem is to deduce
Accordingly,
suppose
106
that
times
READINGS
an
were
PSYCHOLOGY
MATHEMATICAL
exponentialdistribution
completelyrandom. Let
has
IN
Ks)
(1)
would
as
be the
if interresponse
case
xe-
/(s)is the frequency function of s, and X is a positiveconstant, i.e.,

constant. Equation (1) then describes a very simple delay process
of a response
in which the probability
during any short interval of time As
and equal to XAs
(see Feller [4],p. 220).
followingexcitation is constant
This defines what
mean
we
by "completely" random; the instantaneous
of response
is independent of time.
probability
in view
in trouble immediately,for (1)is not strictly
We
legitimate
are
be seen
from the fact that
of the requirementsjust set down. This may
"
"
mum
whereas the maxithe
interval
0
distributed
is
s
the exponential
on
where
the time
oo
value
of
If Xr is
in
Fig. 1
is
t.
circumstance, the
excitation and
delay between
average
because, in this
real trouble is encountered
sufficiently
large,no
is small
response
to Ei
excitations. Hence
the response
compared with the period between
before E2 comes
certain to occur
is practically
along, and the tail of the
distribution of s never
reallygets tangledwith the next followingexcitation.
it happens that Xr is not large,
When
a simple adjustment of (1)is required
tion
and r, without changing its characterizain order to bound
zero
s between
interval.
as a completelyrandom
With
Distribution of InterresponseTimes
Our
main
results
are
probabihty
and picturedin
describe the
outlined in the first section
the distribution
densityfunction describing
The
between
Periodic Component
given in (2)and (3),which
distribution of the mechanism
Fig. 1.
of the time interval t
is
twô successive responses
Xj/
1
sinh \t
t "
T,
-X(
t "
T,
(2)
1 +
J'
.
Xe
2p
in which
skewed
z/
is
and has
Whenever
constant
a
Xr
the distribution of
given by
i^
well-defined maximum
over
distribution is
evidently
t.
small,
large enough so that v is negligibly
to
time
in
(2)
interresponse
simplifies
happens
to
be
(3)
e"^^ The
Ki
r)
Equation (3)is the well-known

and sharplypeaked over
t
2'
-xu-co
"
"
cal
Laplace densityfunction [1].It is symmetriof
the latency
the behavior
t, and describes
WILLIAM
mechanism
by
the
the intervals between
when
component
periodic
then
must
be
J.
small
r.
"Noise"
dominated
successive responses
are
introduced by the random
comparison with
in
107
MCGILL
the
component
the
generatedby
periodicity
excitatory
process.
approximation in (3) is easilyrationalized if 1/X is considered as
1/Xr
component. In that case
measuring the magnitude of the random
the size of the noisy perturbationrelative to the period between
measures
The
excitations. Hence
the parameter
toward
will go
the ratio
whenever
zero
the random
small.
whenever
1/Xr gets small, i.e.,
component is effectively
It is not obvious that (2)approaches the Laplace distribution as v disappears,
but a brief study of (2)shows that this is in fact what happens.
Proof of the Distribution*

shall
We
show
now
interresponsetimes when
shown
in
(2) is the
that
triggered
by periodicexcitations
are
responses
of the distribution of
form
correct
as
Fig.1.
be
First of all,(1) must

handled.
is easily
beginwith
We
hold
adjustedto
and
zero
t.
This
simplycyclethe exponential
excitation and
an
between
lettingthe distribution
until it reaches t again,and repeating the process
continue
down
to run
to any
ad infinitum. The ordinate corresponding
point s between zero and
distribution back
will then
be
to the
originas
will be distributed
reaches
r,
/(")
=
distribution
is now
excitation,
in (la)yields
^^^"'^
of r, the interval from
/w
Evidently r and
excitation period.On
are
0"s"r.
the response
determined,since,from
(4)
excitations.
Fig. 1, s
Hence, when
writer is indebted
original
proof.
to
r.
o"r"r.
only
intervals
one
between
referee for
can
response
to
in the
same
between
occur
are
responses
belong to one excitation period and s will belong

making r and s independent for determining t.
of r and
It should be clear that t is not just the sum
*The
"
correlated
perfectly(and inversely)
the other hand
following
Substituting
the next
to
=
^^^'^
will
the
excitations
in the interval between
response
as
(la)
two
given by
of the
Consequentlythe position
The
as
soon
analyzed, r
later one,
althougha
suggestingseveral excellent ways
to
thus
cursory
simplify
108
READINGS
examination
IN
PSYCHOLOGY
MATHEMATICAL
The trouble with the impression

Fig. 1 leaves that impression.
of
periodsmay separate Ri from R2 In other words,

excitation may
not forced by excitations. A new
are
come
along
responses
1
before the response
is emitted. We have drawn Fig. as though response R2
reflection suggests
fellinto the excitation periodfollowing
Ri but a moment's
that thingsmight not happen so neatly.To deal with this nasty eventuality,
is that several excitation
shall define t
we
as
(5)
where
is taken
the time
as
is the
i.e. the number
Equation (5) is
whose
of
distributions
and
s,
interval between
analogousinterval between
s
excitation,
precedingit,and
occurs,
kr +
of
now
are
Ri and
the next
R2 and
mediatel
the excitation im-
periodsin which no response

empty excitation periodsbetween Ri and R2
of quantities
of t in terms
unique specification
k is the number
of
known
as
soon
as
X and
are
fixed. The
in (4) and (la).Our

alreadybeen specified
have
following
distributions
step is
next
to
find the distribution of kr.
beginningwith
interval
An
and terminating in
excitation
an
response
This latency is
occurs.
may
span several excitations before the response
denoted by kr + 5, where k takes on values 0, 1,2, 3, etc. It is evident from
bution
leadingup to (la) that kr -{"s has the exponentialdistriout to infinity.
the delay from an excitation to the first
Accordingly,
be resolved into two
can
subsequent response
independent components:
the location of response
(i)the number of excitation periodspassed,and (ii)
R2 in the periodbetween the last two excitations. In view of the independence
the arguments
of k and s,
we
can
write
Xe
-\(.kr a)
\e~''''^''
+
T^/7
where
Pikr) is the probabiUtyof
(6)
Pikr)
In other
words, the distribution
at successive
multiplesof
All three components

r
and
can
form
of
The
(7)
"
find that
of kr. We
v).
of kr is geometricwith ordinates
spaced out
independent.Moreover, the variables

Consequently,(5)
for each value of k.
same
to read
i
where 0 "
/(I
(5) are
unit that is the
be amended
value
particular
t.
(5a)
some
Pikr)
-\-s "
kr +
y,
2t,and kr has the geometricdistribution givenby (6).
distribution of y is obtained
from
the convolution
of
and
s.
After
find that
we
simplification
=\csinh\y
[csinhX(2r
"
y)
"
"y
f(y)
"
"
r,
2r,
WILLIAM
J.
109
MCGILL
where
\v
c
(1
The
the
distribution
pair of
in the interval. The
of excitations
/(I
/.(y)
(8)
it
separatelyby linking
density function
of
component
fixes k, and
of the distribution
component
new
f{t)will be indicated
t. Equations (5a) and (6) yield
A;th harmonic
number
each component
to describe
of t. It will be convenient
it to the number
determine
k will define
change in
of excitations between
the number
on
each interval. This
that bound
responses
that each
follows
depend
of t will
^r
fk(y),since
as
of the
and
v)/(y).
times with a
example, /o(?/)is the density function of interresponse
of f{t)
each pair of responses.
This component
singleexcitation between
For
has k equal to
and is defined
zero
the interval 0 "
over
2t. The
^ "
average
interresponsetime in the interval is r.

The
first harmonic
fiiy)spans values of
is 2t. Higher harmonic
The
function
each
between
excitations
just two
refers
fi{ij)
component
t between
pair
of responses.
and
St. The
components
are
that for values
has contributions from
two
harmonic
unity and
time
interresponse
Hence
average
defined in the
it evident
foregoing makes
with
interresponsetimes
to
same
k is
way.
oi t "
components
the
in each
density
interval
lengthof an excitation period.The pair of contributors

change as we proceed away from the originin multiples of r, but every
out to have two
element of densityin f(t)after /
t will turn
components.
Specifically,
corresponding to
the
will
U(y) + h.îy
/(/)
(9)
If the
densities
the
on
expressionsdetermined
r)
2r.
r"y"
right-handside of (9) are replacedby equivalent

(8),it is easilyshown that
from
(9a)
/(/)
^Xe"
'.
simply another way of writing t,and it is apparent

that
in a way
of f(t)interlace themselves
that the harmonic
components
for the distribution of / :
simple expression
produces a surprisingly
Now
recall that kr -\-y is
sinh \t
Xe-''
t "
~,
i
^ +
"
2^
This is (2) and
the
proof is complete.
r.
110
112
This
zero.
and
we
are
can
necessary
of
Refer
lapses
geometric distribution in (6) colof r and s,
t in Fig. 1 will be just preciselythe sum
0. Hence
of
excitation
Two
periods.
responses
ignore the possibihty empty
there must
also be two independent occurrences
to define t. Hence
Call them
R2
respectively.
S2 corresponding to Ri and
Si and
follows
when
s.
and
Fig. 1
and
S2
we
can
m.g.f.of
given in Table
for M ,{B),
we
the
write
the
as
"
S2
difference
Si
"
exponential
of two
generatingfunction
its moment
as
MXe)MX-d).
exponential distribution
1 for the variable
Si,
is distributed
"
that
observe
M,_.(^)
The
the
Consequently,
variables
the fact that
from
to
now
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
kr +
s.
is,of course, very familiar and is

Substitutingthis exponentialm.g.f.
obtain
Mt-r{e)-'
(1 -Â)(i
^/x)
ieW~,
is,as we have already shown, the m.g.f.of the Laplace distribution.

Evidently the Laplace density function (3) is in fact simply the distribution
two
of the difference between
exponential variables. This simple point is
ignored in most texts on statistics because, perhaps, no one imagines why
estabhshes
be interested. Our
else would
a
good
argument
very
anyone
which
for
reason
being
provides a characterization
periodicexcitation.
under
of the
between
also behaves
latency mechanism
excitations
has
of the mechanism
another
gets
a
very
small.
in
error
the
Laplace
timing
device
in
We
an
now
fairlyslow response,
rapid succession.
in
interestingway
suppose
the
that
but is bombarded
The
Vim
fit)
the
as
that
is
period
delay part
by
excitations
restriction is achieved
following one
by fixingthe delay time constant, X, while allowing r
zero.
Equation (2) ior f(t)immediately leads to:
(14)
bution,
distri-
Excitation
Continuous
The
hence
difference,and
The
interested.
to
bolically
sym-
approach
Xe"".
T-.0
0 and t
t
t
portion of f{t)between
is
the
constant
approaching
v
must
disappearas t approaches zero. Meanwhile,
of (2)
the
of
falls
out
the
Umit
for
portion
right
f{t)
unity. Consequently
defined for t " t. The
same
exponential limit can be obtained by studying
with X fixed,
the behavior
of the m.g.f.for f(t)in (12) as t approaches zero
The
or
result is almost
by analyzing
the
obvious.
The
variance
of
fit) under
these
same
limitingconditions.
12
INTERRESPONSE
General
distribution
is
curve
of interresponse times
(standard
units)
with
plotof equation (2) in the text, with
and
arbitrary random
X
and
1. Dashed
periodic parts.
lines
are
harmonic
of the distribution.
components
given in Table 1. The

harmonics
the variance
(i.e.,
Variances
TIME
Figure
The
113
MCGILL
J.
WILLIAM
formulas
are
establish
that
the component
vanishes, and the

y) disappears as
harmonics.
b
etween
entire variance
becomes
differences
This imphes that the probabihty distribution of f{t)must
congeal around
its harmonic
peaks (seeFig. 2) when
goes to zero, and that each peak then
contributes
"line" of density to the resulting exponential distribution.
a
be contributed
that no
by
delay can
Intuitively,the limit in (14) means
within
of
in the
concentrated
the
is
is
latency
between
and
response
the
next
excitation.
in Fig. 1 vanishes
r
instantly available. Hence
consumed
by the latency between excitation and
assumed
to be
and
the
response,
That
excitation
entire
which
interval
we
have
exponential.
Applications
compiled from a long .series

of limidus.
of action potentials recorded
a
on
singlefiber of the optic nerve
that the data are
periodic and the
The
distribution demonstrates
narrow
fiber.
to originatein the refractory period of the nerve
periodicityseems
The
mechanism, however, is not well understood. In this particularcase,
achieved
the regular sequence
by dissecting out a
of action potentialswas
Fig.
presents
frequency
distribution
114
IN
READINGS
PSYCHOLOGY
MATHEMATICAL
0+20+40
-20
-40
FROM
DEVIATION
PERIOD
Figure
nerve
when
continuouslyto
the
milliseconds.
291
of 303
the eye
intervals
Measured
is
fiber of the
the
nerve
fiber
produced
into the control gate of
pairs of responses
were
deviations
Laplace
shining a
beam
of discrete
barrage
was
out
a
onto
1000
of
attached.
was
magnetic tape.
digitalcounter, and
read
from
linear
the
drift. Smooth
curve
distribution.
on
the gate
passed through
are
the fiber
to which
recorded
amplified and
and
optic nerve,
ommaiidium,
the
interresponsetimes observed in a single fiber of the optic

fiber adapted
illuminated
was
by a steady light.The nerve
linear
increase
in
slow
in
period from 261 to
illumination,
resulting a
Frequency distribution
of limulus
( milliseconds
cps
Later
time
permanent
sine
wave
light on
the
receptor, i.e.,
steady illumination,
Under
responses
on,
which
the
tape
was
intervals between
record. The
then
were
played
alternate
timing signal
generated by
calibrated
is of the order
of the measurements
tuning fork oscillator. Over-all accuracy

2 milliseconds,
due to variations in the speed of the tape recorder.*
The
fiber adapted continuously to steady illumination,resulting
nerve
about
261
milliseconds
to 291
in a slow increase in the basic period from
data
the
in
blocks
isolated
of
milliseconds. This change was
by averaging
which
and fitting
25 intervals,
a line to the averages,
quite
fortunatelywere
of "
linear. Measured
from
is
between
this
responses
line,yieldingthe frequency
Laplace density function.
*The
and
intervals
the writer
with
distribution
made
the assistance
preparation and recording were
analyzed by
were
in
Fig. 3.
by C. G. Mueller.
of Michael
into deviations
converted
S.
The
The
data
Kennedy.
fitted
were
curve
recovered
WILLIAM
normal
A
a
data
taken
in which
last
in
payoff
Fig.
normal
was
dashed
on
the
the
distribution
normal
Fig.
and
obtained
a
main
and
interrespbnsetime
in
was
by
approximation
class interval
immediately
effect. In
evident,and
white
to
as
is taken
by mea.suring
rat.
Hill's data
The
data
schedule
from
the
is shown
used for this purpose

because
after reinforcement
believed to be unare
related
the
event
any
not
wêre
leptokurticcharacter
of Hill's
it suggests that the
long regimen of training(184 hours)

Hill's rat into a fairly accurate
problem made
discrimination
time
of
high shoulders
frequency distribution in the background. This

and variance
by matching mean
sponses
to the data. Re-
fitted
of responses
to
is
distribution
bar-pressesmade
The
response.
curve
have
day of conditioning with a reinforcement

contingent on delaying at least 21 seconds
in the 0-3 second

bursts
the
would
93rd
was
by the
data
illustration is provided
successive
the
on
previous
data
defines
1 15
MCGILL
same
reported by Hill [7].The

between
intervals
were
fact then
leptokurtic.Another
being
from
fitted to the
curve
flat top. This
J.
Laplace-type clock. We are led naturally to conjecture about how the rat
Does it happen internallyvia some
constructs
t.
type of neurologicalclock or
via
of
movements?
a stereotyped sequence
externally
.50
.40
"o
z
UJ
z"
.30
o
UJ
UJ
.20
.10
s,
0
12
15
18
INTERRESPONSE
TIME
Figure
Distribution
of
conditioningon
seconds
from
and
interresponse times
a
last
schedule
previous
demonstrates
in which
response.
27
30
36
33
(seconds)
produced by
reinforcement
Dotted
24
21
curve
bar-pressingrat
is
contingent
is best
peaking of empiricaldistribution.
on
after
long period of
delaying
fittingnormal
(Data from
at
least 21
approximation
Hill).
116
READINGS
Skewed
with
interresponse times
of
distributions
PSYCHOLOGY
MATHEMATICAL
IN
the
appearance
of
usually in connection with

(2) (seeFig. 2) are found often in the literature,
studied
[2] who
high speed responding. Fig. 5 is taken from Brandauer
small
illuminated
at
a
generated by a pigeon pecking
sequences
response
and the bird
controlled
Reinforcement
a
was
high
speed
flip-flop
by
target.
reinforced whenever
a particular
a peck happened to coincide in time with
was
forcement
of the two states of the flip-flop.
Consequently, the probabilityof reinone
determined
was
in that
the net
state, and
per
1000
result
of the
tail which
In this
interval
in turn
it is
case
rhythmic
use
faster than
even
that every
between
reflects
likelythat
to
second.
This
flip-flop
spent
same
(low)
the
generate high
distribution
Intervals
of 1000
excitations
follows because
the
by
were
the average
exponential
excitation.
varying degrees
the period t is constructed
by a pre-programmed
that humans
head something like the mechanism
rates
of
20
(second!
TIME
.23 seconds
"
10"^
interresponsetimes
longer than
follow
tapping.
Figure
rate.
that
of failure to
INTERRESPONSE
high
the
had
is increased
responses
10
Frequency
time
response
conclude
would
we
5.3 times
oscillation of the
in order
was
of
The
reinforcement.
periodicexcitatorymechanism,
coming
length
proportion
rate of 5.3
pigeon generated an average
second
mately
approxiduring the run shown in Fig. 5 which covers
created
is
in
5
the
in
fact
If
by a
Fig.
sharp peak
responses.
probabilityof
responses
the
by
not
recorded
shown.
from
pigeon pecking
(Data from
Brandauer).
at
WILLIAM
In
recent
interresponse times
in the
Hunt
paper,
spinal cord
and
recorded
of the
J.
117
MCGILL
Kuno
several
[9]present
distributions
of
during spontaneous
cat. The
data
activity of single fibers

gamut from the Laplace to
the
run
the
exponential,includingseveral examples of what appears to be our skewed

distribution
(Fig. 2). The effect is exactly what might be expected, if the
same
general response
subjected to varying rates of periodic
system were
excitation.
Discussion
be
It would
fiber
activity and
times
When
by
study
be
can
neurons
give
organization may
provide
constructs
have
r, we
of the mechanism
The
into
as
delineation
of the
have
to
device
and
the
excitation
for
summating
For
one
in
the
is
the
response
are
for
or
closelyrelated to
required,and
number
[3].These
it is not
noise
how
earlier,
kind
with
in
an
thus
the animal
conception
our
of noise
our
presents
insights
of the
general
higher
whether
the
barely scratches the

has indistinguishable
there is
new
is
simply
parallelchanneling present
distributions
show
response
replaces
by
blocked
suggestions
themselves.
of interresponse
response
inter-
clusteringsof
period, and
components.
entirelyclear yet what
excitation
this is clear other
distributions
the
[1].
before
trigger,or
of harmonic
centers
interestingproblems
very
mechanism
appears
Once
similar
detecting periodicityin
in this paper
multiples of a fundamental
(2) which also has harmonic
else is
system looks
organization,and
its parameters
that
already working.
Hterature
at
out
difference
no
excitations
there
times
so, the
compatible
for
excitation
new
reactivates
that
example,
times
estimate
to
considered
it makes
earlier one,
old
did
what
suggests possible ways
It turns
possibilities.
the
that
we
is
indicates
mechanism
limited
the noise.
by
mechanism
excitations. Whenever
an
that
way
in the
of
attempting
latency
surface x"fthe
to
dictated
ask, as
we
single
interresponse
operating in overt
complicated systems of
Fig.
of the
nature
instance, the Laplace distribution
noise. For
to
coding
face, and
The
to find
and
affords
simple periodic mechanisms
fibers. Ejiowledge
coding in single nerve
information
form
responses,
than
the
to
of
neither
simple jobs. Even
very
it. When
study
to
way
clue
distributions
like
more
no
do
to
apart than
the other.
mechanisms
means
organized
between
organized than
stochastic
responding, it probably
the
further
applicable to both,
seem
of the time
better
find
we
responding. Yet
overt
complicated or
more
find levels of behavior
to
in this paper
presented
afforded
view
hard
that
hence
But
seem
to
something
something
else is.
REFERENCES
[1] Arley, N.
New
York:
and
Buch,
WUey,
K.
1950.
R.
Introduction
to the
theory of probabilityand
statistics.
118
[2]
of
rate
[3]
[5]
the
F.,
Bronk,
1946,
C.
Ferster,
and
H.
Hartline,
1940,
Amer.,
[7]
Vol.
[8]
[9]
Hoel,
Hunt,
K.
G.
P.
[11]
Mood,
[12]
Widder,
[13]
Wilks,
Manuscript
Revised
C.
Nat.
and
S.
received
manuscript
N.
Ann.
nerve.
Y.
of
ed.)
New
York:
Wiley,
1950.
York:
New
reinforcement.
Appleton-
fibers
the
of
the
visual
J.
pathway.
Soc.
opt.
Differential
mathematical
to
Physiol,
Mathematical
the
low
responding.
rate
N.
River,
Y.,
P.
theory
36,
calculus.
and
New
York:
New
evoked
York:
Wiley,
of
response
of
statistics.
some
among
McGraw-Hill,
E.
New
1954.
spinal
1950.
of
measures
conditioning.
123-130.
statistics.
8/28/61
ed.)
discharge
364-384.
relationships
1950,
12/14/59
received
147,
1959,
to
Sci.,
Pearl
Co.,
(2nd
statistics.
Background
M.
Theoretical
G.
of
reinforcement
Cyanamid
American
383-415.
Kuno,
Advanced
D.
in
behavior.
pp.
Acad.
S.
Schedules
messages
nerve
Introduction
Mueller,
Proc.
(1st
applications.
its
F.
Laboratories,
/.
A.
B.
Introduction
C.
C.
of
239-247.
30,
interneurones.
[10]
The
11/10/59,
4,
excitation
response
1958.
1957.
Lederle
of
Report
and
theory
Operant
T.
R.
Hill,
Chemical
the
upon
University,
457-485.
Skinner,
Century-Crofts,
[6]
reinforcement
Columbia
thesis.
M.
Larrabee,
and
D.,
47,
Probability
W.
Feller,
doctoral
Unpublished
of
probabilities
uniform
of
effects
The
pigeon.
Sci.,
Brink,
Acad.
[4]
C.
Brandauer,
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
York:
Princeton:
Prentice-Hall,
Princeton
1947.
Univ.
Press,
1943.
p Sound^
treated
Figure
diagramof
Schematic
equipmentwith
the
room
circuit used in
equivalent
intensity.
the
computationof
the
size of the increment in
AP/P
AP
140
10
2000
10
fc-1000
0.006
0.008
120
0.01
"
0.1"-
kr
100
1.0^^1.0
't-
0.02
100
ir- 10
0.3
-h:
^-
0.04
0.1^^0.1
~^
0.5
0.06
5;
1.0
a,
--
"
0.08
60
"0.1
0.1
0.01 îr
0.2
0.01
40
^0.01
3 -^-
0.4
0.001
20
-
5"-
0.6
0.001
0.001
0."
0
1.0
Figure
Nomogram
to convert
0.0002
values of AP/P to AP
120
0.0002
when
P is known.
0.0002
GEORGE
Resistance
Decade
The
attenuators
Rq and
Box.
used
were
schematic
keep
to
Rj^, surroundingthe shunt
into the
in
shown
Fig. 1.
whole
the
of the
equipment
values
of
constant
resistances,Ri and
system
which
can
this circuit,the
For
121
MILLER
diagram
of the increment
computation
resistance, 7?2- The
A,
source
by the insertion
produced
representedby
size of
AEl
the
increment
an
in
Fig. 1.
impedance,
load
R2, since these values
is
be
is shown
and
must
enter
of the variable
equivalentcircuit,also
voltageÊl is given by
in
R^R^Rl
'
RiiRoiRi
El
If
the
does
system
introduce
not
R2+
R3)
of its sensation
for the
level
noise.
the number
"
R2)]
after the
increments
amplitude distortion
be taken
pressure, expressed in decibels, can
in sound
produced, the increment
+
AElIEl).
201ogio(l
Throughout the followingdiscussion
terms
Rl(Ri
the
of
intensity
of decibels
above
the
noise
are
as
will be stated
the listener's absolute
in
old
thresh-
If the
sound-pressurelevel of the noise is taken to be the level

the
generatedby a moving-coilearphone (Permoflux PDR-10) when the voltageacross
is
the
the
for
soidal
sinusame
as
a
earphone (measured by a thermocouple)
voltagerequired
to generate the givensound
in
of
volume
6
wave
then
(1000 cycles)
a
cc,
pressure
for the noise correspondsto a sound
the absolute
threshold
of
approximately
pressure
10 db
level
pressure
sensation
150
by
level.
and
7000
c.
Once
p. s.
sensation
the
discussed
into
noise
and
by
value
passes
to
of the
spectrum
into
converted
the
adding
uniform
was
relatively
the
dynes/cm^will
be
can
10 db
noise
value
sound-
given
for
the
db) between
("5
transduced
the
by
Hawkins.^
relative
of the increment
line which
straight
scale,and
level
in detail
sound-pressurelevel
value
of
and
measurement
been
decibels
convenience.
of
spectrum
The
the absolute
convertingthe
the
simple procedure
has
the
known,
hand
the
The
PDR-10
earphone
are
dyne/cm^. Thus
0.0002
re
size of the increment
be
find the
through
in decibels
Those
computed.
of Fig.2
nomogram
can
value
of the sound
interested
a
in
considerable
of A/ in decibels
the left-
on
the middle
scale,will intersect
through a
pressure on
When
the stimulus
is a
right-handscale at the appropriatevalue of AP in dyne/cm'^.
sound
its
acoustic
in
is
plane progressive
wave,
intensity watts/cm^ proportionalto the
the
of the
square
The
pressure
peak
kp^.
"
amplitudesin the
of
wave
white
noise
expect, therefore, that the size of the justnoticeable
to
function
of the
aspect of the
distribution
stimulus,
passed through
increments
altered
were
by
uniform
the
square-wave
in
amplitudesin the
peak
experiment .was
second
difference
In
wave.
conducted.
generator (Hewlett Packard,
square-wave
introduced.
level. The
randomly
of
and
The
order
The
Model
wave
resulting
frequency.
might
might
to
vary
evaluate
noise
210-A)
be described
as
before
are
"squared oflF"
square-
wave
as
this
voltagewas
qualityof the noise

subjective
spectrum
generator, but the peak amplitudes are
form
It is reasonable
constant.
not
are
the
not
at
modulated
masking of pure tones and of speech by white noise," in a report

Report No. 5387 (Psycho-Acoustic
by noise, OSRD
Laboratory,
masking ofsignals
Harvard
Services,
1, 1945) (available through the Office of Technical
University,October
U.S. Department of Commerce,
Washington, D.C.).
^
entitled
J. E.
The
Hawkins,
"The
122
READINGS
The
as
PSYCHOLOGY
differential sensitivity
procedure for determining
was
experimental
and
of
increment.
an
noise
the
The
observer, seated
monaurally through
listener heard
continuous
the
percentage
heard
which
to
(1.5 sec.
Four
Decibels
Two
Time,
of
5 to
points on
threshold
two
obtained
was
series
as
Noise.
Listeners
Could
Function
of Sensation
Hear
used
were
to
used
The
A
periodically.
presented,
to determine
determine
each
Increments
50
each
Percent
in
of the
Level.
psychometricfunction, and from

Thus
500
by linear interpolation.
listeners
experienced
listened
room,
added
were
the
impendingpresentation
sound-treated
was
same
was
for Intensity
of
Sensitivity
Which
intervals of 4.5 sec.) was
at
such
TABLE
DiflFerential
in
difference
only
indicate the
increment
an
duration
tabulated.
was
alone
The
the
high quality,
dynamic earphone (PDR-10).
noise,
series of 25 identical increments

and
MATHEMATICAL
Volkmann.^
employed by Stevens, Morgan,
of a signallight
which
theysometimes used to
that
omission
to
IN
this function
to 800
the
differential
each
of
at the
16
judgmentsby
differential
threshold
different intensities.
Results
The
time
be
are
noted
increments
in decibels
presentedin Table
as
which
a
the
function
two
listeners
of the
could
sensation
hear
50
level of the
percent
noise.
of the
It will
for "square-wave noise"

is not significantly
sensitivity
in
the
noise.
the
fluctuations
peak amplitude
greaterthan that for random
Apparently
do not influence the size of the justnoticeable
of the wave
The
increment.
response of
the ear
And
since the
is probably too sluggishto follow these brief fluctuations.
the two
difference between
of the phase relations
forms is essentially
matter
wave
a
*
that
S. S.
discrimination
the
differential
Stevens, C.
of loudness
T.
Morgan,
and
and
Am.
pitch.
J.
J.
Volkmann,
Theory of
Psychol,1941, 54,
the neural
315-335.
quantum
in the
123
MILLER
A.
GEORGE
3.5
3.0
:9 2.5
2.0
.5,L5
LO
0.5
20
10
40
30
50
60
Sensation-level
Figure
in decibels
of the noise
of
purposes
the
among
components,
differential
eflfecton
The
data
comparison.
we
value
0.048
or
energy,
the
to
is
in
curve
of
Fig.3. The
obtained
by
For
pressure.
Karlin^
with
of
The
range
is indicated
of the
intensity
presentedfor
are
have
important
no
group
At
which
for sound
of 0.099
is proportional
increment
the
portion of
horizontal
the
threshold,
highestintensities
the
Weber-fraction
over
by
the absolute
above
more
of intensities agree
this range
over
or
constant.
approximately
corresponds to
stimulation
values
purposes
Knudsen^
db, which
0.41
for sound
level
tones
phase relations
that these
that, for intensities 30 db
indicate
is about
function
sensitivity.
the relative differential threshold

the
are
solid line representsEq. (2).
The
conclude
may
100
plottedas a
hearing. Data for
of
threshold
the
above
90
50 percent of the time
heard
in intensity
Increments
70
in decibels
quitewell with
the
solid
the values
of 50 listeners.
comparison,Fig.3
includes
obtained
data
from
differ
by
Riesz^
those
and
obtained
by
for
markedly
intensities.
at low
Possibly
quite different, especially
introduced
the
data represent sensitivity
Knudsen's
to the "noise"
by
abrupt onset of
of
of his use
because
intensities
his tones, or possiblyRiesz's data at low
are
suspect
for
obtained
in intensity.
Data
beats to produce increments
by Stevens and Volkmann^
with
of
four
intensities
to
more
tone
seem
listener at
a
closely
a single
1000-cycle
agree
mine
the present results than with Riesz's, but their data are not complete enough to deter-
noise, but
800
Riesz's
function.
c.p.s. which
these
studies
R.
R.
results do
Knudsen's
for tones.
data
are
King, and
favorablywith
Churcher,
compare
indicate
not
that
the
Davies^
the
for
with
reported data
Riesz.
Taken
intensityis
of the
of
function
limen
difference
have
of the
sensitivity
Riesz, Differential intensity
ear
for pure
tone
of
together,all
order
same
tones,
of
P/iys.Rev.,
1928, 31, 867-875.

"
V. O.
and frequency,
Knudsen,Thesensibilityoftheeartosmalldifferencesinintensity
Phys. Rev., 1923, 21,

"^
S.
1940, 92,
*
S.
Stevens
84-103.
and
J.
Volkmann,
The
quantum
of sensory
discrimination.
Science,
583-585.
B.
of
intensity
G.
a
Churcher,
pure
A.
tone, Phil.
J.
King,
and
H.
Mag., 1934, 18,
Davies, The
927-939.
minimum
change
perceptible
of
124
READINGS
magnitude for
noise
it is for tones, at least at the
as
for
intensities the discrimination
lower
PSYCHOLOGY
MATHEMATICAL
IN
higher levels
stimulus
noise
be
may
At the
intensity.^
of
somewhat
acute
more
for tones.
than
Implications
for a Quanta! Theoryof Discrimination
is not
units
neural
Only
It is
new.
neural
limen
difference
the
evidence
has
discreteness
to
of discrete
selves.
sensory cells themthe
support
discrimination
mediatinga
processes
of the
obtained
been
activation
the
depends upon
the
suggestedby
however,
recently,
basic
the
that
that
notion
The
of
are
assumption
all-or-none
an
character.
evidence
principal
The
Volkmann^"
Stevens, Morgan, and

We
sensory
excites
these
number
quanta
This
bring into
the
activity
added
left-over stimulation
fluctuation
evident
that
the
over
When
the
increments
added
are
neural
to
in effect, a
The
constant
error
of
that
"
with
the
increment,
to
expect?
an
individual
difference
of
If [the over-all
quantum,
it is
equally
occur
is added
taneously
instanthis fraction
itself.
the listener
stimulus,however,
stimulus
from
the
changes
In order to
sensitivity.
ignoreall one-quantum changes.
in his
conditions
these
to
fraction of the time, and
of fluctuations
forced
A/,
much
How
to
changesin the
must
activate
at least two
and reported.
Thus,
perceived
function.
the
to
psychometric
be described
line of reasoning
can
will be
is added
quantum
one
psychometricfunction
it will
little to spare
additional
some
excite
surplusstimulation
continuous
distinguish
one-quantum
units in order
which
stimulus
that, if the increment
the size of the increment
judgments,the listener is
under
Consequently,a stimulus increment
additional
we
of the
it follows
reliable
make
are
the size of
to
all values
constantlyoccurringbecause
are
with
a certain
perceived
stimulus, it will be
finds it difficult to
which
of time
considerations
The
.
so
perceptionof
in the
units.
for discrimination.
surplus excitation
to
directly
proportional
in
contribute,along
needed
to the
is
course
these
From
.
will
quantum
or
psychometricfunction.
the following
way :
surplusinsufficient to
is largecompared
sensitivity]
in
often.
ordinarilydo
will
small
the
involved
initially
distinct
functionally
quanta
leave
shape of
the
structures
into
surplusstimulation
quantum.
this
of
and
from
present the argument
neural
the
divided
are
certain
excite
that
assume
continuum
derives
predicted
by
this
less than
to a steady sound
are
followingway.
value A/q, they are never
some
reported,and over the range of increments from 0 to
remains
the
function
at 0 percent. Between
A/q and 2AIq the
A/q
psychometric
with the size of the increment, and
proportionof the increments reportedvaries directly
reaches
100 percent at 2 A/q. Such a function is illustrated by the solid line of Fig.4.
is reported50 percent of the time is
It will be noted that the difference which
If we
take this value as defininga unit
to 1.5 times the quantalincrement.
equivalent
functions
increment
in the stimulus,all the psychometric
obtained for the two listeners
in the
can
When
into
be combined
scales
intensity
coincide
^
at the
Of
the
50
"
stimulus
singlefunction.
which
against
the functions
increments
In other
are
percent point.In Fig.4
the
See reference 4, p. 317.
Soc.
we
can
adjustthe
to make
Am., 1941, 12,
with
disagreesstrikingly
and
R. M.
517-525.
individual
all the functions
size of the relative increment
investigations,
only Dimmick's
higherintensities. F. L. Dimmick
in audition,/, acoust.
words,
plottedin order
modern
reportedhere for the

limen
the
Olson, The
in sound
the values
ence
intensive differ-
A.
GEORGE
125
MILLER
2Q
IQ
AP/P adjusted to 1.5Q
psychometricfunctions
32
of the time
to
this value.
pressure, AP/P, has been

the
time
is
Each
the
classical
difference
attributable
to
Is there
is, for
white
means
that
and
value
quantal increment.
quantal function was
of
described
by the phi-function
the
dashed
that
by
an
level
any
noise
the
is
amount
the
of the
the cumulative
When
chance.
time
increment
the
be
in the
The
is
hypothesis.Any
produce
the
S-shaped normal
It should
one
of the
"
function:
467.
source
G. A.
Miller and
Implicationsfor
the
W.
a
trolled
con-
reasoning
statistical
surprisingindeed
to
of the
nature
noise
obscure
in view
And
if the
of the fluctuating
rigorousexperimental
situation
predictedby
the
the
demonstrates
the
step-wiseresults
quantal
and
to
probability
integral.
shape of the psychometricfunction is only
the slope
quantalargument. According to the hypothesis,
be noted, however,
of
implications
If this
are
are
probable value,
merely the most
will depart from
this probable
increment
tends
variability
which
variables
quantalhypothesisshould
the listening
situation.
experiment? Certainlythere
fulfilled. This
requirementsof the quantal hypothesiswere
functions
in obtaining the rectilinear
practicaldifficulty
of
variables
these
the
sufficient to affect the discrimination.
stimulus, it would
function
probability
into
variability
of randomness.
of the
portionof the
in this
(the normal
gamma
is revealed."
Fig.4 from
of random
paradigm
value
pointsin
of randomness
source
calculated
certain
of
obvious
percent of
obtained
not
of small, indeterminate
according to
introduction
the
50
line.
applicationof
number
combine
the deviations
heard
was
the
assumes
which
which
eliminated,the step-wise,
"quantal" relation
or
is correct, then
be
that the increment
for the
argument
limen
and
independent,
the characteristic
that
experiment.The data are better

indicated
probabilityintegral)
by
to
in
adjustedso
1.5 times
plottedas
Figure4 shows
The
singlegraph. Values of APjP heard 50 percent

the datum
pointson each function are plottedrelative
point represents 100 judgments.
combined
designatedas 1.5Q, and
are
3Q
j.n.d.
Figure
The
R.
that the
Garner,
the psychometric
on
presentation
1944, 57, 451discrimination.Am. J. Psychol.,
Effect
quantaltheory of
of random
126
READINGS
IN
MATHEMATICAL
PSYCHOLOGY
of the
psychometricfunction is determined
by the size of the difference limen for all
of stimulus-intensity.
The
with
this second
present data accord
prediction.
The
standard
deviations
of the probability
the
which
describe
data
are
integrals
one-third
the
0.5
all
for
the
thresholds
measured
for both
means
(or
approximately
AIq)
subjects.This invariance in the slopeof the function is necessary but not sufficient
for a neural quantum, and it makes
evidence
of the results
possiblethe representation
in Fig.4.
in the form
shown
values
order
In
to
Symbolic Representationof
the
represent the experimentalresults
in
symbols will be used
Data
the
symbolicform,
following
numerical
constant
numerical
constant
==
1.333,
0.066
A/^// when
" /",
difference limen
noticeable difference)expressed
in decibels,
(just
in
second,
/ frequency cyclesper
sound
/
intensity
(energyflow),
/~
sound intensity
per cycle,
sound
which is justaudible in quiet,
/q
intensity
sound
which is justmasked
in noise,
I^
intensity
in
sound
increment
AIq quantal
intensity 0.667^1^,^,
heard
50 percent of the time,
A/gQ increment in sound intensity
DL
in sones,
loudness
masking in decibels,
of
number
quantal increments above threshold,

ratio per cycleat
signal-to-noise
any frequency,
effective level of noise at any
Nq
An
empirical
equation
where
the
variable
time
we
A/g^
"
the solid curve
in
the
Fig.3
are
"Weber's
Law,"
a
1-
German
181.
the
H.
Helmholtz,
ed..Vol.
II. The
be
can
developed from
the
(1)
I"
which
increment
have
to
be heard
can
101ogio[l+
justnoticeable
know
the
only
computations
The
obtained.
obtain
that
which
of the law
Treatise
at
fixed
50
percent
and
of the
on
sensations
fit of this
that
the
values
size of
it is added.
at
to the form
low
in decibels
I and
through,the
curve
to
of the
in
expressed
as
/q and
function
not
their
values indicated
the data
is
by
good enough
function.
(1) is
to
equivalent
justnoticeable
Differential
intensities,and
(2)
1.5b(IJI)].
ratio between
high intensities Eq.
which
Law
1.5c
increment
carried
smoothed
states
to
proportionalto the intensity
departs from Weber's
modification
I "
is assumed
stimulus-energy
are
of Eq. (2) to
the use
justify
It is interesting
to note
known
Table
write
can
level,althoughwe
When
in
bl"
101ogio(l+ A4o//)
values.
absolute
data
to compute the
(2) it is possible
of sensation
to
Since
component.
the
in the
quantalincrement
DL
From
M^
equals1.5A/q,
"
frequency.
adequate descriptionof
the
well-
difference
is
teristically
characsensitivity
Fechner
Eq. (1).^^The
long ago suggested

essential feature
physiological
optics(translatedby P. C. Southall from
of vision, 1911),Optical
Societyof America, 1924, pp.
3rd
172-
128
READINGS
IN
MATHEMATICAL
White
Noise.
PSYCHOLOGY
TABLE
of White
Masking
of
Values
of the
Masking
Masking
It is
givenin
Noise
by
Obtained
Noise.
Table
I, and
function
values
of
determine
into
of the sensation
level of the
computed
are
intensities
intense
than
able
to
obvious
when
with
used
over
for
wide
masks
this noise
less energy
is needed
than
the
of the
noise
step is to
to
purposes
when
the
when
Eqs.
two
which
results of
Fig.5 where
masking noise.
(3a)
converting
thresholds
are
masking is plotted
addition,Table
In
(1) and
the information
II contains
combined:
are
(4)
masking noise
threshold, the
is about
12 db
more
ask whether
mask
these
tones
or
results
human
correspondto
the functions
speech. Fortunately,we
are
masking effects of noise on

equipment directly
comparable
the
here.
which
forms
in
The
Eq. (4).
to
into masked
then
the
Level
noise.
is used
1000-cycletone
when
Eq. (3a).
and
question.Hawkins^ has measured

and
with
speech
experimentalconditions
those
find
into
Iq from
and
A/q
and
+bl
101ogio[(f///o)
above
more
of
shown
are
in Decibels
of the Sensation
Masking According
this
Suppose,
We
next
noise
answer
and
tones
or
masked
the
The
obtained
25 db
Increments
Function
values
values
listeners,and
masking which
of
quantalincrements
M
For
the
substitute these
to
as
Values
II for the two
givenin Table
a
Listeners
Computed
possibleto
now
the differential thresholds
as
for Two
II
Quantal
energy
range
of
is 20 db
we
itself,
for
is
comparison,
of intensities
less intense.
conclude
when
audibility
the
over
spread
Since
the
entire
to
mask
white
particular
the
that, for this
masking functions, therefore,we

the 1000-cycle
tone.
masks
choose
we
this
that
energy
tone.
1000-cycle
noise
justmasks
corresponding value
is concentrated
In
subtract
order
8 db
at
to
from
1000
compare
the
is 12 db
noise spectrum,
specific
spectrum.
can
8 db
c.p.s.
the
level of
GEORGE
A.
129
MILLER
90
80
:^7o
a"
I 60
"
50
if
5 40
00
.E30
(a
S20
10
-10
10
20
30
40
Sensation-level
60
50
of noise
in
increments
experiments. Solid
of white
intensity
line represents function

and
When
and
tone
1000-cycle
obtain
we
intensity,
is
from
taken
curve,
remarkably
function
to
warrant
noise
by
of
by
of
the ratio
at that
data, and
the
as
Fig. 5.
for
noise
of the corrected
function
The
data
level for Hawkins'
in the noise
this
correspondence between
in the present experiment
pointsobtained
computed from Eq. (4) falls

in Fig.5.
separate presentation
close
too
to
Hawkins'
of
choice
Munson^''
function
in
noise.
lOOOc.p.s.is not crucial to this correspondence. As Fletcher

pointed out, a singlefunction is adequate to describe the masking
is a
noise is corrected by a factor which
of
pure tones, if the intensity the
This factor is givenat any frequency/
the frequency of the masked
tone.
tone
of the masked
to the intensity
R of the intensity
per cycleof the noise
The
and
white
function
The
its
analogous to masking
plottedin a manner
the masking of tones
Hawkins
for
by
of this tone
solid line shown
Hawkins'
close.
100
obtained
of 8 db
plotthe masking
the
noise
speech by
this correction
make
we
90
Figure
Discriminable
80
70
(///qin decibels)
have
frequency: R
intensities well
JJI'^.
above
is
threshold"
determined
experimentally
rectilinear portionof
the
on
for all
the
at
frequencies
function
in
shown
Fig. 5.
noises
For
attributed
to
the
with
continuous
noise
band
in the
it is convenient
to
spectra,the masking of
of
relate the
tone
of
frequency/ can
be
immediately adjacent f}^

frequencies
of frequency/tothe intensity
of
tone
a
masking
to
sequently,
Con-
of
in decibels re the threshold
and to express this intensity
/',
be
which
can
regarded
hearingat any frequency. This proceduregives10 logio(/~//o)'
of the
level
Z
effective
The
noise.
of
band
the
sensation
level
at /of a one-cycle
as
noise at that frequencyis then defined as
per
cycleof
the
noise
at
13
Soc.
Am.,
"
H.
Fletcher
and
W.
A.
(5)
+
lOlogio/?.
10log,o(/~//o)
Munson,
Relation
between
loudness
and
1937, 9, 1-10.
H.
Fletcher,Auditorypatterns.Rev.
Mod.
Plivs.,1940, 12, 47-65.
masking, J.
cicoiis.
130
READINGS
IN
MATHEMATICAL
PSYCHOLOGY
masking of pure tones is plottedas a function of Z, the relation between M

function
of frequency.A single
to be independent
expresses the relation
and Z for all frequencies.
between
M
M
in
When
obtained
to Z with the function
we
relating
compare the function
find
that
the
sensation
of
level
the noise is equivalent
the present experiment,we
to
When
and
the
is found
Therefore,
ll.Sdb.
15.14/?(/~//o).
///o
this expression
into Eq. (4) gives
Substituting
M
equation,
along with the
compute the masking of pure
10 logio(/~//o)
is greater than
be
computed more
masking can
This
Hawkins'
show
results
+6].
101ogio[i?(/~//o)
functions
tones
by
random
any
enables
Iq to frequency,
15
noise
of known
for
db, b is negligible
10
to
us
spectrum. When
and
frequencies,
+
logio(/~//o)simplyas
log^,/?
that the function
of Eq. (4) can
also be adaptedto
about
of human
the
and
R
relating
(6)
all
the
10
white
scribe
de-
noise.
speech by
masking
the correspondence
the masking and the masked
seems
complete. When
sounds
to changes in intensity
are
are
identical,masking and sensitivity
equivalent.
obtained
with
identical
and
masked
The results
noises are directly
masking
comparable
Thus
to
with
results obtained
different
the determination
that
of
masked
sounds.
to
sensitivity
It is reasonable
changes
in
to
is a
intensity
fore,
conclude, there-
case
special
of the
generalmasking experiment.
of masking is also applicable
It is worth
to visual
noting that this interpretation
of white
and
to changes in the intensity
sensitivity
light.Data obtained by Graham
of the similarity
of their
Bartlett^^ providean excellent basis for comparison, because
and
because
they used homogeneous,
procedure to that of the masking experiment,
more
rod-free, foveal
plottedas
function
and
we
retina.
of visual
measures
that
of the
areas
have
used
When
these
masking,the result
to
express
the
data
are
substituted
be described
can
auditorymasking by
by
noise
into
the
Eq. (3) and
same
of tones,
general
speech,
noise.
Relation
to
Loudness
adopted the just noticeable difference as the unit for sensory

is stillalive today : Are equally-often-noticed
a controversy which
scales,he precipitated
?
the
In
to be
of auditoryloudness, the answer
differences subjectively
case
seems
equal
much
noticeable
differences
intensities
at high
are
subjectively
(j.n.d.'s)
negative.Just
at low intensities.
largerthan j.n.d.'s
When
^^
Fechner
C. H. Graham
eye: III. The
and
influence of
The
N. R. Bartlett,
area
on
foveal
relation of stimulus
and
in the human
intensity
1940, 27,
Psychol.,
discrimination, J. exper.
intensity
149-159.
Crozier
has
used
detectable
similar
visual data
to
of the just
reciprocal
lightintensity
by a normal probability
demonstrate
that the
increment
is related to the logarithmof the
is determined
by the not-alreadyintegral.This is deduced on the assumption that sensitivity
excitable neural effects. Crozier's equaexcited portionof the total populationof potentially
tions
here. W. J. Crozier, On the
of the auditorydata presented
givean excellent description
Proc. Nat. Acad.
law for minimal
discrimination
of intensities. IV. A/ as a function of intensity,
Sci.,1940, 26, 382-388.
GEORGE
In order
need
we
to
the
two
be
can
in
is the
need
We
If these
as
the
threshold, and
Tone
as
Sones.
for
Data
Listeners.
12
of
The
tones,
intensity
loudness
subjective
was
rightand j.n.d.'s
not
agree, Fechner
was
III
Sensation-Level
of Sensation-Level
Function
of Quanta.
Number
for pure
to the
correspond,Fechner
If.they do
loudness-scale.
on
a subjective
pictureis more
complex than he imagined.
the
as
noise
relating
functions
two
TABLE
and
well
as
the functions
units
and
Loudness
for noise
case
to know
distinguishable
steps above
sones.
used
wrong,
that such
of information.
of
number
of the noise
to demonstrate
kinds
131
MILLER
A.
of
of
Noise, with
The
Last
Column
Quantal
Units
Above
Gives
1000-Cycle
Equally Loud
Loudness
in
Corresponding
the Corresponding Number
Threshold.
of differential
level
to a givensensation
quanta Nq corresponding
of
scale
a
readilyobtained
by "steppingoff" the quant.alincrements
against
consists
of
the
number
of
increments
The
unit
procedure
finding
quantal
per
number
of noise is
decibels.
of
and
intensity
then
integrating:
NQ=(llMQ-dI.
If
increment
substitute for the size of thee quantal
qu
we
When
we
solve
convert
in terms
We
assume
loudness
in
of the noise with

to the
alternately
equations were
The
of
that the number
Eq. (4) indicates that M

Eq. (9) are given in Table
The
base
logarithmsto the
masking M, we find
Nq
same
made
result of this
III, and
sones
was
the loudness
ear,
by
and
experiment
"
10, insert the values
for the
and
constants,
that
3.49M
(9)
K.
is
Therefore, K
zero
when
"5.1.
Values
/",and
at
of
obtained
Nq
this
point
by
plottedin Fig. 7.
ness
determined
by requiringlisteners to equate the loudThe two sounds
of a 1000-cycle
tone.
were
presented
of the tone.
Five
the listener adjustedthe intensity
of twelve
each
(8)
-JnMQ+C.
-
quantalincrements
1.46 db.
6/o
to
of
accordingto Eq. (1),
-dl
77-rTr-dr
"TT
cl
^Q-
(7)
the
listeners
level of the
for
the
six noise-intensities
1000-cycletone
which
sounds
studied.
equal in
132
READINGS
IN
MATHEMATICAL
20
40
PSYCHOLOGY
60
Figure
and
Observed
defines
"
III and
in Table
tabulated
values
in
from
sones
givesthe
Stevens'
standard
indicated
are
by
the
been
loudness
for the
constructed
loudness-scale^^
deviations
are
Standard
of
deviations
lengthsof the vertical bars.
plottedin Fig. 6, the

has
noise.
level of the noise.
the loudness
which
the loudness-scale
from
also
noise
to the
loudness
are
computed
values for 15 listeners
the
level of white
of the loudness
values
100
80
of noise in decibels
Sensation-level
included
With
in
these
is determined
sones
1000-cycletone.
in Table
of loudness
data, which
III.
Table
levels obtained
The
III
for
the 12 listeners.
Loudness
for
also be
can
loudness
calculating
computed.
from
the
Fletcher
masking
and
which
Munson
the
sound
developeda procedure
this
produces. When
300
225
75
150
50
25
75
20
40
Figure
Comparison
of the number
noticeable
!'"' S.
S. Stevens
and
of
80
60
discriminatory
quanta
increments
of intensity
are
H.
Davis, Hearing,New
100
of noise
Sensation-level
with
not
York:
the loudness
of white
equal.
subjectively
Wiley,1938, p.
118.
noise.
Just
GEORGE
procedureis appliedto
the
computed
shown
in Fig.6.
quitesatisfactory.
The
the
masking
equippedto present the

number
of quantalunits as
functions
two
a
by noise, we get
computed and experimental
of pure tones
between
agreement
now
are
shows
curve
for the
data
values
results is
We
Hawtcins'
133
MILLER
A.
in
shown
of sensation
function
Fig.7.
level.
solid
The
The
dashed
100
10
1.0
0.1
0.01
6789
45678
"
100
Masking (A7/7o in decibels)

8
Figure
Relation
curve
the
shows
error
the
loudness
of Fechner's
are
When,
not
as
S. S.
in sones.
The
and
masking
for white
discrepancybetween
assumption. Loudness
and
noise.
these
two
the
number
of
are
both
related
curves
affirms
justnoticeable
ferences
dif-
related.
linearly
in the
to determine
possible
"
loudness
between
present
case,
their relation
Stevens, A
two
to each
variables
other.
scale for the measurement
Rev., 1936, 43, 405-416.

Psychol.
of
Stevens^^ has used

a
to
third, it is
Riesz's
data
for
psychological
magnitude: loudness,
134
the
arrive
to
tones
pure
in
tone
sones,
is the
in Table
of the
most
the
is
and
loudness.
noise
an
alternative
In
the
itself. Let
on
Fig. 7
Table
the
expression L
in
form
than
KM^
loud
is related
fits the
the
to
of the
power
the
cast
we
Fig.
relation
data
third
rather
of the
power
logarithm
of the
that
j.n.d.'s
over
why
that
it
along
of
in
masking
to
be
can
creases
in-
noise
itself,
on
whatever
intensity.In
faint
that
seen
white
the
level;
the noise
smaller
are
j.n.d.'s
of loudness.
scale
the
of
defined, and
of
in
sensitivity
to
sensitivity
relation
loudness
it is obvious
equal
noise
masking
so
Fig. 8
increment
units
tinguishabl
dis-
but
sensation
to
The
In
The
quantal
not
are
the
masking produced by
empiricalequation, however,
j.n.d.'sand
that
masking,
level.
well.
functions,
computed
functions.
two
for
well
differential
is related
masking
data
rather
notion
the
between
of
of
apparent.
between
we
sensation
to
these
and
the
power
is not
developed
II
with
loudness
number
relation
be
to
is the
is the
the
relation
masking,
the
out
tones
the
Table
by combining
proportion
third
i.e.,the
of
turn
and
we
where
computation
describes
both
state
kN^-^,
first step, and
kN^
section
5 and
loudness
III
to
way
examine
now
In
is obtained
loudness
specialcase
the
for noise
preceding
us
subjectiveloudness.
and
that
different
be
intensityis
in
changes
sones
that
It is interesting
should
There
find
we
of
parallelStevens'
we
III,
range.
exponent
in
size
When
steps.
presented
empirical equation
the
at
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
Acknowledgment
in
valuable
criticism
and
advice
who
S. S. Stevens,
preparation of
the
during
who
Shirley Mitchell,
his gratitude to
Miss
author
wishes
to
express
obtaining the experimental data, and to Professor
The
assisted
this
tributed
con-
manuscript.
Correction
In
1963
is written
D.
as
added
pressures
pressures,
from
recomputed
above
noise
12
alone
in
Osman,
and
M.
acoust.
to
the
and
V.
Soc.
When
obtains.
db
there
therefore,
able
Table
I
the
threshold,
(not
were
of
20
noise
of the
Rich;
page
1962,
(cf. Fig. 1),
effect
for
At
db
25
then
would
noise.
This
sinusoids
model
instead
have
fact
has
for
In
of
of
been
was
been
monaural
masking
25
the
than
about
below
verified
if
db
is
or
masked
25
db,
listeners
masking;
inaudible
sound
summed
their
intensities
intense
levels
their
so
amount
for
more
sensation
which
Eq. (3), which
of
square
the
logô(AP/Pq),
is about
Energy-detection
the
When
pressures.
in
error
an
generated independently.
was
(negative masking)
"masking"
similar
Mathews,
noticed
been
power
128).
in-phase increments
absence
Am.,
on
facilitation
was
hear
squared
masking
stated
as
combined
using
had
perfectlycorrelated
were
their
Rich
E.
noises
masking
their
phase;
sum
and
Osman,
and
noises
two
in
the
not
E.
Raab,
masked
the
fact, however,
more
H.
if the
presented
directlyby Raab,
reported by
S.
auditory
M.
Pfafflin
detection,
/.
34, 1842-1853.
this correction
is
made,
of course,
the
relation
shown
in
Fig. 8
no
longer
136
READINGS
the variance.
MATHEMATICAL
it also enables
And
based
are
the
different
on
good
some
So
ing
adopt-
for
reasons
The
of variance and amount

similarity
of information
might be explained this
When
have a large variance,
we
way:
we
are
ignorant about what is govery
ing
If we are very ignorant,
to happen.
then
when
make
we
gives us a lot of
other hand, if the
know
we
must
system,
what
into
goes
great
deal
comes
out.
their
the
variance
If it is
their
(or
say
this
what
That
is to
upon
with the
to the
of
So
circle the amount
are
Whenever
that
the
is
see
in and
the
say,
is
the
measure
you
whenever
transmitted
understand
"amount
to
of
will understand
I refer to "amount
information," you
"covariance"
or
If
be
will
by
two
The
and
the
to
of transmitted
observer's
will be
absolute
then
will
recoverable
If he makes
transmitted
formatio
in-
errors,
information
may
considerablyless than the input. We

increase the amount
expect that,as we
of input information,the observer will
and more
more
begin to make
errors;
we
reasonable
amount
kind
then
of
server
ob-
If the human
judgments.
system,
the
of his
test the limits of accuracy
can
absolute
of communication
when
input
increase
we
the
information
will increase
information
at
eventuallylevel off at some

value.
This asymptoticvalue
asymptotic
take to be the channel
we
capacity of
firstand
will
it represents the
of information
amount
us
about
an
absolute
the
stimulus
judgment.
is the upper
to which
can
the
by
information.
be
tion."
"correla-
graphically
partiallyoverlapping cir-
situation
stimulus-
measured
as
the
transmitted
formation,"
inthe observer:
ance."
"vari-
overlap the
amount
transmitted
a
in his
judgments are
quite accurate,
all
the
of
nearly
input information
is
measure
amount
experimentalproblem is to increase
and
of input information
amount
much
mission.
trans-
the
stimuli,the right
of transmitted
"noise"
simply
input-outputcorrelation.
two
simple rules to follow.
refer
the
correlation
the
during
information
we
And
The
and
Then
of information
then
how
or
in the
be
to
represent the
of information
system,
variance
input and
channel.
left circle would
ment,
judg-
absolute
is considered
communication
atic
system-
of the
There
his responses.
the
fluctuations
the system
by
of transmitted
measure
experiments on
the observer
from
goes
of the output
to random
introduced
of
also
input, or
input. If we
then we
can
correlation,
much
attributable
is due
that
in terms
some
depend
will be correlated
how
the
In
tion.
informa-
of transmitted
amount
amount
information).
be
must
out.
will
measure
and
communication
there
comes
output
system
relation between
what
cation
communi-
realize
described
good
however,
the
as
response
variabilityabout what
The
input and the output
be
taken
variance of the input,

rightcircle the variance of the output,
and the overlap the covariance
of
and
input
output. I shall speak of the
left circle as the amount
of input information,
the right circle as the amount
of output, information,and the overlap
of
therefore
can
observation.
about
great deal of variability
be
can
the
formation
get little inresponses,
we
will
you
the
is very small,
tion
observaour
how
so
it
On
information.
making the
now
imagine
will
If you
observation
variance
out,
come
from
there is
the
in advance
the left circle
cles.
metrics.
concept.
newer
Then
ferent to represent the

dif-
in quite
experimental situations where it
ances
varibe meaningless to compare
would
PSYCHOLOGY
to
us
results obtained
compare
there
IN
be described
Now
that he
can
to the
stimuli
just a
brief word
we
can
give
basis
channel
The
limit
the observer
the
on
greatest
on
of
pacity
ca-
the extent
match
his
sponses
re-
give him.
about
the bit
GEORGE
and
we
One
bit of information
begin
can
information
decision
that
man
than
six feet tall and
the
chances
bit
one
if
or
more
know
we
information.
to
way
does
unit
that
refer
not
of
"
Two
bits of information
decide
four
among
Three
among
decide
Four
five among
lyl,and
12
if
there
are
say,
so
know
So
five successive
every
is increased
There
rule
of
two
are
could
of
give information
would
time
the
rect.
cor-
simple:
bit
one
to
the
Or
increase.
variable
stimuli.
data
the
is the
in
of
from
ordinate
more
With
per
14, the
from
Fig.
stimulus.
tones
was
mation
input infor-
1 to 3.8 bits.
plotted the
1.
of input
amount
bits
alternative
2 to
is
in
plotted
bottom
increased
the
or
listeners
the
tones
are
the number
increased
five
frequent.
were
information
As
confusions
tones
with
were
them.
On
of
amount
at
which
observer,so
that
unit time
per
could
we
we
ignore the
completely and increase

by
input information
number
the
In
we
of
absolute
We
give
alternative
judgment
interested
are
alternative.
periment
ex-
in the second
the
observer
as
Fig.
time
we
must
but
tones
mistakes.
many
Along
correct
confused
never
different
different
These
three
or
confusions
the
tone.
of
increasing the
of
two
quite rare,
made
told
was
made
input information.
the rate
of information
amount
much
the
in
crease
in-
might
we
ways
increase
the amount
is
is
spect
re-
is added.
the amount
We
four
alternatives
factor of two,
only
With
fore
each, be-
alternative
number
by
of information
bit
general
the
is to
covered
listener had
of the
the listeners
alternatives, fourteen
one
which
the
time
When
the
he
response
tones
That
and
8000
After
identification
natives,
equally likelyalter-
worth
binary decisions,
we
his
bits of
mation
infor-
to
different with
were
100
numeral.
were
on.
make
must
able
en-
to
to
equal
range
cps
logarithmicsteps. A tone was sounded
and
the listener responded by giving a
to
us
tones
from
happens
judgments of
(17) asked listeners

by assigningnumerals
frequency,
to
eight equally
16
among
The
what
absolute
Pollack
them.
Uni-
of
stimuli
consider
us
make
tones.
natives. used
equally likelyalter-
likelyalternatives.
we
enable
bits of information
decide
to
us
let
we
need
we
length that
we
use
feet,inches, centimeters,etc.
However
the man's height,
measure
you
stillneed
bit
of information.
we
justone
any
Now
identifytones
that
Notice
of information
the
whether
50-50, then
are
of
this unit
in
decide
six feet tall
Judgments
dimensional
ternatives.
equally likelyalwhen
must
we
Absolute
of
make
to
137
MILLER
data.
some
need
two
is less than
at
is the amount
we
between
If
look
to
A.
as
he wants
will appear
callinghis
begin
near
increase
stimuli
discriminate
confusions
his
sponse;
re-
1.
and
the number
which
among
look
to
see
Confusions
to
occur.
the
point that
capacity."
"channel
he
where
we
are
Data
from
Pollack
(17, 18)
on
the
by
of
judgments
of input information
auditory pitch. As the amount
2
is increased
by increasing from
the number
of different pitches to be
14
to
tion
informaof transmitted
judged, the amount
limit a channel
approaches as its upper
capacity of about 2.5 bits per judgment.
amount
simply
alternative
to make
listeners
of information
who
make
that
is transmitted
absolute
138
READINGS
transmitted
of
information.
transmitted
much
IN
The
MATHEMATICAL
amount
behaves
information
PSYCHOLOGY
in
the way
would expect a comwe
munication
channel to behave; the mitted
transinformation
to about
an
increases
2 bits and
asymptote
then
linearly
up
bends
about
at
off toward
2.5 bits.
LOUDNESSES
This
is what
value,2.5 bits,therefore,
we
are
callingthe channel capacityof
the listener for absolute judgments of
pitch.
So
bits.
that
now
What
have
we
does
the
it mean?
number
2.5
First,note
15-110
12
INPUT
DB
INFORMATION
Fig. 2.
Data from Garner (7) on the channel

audicapacity for absolute judgments of tory
2.5 bits
loudness.
correspondsto about six
equallylikelyalternatives. The result
that we
Next
than
ask how
cannot
means
can
pick more
reproducible
you
this result is. Does
six different pitchesthat the listener will
it depend on
the
confuse.
ferently,
never
spacingof the tones or the various conditions
Or, stated slightlydifhow
tive
alternaof judgment?
Pollack
varied
matter
no
many
these conditions in a number
ask him to judge,the best
of ways.
tones we
be changed
can
we
can
expect him to do is to assign The range of frequencies
them to about six different classes without
by a factor of about 20 without changing
the amount
of information
mitted
transerror.
Or, again,if we know that
there were
than a small percentage.
alternative stimuli,then
N
more
Different groupings of the pitchesdecreased
his judgment enables us to narrow
down
the particularstimulus to one
the transmission,
but the loss
of
out
small.
For example, if you
was
can
N/6.
Most
five
that
the
discriminate
high-pitchedtones in
people are surprised
number
series and five low-pitchedtones in
is as small as six. Of course,
one
there is evidence
another
that a musically sopect
phisticated
series,it is reasonable to exall ten into
that you could combine
with absolute pitch
person
a
singleseries and still tell them all
can
identifyaccuratelyany one of SO
When
I
60 different pitches. Fortunately,
or
apart without error.
try it,
you
nel
chandoes
work.
The
do not have time to discuss these reit
not
markable however,
be
about
for
to
tunate
capacity
pitchseems
exceptions. I say it is for-
because
I do
not
know
how
to
six and
that is the best you
can
do.
While
we
are
on
tones, let us look
explaintheir superiorperformance. So
I shall stick to the more
pedestrianfact next at Garner's (7) work on loudness.
data
for loudness
Garner's
that most
of us can
are
marized
sumidentifyabout one
in Fig.2. Garner went to some
out of only five or six pitchesbefore we
trouble to get the best possiblespacing
begin to get confused.
the intensityrange
of
his tones over
It is interesting
to consider that psychologists
He used 4, 5, 6, 7,
have been using seven-point from 15 to 110 db.
10, and 20 different stimulus intensities.
rating scales for a long time, on the
results shown
in Fig. 2 take into
The
intuitive basis that trying to rate into
the differences among
add much
does not really
account
finer categories
subjects
mediately
influence of the imlack's and the sequential
to the usefulness of the ratings. Polresults indicate that,at least for
judgment.
Again
preceding
to be a limit.
find that there seems
we
pitches,this intuition is fairlysound.
GEORGE
A.
139
MILLER
and
Garner
markers.
Fig.
two
^_l.9
BITS
(8) asked observers

two
visually between
Their
They
4.
In
ways.
observer
and
use
100
to
TA$TE$
JUDGMENTS
OF
INPUT
Fig.
and
Data
3.
(1)
absolute judgments
results
on
Beebe-Center, Rogers,
the channel
of
observers
five
perfectlydiscriminable
capacity for
values
of loudness
Since
these
different
is 2.3
studies
two
laboratories
absolute
ments
judg-
bits,or
about
alternatives.
done
were
in
were
to
to
channel
with
capacityfor
saltiness.
The
shown
are
in
the
experiment in
version they let the
between
zero
any number
describe
the position,although
this
unlimited
technique are shown by the

the graph. In the other
on
INFORMATION
from
O'Connell
one
scale
they presented stimuli at only

5, 10, 20, or SO different positions.The
SALINE
CONCENTRATION
12
results
did
terpolate
in-
to
limited
in
version
the
their
sponses
re-
lus
reportingjust those stimuthat were
That
is
possible.
in the second
say,
response
filled circles
version
the
ber
num-
of different responses
that the observer
could make
the
was
exactly
same
the number
as
the
of different stimuli
experimenter might present.
that
The
with
ferent results with this limited response

nique
techslightlydifand
methods
of
shown
the
circles on
are
techniques
sis,
analyby
open
in a good position to
the graph. The
not
functions
we
are
two
so
are
whether
five loudnesses is signififair to conclude
cantly similar that it seems
argue
different from six pitches. Probably
that the number
available
of responses
the difference is in the rightdirection,
had nothing to do with
to the observer
and
absolute
judgments of pitch the channel capacity of 3.25 bits.
than absolute
The Hake-Garner
accurate
are
slightlymore
experiment has been
loudness.
The
though
Aland Klemmer.
judgments of
important repeatedby Coonan
point,however, is that the two answers
they have not yet published
order of magnitude.
of the same
their results,they have given me
are
mission
perThe
nel
chanto say that they obtained
experiment has also been done
for taste intensities. In Fig. 3 are the
capacitiesranging from 3.2 bits for
results obtained
by Beebe-Center, Rogers,
and
O'Connell
(1) for absolute
of salt
judgments of the concentration
concentrations
solutions. The
ranged
from
cc.
0.3
to
34.7
in
tap water
They
used
1.9
bits,which
concentrations.
seem
per
100
equal subjectivesteps.
3, 5, 9, and
The
NaCl
gm.
17
different
channel
is about
Thus
centrations.
con-
capacity
four
taste
little less distinctive
is
distinct
intensities
than
tory
audi-
stimuli,but again the order of

magnitude is not far off.
On
the other hand, the channel capacity
for judgments of visual position
to be significantly
seems
larger. Hake
IMP'J"^
Fig.
on
the
INFORMATION
Garner
Hake
and
(8)
ments
judgcapacity for absolute
of the position of a pointer in a linear
4.
interval.
Data
channel
from
140
READINGS
short
IN
PSYCHOLOGY
MATHEMATICAL
Curvature
pointer position for the long exposure.
was
harder
When
to
the
longer exposures.
apparently
judge.
values
than
are
slightlyhigher
length of the arc was constant, the result
and
must
clude
conGarner's, so we
at the short exposure
duration was
very
These
Hake
bits for
that there
distinct
of the
exposures
3.9
to
between
are
and
10
linear
positionsalong
is the largest channel
a
This
that
has
unidimensional
At
been
measured
absolute
unidimensional
stimulus
for any
2.2
bits, but
chord
that
the
length
constant, the result
was
bits.
1.6
when
This
last value
has
anyone
of
the
only
was
is the lowest
measured
to
date.
should
these four
judgments
stimuli
in the
appeared
However,
pacity
ca-
variable.
the present time

on
IS
val.
inter-
experiments
simple,
of
all that have
are
add, however, that these values

low because
to be slightlytoo
data from all subjectswere
pooled
apt
are
the
before
the
transmitted
information
was
psychologicaljournals. computed.
great deal of work
variables
has
other
on
yet appeared
not
Now
let
where
we
are.
First,
to be a
capacity does seem
notion
for describinghuman
servers.
obthe
channel
Second,
capacities
us
see
the channel
in the
journals. For example, Eriksen

valid
that
the
(6) have found
channel
capacity for judging the sizes measured
for these unidimensional
ables
variof squares
about
five
is 2.1 bits, or
1.6 bits for curvature
from
range
perimental
categories,under a wide range of exinterval.
to 3.9 bits for positionsin an
conditions.
In a separate
Although there is no question that the
(5) found 2.8 bits
experiment Eriksen
the variables are real
differences among
l.Z bits
for size,3.1 bits for hue, and
and
the more
meaningful,
impressive
for brightness. Geldard
has measured
fact to me
is their considerable
larity.
simithe channel
capacity for the skin by
If I take the best estimates
I
placing vibrators on the chest region.
channel
all
of
the
for
can
capacities
get
A good observer can
identifyabout four
the stimulus variables I have mentioned,
five durations, and
about
intensities,
and
Hake
about
One
the
locations.
seven
of the most
in this
active groups
the Air Force
has been
area
Operational
Applications Laboratory. Pollack has
been
kind
with
enough to furnish me
the
results
of
their
several aspects of visual

made
displays. They
for
measurements
for
measurements
and
area
for
the
curvature, length,and direction of

lines. In one
set of experiments they
used
"
very
%o
second
2.6
bits
v/ith
bits with
the
length of
bits with
For
or
the
the
the
long exposure.
they got about
of
bits for the short
exposure
5-
and
exposure
line
angle
peated
re-
they got
area
short
the short
bits with
with
measurements
exposure.
2.7
3.0
they
"
the
is
standard
deviation
and
exposure
and
3.3
the
from
total
range
to
is
to
wide
that
be
the
There
built
be
to
seems
into
either
some
limitation
by
design of our nervous
systems, a
limit that keeps our
channel
capacities
in this general range.
On
the basis of
us
by learning or
the
the
present
evidence
2.6
say
that
possess
small
we
one
it
safe
seems
finite and
to
rather
mensional
capacity for making such unidiand
that
this
pacity
cajudgments
does not
2.8
bits
standard
15 categories. Considering
variety of different variables
have
been studied,I find this to
remarkably narrow
range.
from
tion,
Direc-
long exposure.
inclination,
gave
includes
categories,and
10
For
about
the
only 0.6 bit. In terms of

this mean
distinguishablealternatives,
one
corresponds to about 6.5 categories,
of the stimulus
short exposure
second
and
then
bits and
is 2.6
mean
deviation
vary
simple sensory
great deal from
attribute
to another.
GEORGE
Absolute
Judgments
A.
dimensional
Multi-
of
suits.
have
may
been
noticed
that
increased
that
means
You
the channel
Now
have
to
Stimuli
141
MILLER
of
one
any
careful
people
have
capacity seems
bits, which
rately
identify accupositionsin the
4.6
to
can
24
that this magical

say
square.
appliesto one-dimensional
The
positionof a dot in a square is
judgments. Everyday experienceteaches
clearly a two-dimensional
proposition.
that we
can
us
identifyaccuratelyany
Both
its horizontal and its vertical position
of several hundred
one
faces,any one
be identified. Thus
must
it seems
of several thousand
words, any one of
natural to compare
the 4.6-bit capacity
several thousand
The story
etc.
objects,
for a square
with the 3.25-bit capacity
certainlywould not be complete if we
for the positionof a point in an
val.
interhave
stopped at this point. We must
The
in
the
point
requires
square
some
understanding of why the onetwo judgments of the interval type.
If
dimensional
variables we
judge in the
have a capacity of 3.25 bits for estimating
we
laboratory give results so far out of
intervals and we
do this twice,
line with what
do constantlyin our
we
should
6.5
bits as our
we
get
capacity
behavior outside the laboratory. A possible
for locatingpoints in a square.
Adding
of
explanationlies in the number
the second independent dimension
gives
independently variable attributes of the
increase from 3.25 to 4.6, but it
an
us
stimuli that are being judged. Objects,
falls short of the perfect addition
that
faces,words, and the like differ from
would
give 6.5 bits.
another
in many
whereas
the
one
ways,
Another
example is providedby Beebestimuli
have
considered
thus
we
simple
When
Center, Rogers, and O'Connell.
far differ from one
another
in only one
they asked people to identifyboth the
number
to
seven
respect.
saltiness and
Fortunately,there are a few data on

what
absolute
make
we
happens when
judgments of stimuli that differ from
another
one
look
Frick
in
several
in
the
first at
Let
ways.
results
Klemmer
us
and
lute
(13) have reported for the absojudgment of the positionof a dot
In Fig. 5 we
their resee
square.
the
containingvarious concentrations of salt

and sucrose,
nel
they found that the chanpacity
capacitywas 2.3 bits. Since the cafor salt alone was
1.9, we might
expect about
of
3.8
bits if the two
the
the second
dimension
capacity but
conceivablymight.
A third example
the
IN A
.03 SEC.
3
4
INPUT
Fig. S.
on
the
Data
from
channel
of the
SQUARE
GRID
EXPOSURE
second
INFORMi^TION
Klemmer
and
dimension
capacity but
Frick
not
adds
as
much
little
as
it
lack
provided by Pollisteners to judge
both the loudness and the pitch of pure
Since pitch gives 2.5 bits and
tones.
loudness
gives 2.3 bits,we might hope
4.8 bits for pitch and
to get as much
as
loudness together. Pollack obtained
3.1
bits, which
again indicates that the
(18), who
NO
aspects
compound stimuli were

judged
with
As
tions,
independently.
spatial locato
POINTS
of solutions
sweetness
(13)
capacity for absolute

ments
judgposition of a dot in a square.
fourth
the work
on
not
is
asked
much
example
of Halsey
confusions
the channel
augments
so
among
as
it
might.
be drawn
can
and
from
Chapanis (9)
colors of
equal
142
READINGS
luminance.
IN
Although they
MATHEMATICAL
did not
their results in informational
they estimate
15
that there
are
about
3.6 bits.
in both
hue
1 1 to
our
terms,
dimensional
with
Since these colors varied
and
saturation,it is probably
regard this as a twojudgment. If we compare
to
correct
this
lyze
ana-
terms,
about
identifiable colors,or, in
PSYCHOLOGY
Eriksen's
bits
3.1
for
hue
(which
a questionablecomparison to
draw), we again have something less
than
second
perfect addition when
a
is
dimension
NUMBER
is added.
It is still a
Fig. 6.
The
OF
VARIABLE
general form
ASPECTS
of the relation
tween
be-
channel capacity and the number

of independentl
however, from
variable attributes of the stimuli.
these two-dimensional
examples to the
multidimensional
stimuli
provided by
It is interesting
decreasingrate.
to
faces,words, etc. To fill this gap we
that
the
note
channel
is
creased
inhave only one
capacity
experiment, an auditory
when
the several variables
even
study done by Pollack and Picks (19).
not independent. Eriksen
are
(5) reports
They managed to get six different acoustic
when
and
that,
size,brightness,
variables that they could change:
hue all vary togetherin perfectcorrelation,
of interrupfrequency, intensity,
rate
tion,
the transmitted
information
is 4.1
on-time
total
fraction,
duration,
bits
with
of
as
compared
and spatiallocation. Each
an
of these
average
one
long
six variables could

five different
5^, or
were
they
a
could
way,
assume
of
one
any
values,so altogetherthere
15,625 different tones that
present.
The
the transmitted
Under
one
was
ent
corresponds to about 150 differcategoriesthat could be absolutely
Now
error.
up
we
are
into the range
time.
that
the
was
function
increased
input
input
without
in
the
creasing
in-
tion;
informa-
increase in channel
an
capacity of about
the dotted
are
By confounding
of
amount
the result
bits,
which
beginningto get
the
of these
7.2
at
one
these attributes
three attributes,
Eriksen
these conditions
information
identified without
2.7 bits when
varied
of
dimensionality
listeners made
separate rating for each
six dimensions.
about
the
that
amount
Fig. 6 would
lead
to expect.
us
The
add
point
increase
seems
to
be
that, as
we
variables to the display,we
more
the
total
crease
capacity,but we defor
particular
any
expect.
accuracy
Suppose that we
plot these data, variable. In other words, we can make
relativelycrude judgments of several
fragmentaryas they are, and make
a
about
how
the channel
things
simultaneously.
guess
capacity
with
We
the
of
the
of
changes
might argue that in the course
dimensionality
stimuli.
The
result is given in Fig. 6.
evolution
those organisms were
most
In a moment
of considerable
successful that were
daring I
responsiveto the
sketched
the dotted line to indicate
widest
of stimulus
energies in
range
the
trend
that
the data seemed
their environment.
In order to survive
roughly
to be taking.
in a constantlyfluctuating
world, it was
Clearly,the addition of independently better to have a little information about
ordinary experience would
lead
us
to
the
variable
attributes
the channel
to
the
stimulus
capacity,but
creases
ina lot of
at
thingsthan
about
to have
small
lot of information
segment
of the
144
READINGS
IN
SUBITIZING
out
generalarea witht
he
periments
mentioning,however briefly, exconducted
at Mount
Holyoke
ber
College on the discrimination of numIn
(12).
experimentsby Kaufman,
random
Lord, Reese, and Volkmann
leave this
I cannot
patterns of dots
for
to
of
second.
than
more
200
the pattern.
report how
The
flashed
were
The
on
first point to note
The
of dots
was
in
to
was
can
be
the
are
This
is
on
make
errors.
small
bers
num-
the
dots that it was

performance with more
Below
the
seven
given a specialname.
said
to
subjects were
subitize;above
said to estimate.
This
seven
they were
will
what
we
once
as
is, you
recognize,
called
"the span of attention."
optimistically
one
number
magical
Here
me.
of
numerousness
seven
as
yet when
of the ways
the
in which
has persecuted
seven
have
we
closelyrelated
two
experiments, both of which

of the number
significance
limit on our capacities.And
examine
we
the matter
to be a
closely,there seems
that
it
is
nothing
suspicion
a
discontinuityat seven
is, of
suggestive. Is this the same
The
Span
Let
course,
basic process
that limits our
sional judgments to about
The
unidimengories?
cate-
seven
is tempting,
generalization
in my
opinion. The data
than
more
the
summarize
me
this way.
Memory
Immediate
of
There
is
limit to the accuracy
more
reasonable
coincidence.
unidimensional
would
with
which
absolute
that
maintain
call
definite
we
limit
judgment,
is
can
of
variable.
this
I
the
and
for unidimensional
this span
in
magnitude
stimulus
to
propose
of
span
situation
clear and
identify absolutely the

This
are
the
point to
patterns
six dots
or
of
the subject
density. When
and
area
subitize,
density may not
the significant
variables,but when
subjectmust estimate perhaps they
parison
significant.In any event, the comis not
so
simple as it might
at first thought.
were.
different from
so
and
appear
is that
containingup
subjectssimply did not
performance on these
dimensions
kinds
dots there
many
to five
the
from
subject'stask
two
area
seem
screen
Anywhere
dots could
PSYCHOLOGY
MATHEMATICAL
ments
judg-
usually somewhere
We
are
neighborhood of seven.
this
the
of
at
completely
mercy
on
lyzed
analimited span, however, because we have
in informational
terms; but on
a
variety of techniques for getting
the basis of the published data I would
around
it and increasingthe accuracy
that
transmitted
the
subjects
guess
portant
of
judgments. The three most imthan
four bits of information our
something more
of these devices
are
(a) to
about
the number
of dots.
absolute
relative
rather
than
make
ments;
judgUsing the same
arguments as before,we
that
is not possible,(b)
would conclude that there are
about 20
or, if
but not sound
number
estimates
have
distinguishable
categoriesof nuThis is considerablymore
merousness.
or
30
information
qet from
is,as
than
we
of
matter
two-dimensional
is not
in the
Prick's
expect
random
dot
entirelyclear,these
same
for
of dots
range
their
in
as
two-dimensional
square.
Perhaps
increase
along
(c)
to
that
we
terns
patand
play
disthe
The
one
the
which
It
results
Klemmer
to
to
display.
fact,very much like a
display. Although the
dimensionalityof the
are
would
unidimensional
in the
not
not been
stimuli
the task
arrange
of
number
the
dimensions
can
differ;or
in such
way
solute
of several absequence
in
a
row.
judgments
make
study
of
of the oldest
psychology,and
relative judgments is
topicsin experimental
I will not
now.
The
second
pause
to
view
re-
creasing
device, inhave just
the dimensionality,
we
It seems
that by adding
considered.
it
GEORGE
dimensions
more
requiringcrude, a lot of
tributethis span
judgments on each at-
extend
can
judgment
Judging
150.
the
145
MILLER
and
binary, yes-no
we
A.
of absolute
the span
from
at
to
seven
from
least
everyday
probably in
is
of
span
havior,
be-
our
limit
I have
the
there
different kinds
is about
is
encompass
What
is
but I must
This
objectiveevidence
This
of
that there is
at once
to
support
picion.
sus-
ing
question sadly needexperimentalexploration.
Concerning the third device,the use
successive judgments, I have quite a
bit to say
because
this device
introduces
the handmaiden
as
memory
is
this
no
And,
of discrimination.
since mnemonic
items
in length.
just shown you that there is a
absolute judgment that can
tinguish
disabout seven
categoriesand that
thousands,if indeed there is a limit. In

opinion,we cannot
ing
my
go on compounddimensions
indefinitely.I suspect
that there is also a span of perceptual
is
dimensionality and that this span
in the neighborhood of ten,
somewhere
add
of test materials
seven
span
about
of attention
that
six
objectsat
natural
more
than
all three of these spans

of
And
that
is
I shall be at
will
glance.
think
to
that
different
are
pects
as-
single underlying process?

fundamental
a
mistake, as
some
pains to demonstrate.
mistake
is
of
one
the
malicious
persecutions that
the magical number

subjected me to.
mistake went
something like this.
has
seven
My
have
We
is the
that
seen
in the
span
of
amount
observer
the
invariant
of absolute
information
that
There
transmit.
can
ture
fea-
judgment
is
the
real
the absolute
operationalsimilaritybetween
least
at
and
as
the
are
as complex
are
perceptual
judgment experiment
immediate
mediate
we
can
anticipatethat their
processes,
experiment. If immemory
interactions
will not
be easily disentangled.
is like absolute judgment,
memory
processes
then
to
tending
Suppose that we start by simply exslightlythe experimental procedure
that we
have
been
using. Up
this point we
have presented a single
stimulus
it
and
asked
the observer
to withhold
have
given
him
of the sequence
stimuli he then makes
his response.
still have
the
situation
that
of
We
of
is
transmitted
tion.
informa-
But
have
now
we
passed from
experiment on absolute judgment to
what
is traditionally
called an
ment
experian
on
Before
immediate
we
look
at
warning
to
help
associations
Everybody
span
knows
of immediate
data
any
give you
you
that
avoid
can
that
memory
that
be
of
ous
obvi-
some
there
this
on
word
the span
confusing.
is
and
finite
that for
observer
an
and
the span
the items contain
them,
for
about
amount
of
constant
at
able
remember
can
information
at
only
random.
amount
item
in the
test
seven
formatio
inare
total
should
two
In
how
be
three
or
this way
the span
vary
as
of information
materials.
of memory
are
i.2"
remain
to
we
should
memory
measurements
For
bits of
23
is
bits, then
of the
literature
when
worth
English words
apiece. If the
function
the
long
recall about
generateda theory about
The
be
digitsare
10 bits
of immediate
the individual
lot of information
total of
worth
chosen
of
span
littleinformation.
2?)
If the
constant, then
should
Isolated
to
retain.
in the
be short when
contain
example, decimal
bits apiece. We
per
of immediate
of information
can
is
memory
should
items
variant
that the in-
in the span
of information
amount
words
memory.
topic I feel I must
memory
of
follow
is also the amount
server
ob-
input-output
required for the
sort
same
of
measurement
tend
ex-
his response
until we
several stimuli in succession.
the end
At
feature
immediate
to name
immediately thereafter. We can

this procedure by requiringthe
it should
span
suggestive on
in
this
146
READINGS
question,but
necessary
see.
Hayes
do
to
it
so
experiment
tried it out
(10)
different kinds
the
PSYCHOLOGY
MATHEMATICAL
And
definitive.
not
was
IN
with
of test materials:
to
five
binary
digits,decimal
phabet,
digits,letters of the aldecimal
and
digits,
plus
The
1,000 monosyllabic words.
letters
with
listswere
item
per
much
read
aloud
second
and
at
the
the rate
they needed
A
procedure
(20) was used
time
as
responses.
Woodworth
of
one
subjectshad
give
to
described
to
as
their
by
the
score
INFORMATION
PER
ITEM
BITS
IN
responses.
The
results
shown
are
the
filled
dotted
line
by
circles in Fig. 7.
Here
indicates
should
span
of information
been
what
the
if the amount
span
The
constant.
were
represent the data.
experiment using
different
the
in the
solid
from
Data
8.
of
Pollack
information
presentation plotted
of
amount
as
information
(16)
retained
on
after
of
the
the
test
function
item
per
in
the
one
materials.
curves
Hayes repeated the

test
sizes but
have
Fig.
amount
vocabularies
of
There
is
with
wrong
Pollack
nothing
experiment, because
Hayes's
(16) repeated
elaboratelyand
all
containing only
it much
more
English monosyllables (open circles in
result.
lack
Polgot essentiallythe same
Fig. 7). This more
homogeneous test
took pains to measure
the amount
material did not change the picturesignificantly.
of information
and did not
transmitted
With
items
the
binary
span
rely on the traditional procedure for
is about nine and, although it drops to
His results are
scoring the responses.
about
five with
monosyllabic English plotted in Fig. 8. Here it is clear that
words, the difference is far less than
the amount
of information
transmitted
the hypothesis of constant
information
is not a constant, but increases almost
would require.
of information
linearlyas the amount
per
INARY
BINARY
DIGITS
ilGITS
DECIMAL
DIGITS
f-^
[-LETTERS
rUETTERS
rL
ft DIGITS
in the
so
In
of
40
\
" CONSTANT
V INFORMATION
increased.
input is
the outcome
is
perfectlyclear.
coincidence
the
spite
magical number
places, the span
50
a.
item
And
iggo
WORDS
seven
of
the
that
appears
absolute
in both
judgment
and
the span
of immediate
memory
quite different kinds of limitations
30
\
are
\
\
\
imposed
on
our
Absolute
information.
are
that
abilityto process
ited
judgment is lim-
20
by
the amount
memory
10
of information.
is limited
INFORMATION
test
7.
Data
immediate
of the
ber
num-
amount
materials.
from
memory
PER
ITEM
of information
per
hits
of
information
BITS
IN
Hayes (10)
plotted as
of distinguishing
fallen into the custom

between
of
the
tinction
In order to capture this disin somewhat
picturesqueterms,
of items.
I have
Fig.
by
mediate
Im-
and
chunks
say
that the number
the
on
a
span
function
item
in the
and
Then
of information.
can
tion
of bits of informa-
is constant
for absolute
the number
of chunks
judgment
of informa-
GEORGE
tion is constant
The
span
of
bits
has
also
we
constitutes
drawn
and
fact
definite about
very
chunk
bit
terms
what
For
of information.
memory
of five words
span
obtained
each
when
at random
from
word
span
of
had
about
IS
been
called
phonemes,
phonemes
three
recallingfive
in it.
the
here
subjectswere
phonemes,
words,
logicaldistinction
with
process
familiar
into
input
grouping
or
chunks, and a great deal
has
the
into
gone
familiar units.
of
units
learning
formation
into
sequence
Since
the memory
of
units
is
span
chunks, we
can
fixed
ber
num-
the
ber
num-
increase
that
it
than
just beginning
telegraphiccode hears each
as
to
to
man
separate chunk.
organizethese
then
he
deal
can
chunks.
sounds
with
dit and
Soon
he
many
The
into
is able
letters
operator
bits per
the
of communication
chunks
code
with
many
with
more
ways
that
and
group,
rather
then
than
Since
a
the
that
bits per
the input
contains
fewer
There
bits per chunk.
do
this
to
recoding,but
probably the simplest is

input events, apply a new
to
the
group
to
name
the
remember
the
name
new
originalinput events.
I am
convinced
very
general and
psychology,I
that
this process
important
to tell you
want
one
about
experiment that should

I am
ing
talkwhat
explicit
perfectly
This
about.
ducted
conexperiment was
and
was
ported
reby Sidney Smith
chologic
by him before the Eastern Psy-
demonstration
make
but
a
with
we
the observed
recalled
suspect
of
be
and
in these
that
at once
used
immediate
digits. In Table
as
fact that people
eightdecimal digits,
nine
only
binary digits. Since there
of
largediscrepancy in the amount
could
span
in 1954.
repeat back
can
is
ory,
the-
code
few
operator recodes
another
chunks
dah
into letters and

the
increase
to
this process
would
be
The
input is given in
Begin
radio-
the
ing.
called recod-
information
learn
am
jargon
chunk.
tains
con-
before.
the
that
correspondingly.
proposing to use, the
Association
simply by building larger and

larger chunks, each chunk
containing
information
contains
chunks.
or
of bits of information
more
the
In
precisely,
speak more
must
we
therefore,
recognizethe importance
of grouping or organizing the input
ized
organ-
chunk.
is
to
are
increases
operator learns
Recoding
order
dahs
that
message
remember
for
In
dits and
In the terms
these
of
the
of
amount
can
mediately
imare
are
of
the
is not
dealing
organizing or
We
apparent.
tively,
Intui-
15
not
as
propriately
ap-
word
overlap
by learninginto patterns and

these
larger chunks
emerge
memory
each
since
it is clear that the
but
and
rates
the
during
learningprocess.
simply pointing to the obvious
am
fact that
set of 1000
English monosyllablesmight just as

have
different
at
other
date.
to
highlightthe
to
serves
example, the
that Hayes
was
the
of
not
are
examined
been
contrast
chunk
that
per
that
range
The
each
seems
memory
147
MILLER
achieved
memory.
independent of the number

the
chunk, at least over
be almost
to
for immediate
of immediate
A.
renaming
cases,
recoding procedure
to
memory
1
two
increase
for
the
binary
method
for ing
groupis illustrated.
Along
a
of 18 binary digits,
able to
subject was
and
he
hear
recall
after
larger chunks,
a
single presentation. In
begins to
whole phrases. I do not mean
that each
line these same
the next
binary digits
that plaor
step is a discrete process,
teaus are
grouped by pairs. Four possible
00 is renamed
in his learningcurve,
must
0, 01 is
pairs can occur:
appear
11 is
for surely the levels of organizationare
and
renamed
renamed
10
is
2,
1,
themselves
Then
as
the
letters
words,
which
organize
are
still
the top is
far
more
a sequence
than any
148
READINGS
MATHEMATICAL
IN
PSYCHOLOGY
TABLE
Ways
Binary
4:1
10
10
Recoding
220
Chunks
from
four
remember,
111
7
Chunks
1010
0010
0111
0011
10
10100
span
is to
same
the
In
to
this is almost
from
into
and
7.
Now
of
sequence
this is well within

In
memory.
within
the
binary digits
of three.
of
There
tween
be-
name
recoded
binary digits
octal digits,and
binary digitsare grouped by fours and

by fives and are
given decimal-digit
from
names
It is
of
to
15
and
from
reasonablyobvious
to
11001
31.
110
25
recoding schemes increased their

case.
binary digits in every
span
But the increase was
not
as
large as we
had expected on
the basis of their span
for octal digits. Since the discrepancy
increased as the recodingratio increased,
for
reasoned
we
that
subjects had
had
the
Apparently
code
few
been
ing
recod-
sufficient.
translation
from
one
almost
be
must
the
minutes
learning the
not
the other
to
matic
auto-
subjectwill lose part of the

while he is trying to remember
the
or
next
the
spent
schemes
18
the span
of ate
immedithe last two lines the
10
The
three,so
have
we
of
sequence
quence
se-
In the next
memory.
of
sequence
eightpossiblesequences
a
give each sequence
new
we
receded
just nine digitsto
now
is regrouped into chunks

are
base-
110
16
recede
we
say,
10
001
01001
20
arithmetic
of immediate
line the
11
are
and
00
100
arithmetic.
there
11
13032
That
3.
01
000
base-two
10
Recoding
renamed
00
101
Chunks
Digits
Recoding
Recoding
5
Binary
of
101000100111001110
Chunks
(Bits)
Digits
Sequences
Recoding
of
group
the translation
Since
the 4
of the last group.

5 : 1 ratios require
1 and
considerable
decided
imitate
study. Smith
Ebbinghaus and do
the
himself.
on
that this kind
With
he drilled himself
to
ment
experi-
Germanic
on
each
tience
pa-
ing
recod-
recoding increases the bits per chunk,

into
packages the binary sequence
obtained
the reand
sults
successively,
the data
shown
in Fig. 9.
Here
form that can
be retained within the
follow along rather nicely with the rea
sults
of immediate
So Smith
would
span
predict on the basis of
memory.
you
assembled
20
could rehis span
for octal digits. He
member
subjects and measured
their spans
for binary and octal digits.
the 2:1
12 octal digits. With
The spans
9 for binaries and
7 for
worth
were
were
recoding, these 12 chunks
octals.
Then
he gave
ing
each
recoding 24 binary digits. With the 3:1 recodscheme
36
to five of the subjects. They
worth
binary
digits.
were
they
studied
the recoding until they said
5 : 1 recodings,they
With
the 4 : 1 and
it for about
5 or
about 40 binary digits.
10
worth
they understood
were
and
"
minutes.
binary
use
he tested their span

for
digitsagain while they tried to
the
studied.
Then
recoding
schemes
they
had
It is
to watch
little dramatic
digitsin
binary
get
back
then repeat them
40
However,
if you
think
row
without
of this
son
per-
and
error.
merely
as
GEORGE
A.
149
MILLER
eyewitnessesis
50
but
PREDICTED
40
FROM
into
3:l
2:l
4:|
RECOCTING
5:i
RATIO
of immediate
span
SUBJECT
PRACTICED
HIGHLY
ONE
The
for
memory
of the
binary digits is plotted as a function
tion
The
predicted funcrecoding procedure used.
for
is obtained
by multiplying the span
octals by 2, 3 and
3.3 for recoding into base
4, base 8, and base 10, respectively.
mnemonic
memory
trick
span,
extending
for
will miss
you
the
the
more
important point that is implicit in

devices.
The
nearly all such mnemonic
that
is
is
an
point
extremely
recoding
for
powerful weapon
increasing the
of
amount
that
information
deal with.
In
form
one
we
can
another
or
recoding constantly
use
of
mony
testifollow
in
our
"
the
few
rich
chunks
mation.
infor-
in
suspect that
imagery is a
form of recoding,too, but images seem
and
much
harder to get at operationally
the more
to study experimentallythan
symbolic kinds of recoding.
zation
memoriIt seems
probable that even
I
9.
random
naturally from
30
Fig.
legalpsychology,
they
particularrecoding
that the witness used, and the particular
recoding he used depends upon his
whole life history. Our language is tremendously
useful for repackaging material
FOR
DIGITS
\\
in
distortions
not
are
SPAN
OCTAL
well known
the
studied
be
can
The
these
in
terms.
of
be simply
memorizing may
of chunks, or groups
of items that go together,until there
that we
few enough chunks
can
are
so
recall all the items.
The work by Bousiield and Cohen
(2) on the occurrence
is
of clusteringin the recall of words
in this respect.
especiallyinteresting
process
the formation
Summary
I have
we
that
daily
like
come
to
I wanted
to
end
of the
present,
make
to
now
the
so
data
would
summarizing
some
marks.
re-
behavior.
In
opinion the most

customary
my
of recoding that we do all the time
kind
is to translate into
there
is
idea that
try
words."
story
we
verbal
or
want
remember,
to
When
it "in
witness
we
to
our
recreate
we
by secondary
details
the
particularverbal
to have
and
verbalization.
the
that
we
our
Upon
member
re-
recall
elaboration
well-known
with
into several
into
have
one
on
demonstration
The
inaccuracy
deserves
than
the
that
people
do
of
at
to
By
ceive,
re-
ganizing
or-
ously
simultaneand
dimensions
cessively
suc-
chunks, we
least stretch)
bottleneck.
much
it has
kind
able
input
(or
pose
im-
amount
remember.
Second, the process

important one in
very
and
the
are
sequence
break
this informational
periment
ex-
of the process.
of the testimony of
to
manage
and
Walter
and
judgment
memory
on
we
stimulus
the
pen
hap-
by Carmichael, Hogan,
(3) on the influence that names
the recall of visual figuresis
that
process,
verbal
then
limitations
of information
event
a
of immediate
severe
own
recoding we
The
an
of absolute
span
the span
ally
usu-
consistent
seem
made.
or
some
make
remember, we
descriptionof the event
want
we
and
When
argument
an
rephrase
to
code.
First,the
of
seems
lifeblood of the
of
recoding is a
ogy
psycholattention
explicit
human
more
received.
In
ticular,
par-
linguistic
recoding
to
me
to
be
the
thought processes.
very
constant
a
Recoding procedures are
social psycholoto
concern
clinicians,
150
IN
READINGS
PSYCHOLOGY
MATHEMATICAL
and
for absolute
and anthropologists
jects
objudgment, the seven
gists,linguists,
less
is
in
the
of
and
the
recoding
attention,
yet, probably because
span
accessible to experimental
manipulation seven
digitsin the span of immediate
T mazes, the
than nonsense
For the present I propose
to
or
syllables
memory?
has
withhold
traditional experimentalpsychologist
there
is
judgment. Perhaps
little or
contributed
something deep and profound behind all
nothing to their
these sevens, something just callingout
Nevertheless,
experimental
analysis.
of
be
for us
it. But
I suspect
to discover
reused, methods
techniquescan
dicantsthat it is only a pernicious,
behavioral inPythagorean
coding can be specified,
And
be found.
I anticipate coincidence.
can
will find a very orderly set of
that we
REFERENCES
relations describingwhat
now
seems
an
uncharted
wilderness
of individual
ferences.1. Beebe-Center,
difJ. G., Rogers, M.
Transmission
O'CoNNELL, D. N.
Third, the concepts and measures

provided by the theory of information
provide a quantitativeway of gettingat
of these questions. The
some
theory
brating
caliwith
for
a
provides us
yardstick
stimulus
our
measuring
materials and
of
about
of
through the sense
Psychol., 1955, 39, 157-160.
2.
J.
frequencies-of-usage.
jects.
sub-
the
performance
the
tion
interests of communica-
our
I have
of
suppressed the
information
technical
tails
de-
useful
lead you to think they are

in research.
Informational
have
already proved
the study of discrimination
valuable
and
number
of
seven?
What
about
the
of
of the week?
5.
1954, No.
6.
point ratingscale,the
about
seven
the
seven-
categories
the
of
reproduction
J.
exp.
effects of
terminate
de-
Aufgaben.
Amer.
J. Psychol, 1932, 44, 163-174.
Multidimensional
lus
stimuEriksen, C. W.
and
nation.
of discrimi-
accuracy
Tech.
WADC
USAF,
Rep.,
54-165.
Eriksen, C. W., "

judgments
stimulus
range
stimulus
and
H.
Hake,
as
lute
Abso-
W.
of
function
the
and
response
exp. Psychol., 1955, 49,
the
number
of
categories. J.
323-332.
8.
sis
analyjudgments of loudness.
/. exp. Psychol., 1953, 46, 373-380.
fect
efThe
Hake, H. W., " Garner, W. R.
9.
reading accuracy.
J. exp. Psychol.,1951, 42, 358-366.
ChroHalsey, R. M., " Chapanis, A.
7.
Garner,
W.
of
of
discrete
the
What
the
on
indeterminate
differences
seven
world, the seven

seas,
the seven
ters
daughdeadly sins,the seven
of Atlas in the Pleiades,the seven
levels of hell,
the seven
of man,
ages
t
the seven
primary colors,he seven notes
of the musical scale,and the seven
days
wonders
P., " Walter,
Relative
W.
and
in
great deal in the

and it
memory;
H.
experimental study
D.
Chapman,
guage;
of lan-
they promise a
learning and
been proposed that they can
has even
be useful in the study of concept formation.
A lot of questionsthat seemed
fruitless twenty or thirtyyears ago may
be worth
another look.
In fact,I
now
feel that my
stop just
story here must
it begins to get reallyinteresting.
as
the magical
And
what
about
finally,
study
4.
not
cepts
con-
An
Psychol.,
gen.
of
visually perceived form.
Psychol., 1932, 15, 73-86.
have
will not
A.
effect of language
and
measurement
1955, 52, 83-95.

Carmichael, L., Hogan,
A.
tried to express
the ideas in more
familiar terms; I hope this paraphrase
The
A., " Cohen, B. H.
clusteringin the recall of
of different
arranged words
of
randomly
3.
In
/.
taste.
BousFiELD, W.
occurrence
for
"
saline solutions
and
sucrose
S.,
of information
An
R.
informational
absolute
presentingvarious
steps
on
maticity-confusioncontours
/.
viewing situation.
Amer., 1954, 44, 442-454.
10.
Hayes,
J. R. M.
vocabularies
size.
numbers
of
scale
Memory
as
In
in
plex
com-
Opt. Soc.
for
span
function
of
eral
sevcabulary
vo-
Quarterly Progress
Acoustics
Report, Cambridge, Mass.:
of
Institute
Laboratory, Massachusetts
Technology, Jan.-June, 1952.
ON
REMARKS
I. THE
METHOD
THE
SQUARES
EQUAL
STANDARD
DEVIATIONS
CORRELATIONS*
EQUAL
Mosteller
Frederick
HARVARD
Case
Thurstone's
stimuli
and
to
an
that
the
assumption
of
in method.
stimulus
is shown
positions
method
of
There
(1)
lead
to
The
stimuli
pairs of
continuum
least
sumes
as-
to
sensations.
estimate
squares
notions
fundamental
of
set
comparisons
corresponding
paired
of
the
scale.
sensation
paired comparisons
is
of
sensations
of
between
to
the
on
Introdtiction.
1.
method
the
deviations
correlations
be relaxed
of zero
can
assumption
with
between
correlations
no
change
pairs
equal
of paired comthe usual
parisons
approach to the method
Further
Case
UNIVERSITY
correlations
zero
It is shown
of
standard
equal
ASSUMING
SOLUTION
LEAST
AND
COMPARISONS:
PAIRED
OF
(4)
which
stimuli
these:
are
be
can
located
jective
sub-
on
scale,usually not having
(a sensation
stone's
Thur-
underlying
urable
meas-
physical characteristic)
Each
(2)
to
sensation
for
Stimuli
rise
to
(5)
It is
(6)
Our
for
*This
the
sensations
sensations
article
and
is to space
was
available
of
the
from
particular stimulus
individual, thus
an
The
stimulus.
reports which
stimuli
individual
pares
com-
is greater.
paired sensations
the
ing
giv-
to be
(the sensation
correlated.
means),
cept
ex-
transformation.
linear
gives rise
is normal.
each
for
possible for these
research
made
Department
This
of
sensation
task
individual
an
individual.
presented in pairs to
are
a
these
grant
presented to
population of individuals
(4)
in the
distribution
The
(3)
when
stimulus
performed
to
Air
in
the
Laboratory
Harvard
University by the
Force, Project RAND.
appeared in Psychometrika,
1951, 16, 3-9.
152
of
RAND
Social
Relations
Corporation
Reprinted with
under
under
permission.
There
example,
analysis" for
the
only
but
individuals
several
need
there
Case
stone's
correlations
discuss
V
data
rather
of
been
has
deviations
the
of
for
number
table.
and
ith and
objects
of
Xj
jth
distributed
mean
stimuli, then
for
of Xi
of Zi
Marginal
in
Thurstone's
will
does
empirical
stick
not
not
stimuli, Oi
or
lie
on
evoked
sensations
we
O2
"rMXi)
of Xi and
to
to be
seem
distributions
pij
of
the
Distributions
Case
of
of
the
the
there
These
stimuli
continuum
S
.
individual
an
by the
with
(i
(i
(i,j
assume
Xj to be jointly normally
="r2
Xj
sensation
in
and
Xi
assume
On.
"""
single
We
Data.
Si
marginal
Error-Free
Figure
Stimuli
this
population of individuals
the
correlation
The
V.
equal standard
of
We
approximation.
tvith
which
single
are
variance
The
first
Case
original proportions
assumption
The
shall
We
to fit
the
that
this
for
seems
the
reproducing
of
sense
Stimuli
give rise to sensations

If Xi
stimuli
the
ordering
the
zero.
are
Thurard
stand-
V.
Case
Ordering
2.
tion
sensa-
as
and
equal
sensations
correlations, because
zero
the
of
that
case
are
quite frequently and
employed
reasonable
assumption
essential
are
is
distributions
of
paired comparison
the
assume
we
fully is known
in this
assumed
has
method
well in the
shall
or
example,
for
deviations
most
pairs of stimulus
standard
Case
discussed
been
sensation
between
dividuals
in-
not?
or
has
of the
deviations
equal,
standard
the
Thurstone
V.
; or
the
times;
several
discussed"
be
to
intercorrelations
equal
which
case
"cases"
assume
we
distributions
The
times
several
comparisons
comparisons
all
are
all the
Shall
zero?
them
all
individuals,
different
in
used
be people.
not
assume
we
make
may
Furthermore,
shall
makes
who
individual
one
have
not
may
we
materials
basic
the
of
variations
numerous
are
153
MOSTELLER
FREDERICK
Xi's
"
"
"
w)
,2
,""
,n)
2
,
appear
"
"
"
as
n)
(1)
.
in Figure
1.
Sensations
Method
of
Produced
Paired
by the
Comparisons.
Separate
154
In
more
than
best
can
the
For
of the
(1)
Our
of the
time
we
Sa
"
to build
are
thing
any-
and
Oj
whether
reports
of the method
in the
the problem
data
exceeds
assume
we
know
the
true
conditions
the
that
and
Xj
data.
of nontallible
case
we
stimuli
ordering the
for
tion
propor-
above
given
fulfilled.
is to find the
sensations
1). Clearly
for
Xi exceeds
the
Xj
data
say
this
do
to
reported
pa
the
by them,
hope
cannot
stimuli
of the
spacing
produced
we
transformation,
times
and
Oi
the tenor
Xi
problem
mean
if
time
though 5i
even
scale.
compares
through
exactly
are
part
of nonlallible
case
the
of
Zi
"
allowed).
are
see
first work
we
happen
rank-order
/
(no ties
Xj
We
if
to
individual
An
Xi
has
this
fact
possibilitythat X2
the
figure indicates
The
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
Si
ure
Sn points in Fig-
"""
within
except
the
merely
are
(or the spacing
linear
of
percentages
-ldij-{Si-Sj)y
Pij
PiXi"Xj)=
V27la{dij)
where
da
loss in
Xi
Xj
"
generality
in
and
J,
2aHdij)
aHdtj)
assigning
ddiji2)
the
2a2(l
scale
p). There
"
factor
will be
so
2aHl-p)=l.
It is at this point that
Case
However,
his
derivations
(and
equal
easing
items
the
the
assume
We
can
continuum
same
and
another.
that
Past
items
stated
model
has
it is difficult to
equation
the
model
unable
was
noticed
With
not
were
that
this
the
(2)
of
uses
imagine
this
really
method
have
assumed
assumed
to take
them
the
cognizance
zero
plicitly
ex-
set
is
of
great
i.e.,nearly
.42
,
attitudinal
all correlated
zero
with
one
all benefited
from
the
fact
to be
items
.38
,
equal. But
this
Actually
.34
are
implicitly with
through
readily imagine
correlated
acterized
char-
correlations.
zero
correlations
carried
are
(not necessarily zero).
conditions.
on
only
who
Thurstone,
variances
equal
but
artificially),
correlations
of
having
as
(3)
depart slightly from
we
no
that
uncorrelated.
were
of the
It
was
uncorrelated, but
Guttman
statement.
only
the
(2)
independently.
scale
factor
chosen
in
equation
(3),
we
can
rewrite
155
MOSTELLER
FREDERICK
/""
(4)
-^y'dy.
/,._.,
V2(4), given
From
of
table
normal
matrix
0
find
can
we
all other
Si
as
Thus
location
the
given
of
use
is
fallible data
with
problem
The
Si
the
arbitrarily assign
we
compute
can
we
if
Then
by
"{SiSj)
for
solve
can
we
pa
areas.
"S\=
parameter
any
pa
more
complicated.
fallible data,
have
p'a which
have
we
equation
to
Analogous
Scaling with
Comparison
Paired
3.
(4)
where
the D'i, are
normal
deviate
further
notice
p'ii
"z:^
estimates
of
the
Dij
were
corresponding
that
not
hold
We
conceive
for
Djh
the D'a
Da
to
Si
=^
we
pa
(5)
e-'y'dy.
We
Si"Sj
p'a
to get the
not
be
look
merely
of D'ij
matrix
consistent
in the
the
up
We
.
that
sense
Sj
"
as
(S'i
[D'ij
"
Sk^^ Dik
"
follows:
of the Si's called
Sj
S'i,such
the
D'ij
to construct
that
is to be
S'j)y
"
from
minimum.
(6)
i,i
It will
data.
One
help
can
to
indicate
set up
the
another
Si
Totals
Means
S-i
On
2'Si-nSx
5
"
S,
OF
S2
On
25i
S
of
form
solution
nonfallible
for
Sj matrix:
"
MATRIX
the problem
set of estimates
When
the true
i.e.,
Dij
does
estimates
of
/""
need
the D'a
Data.
have
we
are
Fallible
"
WS2
S2
Si
"
Sn
25i
S
S3
"
"
Si
wâ
S3
Sn
lSi
S
(Sn
"
"
nS"
Sn
156
READINGS
by setting Si
Now
(S
Si)
"
the
S'i
(S
"
(5
S^)
"
will
We
on.
so
to minimize
respect
(S
"
this
use
the
(S'i
"
with
Since
S'i
"
of squares
sum
D'^
S'i
S2), Ss
"
plan shortly
for
respect to S'i
need
above
from
for
S'i
only
the
main
which
i "
rivative
partial de-
the
and
"D'ji
we
take
we
"
S'j) matrix, i.e.,terms
"
(6)
expression
S'i
to
DH
and
S'i)
"
with
D'ij
S3), and
"
wish
we
with
(S'j
get S2
we
If
"
PSYCHOLOGY
MATHEMATICAL
IN
S'j
"
concern
selves
our-
diagonal
j
^=
in the
ing
Differentiat-
get:
we
9(2/2)
dS'i
S'j + S'i)-^
(D'ji
S'i
(D'ij
-
S'j)
(7)
j=i+l
j=l
{i=l,2,---,n).
this
Setting
partial
+S'i
S'.
derivative
+S'i-i"
"""
equal
to
(n"l)S'i
zero
have
we
S'i+i +
+S'n
"""
(8)
i-l
-Lf
"
j-l
but
D'ij
D'ji
"
and
D'a
0 ; this makes
be
can
have
and
to
is to
not
of the
be
suggested
in the
by
the
matrix
(8)
choose
from
S'i
of Si
S'i
"
of
we
the
left side
the
have
There
are
This
.
solution
of
(9)
means
S'
ishes.
van-
scale
our
various
by setting
try the
left side
of
only chosen
example,
=
(9)
ways
we
(10) which
(9) to the
or
will
is
umn
total col-
^D'ji/n
j=i
we
for
Sj
of
to set S'i
Then
similarity
l,2,---,n).
parameter.
parameter,
distances
measure
because
location
We
coefficients
expected
assign this location
by setting 5"i
of
j=l
(i
^D'j
assigned
side
written
determinant
This
right
2 D'ji
i=i+l
^S'j-nS'i
The
the
D'ji
j=l
(8)
j=i+l
2 D'ji + D'a
Thus
{i=l,2,--,n)
-Lf ij
ji
"
^D'ji/n.
j=i
(10)
i^l
when
that
Notice
S'i
157
MOSTELLER
FREDERICK
0 and
that
because
which
because
happens
double
;'
and
term
every
Therefore, substituting
sum.
negative
its
of
left side
in the
(10)
this
in
appear
(9)
we
have
^D'j,/n-:ZD'ji/n
2i5';i-n
(11)
^D'r.,
;=l
is
linear
transformation
an
The
the
of
this
solution
of
sense
not
to
known
condition
be
This
the
p'ij tend
This
to
and
zero
S'i will be
the
After
table.
in
in
solution
the
seems
been
have
lines.
these
entirely satisfactory because

stimuli are
extreme
compared.
when
numbers
in the
order, the
proper
say,
of columns
preliminary arrangement
approximately
D'a table.
beyond,
all numbers
by excluding
customary
it may
closely along
of
is not
unity
met
the
solution
although
worked
he
that
squares
for
assumption
the
least squares
unsatisfactorily large
difficultyis usually
the
least
literature
solution
squares
introduces
from
since
(3),
that
show
to
is
this
the
in
mentioned
least
is
That
(6).
to Horst
is
and
comparisons
paired
to
any
course,
equally satisfactory.
to provide a background
to indicate
is unnecessary,
Of
solved.
are
is
presentation
paired comparisons,
correlations
zero
equations
solutions
of the
of
point
theory
the
identity, and
which
so
This
2.0
that
quantity
^{D'ij-D\,i,,)/k
is
where
computed
entries
such
in both
appear
discussion
method
which
therefore
take
reasonable
the
of
account
of the
It should
D'a
also
one
original y'a
the
because
In
.
to be
seem
be
the
remarked
we
other
The
differing
this
words,
want
a
more
ford's
Guil-
(see for example
This
to
for methods
computations
variabilities
unmercifully
really
between
paired comparisons).
results.
that
which
of i for
differences
Then
of
method
k values
separations
scale
give reasonable
to
seems
of
the
over
y+1
j and
the
as
(1)
is
summation
column
taken
are
means
the
of
the
p' a
and
extensive.
solution
check
is not
our
reasonable
entirely
results
solution
against
might
158
READINGS
such
be
one
by
p"i;
that
the
once
minimize,
and
MATHEMATICAL
IN
S'i
PSYCHOLOGY
computed
are
we
can
the
estimate
p'a
say,
'2ip'ij
or
p"ijy
"
perhaps
Such
thing
can
do
attempts
method
not
be
to
sin
(arc
VP
doubt
no
differ
to
seem
be
sin
arc
"
ii
but
done,
^/p"ii)^.
the
from
enough
results
the
of
results
author's
the
of
the
present
pursuing.
worth
REFERENCES
1.
Guilford,
1936,
2.
3.
Guttman,
Annals
Horst,
P.
stimulus
Thurstone,
P.
New
Book
McGraw-Hill
York:
math.
of
method
Stat.,
for
situations.
L.
L.
received
J.
educ.
Psychophysical
8/22/50
1946,
determining
paired
quantifying
for
approach
389.
Manuscript
Methods.
Psychometric
An
L.
order.
of
4.
J.
Co.,
227-8,
17,
the
Psychol.,
analysis.
and
comparisons
rank
144-163.
absolute
1932,
Amer.
affective
23,
values
of
series
418-440.
J.
Psychol.,
1927,
38,
368-
160
READINGS
interval Pyt is 0.5 and
time
interval.
one
in
response
in
responses
obvious
Equations (1) and
probabilityof
and
Assume,
however,
that
at
response
time
probabilityof
rate
tions
equa-
gettingno
into
measure
continuous
intervals
interval
in the
bility
proba-
distribution
is zero,
able.
specifiand
stimulus
the
but
is finite and
between
conditions
stimulus
some
between
particulartime
time
directlyconsidered
the
within
getting exactly
relation
the
with
any
given
the
to
is not
thus
probabilityof
transform
dealing
within
0.69/r. Equation
/ is
of responses
Po is e~''^.
case
to
us
are
a
response
response
the
consider
we
In this
we
usually refers
Latency
of
when
probabilityof
(time),the
median
the
or
numbers
example,
(2) permit
Since
measure.
rate
For
interval, T.
an
loge0.5
The
interval, T, is {rT)e~'^'^.
an
(2) is
(1) and
is
"rl
PSYCHOLOGY
probabilityof various
(2) gives the
specifiedtime
MATHEMATICAL
IN
previous development.
are
of the
determinant
one
responding,that is,that the rate has different values for different

This assumption is consistent
with the discussions
conditions.
by
stimulus
Skinner
and
assumption
others
who
the
of rate;
emphasized the measurement
be
an
presumably
elementary requirement for
would
have
any
measure.
the
Under
circumstances
latency,
of the
beginning
since
of the
the
observation
the first response.
Thus,
determinant
of
of rate
are
responses
of the
the
on
delay
distributed
distribution
value, say
the
but
on
the
in time,
length / between
of latencies
time
employed
interval
between
the
presented) and
conditions
previousassumption
a
in discussing
of the
statement
that
rate
are
of
the
sponding
re-
conditions
of the first response.
occurrence
the
be
may
period (when a stimulus was

the assumption that stimulus
specifiedstimulus
of
be
would
responding and
randomly
under
latter
assumption, t
also
the
This
of the
median
ment
implies a probabilitystateand
of
the
stimulus
presentation
tells
statement
not
us
relationshipbetween
latency, and
the
rate
of
only of
some
the
sentative
repre-
responding;
for
latency,
probabilityof a response
greater than the median
tmd, is 0.5; and from
rima
equation (1) we see that
loge 0.5 or that
the median
latency equals 0.69/r.
The
ditioning
precedingdevelopment does not imply any particulartheory of conbut may
be incorporatedinto a large class of theories.
For example,
is combined
with
that
if the foregoing discussion
states
a
theory
that rate of responding is proportionalto the number
of responses
that remain
in
of number
of responses
to be given in extinction, the measure
In
extinction is immediately related to our latency and probabilityterms.
example,
the
"
other
words, if
r
where
A^ is the
number
of responses
(3)
k(N-n),
in extinction,
is the
number
of
re-
CONRAD
already given, and ^

equation (1) and obtain
is
sponses
r
in
G.
constant,
P"t
This
equation
terms
N,
n,
and
number
the
distribution
distribution
through
for direct evidence
above
The
course
outlined
data
of
figure 1
in
were
this interval
are
this
as
taken
distributed
getting
is the rate
of
text.
The
in
The
data
bar-press
line
drawn
equation (1).
well
from
The
session
as
for evidence
relating to
between
obtained
measurements
data
the
rate
the
is whether
responses
the
of
responding
the responses
that
greater than
same
the
a
ditioning.
periodicrecon-
of
Equation (1) states
responding expressed in
during
represent the responses
of "three-minute"
period
question at issue
randomly.
interval
an
intervals
time
rats
observation
The
constant.
the
data
examine
to
SECONDS)
white
plot of
of
assumption
25
in seconds.
time
the
of randomness.
diu-inga 20-minute
approximately
is
data
the
20
inter-response
of randomness
Within
where
periodic reconditioning.^
singleanimal
of
described
as
the
consequences
in
FIGURE
experiment with
an
situation
IN
t is
t, where
than
from
15
predict
tion.
stages of extinc-
at various
from
to
strength and
constant
mainly
(TIME
of
percentage
greater
for
responses
10
the
latency, rate
be used
in time, it is of interest
The
equation (4) may
follows
for
(4)
relationshipsamong
in extinction
of responses
k{N"n)
substitute
may
existing among
relationships
the
to
in extinction,
present argument
are
for
intervals between
distribution
random
examined
of responses
of time
the
be
we
-k(N"n)t
"
In addition
/.
of responses
and
Since
then
may
161
MUELLER
units
as
the
/
/.
was
in
bility
probais
e^",
In the
162
session, 238
20-minute
and
the
that
responses
measiu'ed
the
data
intervals,theory specifies
uniquely. In this
greater than t (in seconds) is
of the
percentage
the various
time
the
figure 1
date
consistent
are
per
corded,
re-
Thus,
in
with
intervals
values
the
theory
found
be
may
tween
be-
on
specified
represents
assumption
the
that
occurred
from
deviations
example,
through
were
second.
randomly in time.
of the agreement
be representative
data of figure1 may
certain cases
of
theory under the conditions specified,
and
data
for
The
responses
the
Although
greater than
were
function.
theoretical
the
shows
figure1
solid line
The
abscissa.
the
of
ordinate
g-o.2oj. xhe
response
intervals
responses
interval
time
time
237
of time
intervals between
probabihtyof gettinga
the
case
the distribution
to
of time
the distribution
is 0.20
session
in this
direct reference
without
made,
were
responses
rate
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
in
be
may
where
cases
One
noted.
animals
tween
betematic
sys-
class of deviations,
show
marked
"holding"
seconds.
depressed and held down for many
in
is
the usual
not
a
"refractory"period^
Although the "holding" period
in
the
data
similar
it
affects
a
of the term,
During
obviously
sense
way.
is zero.
One
the "holding" period,the probabilityof response
occurrence
complicating feature in analyzing responses characterized by "holding" is
the bar is
where
behavior, i.e.,
variable
"holding" is of
the fact that
length.
The
data available
at
present
of this
problem, but the simplicity

the factor of
result from apparatus changes designed to eliminate
that may
from the additional
"holding" and the advantages that may accrue
response
do
warrant
not
extensive
an
be shown.
specification
may
An
shown
example of a distribution showing systematicdeviations from theory is

The
in figure 2.
computations and plot are similar to those in
ordinate
figure 1.
The
responses
greater
is correct
except
the
appHed
to
time
the
in the
and
when
the
of
as
solid
line
in the
reference
in
case
the
to
the function
is
tribution
dis-
appears
analysisleading to equation (1) and applied to

when
period
applied to all portions of the observation
be
additional
An
then
in
test
"holding."
may
spent
data
the
which
from
one
the rate
response
time
term,
minus
of
form
in the
figm-e 2
of the
beginning of
abscissa
specified
total
determined,
responding without
fitis obviously poor;
The
meastu-ement
meastuements
the end
was
The
values.
between
the
that
assume
1
figtu-e
Hne
intervals
asymmetric.
and
us
percentage of
of
rate
intervals.
of time
Let
of the
the
the
specifiedabscissa
the
constant
figure1, directlyfrom
sigmoid
represents
than
The
theoretical.
the
treatment
values.
obtained.
interval between
Figure 3
shows
plot of the percentage of

beginning of the next that
The
Now
we
the end
the
are
of
ested
interone
sponse
re-
results of such
intervals between
were
solid line through the data
equal to the ratio of the number

"holding" time, i.e.,to the number
r, is set
the
the next.^
the
and
time
was
greater than
is theoretical
of responses
to
of responses
CONRAD
G.
163
MUELLER
10
t
plot similar
to
15
(TIME
IN
FIGURE
figure 1 showing the.deviation

The
line drawn
through
(TIME
The
time
of
data
that
in
between
response.
described
figure2
"corrected
figiu-es1 and
the
The
in the
end
line
text.
2, except
of
one
drawn
data
IN
theory
is
in
plot of
cases
of
"holding"
equation (1).
SECONDS)
3
FIGURE
to
SECONDS)
from
the
20
for
that
holding."
the
The
measured
response
and
the
through
the
data
plot is similar
interval
beginning
is
is the
of the next
theoretical
one
havior.
be-
164
READINGS
time.
per unit of "available"

the
independently of
to the
relevant
Data
The
agreement
percentage
between
of time
and
Yamaguchi
In the
straightline
of intervals.
evaluate
case
are
the
Hull^ is shown
In the
of
is evaluated
in
ous.
numer-
reported
figure4,
where
plotted.
are
of the
latency data under

tion
independently of the distribu-
figure4
fitted to
not
data
specifiedabscissa values
curve.
possibleto
intervals.
slope of
the
2 the constant
present theory and
of latencies greater than
it is not
consideration
and
distribution
the
solid line is the theoretical
The
by
of the
figures1
present analysisof latency measures
by Felsinger, Gladstone,
the
in
As
shape
PSYCHOLOGY
MATHEMATICAL
IN
case
the constant
was
determined
plot of logeP"/ againstt.
0123456789
IN
(TiME
SECONDS)
FIGURE
The
of latencies
percentage
than
greater
may
tests
be made.
that
have
One
promising.
of time
intervals
tried
animals,
will be
and
the
curve
between
as
from
are
say, response
distributed
found
for
in
manner
In other
in figure 1 will
of the
of the formulation
tests
not
sampling
available.
data
and
the
of animals
at
For
the
theory
is
distribution
comparable
tion
stage in extincspecified
of
i?" and Rn+i^ for a largenumber
in figure1
similar to that shown
at
(thereforethe steepness
n.
further
concerns
number
expectationis that
The
constant,
tested
been
figure 1 of Felsinger,
Hull.*
between
agreement
has
responses
systematicallywith
such
the
predictionthat
the intervals between,
vary
data
and
be
been
stages in extinction.
that
The
gained at this time by

additional
of equation (1),but many
tests appropriatedata are
For some
little is to
consequences
t.
Yamaguchi
Gladstone,
Probably
of
drop
of the
curve)
will
words, the steepness of the drop of
depend
on
where
in extinction
the
inter-
CONRAD
vals
measured.
are
measured,
cases
extinction
fact
In
this
the
although
is not
G.
expectation
number
of the
features
of the
distributions
of the
many
values
one
may
changes
in another,
changes
say
using different
data
functions
only
that
Summary.
"
responses
obtain
we
directlya
(or of
any
of the
function
obtain
intervals
number
length
the
that
probability
the
these
where
between
various
between
terms
values, the
time
of
of
and
are
responding
and
to
In
for
of
addition
has
we
that
assume
in time,
occurrence
of
interval
responding
inter-responsetime
specified
out
be
fixed
yet
responses
to
to
the
consequences
and
about
the
the
occur,
latency
statements
for
to
bility
probaaverage
distribution
of
and, by extension, for the distribution
responses
also
we
sponse
reas
responding,it turns
various lengths may
of
be related
may
rate.
present formulation
between
to
measures
that, for any

of
ing.
test-
probabilityvalue,
be specified. (3) Finally,
may
theory specifying the relationship
number
in extinction
as
rate
some
rate
theory
distributed
of
assume
responding, or,
added
of
occurrence
we
If
specifiedtime
of the rate
of latencies
occurrence
aid to
(1)
probabilityof
If
Therefore,
multiplicityof
descriptivestatistics not
randomly
of
the
relationships
among
are
and
corresponds
median.
and
considered.
of the
latency
well
as
intervals
latency
been
in terms
mean,
actually an
some
probabilityof
rates
of responses
number
of
the
or
is
but
of
account
considerations
rate
mean
of representative
preceding equations,
arithmetic
of different
of the interval
condition, there
relation
the
comparable
use
varying lengths. (2)
specifiedfor
basis
of responses) within
stimulus
the
of the
geometric
of the
statement
of
arbitraryselection
made
Since
eliminate
possibleto
the
the
statement
it is
the
are
has
account
data.
experimental
On
"free-response"situation
of
stage
quency
possibilityof specifyingthe fre-
statistic,say
theoretical
present
with
difficult problem
of
one
of strength of conditioning has

in
the
by
out
each
at
of the
discussed
data.
arise from
may
form
is the
measures
statistics
to pose
ceases
in
the
treatment
associated
summarizing
state
borne
of measurements
that
account
of the
problems
in
be
to
seems
large.
Finally,it may be pointed out

for the
important consequences
one
165
MUELLER
of
measures.
'
Hull, C. L., Principles of Behavior, D.
Skinner,
B.
F., The
Behavior
Appleton-Century
of Organisms,
D.
Co., New
Appleton-Century
York,
1943.
Co., New
York,
1938.
^
slightly different
i.e.,that
getting
period
after
immediately
a
to
response
randomness
equation
is
zero.
is
we
we
there
response
If
if
results
is
that
assume
instantaneous,
the
that
assume
"refractory"
the
transition
probability of getting
t is
P"t
period during which
-r{t-to)
the
from
an
the
interval
period exists,
probability of
"refractory"
greater
than
166
where
treated
*
as
The
"refractory"
the
is
to
gradual
of
Laboratories
is
This
for
useful
*
37,
merely
is
intervals
at
first
measure
make
with
this
in
analyses
of
the
transition
is
Psychological
is
next
that
that
show
may
the
indicate
the
the
independent
not
is
approximation
distribution
the
of
expression
an
and
Hull
fact
the
does
that
0.5
at
the
have
not
The
point
reasons.
intervals
step
of
use
normal.
the
two
the
starting
zero,
The
more
of
deviation
with
than
greater
appropriate.
Psychol.,
Exptl.
for
distribution
were
more
[/.
Hull
formulation
our
frequency
be
and
of
test
latency
for
is
Yamaguchi
in
exponential
Yamaguchi
the
notion
the
After
has
from
the
responses
short
zero
as
of
in
distribution
step
lower
method
second
maximum
marizing
sum-
figure
4.
reported
frequency
by
the
at
many
of
under
for
of
stimulus
and
same"
light
[J.
is
would
the
stimulus
first
of
conditions.
stays
more
and
on
until
the
and
long
to
subsequent
sure
in-
would
the
notion
used
procedure
the
sponse
re-
one
procedures
closely
of
measurement
cedure
pro-
(1948)]
96-123
sponse,
re-
theory
present
sufficiently
The
responding.
response
time
the
experimental
experimental
parallel
the
permitting
26,
and
of
the
tinuous
con-
extend
to
occurrence
of
an
Psychol.,
presented
Such
of
the
the
possible
is
test
from
period
fixed
rate
before
obtained
Prick
and
end
it
equivocal
less
(Frick).
stimuli
advantage
"the
cit.).
some
responses
determine
the
which
A
latencies
light,
Although
stimuli
than
role
important
more
bar.
the
duration
(op.
on
transient
additional
onset
stays
or
of
of
discriminative
transient
many
play
may
of
no
say,
the
required.
Skinner
of,
period
bar
presence
are
by
used
sort
the
that
distribution
conditions
the
of
stimuli
to
unspecified
stimulus
of
assumed
the
with
occurrence
minimize
be'
may
exposure
(Skinner)
occurs
the
it
assumptions
expected
be
others.
between
an
would
point
associated
present
of
between
the
length.
Kaplan
beginning
optimal
shortest
account
may
additional
Frick
variable
Michael
procedure
our
reported
the
an
place,
second
ones
that
if
complex
interval.
the
may
of
the
Gladstone,
are
If
of
associated
the
data
Gladstone,
step
In
response
provide
not
lowest
data
deviation
Felsinger,
and
results
Felsinger,
zero.
easily
the
The
one
The
may
the
the
at
could
limit
that
begin
which
more
Subsequent
approximation.
of
by
(1947)
first
Mr.
by
is
has
present.
214-228
The
first
end
experiment
The
period
recorded
were
period.
the
"refractory"
the
University.
the
"holding"
the
if
or
here
Columbia
between
interval
of
one
formulation
The
period.
reported
data
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
time
interval
intervals
by
168
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
Any equipment which the observer uses

Therefore
the voltagewith which
"receiver."
"receiver
The
input."
"receiver"
the
of
this
by
who
the
is
placed
receiver.
here
and
out
instructions
fact that
have
been
various
these
the
actual
performanceis evaluated
its
called the
presentedis called
the
primarilyin specifying
of statisticalmethods
applications
intended
are
minimum
possess
subsequent sections
In
carried
the
on
consist
may
judgment is
is
observer.
signaldetectability.
They
problem
to
subjectfor those
definitions of "optimum"
to
the
this
observer
sections of this article survey the
first three
The
instructions
optimum
be used
to
make
to
the
to
serve
as
of mathematical
proposed by
definitions
other
lead
introduction
an
Several
training.
authors.
Emphasis
the same
essentially
optimum receiver is
of practical
cases
to
of the
specification
for some
numerically
interest [17].
1.1
and
SN
Population
signalplusnoise may be capableof producingmany

of all possible
receiver inputswhen
noise alone
diff'erentreceiver inputs.The totality
the collection of all receiver inputs
when
is present is called "Population N''; similarly,
with a
signalplusnoise is present is called "PopulationSN.'' The observer is presented
from
of
but
he
does
know
receiver
the
from
which
not
two
one
input
populations,
know
the probability
that it arose
from a
indeed, he may not even
populationit came;
the
The
observer
from
which
receiver
must
population
particular
population.
judge
inputcame.
noise alone
Either
1.2
or
the
Samplingplans
the
on
sampling plan is a system of making a sequence of measurements
interval in such a way that it is possible
to rereceiver input duringthe observation
construct
the receiver input for the observation
interval from
the measurements.
functions of time as sequences
Mathematically,a samplingplan is a way of representing
The
list a few examples.
of numbers.
this
idea
is
describe
to
to
simplest
way
that
the
interval.
observation
interval begins
A:
Fourier
series on an
Suppose
at time t^ and is T seconds
long,and that each function in the population6'A'ând A'^can
A
be
in
expanded
for
series
Fourier
receiver input can

particular
each
input,which
in turn
can
ao +
process
(oq,ai, bi,
InntjT
On,
in the
population the
(^0,"î,bi,
.
ay,
of
bn ) is
that
on
by
the
formula
t^ "
-"-
"
to +
(1)
T.
function
.) is
Fourier
x(t) by
the
samplingplan
series which
sequence
in the
involve
sense
the
of its Fourier
described
cosine
and
efficients
co-
above.
sine
of
For such a
simply "series-bandlimited."
the
finite
representingeach receiver input x(t) by
sequence
a finite sample plan.*
series
sense
sampling plan is finite if there

inputsin the population.
A
measurements
population
cyclesper second. Suppose that for a particular
the population
of frequencygreaterthan njTa.ve zero; i.e.,
Fourier
process
bn,
in the
frequencynjT
inputsthe terms
is bandlimited
receiver
is of
of receiver
pairof terms
The
Zj"sm
each
representing
by making
coefficients
Fourier
The
l-rrnt
+
""
interval.
these measurements
from
lirnt
^71 COS
Z
n
the
obtained
be
be reconstructed
^
x(t)
Thus
observation
the
on
or
is a finite maximum
lengthfor
the sequences
for all
W.
Shannon''s
B:
all time
and
that
frequencies
greater than
(.
x{t^
this
x{t) by its
njlW),
the
case
\J2W\
The
instants
0 and
of time
a^
^0
receiver
from
input
is
populationis to
of the
represent
to)
apart,
.). In
inputis
receiver
77-[2
W(t
called
are
zero
/?]
(2)
2H^/
?(,+ n/lW
includes
band
spaced 1/2[^seconds
XjlW),
x(to+ nj2W),
x{t^ +
x(to),
sin
interval
to
every
this
samplingplan
at times
amplitudemeasured
[2] for the reconstruction
x(t)
of
for
169
FOX
observation
the
transform
W.
x(to
...,
formula
that
C.
W,
"transform-bandlimited"
Fourier
cyclesper second, i.e.,the
AND
BIRDSALL,
populationsare
to
function
G.
samplingplan. Suppose
the
for
each
T.
PETERSON,
W.
7T[2Wit
to)
n]
Each
sampling-times.
If the
all
different
observation
choice
of
/q between
samplingplan.
againincludes
from
transform-bandlimited
to a frequencyband
time, but
populationsare
then each receiver input
^/2 to/o + W/l which does not contain zero frequency,
/o
be considered
and
modulated
as
an
waveform,
x(t) can
x(t)
amplitude
frequency
ous
+ B{t));r(t)is the amplitude of the envelopeand
6(t)is the instantaner(t)cos{iTrfot
is obtained in this
phase of the carrier. A sampling plan employingsampling-times
each
receiver
the
Kô + "1^)^
case
by representing
inputby
sequence (. Kô)'^(ô)'
a
1/2ff yields
interval
the
"
6(tQ+ njlV), .) of envelope amplitudes and carrier phases

times spaced by 1/ff seconds
apart [1]. The reconstruction of
this
sequence is givenby
.
n=
that
the
^0 +
''
77; I
CO
"
2-/o?+ 0|ro+-
cos
"
Fourier
have
all time.
observation
If the
populationsare
series-bandlimited,then
times
similar
are
populationsand
beginningof
populations
plan for this
rrlW^t
can
in
from
obtained
be
of its amplitudes measured
0 to
inputfrom
n]
to)
to)
n]
interval.
therefore
and
appliesonly
the
the
when
Only
hypothesis
observation
lengthand if the
are
sampling plansutilizing
samplingfor
B
paragraph
interval
interval.
which
interval,
the observation
situation
there
described
those
series-bandlimited
are
sequence
to
infinite observation
an
transforms,
populationsare
interval includes
the
sampling-
at
(3)
''I
for all times
known
which
"
the receiver
Sampling plan using sampling-times

for a finiteobservation
C:
functions
"
sin TT[W{t
"t)
"
measured
is of
Suppose
is T seconds
f^
finite
that
time
long,and
cyclesper second.
by representingeach
seconds
l/2P'f^
is measured
that the
suppose
A
finite
receiver
from
sampling
input by
the
apart [1]
x{to),x\to+
fo +
(4)
T2W
2W
and
the
reconstruction
of the
2Trr-i
^1
receiver
input from
sin7r[2W{t
this sequence
is
n]
t^)
-
,0 "
2W(t
-to)
"
T.
(5)
-n
2^Frsin
2WT
Again
each
choice
samplingplan.
In
of the
a
(initial)
sampling-timeto between
similar
fashion,
if the
observation
0 and
interval
1/2^F yieldsa
is
unchanged
different
but
the
170
READINGS
MATHEMATICAL
IN
series-bandlimited
on
are
populations
which
include
does
not
to/o + W/2
PSYCHOLOGY
this interval
to
frequency,then
zero
from
frequencyband
each
receiver
/q
Wjl
"
be
inputcan
+ ^1^), ^(ô+ V^),

''(''o
representedby a finite sequence [Kô)'^(''o)'
fUo + T
at sample
IjW), 6(t()
+ T
l/fV)]of envelopeamplitudesand carrier phases measured
points 1/^F seconds apart; /q is againused to denote the initial samplingtime which
of the receiver inputfrom
may be chosen anywhere from 0 to 1/ff The reconstruction
-
"
"
"
"
this
of measurements
sequence
is
givenby
"
Infy
dlto+
these
From
examples
that
seen
kind, e.g., instantaneous
same
kinds, e.g., envelope amplitude and

the
common
made
property
role which
of
primarilyone
representedas
methods.
that
the receiver
the
an
can
in order
to
ferences
important dif-
from
the
of different
have
in
ments
measure-
Grenander
back
the
theory presentedin
The
populationsN
samplingplansin order
concerning an
to the
available for
that
SN
paper is
will
be
applystatistical
receiver, it is often
languageof receiver inputs.

of the theory,then
particular
application
parameters of the "optimum"

Both
for this reason
and
plans.
approximatedby usingfinite-sampling
the
the
here
where
is
restricted
to cases
simpify exposition, theorypresented
by
[3] shows
to
this
familiar
more
a
"optimum"
and
the
desired
be
available.
finite-sampling
plansare
2.
2.1
of
use
is obtained
answer
to translate this answer

possible
a finite-sampling
plan is not
work
of
be reconstructed
inputcan
convenience.
throughthe
sequences
Once
receiver
number
or
amplitude measurements,
phase. However, they all
carrier
sampling plan plays in
mathematical
If
recent
are
it.
on
The
there
samplingplans such as (a) the length of the observation

the measurements
are
employed, and (c) whether
sampling-times
be of the
all to
be
can
various
between
interval,(b) whether
are
it
(6)
"T.
"t
Optimum
Tests
Fixed
on
Observation
Intervals
Probability
density
functions
This
for
requires
of the
part
raw
of the
paper is concerned
data
finite sequence
with
method
of numbers
(x-^,
x^,
of statistical analysis
which
.
which
x."),
is the result
finiteinput accordingto some

particular
of
from
which
the
a
population
"sample"
samplingplan.
sequence
it arose, and is denoted
by a singleletter; thus, if the receiver input is x(t),and the
x"),then this sequence is called the sample
sampling plan yieldsa sequence (x^,x^,
X.
The theory to be developed here is intended
to specify
an
optimum receiver and is
couched
in the languageof samples,X
a receiver
x^. If n is very large,
(x^,x^,
which
had to make
called for by a samplingplan would
the measurements
certainly
this practical
avoided
when
the
be impractical.However,
is
of
difficulty
specification
from
the language of samples to the language of the
back
the receiver is translated
receiver inputs; this can
it is possible
the inputsfrom
because
be done
to reconstruct
made
measurements
The
at
the
receiver
is often
called
the
samples.
W.
AND
BIRDSALL,
G.
171
FOX
C.
W.
finite
purposes of the subsequent development any
the
For
T.
PETERSON,
W.
samping plan may
of the associated
known
are
sample X so
providedenough properties
tions
the probability
densityfuncthat certain probabilities
may be calculated. Specifically,
from
is
drawn
when
X
for
the
variable
X
of
the
cases
sample
fs{X) and/sA'CA')
of
The
basic properties
two
be known.*
must
A'ând
populations
Â^, respectively,
be considered,
densityfunctions
are
\fy{X)dX^\,
/ivW"0
(7)
and
^fsîX)dX^\
fsdX)"Q
where
taken
integration
symbol representsthe multipleintegral
the
of the
range
sample
2.2
The
The
observer's
of a
concept
Consider
variable
(x-^^,
x^,
x^.
criterion
an
now
entire
the
over
has
who
observer
job is to judgefor each
sampleX
available data the
as
sample whether
not
or
{x^,
taken
it was
from
x").
tion
popula-
the (probably
subconscious) criterion
to determine
possible
of it. Ideally
to find an external manifestation
used by the observer, it is quitepossible
his
and
the
observer
to record
to
is
is
each
all that
to submit
sample
possible
necessary
decided
the
observer
judgment. This will yielda tabulation of those sampleswhich
is giventhis tabulation and
If
drawn
from populationSN.
were
any other observer
SN.
instructed
base
to
the
Thus,
itis not
Although
his decisions
tabulation
of these
employed by
denoted
by the
"Acceptingthe
denoted
2.3
by
the
observer
to
replacethe
a
criterion
mental
criterion and
in
the
will be
statistics of
phraseologycommon
signalis present."The tabulation of the remaining
from
drawn
concluded
were
populationA^,will be
refers to
a
used
will also be called
B.
associated
Probabilities
There
all
that
hypothesis
be
can
tabulation
which
letter A,
which
samples,those
responses
Such
the observer.
did the first observer.
exactlyas
it,he will behave
on
are,
of course,
with criteria
as
many
criteria it is necessary
possible
diflFerent criteria
to select those
that
as
are
there
are
observers.
best for various

with
each
Among
purposes.
criterion.
To
It will be
associated
must
quantities
will be
of the populations
that a sample from
one
probability
necessary
definitions,these probcriterion A. According to the standard
abilities
listed in a particular
are
givenby
be
do so, certain numerical

to
the
know
"
(8)
and
-J.,
where
the
*
from
Also
is
multipleintegral
population SN"
event
of
over
all
samples listed in
the criterion A.
kept in mind that "the event of the sample being drawn

corresponds signaland noise being present at the receiver input.
the same
thing.
populationSN being sampled" means
In this discussion
"the
taken
it should
be
to
172
MATHEMATICAL
IN
READINGS
PSYCHOLOGY
sample plan might have

example,a particular
For
function
density
of the form
would
+
Kexp [" {x^+ x^ +
lie outside
consist of those
a
,x^) which
(aî,
samples X
x^,
sphereof radius 1
Then
the integral
would
be taken over
the exterior of this sphere.
centered at the origin.
These
have
a special
P\(A) is the conditional
ability
probprobabilities
significance.
in
that a sample from
criterion
that
will
be
TV
will
be
listed
A
is,
;
population
Thus
SN.
from
F
is
the
conditional
false
PyiA)
population
judged as a sample
alarm
of a certain kind of
Also, Pgy{A) is the conditional probability
probability.
that
called
correct
a hit (that of judgingcorrectly
a sample is from
population
response
of judgingfalsely
that a sample is from population
SN). The conditional
probability
of a miss.
^'A'^ is,therefore,givenby 1
M, the conditional
Pgy{A)
probability
alarms
and
misses
their
conditional
The only errors
which
false
can
occur
are
;
abilities
prob-
fyiî,X2,
.r")
x^)]. A
possible criterion
"
called
F and
M,
reader
familiar
are
with
the
briefly
the formal
probabilities.
error
content
of
probability
theory should
note
that
these
the sample
on
are
true conditional probabilities
; the firstis conditional
quantities
from
drawn
from
second
is
conditional
drawn
the
its
on
SN;
being
being
population
them
from a prioriprobabilities
(the probabilities
populationN. This is to distinguish
that a certain population
will be sampled,for example)which are not as yet assumed
known.
2.4
Likelihood
ratio and
the ratio criteria
It is convenient
defined
as
the ratio
the likelihood
drawn
was
that
from
to
introduce
function
new
fgy(X)lfy(X)for sample points X

the
sample
was
from
drawn
SN
called the likelihood

=
(x^,
.
N.
in fact drawn
from
populationSN, i.e.,that Xshould
was
"best"
criterion. Thus, for each
/? "
number
0, a
ratio,KX),
a:")
; l(X) represents
relative to the likelihood
it would
Hence, if l(X)is sufficiently
large,
that A'
be reasonable
it
that
to conclude
be listed in the desired
certain criterion A{(i)will be selected;
each sample Jffor which l(X) " p. The problem then reduces
by listing
of ;6;that is,to determine
how
wise
choice
large"
making a
large"sufficiently
/4(^)is chosen
to that
of
is. Criteria of the
form
A{(^)will
be called
ratio criteria.
presentedvarying definitions of a criterion being

as a ratio
"optimum." It turns out that each of these optimum criteria can be expressed
criterion,so that a receiver designedto yieldlikelihood ratio as output could be used
A
with
2.5
any
number
of writers have
of them.
Weightedcombination
criteria
factor repas
a weighting
to assigna certain number
w
Suppose it is possible
resenting
relative to a hit. Since Pgy{A) is the probthe importance of a false alarm
ability
then be reasonable
of a false alarm, it would
of a hit,and Py(A) the probability
to
find
criterion A
which
maximizes
the
Psy(A)
But
this
quantitycan
be written
quantity
wPy(A).
(9)
as
I'^(X)
wfyiX)]dX,
(10)
W.
where
PETERSON,
W.
would
173
FOX
C.
W.
the
over
list in A
one
integral,
for
Solvingthat inequality
AND
BIRDSALL,
this
sample pointsX listed in ^4. To maximize
the
for
which
not
integrandwas
negative.
every sample
contain
those sample pointsX for
that A should
sees
one
is taken
integration
the
G.
T.
w,
l(X)='f^"w.
the
Thus
2.6
certain
[4] as
it would
maximizes
of
type
be reasonable
of
probability
the
optimum
(1) Py{A^)
it is
so
ratio criterion.
choose
to
hit.
from
Thus
and
below
criteria that
such
among
Neyman
Py{A)
proposed
Pearson
Ai^ for which
criterion
criterion any
false alarm
k, and
"
(2) Pgy{A,^)is a
The
and
of
important to keep the probability
critically
level k, then
which
one
simplyA{w),
criteria
Neyman-Pearson
If it is
is
criterion A
desired
(11)
criterion
A^. type
for all the criteria A
maximum
also
can
be
with
expressedas
the
This
criterion.
ratio
"
property P\{A)
k.
be
can
follows.
To beginwith, it is necessary to consider
as
only those criteria
plausible
in order to meet
will
taken
because
be
A
as
as
k,
large possible
Py{A)
the
consider
the curve
condition
(2). Now
by
equations
givenparametrically
made
for which
X(^)
Py[A(li)]
(12)
and
Y
This
for
will be
curve
a
receiver
output
PsylAiP)].
ROC)
OperatingCharacteristic (briefly,
Receiver
called the
whose
y(iS)
is likelihood
ratio and
ratio criteria
which
with
curve,
being
are
used.
The
the second
ROC
at
iS
samples.Thus
passes through the

0. At /S
0, liX) " /3
=
the
be certain to make
not
observer
will
false alarm
from
exclusively
be drawn
points(0, 0)
curve
basic property of the
0 for all X,
(1, 1), the first
and
^(0) consists
so
report that every sample is drawn

and to make
a hit. (This assumes
one
of the
populations.)This
densityfunctions expressedby
the
from
^3
at
of all
SN,
oo,
possible
so
he will
sampleswill
that the
be verified,using the
can
followingequations:
PsÂA{0)]=jfsd^)dX^l
(13)
and
Py[Am
where
X(0)
the
y(0)
samples
will
is taken
integration
never
These
curve
1.
"
oo;
all
X{oo)
Moreover,
I{X)
can
nor
he
make
y(co)
i.e.,A{od) contains
hit.
Thus
can
somewhat
0, because
no
the
operator
=
as
those
in
of the
Fig.1.
0 and
next
at
that
mean
equations
for /^
samples
Pgy[A((X))]
considerations, togetherwith
be sketched
These
samples X.
possible
report a signalis present. Therefore,
false alarm
ROC
with
over
(fy(X)dX
\,
oo
all and
there
the
are
no
operator
possiblymake
0.
P\[Aico)]
cannot
section, show
that
the
174
IN
READINGS
MATHEMATICAL
XW)
PSYCHOLOGY
P^\A(0)]
Figure
TypicalROC
To
one,
curve
determine
that
so
the desired
Px(Ak)
lies
which
Pj^[A(^)]
k and
lie between
probabilities
there is a pointQ of
clear that
the
the ratio criterion
weightingfactor
usingthe
usingany
criterion A{^^
is
A((ij^)
both
case
is greater than
are
equalto
k.
the
same
(14)
is substituted
into
"
Psn(A).
(15)
be chosen
to
be this
ular
partic-
curve
It is desirable
to
digressfor
Its value
lies in the fact that
is
Aifi),then
criterion,
i?^,the
hPdA).
If this value
Therefore, the desired Neyman-Pearson criterion Aj^should

ratio criterion,
A(fij^).
ROC
to
obtains
one
PsdAi^,)]
2.7
if
equal
or
criterion A, i.e..
other
and PyiA)
Pjs;[A(P,c)]
above,
inequality
optimum weighted-
an
^^. Therefore,
the
the ROC
the
" Psn(A)
Psn[A("(îc)]îcPnIA{Pic)]
In this
and
zero
Then
criterion with
weighted combination
all
one.
paragraph 2.5, it is
weightedcombination
and
(X, Y) of Q are
point (k, 0). The coordinates
Y
will be written /S^. Now
/S,which
Psn[MP)1 for some
and
will be the
therefore
(1) because
A((ijc)
Pff[A((i;.)]k,
for
Pg^[A{(i,^)]
any criterion with the property that P^vC/l) k.
desired Aj, if PsjîA)"
combination
recall that
zero
above
vertically
A(^k) satisfies condition

From
Aj"
k is between
curve.
ratio
can
be read
off the
ROC
coordinate
is the conditional
conditional
probabilityof
being correct
of
when
miss.
noise
if the
moment
to
study the
type of criterion chosen
ROC
for
of the
completedescription
alone
It will be shown
is
present,and
in
moment
ROC
(1
"
Y)
that the
curve,
F coordinate
X) is the conditional
"
closely.
By the very definition of the

F, of false alarm, and the
probability
(1
Similarly
more
particularapplication
detection system'sperformance
curve.
hit.
curve
the
is the
of
probability
ability
prob-
is the conditional
operatinglevel ^
for the ratio
176
In the
same
of
probability
the
way
MATHEMATICAL
IN
READINGS
miss
PSYCHOLOGY
is
givenby
P(SN)M.
Because
an
the
of these
sum
error
can
in
occur
exactlythese
these
already been
substituted
are
ways, the
two
of
probability
is
error
quantities
PiE)
It has
(21)
into the
PiN)F
PiSN)M.
pointed out that F

for PiE)
expression
(22)
1
P^yiA) and M
Pgj"^(A).If
simplealgebraicmanipulationgives
=
"
PiN)
PiE)
It is desired
PiSN)
minimize
to
PiSN)
But
PiE).
(23)
PsdA)-^^yP^,iA)
from
the last
equationthis
is
to
equivalent
maximizingthe quantity
PiN)
^sn(A)
and, of
which
2.9 Maximum
decidingthat
optimum
such
assigna weightingfactor
criterion. This
and
Vd be the value
values, Vm,
to
way
known,
associated
if numerical
of detection
noise
alone
the value
of
with
is
criterion
criterion
as
values
criterion
with
PiN)lPiSN),
can
if the
the
"expected
PiSN)
prioriprobabilities
be
can
to
V^, the value
of
be determined.
now
which
one
depends on knowing
and
the four alternatives.
Let
assigned
and
that is,of correctly
Vq the value of being"quiet,"
The
other
also assigned
alternatives
two
are
present.
miss, and
be determined
can
maximizes
the
false alarm.
In this
case
The
expectedvalue
it is natural
expectedvalue.
It
can
to
define
be shown
an
that
criterion maximizes
Psn(A)
By
(24)
'
criteria
expected-vahie
of each
are
yielda weightedcombination
simplya ratio criterion Aiw).
be
to
Another
PiN)
Py(A\
^^y
this will
course,
is known
value"
definition
PiN)
Vn
P(SN)
Vn
Vp
Pn(A).
(25)
Vm
(see paragraph2.5),this criterion is a
weightedcombination
criterion with
weightingfactor
PiN)
w
Vq
PiSN)
and
case
hence
likelihood
for which
2.10/4
Vq
Vp
"
Vd
Vj,
(26)^
^
Vd-Vm'
"Ideal
Siegert's
Observer"
criterion
is the
special
Vm-
"
and signal
detectability
probability
posteriori
Heretofore
noise
ratio criterion.
"
"
is
the
present" or
best of his
observer
"noise
knowledge,is the
has
been
limited to two
"signal
plus
answers,
possible
present."Instead he may be asked what, to the
is present. This approach has the
that a signal
probability
alone
is
W.
W.
T.
PETERSON,
G.
AND
BIRDSALL,
W.
C.
177
FOX
information
from
the receiving
ward
more
advantage of getting
equipment. In fact, Woodand Davies
that
if
the
observer
makes
best
estimate
of this
the
out
point
possible
for each possibletransmitted
probability
message, he is supplyingall the information
which
his equipment can
givehim [6]. A good discussion of this approach is found in
the original
and Davies
for the a posteriori
[6,7]. Their formula
papers by Woodward
probability,
PxiSN), becomes, in the notation of this paper,
^^
l^îX)P(SN)
(1
P(SN)}fîX)
'
or
1{X)P(SN)
^^^"^^^
"
l{X)P(SN)
If
receiver
which
has
calculation
of
could
make
-P(SN)
as
its
output
be
can
the
receiver
calculated
calibration, since (28) is
receiver
receiver
optimum
an
if the
built, and
be
posterioriprobabilitycan
be built into the
l(X) ; this would
ratio
likelihood
is known,
probabilityPiSN)
^^^^
for
a priori
easily.The
tion
func-
monotonic
obtaininga
teriori
pos-
probability.
3.
3.1
SequentialTests
with
Minimum
Duration
Average
Sequentialtesting
idea
The
input; if the
is this : make
measurement
one
sequential
testing
x-^ on the receiver
is
decide
whether
the receiver
as
to
persuading,
x-^^ sufficiently
from
populationSN or from populationA'^. If the evidence is not so
drawn
inputwas
strong,
of
evidence
make
second
to
make
at
evenly spaced
x^ and
measurement
until
consider
the
the
evidence
Continue
x^.
(a-^,
of
is
measurements
measurements
sufficiently
resulting
sequence
Obviously this involves the
population or the other.
persuadingin favor of one
theoretical
of making arbitrarily
before a final decision
measurements
possibility
many
is made.
This does not mean
that infinitely
be made
in an
measurements
must
many
it
actual
does
that
the
entail
nor
mean
an
operation might
application,
necessarily
taken
measurements
are
long interval of time. If,in a particular
arbitrarily
application,
another
However,
?
1/2,
such
times
then
the
plan might
(rt
"
"time
base"
of such
call for measurements
l)/",and
as
these
times
plan
measurement
to be
all lie in the
made
at
time
the
is infinite.
instants
interval
from
/
zero
0,
to
plan would have a time base of only one unit of time.

If the measurement
plan has been carried out to the stage where n measurements
have
been
the variable
made,
a-")is called the "th
X^
{x-^,
x^, x^,
,x^
x^,
A
for
will
be
considered
measurements
specific
only if for
stage sample variable.
plan
each
possiblestage n, the two densityfunctions /^^CA",,)
and/^(A'")of the //th stage
variable
the
first
of
these
is applicable
when
functions
X" are known;
sample
density
when
population ^'A'îs being sampled and the second is applicable
population A' is
diff"er
different
These
well
functions
at
beingsampled.
density
stages,so that
may very
should
be
however, the // appearingin the argument
they
writtenyj^(^")
and/ifv(A'");
functions
the density
on
Xn should always make the situation clear, and the superscript
one,
measurement
themselves
will be
omitted.
178
3.2
READINGS
MATHEMATICAL
IN
PSYCHOLOGY
Sequentialtests
A
assignmentof
(2) An
(A)
Signalplus noise
(B)
Noise
(C)
Another
measurement
first
of
At
the
present,i.e. the sample

should
the
the
present is drawn
This
entire
alone
noise
another
of
is present, and
againthe
will be made,
measurement
(real)number
system, which
the
in
test
will be written
S^
criteria A^, B^,

first-stage
conclusion
that a signal
is
/4i,the
If
is terminated.
the test
all could
three
If it is listed in
and
at
that the firststage sample
means
number
the test is terminated.
and
populationSN,
populationTV,
from
comes
plan, any
(.r^)
ranges through
for the first stage sample space.
to stand
Suppose
If the sample X^ is listed
and Q,have been chosen.
X-^
from
comes
be made.
measurement
the first measurement.
from
result
theoretically
variable
stage
conclusions:
possible
present,i.e. the sample
is
alone
three
is
things:
plan with densityfunctions/;v(J!f")

andy^;y(A'"),
criteria to each
plan.
stage of the measurement
three
representthe
criteria
three
of two
measurement
(infinite)
An
(1)
These
will consist
test
sequential
moves
By, the conclusion

X^
should
to the
on
is that
be listed in
second
Cy,
stageinstead
terminating.
When
space
way
stage criteria
the first
have
been
chosen,
limitation
is
5*2,the
placedon
(x^,X2) ranges. The only

through which the second stage sample variable X^
the
second
is
for
be listed in Q.
fore,
Thereto
to
can
Xy
(xy)
proceed
stage
but
all
second
does
contain
not
(x^,a^g) onlythose
^'2
possible
stage samples X2
"
the test
(xy)is listed
for which
chosen
there
from
when
The
An, B", and

those
in
C".
Three
A2, B2, and

stage criteria,
second
They
sample
that
signalis or
no
the
C", have
When
from
an
"
S2
is omitted.
that
the
test should
now
way
These
be
be
that
criteria
is,the three
continued, are
C2 respectively.
stage criteria
been chosen, then the next stage's
sample space Sn+i consists of
which
for
X^
x^) was listed
(x^,X2,
(x^,X2,
x^, x^+y)
selection of criteria
samplesXn+i
Then
is not
in
C2, must
in such
chosen
in the first stage. That
chosen
present,or
sample X2 is listed in A2, B2, or
a
be
must
no
conclusions
drawn
Q.
in the listings
and
duplications
the
those
as
same
exactly
significance
are
carry
in
samples X^ listed in S2.
those
proceeds in
the
same
way.
If the
"th
S^+i
are
drawn
the three
("
1) stage criteria ^"+i, B"+y, and
C"+i.
entire sequence
(A" By, Ci),

'^2'
^2)5
"2'
(^rn "m C"),
of criteria is selected,a
course
that
the
test
test"
"sequential
will
be
necessarily
a
possibleways of selecting
be
which
ones
particular
may
sequence
are
has
been
determined.
useful.
particularly
of
criteria and
very useful.
hence
This
does not
However,
a
among
mean
of
all the
test, there
sequential
3.3
associated
Probabilities
is any
If Qn
T.
PETERSON,
W.
W.
/?th
with
G.
AND
BIRDSALL,
C.
W.
179
FOX
tests
sequential
stage criterion,then
the
quantities*
(29)
and
represent the (A^ or
listed in the criterion
(1) The
A?th
conditional
SN)
Q,j.
Conditional
stage conditional
that an A?th stage sample Xn

probabilities
interest are:
of particular
probabilities
error
probabilities:
will be
population A'^ is sampled, then the probabilitythat the sample variable Xn

of a false alarm.
listed in A^ is P^(A"). This is the A^-conditional probability
the
that
then
the
If population^'A'îs sampled,
sample variable X^
probability
is the 5A^-conditional
listed in B" is Pg^{B"). This
probabilityof a miss.
If
will be
will
be
conditional
(2) The
of the
probabilities
error
entire
test
00
2 PN^'^n),the
of
probability
A^-conditional
(30)
false alarm, and
oo
are
'S'A^-conditional
th^
2 "PsA'(^n)'
M=
merely the
of the
sums
conditional
(3) The
of
probability
(31)
miss,
all stages.
over
probabilities
of
probabilitiesterminatingat stage n are
same
error
PNi^n)
(32)
Pn^B,,),
and
T"N
These
terminate
at
equationscan
mutually exclusive
is the
sample
variable
of the
sum
since Xn
conditional
(4) The
PsNi^n)
(33)
PsN(Bn).
The
by a simpleargument.
justified
is for the
stage n
of this event
probability
are
be
can
X"
to
be listed in either
of the
probabilities
be listed in at most
that
probabilities
one
the entire
test
the test
only way
A"
component
of A"
or
B".
events
and
The
which
B".
will terminate
are
1 n,
Tn=
can
(34)
"=i
and
3.4.
(35)
Ts\.
Average sample numbers

There
are
sequentialtest
is that
samplingprocess when
*
The
pointslisted
notation
in
Q".
be introduced.
which
must
quantities
it affords an
opportunityof arrivingat a
the
data
happen
indicates that the
^Qn
to
be
feature
One
other
two
decision
unusuallyconvincing. Thus
is
integration
to
be
carried
out
over
of the
earlyin the
one
might
all
sample
180
READINGS
the
expect that, on
IN
PSYCHOLOGY
stage of termination
the
average,
MATHEMATICAL
than
lower
be
could
of
achieved
be
well-constructed
sequential
by an
equal,good standard
It is therefore
test.
important to obtain expressionsfor the average or expectedvalue
other probabilities,
As with
there will be two
of the stage of termination.
of these
conditional on population
N beingsampled; the other conditional
on
one
quantities:
Â^
are
givenby
beingsampled. They
population
test
would
otherwise
00
En=
nn
(36)
^Tly.
(37)
and
00
EsN
The
is used
letter E
called
are
the
refer to the term
to
termination"
the
take
may
It should
be
be
less than
that
on
the
for
sample
average
so
if it is known
that
are
test
made,
3.5
Tgy
each
become
Sequentialratio
that
the
this
The
take
which
the
but
be
can
E^j^^r
justified
"stage of
variable
the
sense
not
situation would
mean
strictly
are
will
times
some-
will also be,

upon
occasion, much
numbers
finite would
assumption,*it can
of
numbers
stages of termination
are
to be considered
only ones
(in the
sample
not
those
are
that
be shown
Ty
with
be
finite
Tgy
On the other
probability).
hand,
always follow that the average sample

only that if a sequence of runs of the
would
probably terminate, but the

made.
runs
were
largeas more
arbitrarily
run
and
fy
quantities
formulas
that
(conditional)probability
average
sample
1 it does
the
numbers
average
to terminate
finite. Such
numbers
would
Ty
the
n,
test, the
sequential
whose
Under
that the test is certain
were
Therefore
applications.
numbers.
value,
weightedby
sample
average
test
larger. Any sequential
useless
of
runs
each
these
value.
heavilyemphasized
In actual
figures.
average
be
must
form
The
that
grounds
on
will in fact take
variable
"expected value."
average sample numbers.
freely)on
(somewhat
average
tion
stage of termina-
tests
found that the best

tests usingfinite samples it was
studyingnon-sequential
it may
be
be
in
of
likelihood
ratio.
terms
Therefore,
always
expressed
The
nth
likelihood
ratios
of
infinite
useful to introduce
at each stage
an
sample plan.
ratio
is
the
ratio
likelihood
function
defined
as
liX")
/g^7(A'")//v(^n)Optimum
stage
turned
be
criteria
all
criteria in the finite-sample
tests
out
to
listing samples X for
which
be possible
It should
to choose
l(X) is greaterthan or equalto a certain number.
in
numbers
the
criteria
For
each
two
same
(/4",5", "")
fl" and
sequential
stage
way.
In
criterion could
bn
bn
with
numbers
"
fl" could
be
bn
An
lists all
samples X^
Bn
lists all
samples X^, of
Cn
lists all
samples X^
the Xi.
Remember
would
chosen.
a" and
that
the
Then
the
criteria (A", B^, C") determined
by
the
be
of the
sample space S^
for which
/(A'")"
a",
b^,
the
sample
space
5" for which
l^X^)
of the
sample
space
Sn for which
b^
process
Is
assumed
sampling
not
to
"
"
l{Xn)
"
a^.
yieldindependenceamong
W.
PETERSON,
W.
in this way
be finite,then
resulting
sequential
meet
is called
test
181
FOX
C.
W.
requirementsthat
the
If criteria selected
the
AND
BIRDSALL,
G.
T.
the average
sample
numbers
ratio test."
"sequential
tests
Optimum sequential
3.6
sample
the average
fixed
addition
sample
average
"'^v ^^"^
minimum
givenin
Section
numbers
all
among
that
as
for which
one
with
tests
sequential
M.
formulas
the
to
test
optimum sequential
an
"'v and
numbers
F and
probabilities
error
In
the
define
customary [8] to
It is
formulas
3.4, alternative
[9] for
are
00
"-v
2 P,v(Q)
1 +
1
(38)
and
2 PsxiCi).
1 +
^-^.Y
=
if
Thus,
test, then
set
its
(39)
criteria {A*, B*, C*) is presentedas a possibleoptimum

sequential
whether
the inequalities
is decided
character
by ascertaining
optimum
of
iPxiC*) "J,Py(Q)
(40)
lPssiC*)"2Psx(Q)
(41)
and
I
for every
hold
other
of
set
criteria {{A^, B^, C")}

sequential
with the
same
abilities
prob-
error
i.e.,with
^P^{A*)=J^P^-{Ad
(42)
lPssiBt)^2Psx^Bi).
(43)
and
the
test is difficult because
an
optimum sequential
problem of constructing
is
when
there
satisfied
and
be
no
even
(43) can
(42)
apparent term-by-term
equalities
has proposed as optirelation between
the sequences {P^(C*)} and
mum
{Py{C,)}. Wald
The
the tests in which

and
fli
a"
the
case
noise."
of
for
the
densityfunctions
example
when
at least
signaldetectability,
is not
in the
successive
and
met
sense
stages are
tests
independent,as
Z)"
are
can
signalplus noise consist of "random

of the theory
with in most
applications
hypothesesof
that the
Wald
and
witz
Wolfo-
satisfied.
are
Consider
F and
not
at
that is,6^
[10] proved that these
Wolfowitz
noise
both
this "randomness"
However,
and
Wald
{6"}is constant,
{o"} and
of the sequences
each
Moreover
n.
whenever
optimum
be
for all
test of fixed
Although
generally
requiresless
M.
the
lengthas
described
in Section
test with
sequential
optimum
2, with
these
same
error
error
probabilities
probabilities
it has the disadvantagethat it will sometimes

with the
length test requires.In a conversation
the
that
Professor
Mark
or
Kac
of
Cornell
authors,
Universitysuggested
dispersion,
usefulness
aff"ect
the
be so largeas seriously
to
variance,of the sample numbers
may
of the sequentialtests in applications
to signaldetectability.
Certainlythis matter
should
be investigated
before
final decision is reached
a
concerning the merits of
use
much
more
tests
sequential
matter
to
time
time
than
on
the
relative to tests
calculate
the
the average,
fixed
on
variance
fixed observation
of
the
sample
interval.
numbers.
However
Therefore
it is a difficult
an
electronic
182
PSYCHOLOGY
of Michigan which will simulate

beingbuilt at the University
for
data
ROC
of both types as well as the
curves
provide
sample numbers.
(sequential)
both
is
simulator
of tests and
of the
MATHEMATICAL
IN
READINGS
4.
Cases
for Specific
Detection
Optimum
types
distribution
will
4.1 Introduction
The
chief conclusion
presentedin Section
input is
receiver
ratio for each
obtained
2 of this paper
from
is that
the
generaltheoryof signaldetectability
the
a
receiver which
receiver
optimum
TABLE
for
calculates the likelihood
detecting
signalsin
of SignalEnsemble
Description
Section
Signalknown
4.4
Application
Coherent
exactly^
known
except for phase"
Signalknown
4.5
radar
with
range
and
white
Signala sample of
4.7
output of
Detector
broad
Gaussian
noise
time
a
radar
video
radar
case
train
(A
incoherent
with
of
pulses
Signalone of
signals
other
and
pulsefrom
a crystal-
type
is at
with
range
radar
Coherent
orthogonal
ing
start-
with
Ordinary pulseradar
phase)
known
a
broad
receiver
known
4.10
(such
beacon)
as
or
band
4.8
target
signals;
speechsounds in
of
Detectinga pulseof
band
no
of noise-like
detection
receiver
with
range and character
Detection
sian
Gaus-
noise
target of
character
Ordinary pulse radar

and
with
integration
of known
4.6
noise.
of
one
target of
character
and
where
a
gration
inte-
with
a
the
target
of
finite number
positions
non-overlapping
4.11
Signalone
known
of M
orthogonalsignals
except for phase
Ordinary pulse radar

integrationand with
which
may
finite
number
appear
of
with
a
at
no
target
one
of
lapping
non-over-
positions
*
Our
of these two
treatment
work, but here they are
fundamental
treated in terms
cases
of likelihood
is based
upon
ratio,and hence
Woodward
apply
and
to
Davies'
criterion type
probability
type receivers. These first two cases have been
posteriori
in
solved for the more
which
the noise is Gaussian
but has an arbitrary
generalproblem
of
infinite
Those
solutions
the
use
an
require
sampling plan and are
spectrum [11,12].
derivations in this report.
involved than the corresponding
considerablymore
receivers
as
well
as
to
184
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
described
distributions are
by givingthe probability
probability
receiver
the
for
x.
/sv^-'^')
and/^,(a:)
inputs
densityfunction for the receiver inputswith noise alone are
probability
plus noise. The

densityfunctions
The
assumed
be
to
..2
/ivC^) IT
exp
WittN
"
x^
IN
(46)
or
\M/2
/-(^"
I2^^
2N^/']
exp
.
where
and
is 2 WT
"
A''is the noise power.
It
be verified
can
of noise
which
densityfunction is the description
is
and
has
the
at
stationary,
amplitude every time,
Fourier
Thus
components.
Gaussian
we
shall
refer
to
it
has
same
as
that
easily
Gaussian
this
probability
distribution
average power
of
of its
in each
white
"stationarybandlimited
noise."
functions
The
are
rpjjit)
orthogonaland
have
energy 1/2PF, and
therefore
[x{t)fdt.
(47)
0
so
that
1
^^^^^
where
Nq
Nl
\"/2
x(tf dt
"'^P
noise power
IV is the
'^
(48)
per unit bandwidth.
information
is givenabout the signals
as
they would
practical
application,
the signal
noise at the receiver input,rather than about
plusnoise
appear without
and
the
this
information
from
Then
be
calculated
must
density.
fsî^)
probability
will be
for the noise.
The
noise and the signals
probability
densityfunction f;^(x)
of
each
other.
assumed
independent
the
the noise, then
of the signaland
If the input to the receiver is the sum
receiver inputx{t) could have been caused
s{t)and noise n{t) x(t) s(t).
by any signal
in
the
The
densityfor
input
signalplus noise is thus the probability
probability
that s(t)and x(t)
^(O.
s{t)will occur
averagedover all possible
(density)
together,
then
function y^(5),
If the probability
is described by a density
of the signals
In
"
"
sYs(^)ds,
J/yvC-^
/s.v(^)
=
where
form
the
is over
integration
is used
when
P,^; the formula
the
case
the
is
fsd^)
This
general
sample variable s. A more
is described by a probability
measure
signals
the entire range of the
of
probability
in this
(49)
\fN(^
and
is a Lebesgueintegral,
integral
(50)
s)dPs{s).
is essentially
an
"average"of f^{x
"
s) over
W.
of
all values
the
weighted by
G.
T.
PETERSON,
W.
AND
BIRDSALL,
is
/^rCi)
If
Pg.
probability
185
FOX
C.
W.
Eq. (46), this
from
taken
becomes
=).-
fîx -s)dPsis)
/.w(^)
Y^'
"
dPgis)
\2WV/
(51)
1
exp
2^N
2/V
1
^2
exp
2N
J J
'
-A-
dPgis),
=1
"/2
fsyi^)
J/vC-
^) ^^sW
i:^]f f -î| [.r(r)
exp
2777V
s(t)fdt\dPgis)
^n
(52)
The
2/2
f'^
factor
f^
x^ dt
exp
exp
-^''(0^?
-(l/7Vo)
exp
If
"
exp
f^
^^ ^f
tt"
exp
"
x|]can
S
[ -(1/2A'^)
r^
xs
be
dt
dPsis).
brought out
of the
since
integral
it does
s, the variable
depend on
not
of
that the
Note
integration.
J ^(0'^?=^2^?
(53)
=E{s)
is the
energy
expectedsignal,while
of the
'
\
is the
4.3
correlation
cross
between
ratio with
Likelihood
Likelihood
by (46) and
x{t)s{t)dt=^2îSi
the
Gaussian
white
the
expectedsignaland
receiver
(54)
input.
noise
is defined
ratio
With
fj^^{x).
fgyix)and
integral
as
Gaussian
ratio
the
noise
of
densityfunctions
probability
the
by dividingEq. (51) and (52)
it is obtained
:
(48) respectively
E(s)
/(.r)
exp
exp
^x
(55)
dPsis),
TV
E(s)'
l{x)
If the
signalis
signalis unity,and
zero.
Then
exp
the
exp
x(t)sit)dt
dPsis).
(56)
LÔJO
for that
the probability
exactlyor completelyspecified,
s is
probabilityfor any set of possiblesignalsnot containing
known
the
likelihood
ratio becomes
E(sy
Ip)
exp
No.
"
(57)
exp
or
E(sf
/,(.xO
-
Thus
the
*
generalformulas
This
assumes
exp
(55) and
that the circuit
exp
dt
x{t)sit)
(58)
N,
(56) for likelihood
ratio state that
impedanceis normalized
to
one
is the
/(.r)
ohm.
weighted
186
READINGS
over
average of //.r)
set of all
the
i.e.,
signals,
l{x)
An
equipment which
PSYCHOLOGY
MATHEMATICAL
IN
(l^{x)dPsis).
calculates
likelihood
the
(59)
ratio l(x)for each
receiver
input
is the
optimum receiver. The form of equation(58) suggests one form which this
lihood
expected signals, the individual likeequipment might take. First,for each possible
ratio lg(x)is calculated.
Then
these numbers
are
averaged.Since the set of
is
is
this
direct
It is freoften
method
quently
infinite,
expected signals
usuallyimpractical.
in
obtain
cases
to
on
possible particular
by mathematical
Eq. (58) a
operations
X
different form
for
equipment,simplerthan
this which
the
as
recognized
response
by the direct
equipmentspecified
foUov/ing
paragraphs.
l(x)which
is done
in the
If the distribution
function
Pgis)depends
and
phase,signalenergy, or carrier frequency,
are
independent,the expressionfor likelihood
these parameters
functions
indicated
are
denoted
are
by
r^, r^,
hyf-^{r-^),
/^{r^,
.
The
likelihood
of
be
can
the
"
"
various
on
if the
be
can
It is
essentially
parameters such
in these
distributions
ratio
carrier
as
parameters
If
somewhat.
simplified
the associated
r^, and
realizable electronic
method.
probability
density
then
,fn{rn),
ratio becomes
l{x)
"
"
Is(^)f
lifi)
"
"
drj^- drn
-fnii-n)
"
"
"
(60)
=
Thus
4.4
the likelihood
77?^
likelihood
can
be found
4.3
with respectto
lg(x)
by averaging
the parameters
exactly
ratio for the
presentedin Section
"
"
known
of a signal
case
The
been
ratio
] dr^.
\fiiri)lsi^)dr,
\\fnirn)---
case
when
signalis
the
known
has already
exactly
Fi
"
(61)
(62)
As
the first
the distribution
step in finding
find the distribution
(.î,
x^,
Xf has
x.")is due
distribution
normal
(5i,52,
...
SilN times the

zero
and
(IjN) S
to white
independent.Because
are
s
for
Sr,),each
mean
the
x^s^ when
Gaussian
with
Sj
zero
are
summand
of x^, and
Because
5'f/A'^
respectively.
independent,each
with
normal
noise.
mean
constants
variance
the x^
for
is noise
alone.
It
are
can
be
A'^
dependingon
a
(sjNf
normal
times
Then
and
signalto
be
sum
each
the x^
detected,
with
mean
ofx^, which
are
{siX^jN are
summands
their
to
inputs
that
distribution
the variance
therefore
the
Eq. (46) that
WNq
"
the
independent,the
distribution,and
l(x),it is convenient
from
seen
variance
and
{x^Si)jN has
with
functions
there
has
normal
W.
distribution with
the
mean
G.
T.
PETERSON,
W.
of the
sum
AND
BIRDSALL,
means
zero
i.e.,
"
C.
W.
and
"
187
FOX
variance
the
sum
of the
variances.
5?
".,
IWEis)
IE
N.
H
(1/A'^)
x^Si
SignalEnergy
(63)
N
distribution
The
variance
for
lEjN^.
that
sees
defined
with
Recallingfrom
l{x)
one
Noise
the
noise
Power
alone
Per
Unit
is thus
Bandwidth
with
normal
zero
and
mean
Eq. (61)
for
distribution
(64)
exp
N-
TVn
21 x^s^
(1/A'^)
be used
can
by introducing
directly
a
by
E
ha
exp
or
to {XjN) S
l{x) " /?is equivalent
inequality
The
Fn(P)
47tE
distribution for the
The
which
Because
(65)
therefore
dy.
llE-^
(66)
be found
can
by usingEq. (19),
that
states
these
likelihood
a, and
signalplus noise
of
case
In iS
"
x^si "
exp
are
equal to
probabilities
ratio,this
be
can
written
the
complementary
for
as
(68)
^ dFs{(i).
dFsNil^)
Eq. (66),
Differentiating
da..
^Â'(^)=-./êxp
(69)
AE
obtains
combining (65),(68),and (69),one
and
^0
dFsîP)=
-Jêxp
Fsyii^)
ue].
exp
/3,have
normal
1- a
(70)
doL.
AE
Thus,
=
In summary,
well
with
as
of the
any
is
and
*
In
In
alone; the variance
2EJNq.
operatingcharacteristic
with
In / has
normal
plusnoise.
signal
Fig. 3, the
paper.
On
receiver
this paper
erf (x)
"
the receiver
with
parameter
axes
are
linear
(1/ V277)
"
in
and
3*
variance
same
are
both
plottedfor
with
noise
figureis equal to the square of
curves
in the
signalplus noise as
lEjN^, and the difference
with
Figs.2
the
d in this
exp
"
J
this makes
curves
both
Aô/.
distribution is
distribution
The
distributions
of each
receiver
in which
case
alone
noise
means
The
therefore
and
(71)
dy.
4
error
plottedon
are
function
[-fil] dt;
CO
lines.
operatingcharacteristic straight
bility"
"double-proba-
188
READINGS
IN
MATHEMATICAL
PSYCHOLOGY
1,0
0.9
0.8
0.7
0.6
S 0.5
0.4
0.3
0.2
0.1
0.1
0.2
0.3
0.4
0.5
Figure
Receiver
operatingcharacteristic.
the difference of the means,
apply
curves
to
Eq. (62) describes
operationin
divided
the
the receiver is
can
means
of
be taken
care
obtainingcross
If the
correlation
form
of the
1.0
normal
deviate
with
teristic
by the variance. These receiver operatingcharacknown
d
with
IEJNq.
exactly,
signal
=
the ideal receiver should
do for this
obtainingthe correlation,
adding a constant,
simplyin the calibration
constant,
of
0.9
In / is
correlation
have
been
and
case.
^(0^(0 dt-
Jo0
tions,multiplying
by
0.8
of the
case
what
0.7
0.6
The
The
essential
other
opera-
tion,
takingthe exponentialfunc-
of the receiver
output. Electronic
[13].
developed recently
signalis simple,there is a simpleway to obtain this cross

impulseresponse of a filter. The response eît)
[6, 7]. Suppose h{t)is the
of the filter to
voltagex{t)is
eît)
x{t)hit
J
"
"
"
t) dr.
(72)
W.
W.
If
filter can
be
G.
T.
PETERSON,
W.
AND
BIRDSALL,
C.
189
FOX
that
so
synthesized
h{t)
s{T
h(t)
0,
t),
"t
"T
(73)
otherwise,
then
eoiT)
so
that the response
ideal receiver
consists
0-
of this filter at time
simplyof
(74)
xir)s(r)dT,
Tis
filter and
correlation
the cross
the
required.Thus,
amplifiers.
99.9
99.5
99
98
97
96
94
92
90
80
70
Vd-
60
50
d
o
40
30
20
10
8
6
4
3
2
1
0.5
0.1
0.1
0.3 0.5
12
810
20
30
40
50
60
70
90
yb
100F^(0
Figure
Receiver
operatingcharacteristic.
In / is
normal
deviate,
a\y
cr^-,
(A/^y
"
Mjy)-
da\-.
190
when
specified
power
4.5
for the filter which
asks
one
PSYCHOLOGY
that this filteris the same,
be noted
It should
MATHEMATICAL
IN
READINGS
except for
maximizes
constant
factor,as that
peak signal
average noise-
to
ratio [14].
except for carrier phase
Signalknown
considered in this section consists of all signals

which differ
ensemble
signal
from
and frequency
modulated
a givenamplitude
signalonlyin their carrier phase,and
all carrier phasesare assumed
equallylikely.
The
s{t)=f{t)
phase angle6
Since the unknown
has
uniform
(75)
[oit+ "i"{t) 6].
cos
distribution,
(76)
ATT
The
likelihood
E{s) is the
ratio
for all values
same
by applyingEq. (56),and
the carrier phase 6,
be found
can
of
since the
signalenergy
l{x)
Expanding s
exp
into the coefficients of
s(t) =f{t)
cos
exp
d and
cos
cj"{t)]
cos
[oit+
îî
sin 6 will be
dPsis).
:
helpful
sin [cot+
+f{t)
(77)
sin 6,
"f"(t)]
(78)
and
1
1
^
Tr
2.^1^1
COS
NT
Zîfih)
"
[ojti+ "i"{t,)]
cos
1
+
Because
easiest to introduce
"
Zîf(fi) sin [oiti+ ^(ti)].'
(79)
with respect to 6 to find the likelihood ratio,it is

integrate
parameters similar to polarcoordinates (r,6^)such that
wish
we
sin
to
0"
r cos
"
^
=
[oJti+ 4"{ti)]
T^Zîfih)
COS
(80)
1
-
and
00
r sm
]^Z *i/(0
[wti + "}"{t,)],
sin
therefore
1
r
^
^Zîî =;^cos(0
-
Using this
form
the likelihood
(81)
ô)-
ratio becomes
dd
-cos(0
-0o)
2^
(82)
where
/q is the
*
Bessel function
ti denotes
the /th
of
zero
order
samplingtime, i.e.,ti
and
=
pure
ijlW.
imaginaryargument.
192
READINGS
IN
MATHEMATICAL
PSYCHOLOGY
1.0
0.9
0.8
0.7
0.6
s;
0.5
0.4
0.3
0.2
0.1
0.1
0.2
0.3
0.4
0.5
0.7
0.8
0.9
1.0
Figure
Receiver
0.6
except for phase.
characteristic. Signalknown
operating
Eqs. (85) and (89) yieldthe receiver operatingcharacteristic in parametricform, and

levels [15]. These
are
graphedin Fig.4 for some
Eq. (84) givesthe associated operating
used
as
were
signalenergy to noise power per unit bandwidth
the
that
2
of
effect
and
known
3,
so
knowing
exactly,
Figs.
phaseangle
the phase can
be easilyseen.
the
If the signal
to match
is sufficiently
simpleso that a filtercould be synthesized
known
in
0
the
of
for
a
case
a
exactly,
signal
expected signal
givencarrier phase as
likelihood ratio. For simplicity
then there is a simple
to
a receiver to obtain
design
way
let us consider only amplitudemodulated
signals\^{i) 0] in Eq. (75). Let us also
the filter has impulse
0=0.
choose
(Any phase could have been chosen.) Then
of the
when
same
the
values
of
was
response
h{t) =f{T
=
0,
t) cos
{oj{T
t)l
"t
"T,
(90)
otherwise.
W.
output of
The
?o(0
filter in response
the
x(r)h{t
t) ^T
G.
W.
AND
BIRDSALL,
193
FOX
C.
T(t) is then
to
x(r)f{T+
C"(t +
t) cos
t) dr
Jt-T
-co
C0(T
cos
T.
PETERSON,
W.
x{T)f{T +
t)
"
t) cos
"
dr
WT
Jt~T
"
of the
and
integrals,*
x{T)f{T+
t) sin
"
the filter output will be the square
envelope of
The
-I
sin co{T
the
time
envelopeat
Twill
(91)
dr.
ojT
of the
root
of the squares
sum
proportionalto rjN, since
be
fT
x{r)f{T)COS
dT
lOr
sin
!:(t)/(t)
dr
COT
(92)
2W
which
be
can
identified
x(t)passes through
linear
square of the envelope of ^qCO at time T. If the input

an
impulse response givenby Eq. (90),then through a
the
as
the filter with
known
Eq.
the
likelihood
4.6
Signalconsisting
of a sample of white
ratio of the
Suppose
Gaussian
monotone
to
Gaussian
with
?i/2
2tt{N +
S)
a, defined
the
by
!1 y
exp
of
presence
variance, and
of freedom,
;8 is
l(x) "
noise
they
squares of these
2N
(93)
are
alone
the
will exceed
can
^,2
(94)
'
to
introduce
the
or
zero
be shown
have
at zero
that these
(\IN) 2 xf
zero
that
probability
ox
frequencyand at
contain
integrals
with
In
cr.
and
mean
the
"
unit
of the
sum
n
degrees
(96)
K,,{rf?).
,.
xJVn
the
that
distribution
chi-square
is the
^^.
S2
condition
the
variables
random
If the line spectrum o^ s(t)is

it
S^
for /,it is convenient
\N
i.e.,
Ico/ln,then
(95)
exp
""^
to
equivalent
greaterthan
2N
independent. Therefore,
variables
1 __}__ y
V"/2
FsiP)
co/Itt.
equation
,A^ + SJ
'
functions
TV
condition
^2
2N^
distribution
determiningthe
parameter
the
since
1-f
exp
nli
l{x)^
the
The
signalpower.
is
noise
plus
signal
S, the
Gaussian,
is also
independent
sample pointsare
ratio is
likelihood
The
Then
read
1 WT.
In
ratio,
to
variables:
1
the
variance
and
mean
zero
signalplus noise
/s.v(^)
where
output
calibrated
be
can
noise
signalvoltageat
random
Gaussian
of two
sum
of the
variables
probabilitydensitydue
the likelihood
Because
input.
the values
random
T.
the
/-/TV,
of
function
(82), is
the
at time
detector, the output will be (Nj2)rlN
all frequencies
equal to
no
or
as high as
frequencies
194
READINGS
in
Similarly,
[IJiN +
The
unit variance.
S)] S
xf
[Nl(N
"
MATHEMATICAL
PSYCHOLOGY
signalplus noise
presence of
the
and
mean
zero
that
IN
the
5)]a2,and
For
(97)
distribution
chi-square
a^ " 0,
for
[16],
precisely
Kîo^')
requiring
distribution,
chi-square
as
is
approximatelynormal
Fd^)
same
of the
use
S have
K"
of n, the
largevalues
portion;more
a^ is the
"
again making
the center
xj Vn
variables
iljN) E xf
condition
FsdP)
random
exp
VItt
dy,
2^
over
(98)
"v/2a2-\/2w-l
and
\
exp
If the
and
this
signalenergy
case
is small
have
distributions
both
too, with the value
noise, VnI(N
of the
compared to that
nearlythe same
of d givenby
Then
variance.
dy.
--r
(99)
S) is nearlyunity
Figs.2 and
apply
to
TV
d
{ln
1) 1
(100)
N
For
relation
index
signalto noise ratios and largesamples,there is a simple

of samples,and
the detection
signalto noise ratio, the number
small
these
between
d.
1 5
TV
1
(101)
and
Two
signalto
noise ratios,{SjN\ and
characteristic
if the
the
{SlN)^,will giveapproximately
of
correspondingnumbers
sample points,Wj
ing
operat-
same
and
n^,
satisfy
A^
(102)
^X2
N}
By Eq. (94),the
likelihood is a monotone
function
of
xf. But
the
output of
an
energy detector.
{"t)fdt=-:^J^xf
o(0=J
is
proportionalto S a;?. Therefore

ratio,and
hence
can
be used
an
as
energy
an
detector
optimum
can
be calibrated
receiver in this
case.
to
(103)
hood
read likeli-
W.
Video
4.7
output of
a
band
Gaussian
BIRDSALL,
AND
W.
C.
195
FOX
receiver
problem considered
signalsand noise are assumed
The
G.
T.
band
broad
designof a
The
to
PETERSON,
W.
in this section
is
in Fig.5.
representedschematically
and at the
to have
passedthrough a band pass filter,
the filter,
A
the
assumed
be
limited
in
to
on
are
diagram,they
point
spectrum
of width
^Fand
center
to be
frequencycoJItt" H^j2. The noise is assumed
noise
pass
through
and
noise
with
uniform
spectrum
linear detector.
the
over
band'.
output of the detector
The
The
is the
and
signals
noise
envelope of the
then
signals
they appeared at pointA; all knowledge of the phase of the receiver input
is lost at pointB. The
ceiver
resignalsand noise as they appear at point B are considered
and
the
of
is
these
video inputsto ascertain
inputs,
theory signaldetectability
appliedto
the best video design and the performance of such a system. The mathematical
of the signals
and noise will be given for the signals
and noise as they
description
appear
The envelope functions,which
at pointA.
appear at point B, will be derived,and the
will be found
likelihood ratio and its distribution
for these envelope functions.
The only case
here is the case
in which
which will be considered
the amplitudeof
it would
the signal
function of time.
as
appear at point A is a known
A
be
band
function
will
limited
at
of width
W and
to
a band
center
Any
point
frequencyajjln " W/l. Any such function/(0 can be expanded as follows:
as
f(t)
where
x(t) and
themselves*
be
y(t)are
band
x(t)cos
y(t)sin cot,
(105)
limited to
no
frequencies
higherthan
expanded by samplingplan C, yielding
I-^ IVM
CO?
cos
2/ (
sin
Iy"i{t)
"
PF/2,and
hence
can
(106)
cot
The
amplitudeof
the
function
r(t)
and
thus
the
amplitude at
the
/th
/(r)is
V[xit)f + [yiOr,
(107)
samplingpoint is
^^""^
^^*'
'''Tf)'"'*'
(108)
"
The
angle
arctan
might be
might be
considered
described
the
phase of/(0
by givingthe
Input from
Band
antenna
or
at the
r^ and
arccos
"
/th
samplingpoint.The
d^ rather
than
of
amplifier
Point B-
FlGURE
diagram
function/(0then
y^.
Video
detector
Point A-^
Block
the x^ and
Linear
pass
filter
mixer
(109)
"
broad
band
receiver.
Because
-\-{WjD, the
frequency greater than {collir)
any function /(/) at A has no
samplingplan C might have been used on/(/). However, the distribution in noise alone,
fxiî),would probablynot be applicable.
usual
196
Let
denote
us
filter (i.e.,
at the
by
it would
the
signalas
hence
signal,
or
x^, y^,
pointA
in
r^,
Oi,the sample
Fig.5). Let
Oi,
values
"
no
"
"
with
signal,
"
the
known.
Let
phase sample
there
is white
denote
us
values
Gaussian
the
for
the
by
"^j. The
noise and
is
2 WT,
W2
for
inputafter
sample values
The
envelopeof
the
noise.
no
assumed
""
fN(-,y) =\j;;^)
and
are
receiver
denote
"^j,
were
Z^,
amplitudesample
^n/2)the distribution function of
densityfunction for the inputat A when
probability
4"2'
-fs(î5
the
for
values
bi,or/,
if there
pointA
at
appear
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
w/2
nj2
(110)
exp
signalplusnoise, it is
fsd-, y)
=[^) J/xp[--^2/-.cid'+l(yi
-
Expressedin
(r,6) sample values,Eq. (110) and
of the
terms
\n/2 w/2
bd'
dPsiaAl
(111)
Eq. (Ill) become
m/2
(112)
and
\n/2 w/2
fsNir,6)
2777V
1
exp
'"^
J
w/2
I [rl+fl
"
cos
(6,
2r,/;.
...,
from
factors
because
introduced
the
they are
Jacobian
"Pn,2)-
of the transformation
samplingplan to the r, d samplingplan [16].*

probability
densityfunction for r alone, i.e.,the densityfunction for the
the densityfunctions
for r
the detector,is obtained
simplyby integrating
the x, y
The
output of
and
Ilrj are
^,)]
(113)
dFsi4"i,
The
2N,
Q with
respectto
Q.
"277
fdr)
/"277
/'27;
Jo
Jo
Mri,Oddd,dd,---dd
(114)
or
I \w/2 n/2
1
"
27V
"/2
.fi
and
fsyi'-)
=
r r- f"Vs.v(^"
Gi)d6x dd,--"
Jo
Jo
"
dB
Jo
or
2 \n/2 1^ w/2
W./2
W2
/,,;;.
n/o(^K^('î,"/'2---
/s'v('-)
^"""4=1
(115)
or
1 \w/2 w/2
For
example,in
two
/ff.
w/2
y) dx dy =f^{r,B)r dr dd.
dimensions, /^(a;,
W.
that
Notice
which
PETERSON,
the
probability
densityfor
the
all information
had;
"^,:
is
about
197
FOX
C.
W.
AND
BIRDSALL,
G.
T.
W,
tribution
completelyindependentof the dishas
been
the phase of the signals
lost.
ratio for
likelihood
The
input,r{t),is
video
nl2
(116)
A^
Again
it is
convenient
more
with
work
to
w/2
""
ratio. Thus,
the likelihood
logarithmof
the
/r f\
(118)
which
is
approximately
E
In /[KO]
"'^0
function
The
is nearlylinear for
and
be
In
[^
W\
-"
[r{t)f{t)'
'
(119)
dt.
\nl,
Jo
of
values
for small
the parabolaa'2/4
Iq{x)is approximately
for likelihood
largevalues of x. Thus, the expression
ratio
might
approximatedby
IHt)]
In
and
signals,
for small
(120)
by
ln/[KO]
where
largesignals,
for
I V{tmf{t)fdt
TTro
and
Ci
Cg
chosen
are
(121)
r{t)f{t)dt
approximateIn Iq best
to
in the desired
range.
The
Thus
the
correlator
the square
ratio,the
its
output
in Eqs. (120) and

(121) can
integrals
receiver
for
weak
signalsis
optimum
which
finds
the
correlation
cross
be
a
interpretedas
square law
the
between
detector, followed
detector
envelope of the expectedsignal.For the case

optimum receiver is a linear detector, followed by a
cross
of the
correlation
detector
output
of
of the
the
correlations.
cross
by
the
[f{i)T,
largesignalto
noise
has
for
amplitudeof
the
which
correlator
output and/(0,
and
expectedsignal.
The
distribution
function
approximation developed here

ratio,since this is the
for the
First
logarithmof
we
will
of most
case
largesignalto
shall find the
the likelihood
for
apply to
ratios
and
mean
as
shown
designedfor
the receiver
would
standard
be
easilyin
found
be
interest in detection
noise
ratio
/(r)cannot
studies.
even
deviation
easier
An
to
for the
this
case.
The
low
signalto noise
analogous approximation
derive.
distribution
of the
above,
^nm^-^^lff^^Jrftt
(122)
198
for the
of small
case
signalto
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
ratio. The
noise
probability
densityfunctions
for each
r,- are
(123)
and
The
these from the jointdistributions

^;v(î)
and^^^Cr^)is used to distinguish
called
of each term
and
were
f^{r)
fsN^^)-The mean
previously
notation
of all the
ri which
in
rff^l4N^
the
in
sum
Eq. (122) is
(124)
or
(124)
Similarly,
"
The
second
of each
moment
f^SN
dn
zm
is
rff^jAN^
term
167V
or
r_m\
ji_ r"
^^f^lM\"in
"I67v0o
^'^""^
^^^'^167V^j
^
2N
(125)
Similarly,
N^^''^'''^'^'"
~\6N^]o
^""{leN^j
rift\
or
16N'
The
r^ "5
ft
00
for
integrals
the
case
16N
of noise alone
Ara^xp
Jo N^
.
can
be evaluated
dri
2N
:
easily
t^N
(126)
and
M-N
of signal
for the case
plusnoise can
integrals
which
for
turns
out
f
unction,
hypergeometric
The
be evaluated
the
cases
in terms
above
to
of the confluent
reduce
to
simple
200
READINGS
of the
Thus
MATHEMATICAL
there
(fi),
numbers
WT
IN
consecutive
are
PSYCHOLOGY
which
ones
are
not
These
zero.
givenby
are
=7^
U
where
is the
pulseenergy
in
pointA
at
(132)
Fig.5
in the absence
of noise.
For
this case,
Eq. (130) and Eq. (131) become

1 "2
/i^;vDn/(r)]
="
^,
=0,
/.^[ln/(r)]
(133)
""2 /
0|^[ln/(r)]=-
"-""'WI-mFjI'+m^
and
normal
distribution of In /(/-)
is approximately
The
for,by
the central
with
variables
limit theorem, the distribution of

distribution
common
large.The
becomes
actual
nearlynormal
are
than
The
its distribution
The
usingthe
noise
than^v('*i)operatingcharacteristic for the case

distribution as approximationto the true
normal
the
case
distributions
distribution then
have
leads to the
radar
that the
and
a
in
of times
signal
plusnoise is more
uniform
16
is
plottedin Fig.6
distribution.
In many
cases
approximatelythe
of Figs.2 and
curves
same
variance.
Assuming
3, with
(2Ef
case
This section deals with

assume
itselfany number
2E
as
be calculated
can
that
1
4.8
distribution
norma)
alone, since the distributions gsN^.^'i)
receiver
it will be found
normal
the
of noise alone
case
distribution of In /(r)for
with
than one,
larger
independentrandom
nearlynormal
more
In such
is much
of M
with
for Xh^gjîxi)
integral
expressedin closed form.
be
can
if M
sum
approach
must
distribution for the
this case, since the convolution
givenrange. That is,we shall

rence
pulseswhose time of occurbe
assumed
The
known.
carrier phase will
to have
of
all
the
are
others,
i.e.,
independent
pulses
pulse
at
a radar target
detecting
if it occurs,
signal,
envelopeshape
are
distribution for each
consists
of
train of M
incoherent.
The
set
of
can
signals
be described
as
follows
M-l
^(0
2 /(^ +
where
the
^^^
cos
{(ot+ 0,.),
(136)
angles6^ have independentuniform
and
distributions,
the function
/,
W.
W.
PETERSON,
T.
G.
AND
BIRDSALL,
W.
201
FOX
C.
99.9
99.5
20
0.5
0.1
0.1
0.3 0.5
8 10
20
30
100
40
characteristic.
operating
envelope of
is the
which
Broad
70
60
90
95
F;,^/;
6
Figure
Receiver
50
band
singlepulse,has
receiver with
the
video
optimum
property
16.
design,M
that
IE
(137)
j\t +ir)f{t +jr)dt

Jo
where
The
is
(5,;
time
enough so
f(t)is also
*
because
the
Kronecker
delta
is
function, which
zero
if / ^
between
states
factor 2 appears
the total energy
function
that
The
/.
j, and unity if /
far
are
spaced
pulses
that the
pulses.Eq. (137)
The
and that the total signalenergy is E.
they are orthogonal,
as
as
assumed
loJItt.
no
to have
high
frequency components
is the interval
"
in
is M
(137) because /(O is the pulseenvelope; the factor

times
the energy
of
singlepulse.
appears
202
ratio
likelihood
The
MATHEMATICAL
IN
READINGS
be
can
obtained
PSYCHOLOGY
by applyingEq. (56).
r
"2
/(.i)
exp
dt
exp
A^n
Then
Jos(t)x(t)
(138)
dPsis)
'
No
or
r-lir
l{x)
exp
J'2tt
TVn
exp
The
be
M-l
Jo
N"
can
integral
Jo
0
T
?)i
+ 0,")cf/ dd,---dB,j_^.
f{t + tm)x{t) cos {(x"t
(139)
0'
evaluated,
in Section
as
4.5, yielding
M-\
l{x)
exp
No.
(140)
h
Q
where
"
fit + mT)x{t) cos
CO?
t/r
/(/ + mT)x{t) sin
\NoJo
ojt
dt
0 JO
(141)
This
of the
discussion
of the
case
with
quantityTq is connected
filter for the
signalknown
in
obtained
be
r"i could
each
identical with
is almost
quantityr"j
the first
described
pulse;it could
be obtained
by designing
the
fact,
section.
an
will be
(Nj2)rjN
delayof the filter. The

the pulses
which come
It is convenient
at
other
later.
to
instant
some
The
output
the
of time
differ
r^
quantities
have
{cot+
The
ideal
(142)
e)
puttingthe output through a
phase angleQ, and
of the
value
output
in that
In
signal
=
for any
appearedin
phase,Section 4.5.
manner
Soit) fit)cos
The
which
quantityr
for carrier
except
in the
receiver
the
r^which
only
in that
of the filter at time
receiver
calculate
for each
r,", and
the
linear detector.
is determined
they are
t^ +
nn
by
the time
with
associated
will be
logarithmof the
iNj2)rmlN.
likelihood
ratio.
Thus
In
the
Io(r"jN)must
be found
will
usuallybe
previoussection, r"jN
^(r"J^f
approximatedby x^l4. The quantities
As
in the
than
rather
detector
times
to, t^ +
consists of
detector
an
(for
We
T,
IF
the
shall
logarithmof the
linear
detector, and
tQ + (M
"
1)t
then
the
must
be added.
must
quantities
that
In
so
Iq(x)can be
enough
found
by usinga square law
these
small
can
be
outputs
be
of the
added.
at
square law detector
ideal system thus
The
passband matched to a singlepulse,*a square

device.
threshold
signalcase), and an integrating
find normal
approximationsfor the distribution functions of
likelihood ratio usingthe approximation
with
amplifier
its
In/nlT^'l^^,
law,
the
(144)
47V2
*
of the IF
It is usuallymost
amplifier.
convenient
to
make
the ideal filter(or an
approximationto it)a part
W.
T.
PETERSON,
for small
is vahd
which
W.
values
of
the
for the
distributions
individual
of the
discussion
same
known
signal
except
for
of
TT
{cut +
cos
distribution
the
as
W.
for
phase;
the
203
FOX
yields
(145)
"
0,")are
the
C.
(144) into (143)
follows
independent;this
are
+ wt)
pulsefunctions/(/'
is the
for each
Substitution
-77+1
quantities
r"j
AND
BIRDSALL,
r.,jN.*
In/^
The
G.
from
orthogonal.
quantity
r
which
the
The
tion
distribu-
appears
in the
both
analysisapplies
to
same
fact that
cases.
Thus, by Eq. (83)t
^"'F
IE
Pv|^'^")=exp
and
by (89),
1*00
NMr,
exp
SN
IE
AV
The
(146)
A^
be
can
^N
dy.,
obtained
Io(a)da.
oexp
NM
4E
(146) and
by differentiating
MN.
is\
This
is the
in the
same
\N
and
the
and
manner,
"
AE
as
situation,mathematically,
appeared
same
(147):
yVoM
2E
deviation
standard
(147)
\N
IE
exp
IE
densityfunctions
IE
exp
for the
mean
in the
(148)
previoussection.
logarithmof the likelihood
ratio
can
The
be found
they are
MNl
/)
/i^v(ln
0,
"2
a|^-(ln/)
2E
4(ln /)
and
If the
their means
distributions
and
previous section.
receiver
*
can
variances.
The
be
These
problem
See the footnote
t The
rather than
A/ appears
E.
below
in the
MNl
MM
MN^
assumed
normal, they are
formulas
is the
curves
(149)
1 +
are
same,
at
the
identical with
completelydetermined
the formulas
mathematically,and
end
of Section
4.7
the
discussion
apply to
by
(133) of the
both
and
cases.
equation (131).
equationsbecause
following
the energy
of
singlepulseis E'M
204
READINGS
4.9
Approximate
of an optimum
evaluation
PSYCHOLOGY
MATHEMATICAL
IN
receiver
remainingtwo cases, the assumption

the receiver operating
characteristic
be approxithat in these cases
is made
can
mated
of Figs.2 and
that the logarithmof the likelihood
ratio is
3, i.e.,
by the curves
This section discusses the approximationand
method
for
a
approximatelynormal.
of
the
and
the
2
3.
characteristic
receiver
to
curves
Figs.
operating
fitting
In order
approximateresults
obtain
to
can
By (68),Fg^(l)
the
Fgy{l). Hence,
variance
the
between
noise
the
ratio with
Probably
noise
is
alone
noise
a%,which
alone
is
a%, the second

1 -I- a'^.Thus
this number
is the variance
characterizes
of the
be
seen
if
unity, and
with
moment
the
likelihood
detect
to
ability
it can
of the distribution
l)th moment
"
signalplusnoise,is
with
mean
is equalto
means
alone.
any other
hence
is the {n
Fj^{l)
Furthermore,
of the likelihood ratio with
mean
of the likelihood
noise alone, and
if Fj^{l)
is known.
of the distribution
that the nih. moment
the
be calculated
for the
difiFerence
ratio
with
signalsbetter
than
singlenumber.
Suppose
logarithmof
the
the
ratio has
likelihood
normal
with
distribution
noise alone, i.e.,

1
=
"th
is the
and
mean
d the variance
of the likelihood
moment
ratio
'00
J
the
square
the likelihood
(x
exp
in the
has
made.
been
exponent and
The
usingthe
dx
exp
ratio. The
nif-
"
dx,
(151)
Id
VlndJ-oD
(150)
follows:
as
exp [nx] exp
'
logarithmof
/"oo
substitution
completingthe
of the
l^^dFyil)
where
dx.
2d
be found
can
nif
"
exp
Jmi
Jin
Vlird
where
(x
J'
^.v(/)
can
integral
be
evaluated
by
fact that
\/2tt d.
2d
(152)
Thus,
'n^d
fJ-.wil'')
exp
In
the
particular,
mean
of l(x),which
unity,is
be
must
mn
~d
l^dO
and
therefore
The
variance
of l(x)with noise alone
is
(154)
"
a^, and
therefore the
f^Nil') l/'NiOf+ o%(l)

=
and
this must
1 +
second
moment
of
l(x)is
o%(l),
(155)
agree with (152). It follows that
/*iv(^^)^
~
and
(153)
-|-m
exp
"n
"
6xp [2d +
2m]
(156)
exp [d].
therefore
^
The
distribution
of likelihood
ln(l
a%).
ratio with
signalplus noise
(157)
can
be
found
by
W.
W.
T.
PETERSON,
G.
AND
BIRDSALL,
W.
C.
205
FOX
applyingEq. (68). Thus

dFsdn
=IciFJl),
(158)
Fsx(l)
If
dF^{l) is
obtained
from
ldFj,{l).
Eq. (150) and
/ is
exp .r, then
by
replaced.
f'Oj
f'sNil)
Jlni
77-
dx
[.r]exp
exp
V2
(159)
f^
Vl-ndJlnl
Thus
with
^/2 and
mean
variance
abilityto
detect
likelihood
ratio has
that with
with
in both
The
the
are
the
of Section
cases
to
when
distribution
the
cases
approximate
results
the
has
could
followingsections for
characteristic curves
operating
Suppose
which
have
that
the
same
one
of M
the
set
plottedin Fig.2,
(160)
is the
distribution
which
4.8 this distribution
sample
pointsis
In
occurs.
to be the
is found
large.Certainlyin
it seems
Thus
reasonable
that
general form.
for
a%
a
by calculating
only
given case
is
signals
two
are
the
approximately
distribution.
normal
in the
of
is also the diflFerence
those
are
and
distribution
most
useful
and
obtained
ratio has
is
is d, which
curves
logarithm of
this distribution
ajfis given.The
cases
this
exactly,
Section
of the
Signal which
case
measures
a%).
of
this
be
ln(l
4.7, and
to detect
ability
4.10
if
in both
number
assumingthat the
likelihood
this
t"ythe equation
cry
4.6, Section
distribution
limiting
alone, then
signalknown
noise
ratio
If the
singlenumber.
with
Its variance
cases.
d related
of
case
signalplus noise,in
of the likelihood
oy
other
any
d
In the
is
there
completelydetermined
receiver
parameter
than
distribution
normal
plusnoise
signal
means.
when
d.
signalsbetter
In l(x)is normal
also
it is probable that the variance
In summary,
of the
of In / is normal
distribution
the
and
cases,
the
this
On
assertion
of
approximatedby those
as
same
if the
logarithm
basis, orv(/)is calculated

is made
Fig.2
with
that
d
"
the
receiver
ln(l
o^^).
signals
orthogonal
of
includes justM
expected signals
the
probability,
same
E, and
energy
S"
dt
s,(t)s,(t)
are
functions
5^(0, all of
orthogonal.That
is,
(161)
Ed,^.
Jo
Then
the likelihood
ratio
K^)
can
be found
^ 1
2 77
from
exp
be
Eq. (56) to
A^n
exp
z
(162)
I
/(.)=-Iêxp
J^
where
s^t
are
the
sample
values
of the
function
"'ÔJ
sît).
206
READINGS
With
noise alone, each
PSYCHOLOGY
MATHEMATICAL
IN
of the form
term
has
normal
distribution
with
mean
and
zero
IE
_M
variance
"
~
Furthermore, the
different
N,0
quantities
X.^"-^^^
yy-l
i
are
independent,since
the
functions
follows
orthogonal.It
^^^COare
that
the
terms
exp
1
are
independent.
Since the logarithmof
Z
has
normal
each
distribution with
distribution
be found
can
WJi"^''"~Aôj
( "EJNq)
mean
from
term
exp
"
Eq. (152).
variance
and
lElN^, the
of the
is
"th moment
The
moments
n(n
fi.yiZ'') exp
=
It follows
that the
of each
mean
A^.
oj
and
is unity,
term
(163)
1)
is
variance
the
'2E'
a%{Z)
The
the
variance
of
of
sum
KZ^)
[îiZf]
exp
independentrandom
1.
(164)
N, oJ
variables is the
sum
of the variances
of
Therefore
terms.
/2E]
G%iMl)
and
it follows
that the variance
exp I
of the likelihood
^-(^^4
curves
pointedout in
those
approximately
was
are
Section
of
IE
4.9, that
This
equation can
ln(l
be solved
a%)
for
(166)
the
receiver
Fig.2, with
1
(165)
is
ratio
exp
It
In
^-M
2E
M^^PliVo/J
(167)
IEJNq.
IE
ln[l + Mie'^
1)].
No
*
The
reasoningis the
same
as
that in Section
4.4.
(168)
208
The
READINGS
value of each
mean
MATHEMATICAL
IN
PSYCHOLOGY
is
term
'00
2E
exp
This
be evaluated
can
as
No'
N,oJ
Jo
page 174 of Threshold
on
d(x.
exp
Signals[5],and
(176)
the result is that
/."^)(i3)1.
=
The
second
of each
moment
is
term
-J.
/.",^)(|S2)=
fi^dF'-J^Hfi),
(177)
or
2E
"f
/4'^
(^')
The
exp
L 'No_\
be evaluated
can
integral
IE
in
as
Appendix
of Part
da.
exp
II of reference
[17],and
the
result is
IE
M
M'r(^')
h\TF
(178)
No
The
variance
of each
in
term
Eq. (172) is
IE
["rW)f
It follows
^^'^(^')
that the variance
of Ml
1/^''-'
m'
io\j^^
(179)
(180)
(181)
1.
is
"
(IE
4(M/)
and
therefore
IE
o\{l)
since the
variance
for the
of
sum
Nn
variables
independentrandom
is the
sum
of the
variances.
If the
characteristic
approximationdescribed in Section 4.9 is used,

those of Fig.2, with
curves
are
approximately
1
4.12
The
A
4.1.
Two
broad
band
ln(l
receiver
and
al)
the
the receiver
/2E
(182)
In
optimum
receiver
in
of the results of Section 4 are suggested
applications
further
from
examples of practical
knowledge obtainable
few
presentedin this section and
operating
Table
the
I, Section
theoryare
in the next.
in a frequencyband
of width
detecting
pulsesignals
B is to build a receiver which covers
this entire frequency
Such a receiver with a
band.
time is studied in Section 4.7. This is not a trulyoptimum
pulsesignalof known starting
it with an optimum receiver. We
to compare
receiver; it would be interesting
of a signal
have been unable to find the distribution of likelihood ratio for the case
which is a pulseof unknown
is distributed evenlyover
a
carrier phase if the frequency
that the frequencyis restricted
band.
so
However, if the problem is changed slightly,
One
common
method
of
W.
to
W.
T.
PETERSON,
G.
BIRDSALL,
AND
W.
C.
209
FOX
of the pulse width

pointsspacedapproximatelythe reciprocal
apart, then pulses
and the case
of the signalwhich
are
frequencies
approximately
orthogonal,
of M
known
be applied. Eq. (182)
one
orthogonal signals
except for phase can
at different
is
be
should
band
used
width.
width, the
IEJNq
as
with
Since
the band
parameter
function
equal to
width
used
of ^
are
the
of the
frequency band width B to the pulse

of its pulse
pulseis approximatelythe reciprocal
Curves
Section 4.7 also has this value.
showing
of
in
ratio
a
givenin Fig.7
for both
the
approximateoptimum
receiver
100
90
80
70
60
50
40
2"
No
30
20
10
14
12
16
18
d
Figure
Comparison
of
optimum
and
broad
band
receivers.
20
22
24
210
READINGS
and
the broad
Eq.
(135) and
4.13
for several values
receiver
band
(182),which
Eq.
Uncertainty and
PSYCHOLOGY
MATHEMATICAL
IN
hold
for
of M.
In the
largevalues
d is calculated
figure,
from
of M.
signaldetectability
where
the
signalconsidered
is one
of M
the
orthogonalsignals,
t
o
provides opportunity study the
In the approximateevaluation of the optieffect of uncertainty
on
signaldetectability.
mum
the signal
is one
of M
the
of
receiver when
ROC
functions,
curves
orthogonal
Figs.2 and 3 are used with the detection index d givenby
In the two
cases
signalis
uncertaintyof the
of M.
function
This
This
(IE
ln
be solved
equationcan
an
(167)
for the
signalenergy, yielding
IE
=
"
ln[l
Me^]
In M
\n{e^
1),
(175)
approximationholdingfor largeIE/Nq. From this equationit can be seen that

the detection index d,
the signal
of In M when
a linear function
energy is approximately
and
hence
the ability
detect
is
that
It might be suspected
to
signals, kept constant.
of the
2EINq is a linear function of the entropy, S ^^ Inp^,where /",:is the probability
ith signal.The
linear relation holds
when
all
the
The
only
expression
pi are equal.
which occurs
in this more
generalcase is :
the
"
2E
^
"
-ln(2pf)+ln(e^-l).
(176)
REFERENCES
[1] S.Goldman.
is devoted
[2] C.
to
êw
Information
theory,
samplingplans.
Communication
E. Shannon.
York:
ChapterII,pp. 65-84,
Prentice-Hall,1953.
in the presence of noise.
Proc.
IRE, January,1949, 37,
10-21.
[3] U. Grenander.
Bd
nr
Stochastic
and statisticalinference. v^r^/j;/or Mor/zewor/^, 1950,
processes
17, p. 195.
[4] J. Neyman, and
E. S. Pearson.
problems of the most efficient tests of statistical

ofthe RoyalSocietyofLondon, 1933,231, Series A,
the
On
Transactions
hypotheses.Philosophical
289.
and
[5] J. L. Lawson
[6] P. M.
Woodward
G. E. Uhlenbeck.
Threshold
and
Information
Proc.
[7]
I. L. Davies.
On
I. L. Davies.
I.E.E.
New
signals.
theoryand
York:
McGraw-Hill,
inverse
1950.
communicati
probabilityin tele-
(London), March, 1952, 99, Part III,37-44.
determiningthe
presence
of
signalsin
noise.
Proc.
I.E.E. (London),
March, 1952, 99, Part III, 45-51.

[8] A. Wald.
analysis.New York:
Wiley, 1947.
Sequential
A unified description
of statistical methods
employing
Signaldetectability:
of
fixed and sequentialobservation
Defense
Electronic
Group,
University
processes.
Michigan, Technical Report No. 19 (unclassified).
ratio test.
[10] A. Wald and J. Wolfowitz.
Optimum character of the sequential
probability
[9] W.
C. Fox.
Ann.
*
Math.
Stat.,September,1948, 19, 326.
If lEjN^
"
3, the
error
is less than
10%.
W.
[11]
Reich
E.
and
Physics,
[12]
C.
R.
J.
V.
Y.
Conf.
1952,
North.
See
[15]
periodic
J.
W.
in
wave
C.
211
FOX
Gaussian
noise,
Journal
Applied
in
noise.
Applied
Journal
Physics,
January,
Signal-to-noise
October,
1950,
A
device
improvement
38,
1197.
for
computing
through
functions.
correlation
in
integration
Rev.
Sci.
Jr., and
J.
in
signals
Reintjes.
of
the
Wiesner.
B.
noise.
five
Applications
Proc.
channel
of
October,
I.R.E.,
electronic
correlation
1950,
analysis
38,
correlator.
analog
to
1165.
El.
Nat.
Proc.
8.
An
analysis
RCA
systems.
[5],
of
values
values
large
F.
reference
of
Graphs
sine
signals
sure
Meade.
E.
Cheatham,
carrier
also
J.
P.
and
D.
pulsed
of
Rogers.
F.
of
T.
Levin
O.
of
AND
347.
23,
Lee,
T.
I.R.E.,
and
1952,
J.
BIRDSALL,
detection
detection
and
Proc.
detection
M.
G.
289.
24,
the
On
Harting
W.
the
[14]
1953,
tube,
E.
The
Swerling.
Harrington
Instr.,
T.
PETERSON,
76-82.
25,
storage
A.
P.
March,
Davis.
1954,
[13]
W.
of
Rpt
(89)
integral
in
appear
which
determine
signal-noise
discrimination
in
1943.
PTR-6C,
206.
p.
the
factors
Laboratory
S.
with
along
approximate
Mathematical
Rice,
O.
for
expressions
of
analysis
random
and
small
noise,
B.
for
S.T.J.
,
1944-1945,
J.
Marcum
Project
[16]
P.
[17]
The
G.
Hoel.
of
Defense
Part
W.
be
may
noise,
theory
that
of
Journal
detection.
Trans.
noise.
in
Birdsall,
D.
of
No.
G.
the
this
Rand
function
have
been
Table
Corporation,
the
material
November
Applied
2,
p.
of
371,
26,
Physics,
signals
1953;
March,
'statistical
criteria
January,
for
in
this
the
D.
1953;
April,
of
signal
No.
of
criteria
Wiley,
reference
of
theory
Section
Statistical
24,
from
Report
in
detection
York:
New
drawn
The
Birdsall,
Physics,
PGIT-3,
Discussion
is
paper
Michigan,Technical
Optimum
35,
IRE.,
I, 11', Journal
of
statistics.
this
Middleton,
Applied
I
of
T.
contains
detection.
Report
G.
and
University
report
Technical
T.
and
Peterson,
found
of
report
mathematical
to
Sections
Group,
of
II
field
W.
Tables
compiled
of
by:
^-functions.
RM-399.
Introduction
of
46-156.
24,
unpublished
an
Report
material
Part
in
in
Rand
and
282-332
23,
noise,
D.
1954;
for
1954,
the
25,
paper.
Middleton,
detection
128-130.
Electronic
July,
Other
work
of
pulsed
Middleton,
M.I.T.
The
Lincoln
W.
of
1953.
in
this
carriers
statistical
Laboratory.
theory
W.
from
and
(unclassified),
detection
246.
p.
above
detectabiHty,
Statistical
Middleton,
D.
13
1947,
[9]
of
Peterson,
pulsed
carriers
signal
and
in
ASPECTS
FOUNDATIONAL
control
precise
be neither
OF
PATRICK
MEASUREMENT
SUPPES
It is
scientific
platitudethat there
ment.
predictionof phenomena without measure-
of measurement.
1. Definition
can
and
SCOTT
DANA
THEORIES
OF
nor
Disciplinesas diverse as cosmology and social psychology provide

evidence that it is nearly useless to have an exactly formulated
quantitative
of measurement
feasible methods
cannot
be developed
theory if empirically
for a substantial portion of the quantitative
concepts of the theory. Given
or
a psychologicalconcept like that of
a physical
concept like that of mass
strength,the pointof
habit
of
of
characteristic
a
of
consists
From
collection
an
of relations
additional relation,are
of the
measurement
The
major
must
be
shall
so
to
as
guarantee
Nevertheless
the
the
measure
this
specifiedobjects.For
physicalobjectsare
set; additional
easily
data, and
quantitative
yielda satisfactory
adequate theory of
reasonable
of mass,
measurement
that
basic set of
the
postulated have
often
ment
measure-
numerical
for instance,
objectsmeasured
acceptable numerical
empirically.
Conversely,as
be
we
of relations which
the structure
cannot
numerical
succinctlycharacterized
interpretation.
difficultywill
of
major
varietyof empiricalcontexts.
source
and
exact
an
of this paper,
desired
set of
utterly unsuitable
are
empiricalmeaning
sound
the structure
pretation.
technicallypracticalempirical inter-
relations
the
last section
in the
have
embarrassing consequence
infinite. Here
see
between
to
analyses of the
classical
but
interpretations,
have
to
objects.
have
yet also
and
interpretation
the
may
in providing an
difficulty
relations which
is to construct
have
needed
of the
masses
of
source
The
be used
corresponding to the concept. Why

abstract standpoint a set of empirical
example, data on the relative weights of

representedby an ordering relation on
fortiori an
laybare
empiricalphenomena
collection of relations?
data
is to
measurement
empiricalrelations which
collection of
theory of
be
carefully
point of the present

be grounded
analysesof measurement
may
paper is to show how foundational
in the generaltheory of models, and to indicate the kind of problems relevant
scrutinized
in
which
measurement
to
then
may
stated
be
The
here
not
main
(and perhaps answered)
in
precisemanner.
Received
^
We
clear
and
like
here
special cases
related, and
our
of the
indebteness
mathematical
presentation (see [7]).Although
our
Office
record
to
precise formulation
constitute
research
1957.
September 24,
would
we
was
of Naval
have
of the
made
supported
use
under
classes
arithmetical
of results and
Contract
our
NR
methods
to
Professor
Alfred
theory of models
theories
of
171-034,
the
the
Group
notions
article
are
fluenced
innot
closely
This
Psychology Branch,
appeared in J. symbolic Logic, 1958,23, 113-128.

212
do
theory of models.
Research.
This
whose
greatly
of measurement
Tarski,
from
Taxski
has
DANA
measurement,
want
we
To
involved.
notions
relational
(finitary)
"^, Ri,
The
give
to
relational
is to
is
of
sets
R^,
R^
.,
of
the
to
being
as
of the
of elements
form
called the
finitaryrelations
are
relational system 31 is called finiteif the set A
meaning
empirical data
set
non-empty
of theories
say, finite sequences
91, and
system
construction
with
treat
we
systems, that
^
213
SUPPES
PATRICK
precise set-theoretical
begin with,
R"y, where
.,
of the
domain
on
AND
turning to problems connected
Before
51
SCOTT
is finite ; otherwise,
are
infinite.It should be obvious from this definition that we
mainly
data.
think
of each
consideringqualitative
empirical
Intuitivelywe may
r
elation
R^ (an w^-ary relation,say) as representinga complete
particular
to a question asked
"no"
set of "yes" or
of every Wj-termed seanswers
quence
of objectsin A. The point of this paper is not to consider that aspect
analysisof
the
If
a
"Wi,
relational
.
is
i?,-
relation
the
all the
S""
Ri{a^,
.,
then
one,
31 is
and
is
Within
possibleto
are
framework
give
an
Although
isomorphism
in
s
a
Then
w"".
.,
onto
"%,...,
a^"
However,
with
image
outlines
of
domain
9'?is
s.
only if
.,
"
elements
S3
,
of elements
homoeach
of
/ is
A,
one-
isomorphic.
are
w, the
relation
in S3 if
numerical
that, for
function
R^
system
sub-
some
relational system
is the
Re
set
of all
relational system 3t with respect

function
which
of
imbeds
31 in 5Z.
theory
contexts
of
for
imbeddability
theories
homomorphism
isomorphic imbeddability
it is
now
of measurement.
theory are determined

by fixing a
relational
only considering
systems
homomorphism,
notion
of
relational system 91 of type
mathematical
the
of 3t. ^ A
positiveintegersand
than
of type
$8 is
such
31 is imheddahle
characterization
exact
n
.,
not
numerical
most
rather
both
determined
required to be one-one.
of the preceding formal
definitions
of
2.
S,-to ^.
system
finite sequence
Section
uniquely
-S and, for each
relational
general
connected
they
similar
are
.,
assignmentfor
First of all the
Next
numerical
assignment is
the
restrictive.
/ from
sequence
relational system whose
numerical
s.
"Wi,
function
homomorphic
numerical
of type
real numbers.
a
systems
a"^) if and only if Si(/(ai), f(a^)).If the

S3 is an isomorphicimage of 31,or simply 3t and
of S3 is
to
that
each
relational
is
subsystem of SS if ^
simply
if for each
avoiding of this ambiguity is not

relational systems 31
"yl,i?i,
R^} and
is the restriction of the relation
is
sequence
the
of type
are
for
rather
interpretations.
of positiveintegers,then
R"y is of type s
relational system
two
morphicimage of 31 if there
i=\,...,n
non-empty;
are
.,
of data, but
their numerical
positiveintegerssuch
Suppose that
"-B,5i,
type of
relations
worthwhile.
S3
sequence
that the
Notice
collection
relation. Two
w^-ary
of
actual
w-termed
an
"^, R^,
an
the
systems and
is
w""
relational system 31
if there is
with
connected
of measurement
and
the
of
is selected which
is defined
here
are
is
in
terms
this
measurement
used
facts
of
is too
actually closely
explained
in
detail
in
214
READINGS
corresponds
intended
the
to
PSYCHOLOGY
MATHEMATICAL
IN
numerical
imbeddable
relational
of
interpretation
in
the
'j!fl
are
theory,and
Moreover
the
permitted.
systems
only
all
relational
of
not
imbeddable
need
in
'^
concern
systems
type s
theory
subclass. Since it is reasonable
that no special
but only a distinguished
set of
we
require that the distinguishedsubclass be closed
objectsbe preferred,
of
under
isomorphism. We thus arrive at the following characterization
of measurement
theories
exists
finite sequence
system 9^ of t5ê
be
is
that
all relational systems in K
object that
may
have
of measurement
of
a
the definition
than
rather
linguistic
ordinarilythought
not
and
as
models
in character, since
linguistic
entity.To
natural
formalization
of every
be
sure,
in first-order
first-order axioms
that
infinite relational
one
and
cardinality,
infinite
by
theory
theories
many
predicatelogic
themselves
system
as
are
model
it is difficult to
then
they
how
be established between
numerical
models
can
any natural connection
Even
models
of arbitrary cardinality.
neglectingthis criticism first-
and
order
have
of theories of measurement
set-theoretical
identity.Notice, however,
adequate, for if they admit
with
of type
are
in SfJ.
readers
Some
such
imbeddable
should
is
theory of measurement
systems closed under isomorphism for which there
numerical
relational
a
s of positiveintegersand
of relational
class K
definite entities:
as
axioms
ordering
it has
theories which
will
Archimedean
wish
to
In fact,we
give
use
relational system is finite
and
these
metamathematics.
of
involved
On
in
the other
of the
some
hand,
do
we
linguisticquestions.
pointof departure for asking
reject any
we
set-theoretical definition
our
as
generalpropertieswould
more
immediately
be
that
or
linguisticdefinition
properties.Any
impressionthat
the
propertiesinvolving arbitrary
express
permitexpressionof
require extensive machinery

deepest problems of modem
not
to
numbers, for example, that
natural
an
adequate
not
are
see
as
just such questions.

On
definition
basis of the
the
questions naturally arise, to

first place,is a given class of
in the
And
it be
second
not
a
of which
devote
we
relational systems
place,given a theory
theory
of measurement,
adopted,
a
two
In
section.
the
of measurement?
in what
sense
can
of measurement.
simple counterexample
shows
that
every class of relational systems of a given type closed under isomorphism

Let O be the class of all relational systems of
theory of measurement.
type "2"
3
each
of measurement
axiomatized?
2. Existence
is
of theories
In
that
contexts
some
relative
are
to
'^. Notice
of measurement,
then
simple orderings.Let ",A,Ry

we
shall say
that
so
that
the
consequence
is every
the class of all systems imbeddable
subclass
class K
of this
of K
in members
is
be
system
theory of
of K
under
is also
isomorphism.
a
where
theory
of type
measurement
is that, if K
definition
closed
in O
is
theory
Moreover,
of measurement
216
READINGS
numerical
relations in various
covered
method
such
of all
morass
PSYCHOLOGY
MATHEMATICAL
IN
unifying
possiblenumerical
that
relational systems is the most

has
found.
been
yet
relational systems
the
among
few
of
are
very
in terms
of the
only
only those definable
From
an
empirical standpoint
a
computational value, indeed
any
portant
im-
But
most
sets
ordinary arithmetical notions.
find numerical
of qualitativedata can
interpretationby relations defined
of example we
in terms
of addition and ordering alone. By way
cite
may
the
of masses,
measurement
Frequently
probabilities.
also the
given in this paper
of
consideration
and subjective
intensities,
weighted averages
multiplicationof numbers.
of the
use
distances, sensation
the
shall restrict ourselves
we
However,
in the
to the notions
requires
examples
of addition
and
ordering.
No
natural
scientific situation
would
strictlyto require the
seem
of sets of infinite data. This state
of affairs suggests that theories
containing only finite relational systems would
of measurement
The
empiricalpurposes.
problem
sideration
con-
suffice for
is delicate,however, for the measurement
meteorologicalquantity such as temperature by an automatic recording

device is usually treated as continuous
both in its own
scale and in time.
lie in the correct
does not really
Yet the important problem of measurement
of such recordingdevices but rather in their initial calibration,a process
use
of qualitativedecisions. Because
of the
proceedingfrom a finite number
of
awkwardness
shall not
of the
applicationof
uniform
finite relational
systems,
we
this restriction.
generallymake
are
establishingthe existence of measurement
In
reference
to
concrete
recent
best motivated
a
a
example.
by
paper [4],
Luce
has introduced
a
generalizationof simple orderings which he calls
Further
remarks
semiorders.
satisfies the
about
semiorder
is
relational
followingaxioms
"Â P"
system
for all x, y, z,
51.
Not
52.
// xPy and
zPw, then either xPw
53.
// xPy and
zPx, then either wPy
of type
"2" which
A:
xPx.
or
or
zPy.
^
zPw.
in situations
where
objectsare to
two
be arranged in order and where
it is difficult to say exactly when
objectsare indifferent. For example, to say that xPy might be interpreted
as
higherthan the pitch
meaning that the pitch of the sound x is definitely
of y, or that the hue of color x is definitely
than the hue of color y,
brighter
Such
relations
are
most
likelyto
occur
greater than that of y, etc.

weight of the object x is noticeably
two
Indifference between
objects x and y (in symbols: xly) is defined as
that
or
not
the
xPy,
See
of those
and
not
[4], Section
yPx. The
2, p.
given by Luce.
181.
is that the relation / of
point of Luce's axioms
The
axioms
given here
are
actually
simplification
DANA
SCOTT
AND
PATRICK
217
SUPPES
always transitive,a fact easilyappreciated for each of

the intuitive interpretations
given above.
In his paper
Luce
interpretationfor certain
gives a certain numerical
show
that
kinds of semiorders, but he does not
particularclass of
any
indifference
is not
is
semiorders
of measurement
theory
not
are
interpretations
finite
the
the
case
only if X
situation
Clearly,if
y+1-
"
it is fair to say that
"Re, " " is
system
defined
and
exercise
simple
Further
followingresult:
The
class of finitesemiorders
numerical
real
are
the
by
condition
is
such
that
give
in
that
and
if
that
x"^
y,
is noticeably
better,x
prove
shall
we
theory of
or
y,
be
x^
numbers
to
his
relation. However,
relativelysimple. Let
becomes
semiorder.
fixed numerical
here, because
used
sense
is definitely
greaterthan
y. It is in fact
greaterthan
in the
relative to
real numbers
between
relation
then
the relational
the
of the
proof
relative to the
measurement
relational system "Re, ^".
presentingthe proof of the above,

in proofs of the existence of
general method
it would
Before
call the method

of type
of
Let
cosets.
"Wi, ...,m"".
31
"yl,R^,
which
measurement
i?"" be
.,
we
relational
shall
system
equivalence relation
determined
uniquely
be well to outline
is
1
n
xEy ifand only iffor each i
and each pair (^z-^^, z^y, (Wi,
w^^.y oim^-termed sequences of elements
1
of A, ifZj 7^ Wj implies{z^,
w^
[x,y] for /
w^, then Riiz^,
z^^)
and
if
only if Ri{w-^,
w^^).
Even
the
above
definition is complicated to state in general,the
though
introduced
into 91
the condition
by
.
.,
The
they
R^. ^
notion
of
xEy is simple : elements

perfectsubstitutes for
are
.,
the relation
the relations
where
.,
meaning of
E just when
weak
orderingcan
relation
serve
as
is connected
and
each
an
in the relation
y stand
other
with
example.
respect
Let
transitive.
and
9t
all
to
"^, i?"
Then
binary
xEy is
-equivalentto the condition : For all z A, xRz if and only ifyRz, and zRx
if and only if zRy. However, this simplifiesfinallyto: xRy and yRx.
to the general case,
define, for each x
A, [x] to be the
Returning now
the
class of all y such

of all
[x]ior X
permissibleto
x^êA,
system
A.
define w-ary
relations
Rfiixj],
if
-...[x^^])
21*
It is at
is
"
xEy. [x]is called the coset

Directlyfrom the definition of
once
of
that
iA*,R*,
obvious
and
R*y
Rf
over
A*
x.
we
such
can
21* is
deduce
only if Ri{x-^,
...,x^).
is called
a
the reduction
homomorphic
be
the
class
that it is
that, for all Xj^,

The
image
'
The
authors
are
indebted
suggested
this
to
the
referee
for
general definition.
pointing
out
the
of 21 and
work
.,
cosets.
that
2t**
is the following:
is not quite obvious
isomorphicwith 21*. What
// S3 is a homomorphic image of 21,then 21* is a homomorphic image of
in [3] which
relational
of 2t by
that
^*
Let
33.
by Hailperin
218
By
proof,let /
of
way
that if
for
f{x)
simplicitythat
We
must
show
that
91 and
homomorphism
[x]
[y].Instead
if
from
that
show
g is
of type "2" and
are
91
wish
to
generalcase,
assume
"^, i^", $8
"B, 5".
to
have
that
therefore
[x] tor
g{f{x))
=
of 39 onto
shown
x
following relation between

subsystem : if 93 is a homomorphic
subsystem
function
any
of 91. For
B
from
of 91 to the
restriction
let
/ be
into
there
is
function
A. It is trivial to
verifythat
the
concepts of homomorphic
phic
image of 91,then 93 is isomorof 91 onto
homomorphism
such
of g
range
that
91*.
the
and
image
that
f{g{y))
for all y
35. Let g
of 91
yields the subsystem
B.
The
isomorphic
33.
the
Using
above
remarks
in
35 if and
91 is imheddahle
Further, it follows
that
of 91* onto
isomorphism
we
Let
function
subsystem
shown
have
(i)K
SfJif and
in addition
relative to
is closed under
be
remarked
satisfyexactly
notion
of
the
that for
K". In effect
without
of
one-one
subsystems, then if *
numerical
ments.
assign-
class of weak
proof in
the
relational system
orders, then
the
first
relational system 91,91 and

of first-order
logic not
identity,then
K*
is
always
involving the
satisfying
class of all systems
satisfyingthe
in addition
satisfyingthe axioms for K and

(*) If xEy, then x
y.
The applicationof this remark
is the
K*
paragraph
91*
is the class of all relational systems
if K
Hence,
identity.
tirst-order axioms
formulas
same
numerical
possessingonly
our
It should
images
isomorphism.
91* for 91
some
the formation
example again, if K is the

that
the class of simple orders. Notice
of this section is a specialcase
of (ii).
use
always an
above:
class of all systems in K
is the
91* in 35 is
of all homomorphic
isomorphic to
(ii)If K
equivalence:
in 35.
of 91*.
only if K* is also.
is
imbedding
of 35, and
the
once
class of relational systems closed under
any
be the class of all systems
K*
To
be
now
at
only if 91* is imbeddahle

any
establish
can
of 91 this property is characteristic
we
of the
33. We
"
homomorphism
with
such
A*
onto
Notice
Let
of 91 onto
35
clearlysymmetric.We
be
be
f{x) f[y),then xEy, or in other words, for all 2: ^,

if and only if zRy. Assume
xRz. It follows
if and only if yRz, and zRx
hence
S
w
hich
that
a
nd
f{y) f{z),
implies
yRz. The argument
f{x)S f{z),
xRz
is
MATHEMATICAL
f{y),then
PSYCHOLOGY
IN
READINGS
axiom:
^=
to weak
simple orderings
orderings and
is left to the reader.

Consider
semiorders.
defined
of E
again
For
above.
the
any
In terms
xEy if and
of semiorders.
case
"^
P"
of /
S, consider
one
only forif all z
can
e
Let
the
establish
A, xlz if and
be
the class
relation
of all finite
of indifference
simplifiedcharacterization
only ifylz.
DANA
Introduce
(*) as
SCOTT
axiom
new
AND
S4. The
just the class S*. Notice that

orderingsand simpleorderings,the
of subsystems even
though S is.
unlike
is
For
semiorder
any
only iffor all z,
leave
We
a
weak
if
"Â,Py
and
reader
S*, then
"
is clearer
x^Pyi, and
Now
the
to
Â
has
an
Let
in
assignment
{xq,
.,
the
uniquely by
If
(1)
the formation
relation
and
weak
then
follows
as
xPz.
of the
fact that
xPy impliesxRy,
that, if xRx^^,
and
xPy.
fixed member
a.
". Under
"Re, "
sequence
then
a,-
of S*. We
the
XiRXi_-ând
wish
relation R, A
to
is
Xi_^. Define
x^ ^
rational
of
Uq, ...,""
show
that 9t
simply ordered.
of
by a course
numbers
determined
conditions:
followingtwo
xJxq,
closed under
with
that
x^} where
recursion
values
then
notices that
Py he
satisfyingS4
pleasant situation
further
.S
xEy if and only if xRy and yRx. Thus,

simple ordering of A. The connection between
a.
one
then
yiRy,
let SH
if
all 21
if yPz
zPy,
verification
elementary
and
is
of
class S* is not
if zPx
the
ordering of A,
is
the
introduce
"y4 P"
219
SUPPES
class
if and
xRy
PATRICK
i+1
i
If
(2)
and
xJXj
XiPXj_^where
0, then
^"
a^
a^ +
i-\-\
Notice
that
show
Passing
Let
such
have
always
while
that
the
we
"
choice
that
If k ^
have
aj_-^ and
then
the induction
such
k ",
course
It follows
that
a^_-^. \i k
ai "
flj_i+
to
that
must
implies i "
that
the
j. By
a^. "
"
0.
Clearly
(I ) this is obvious.
If Xi_jÎxQ,
then
1
XiPXj_-^^.
fl,_i "
in other words
or
Xi_jÎxQ,
Xi_^PxQ.
case
and Xi_-i^Pxj^_-^.
Xi_-iÎx,^
By definition
^. there
is
no
then
problem. Assurhe
xJXj
xjx^.
show
1.
and
so
XjIXi_-^,
by our
it
induction
follows
hypothesis on i,
j, the requiredinequalityis obvious.
Similarly"i_i
a,_i, and
and
aj_i ^
hence
We
x^Px^^^^.
a,^.But
that, if a^ "
k. Assume
Let
hence
x^ be
by
the
way
a^ +
1 "
then
1, but
again, by
a^.j.
Let x^ be the
a^.-\-\.
^"1 " k, and, in view
a^ "
have
aj_i+
aj.+
"
a^ "
that, if x^Pxj^,then
prove
precedingargument,
Converselywe
of
not
1- If ?
hypothesis,a^. ^
step is
next
first element
of the
such
i. For
on
and xjx^,
XiRXi_-i^,
Xi_-i^RXj,
we
/" I,
The
0, /
of k
"
fl,.
and
assume
rcik-i
"
i. Now
"
cases:
1
aj^ -\
"
that
induction
xjx^
can
first element
i-\
-
by
ai_i
that
assume
1. Hence
two
are
0.
a^ ^
(2),
to
flj "
xÎx^,there
first that a^ "
x,^ be
"!_!
in
every
first element
We
that
we
a^^^ +
(2) the hypothesis implies that / ^ i, while in the case

formula
for a^ simplifiesto a^.
Notice
further
flj._^-f z+1.
either under
element
(1) or (2);for lettingXj be the
x^ comes
i the
i-\-^
a^, whence
The
x^Px,,.
of contradiction
first element
such
that
that
a^ "
aj.+ 1
hypothesis
not
x^Px,..
xjx.;
then
220
READINGS
and
'^
then
0 ": a^ "
a,. "
We
aj^-\-\.
and
Uj. If
Uj^ ^
thus
xJxq
a^. "
1, which
which
cij^-\-\,
that
is
again
is
the argument
covered, and
0, then
0 ^
conclude
can
"
a^
and
on
of cosets,
Let
us
given
class, K
a^. ^
a,,
been
have
cases
that
a^. We
been
have
actually
that
proved
S*
is
and, by the general remarks

S is also
theory of
ment
measure-
in the infinite
also work
the steps in
The
upon.
Second, if the proof that K
existence
semiorders.
systems, the
numerical
the
ment
measure-
First, after
one
relational
nummcal
systems in K, and
theory of
as
marked,
re-
all the
orderingof
and
be
was
where
systems
of addition
a
of
relational system should
numerical
is
as
co.
the
establishing
of relational
say,
be decided
long
case
real
is not
measurement
at
tion.
obvious, the cardinalityof systems in K should be taken into considera-
once
restriction to countable
The
and
justified,
of K,
member
helpfulin
the
convenient
of
After
the
existence
is
Consider
relational
introduce
that
of interest:
Let H
For
and
such
that
xyMûv
be the class of all such
for every
x, y, z, u,
Al.
// xyDzw
and
A2.
xyDzw
zwDxy.
or
v,
zwDuv,
and
given
then
We
or
method
for each
feasible
so
relational
present
an
In
is K.
subsystems.
established, there is
of type "4". For
one
system, what
example.
such
systems
if and
only if xyDyy. xyM^zw

if and only if there
zRy. xyM^+hw
xRy
uvM^zw.
A:
then
be
systems. This plan was
relational systems which

w
by the
assignments
plan : cosets
been
iA,Dy
the followingdefinitions:
only if xyDzw, zwDxy, yRz
exist u,
either
assignments?
systems %
in K
of measurement,
has
of measurement
finite
considers some
is to say, one
element
of K" is a subsystem
every
used
have
to
often
can
of cosets, it is sometimes
theory
class of all its numerical
if and
the reduced
Instead
such
could
we
if often
question which
axioms
If K'
.
of semiorders
case
is the
K'
in K'
system
some
the
subclass
empirically
restriction
of measurement
find numerical
only on
of semiorders.
case
relational system
trying to
concentrates
one
existence
imbedding by subsystems. That
consider
to
of
Then, instead
are
the
always seem
possiblewith
of each
the reduction
simplified
by
of cosets.
proof of
the
systems would
results
adequate
systems. Third,
we
it has
"Re, ^",
of
suggested naturallyby the structure
it is most
practicalto consider
be simply defined in terms
relations can
numbers.
All
examples simple orderings and
as
system should
a^ "
f{Xi)
well-orderingof type
summarize
now
using
is
is
ordering i?
the
as
that
Thus
proofwould
the above
that
inequahty
"Re, ^".
relative to
Notice
x^Rxj^.But
the
a,-fl, but
contradiction.
conclude
we
x,JXfj,because
contradicts
0. Now
"
such
relative to
of measurement
method
the
and
complete.
Finally define a function / on ^

% in "Re, ^".
that / imbeds
shown
theory
PSYCHOLOGY
MATHEMATICAL
IN
xyDuv.
satisfythe following
DANA
A3.
then
A5.
// xRy
A6.
There
is
A7.
//
xyDzw and
not uRy.
and
xRu,
not
A8.
and
These
axioms
between
relational
w.
not
xzDzy
and
then
there
xRy,
then
xRy,
that
there
are
zyDxy.
is
u,
such
and
between
Making heavy
H is a theory
that
system 9t
that zwDxu,
an
such
use
and
be formulated
y is not
of the
is the
only if x"y
"
last three
that
weak
and
yRx
case
the interval
existence
axioms, it
relative to the numerical
quaternary relation defined

for all x, y, z,
z"w
property of the
in first-order
in
greater than
of measurement
Archimedean
the
the relation i? is
in H,
interpretationof xyDzw
the intuitive
xy/!"izw
if and
cannot
for
"Re, A" where
system
be stressed
in A8
xzDuv.
that
such
not
interval
that
the condition
must
and
and
and
be shown
can
imply
the
is that
wzDyx.
yzDuv, then
zuDxy.
ordering of A,
wRz
221
SUPPES
PATRICK
then
and
// xyDzw
zuM'^vw
AND
xzDyw.
A4.
// xyDzw,
// xyDzw,
not
SCOTT
logic,because
by
Re.
It
ordering embodied
it
implies that
all
than the power

of the continuum.
systems in H* have cardinalitynot more
In addition, it can
be shown
that, if 31 is in H, and / and g are two numerical
assignments of 91 relative to "Re, A", then / and g are related by a positive
linear
transformation
that, for
all
^"
Re, f{x)
that
is, there
exist
/5e
a,
+ ^. This givesin
(xg{x)
with
Re
certain
sense
"
the
such
answer
numerical
one
assignment for 21, we
question above: If we know
all. Except for very specialsystems in H, nothing more
specific
be
can
really
expected.
Notice that all relational systems in H are
necessarilyinfinite. In the
in detail the theory of measurement
F
next
section we
shall consider
in "Re, A". Here
the
consistingof all finite relational systems imbeddable
situation is quite hopeless.There simply is no apparent general statement
be made
about
that can
as
assignments. In as much
formation
imbeds
"Re, A" in itself is necessarilya linear transany function (p which
the
to
know
them
conversely,it follows that, if 91 is a system in F and / is an

is also an
assignment for 9t,then / composed with a linear transformation
with F is that two
assignment. The main
difficulty
assignments for the
same
system in F need not be related by a linear transformation.
and
3.
it is natural
Axiomatizability. Given a theory of measurement,
to
ask
various
questions about its axiomatizability,for the axiomatic
considerable
analysis of any mathematical
theory usually throws
light
characterithe structure
on
of the theory. In particular,
extrinsic
given an
^"
The
proofs
proofs of both
in
Suppes
and
these
Winet
facts
[6].
about
are
very
similar
to
the
corresponding
222
READINGS
PSYCHOLOGY
MATHEMATICAL
IN
via a particularnumerical
relational
characterization
system, it is quite desirable to have an intrinsic axiomatic
of the theory to be able better to recognizewhen
a relational system actually
of
zation
results
belongs to the theory. In view of the paucity of metamathematical
of higher-ordertheories, we
shall restrict ourselves
concerning the axiomatics
in firstto the problem of axiomatizing theories of measurement
order
logic.
It is
the
then
model,
infinite
result that, if
well-known
most
part
we
it has
of real numbers,
set
is
first-order axioms
is not
strong
Even
hardly
an
models
are
them,
That
asset.
of
To
of
set
such
remove
simply restrict the
we
restriction to finite cardinalities
difficult
questions.Thus for the

theories of measureshall consider only finitary
ment,
we
rather
some
is called axiomatizable,if there exists

of the
for
is to
given
i.e.,theories containing only finite relational systems. Such
(the axioms
one
cardinalities. Since
measurement.
understand
to
leads to
and
of this section
remainder
theory of
consideration.
cardinalities under
are
systems that
a
has
assignments with values in the
one-one
cardinalities
usually not
having
difficulties without
too
in
unbounded
class of all relational
say, the
of unbounded
models
interested
are
of first-order axioms
set
that
theory) such
of sentences
set
of first-order
finite relational
theory
logic
system is
in
the
in the set. A
theory if and only if the system satisfies all the sentences
theory is finitelyaxiomatizable if it has a finite set of axioms. A theory
versal
axiomatizable if it has a set of axioms
each of which is a uniis universally
normal
in prenex
a sentence
(i.e.,
sentence
form
with
only universal
quantifiers)
.
It should
observed, first,that
be
finitarytheory of
any
measurement
deeper than saying that in first-order logicwe

write down
can
a sentence
completelydescribingthe isomorphism type of
each finite relational system not in the given theory, and clearlythe negations
This
is axiomatizable.
of these
for
closed
relational
that
of whether
recursive
the
system
is
recursive
enumeration
imbeddable
the
as
in each
case
or
in
in
may
theories
sentences
problem
is
true
or
as
theory
universal
consistingof
tences,
sen-
all finite
relational system, then

is simply the
effective axiomatization
recursivelyenumerable
this last
be taken
given numerical
number
also that if the
Notice
considers
one
give an
effective method
clearlya continuum
are
the axioms
It is of
required set of axioms.

instance
measurement.
class of universal
that
relational
then
systems imbeddable
establish
not
serve
cannot
we
subsystems
conversely.In
problem of
relational
can
finitarytheories of
under
and
no
the axioms, since there
writing down
of distinct
the
sentences
quiteobvious
course
is
in
not.
the
given
It is not
problem
numerical
difficult to
equivalent to the problem of giving a
of all the relation types of finite relational systems

the
systems whose
given numerical
relations
are
relational
definable
system. For
in first-order
numerical
logicin
terms
224
under
closed
We
are
the
to
turn
now
numerical
the
relational
for
qualifyas candidates
the
copcerned with
of the
loudness
sounds
systems of type "4" imbeddable
with
would
other
and
wide
of loudness
be obtained
every
"Re, A".
system
of
sounds, the
by asking subjectsto
then
to
compare
the difference of
compare
pretations
pair of sounds with every other. More elaborate interrequired to obtain appropriate data on utilitydifferences for
in every
are
individuals
social groups
or
[6]).It may
Winet
case.
subjectivemeasurement
appropriateempiricaldata
be of
this choice
to make
they vacillate
chosen.
in their
be
y may
interest to mention
some
to choose
choice, and
estimated
only
from
the
interpretation
probabilistic
one
of
the relative
p^y
pairedcomparisons.
objects,but
between
are
they
will be chosen
frequency with
".
p^^
we
asked
are
situations in which
many
probabilityp^y that
of the form
inequalities
From
Siegel[2],Suppes and
scalingmethod
of times. There
number
and
Davidson, Suppes
(cf.
to the classical
closelyrelated
Subjects are asked
over
such
one
variety of sets of
In fact, all sets of psychological
data based upon
of sensation intensities or of differences in utility
membership in F. For example, in an experiment
empiricaldata are in F.
judgments of differences
each
proof for
relational
F be the class of all finitary
Let
in
theories of measurement
interesting
examples of finitary
tence.
subsystems which are not axiomatizable
by a universal sen-
there
sentence,
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
which
obtain
may
is
x
a
set
so
of
empirical data, that is, a finite relational system of type "4", which is a
for membership in F. The
intended
interpretationis that, if
sation
P^y " I and p^^ ^ |, then p^y ^ p^^ if and only if the difference in sencandidate
intensityor difference
that
than
closer
between
and
together than
by
system
axiomatize
to
the
on
F. Let
y is
equal
to
or
less
is closer to one-half
than
that of
over
w.
F is not axiomatizable
relational
sentence, we
intuitivelyindicate for a
universal
elements
and
being, of course, that if x and y are

in the subjective scale, then the relative
frequencyof choice of x over y

Before formallyproving that
of ten
utilitybetween
the idea
w,
and
in
the
the kind
the
of
elements
ten
following diagram with
arises in any
which
difficulty
atomic
be
a^,
intervals
u^q ordered
attempt
as
shown
designations
given the
indicated.
I"1 I
I"3 I
"2
"4
I/^lI/^2I
/^3
i^4
(a^,a^),let /5be the interval (ag,a^,),and let y be

or
largerthan
/?.We suppose further that a^, a.^, a.^, a.^ is equal in size to
^^
but
is less than /3.
/Sg,
Pi,Pi,^3, respectively,
Let
be
the
interval
"
to
Essentiallythis example
show
that
was
first
particular set of axioms
given
in another
is defective.
context
by
Herman
Rubin
DANA
size
The
that
now
elements
is
225
SUPPES
intervals
remaining
is imbeddable
in
chosen
so
"Re, A", whereas
clearlynot.
using the criterion derived
and
be
may
from
Vaught's
prove
The
Theorem.
PATRICK
elements
Generalizingthis example
we
the
of nine
of ten
the full system
theorem
AND
relationships
among
subsystem
any
SCOTT
theory of
is not
measurement
axiomatizable
by
versal
uni-
sentence.
In order to
Proof.
sentence,
need
we
is in F but
show
to
% of type "4" such
apply
make
F.) To
as
specifyingthe
"i for
We
then
now
1,
set
a^
1,
I.
and
a^m
and
^^
for i
2*
a^
2.
we
have
two
Then
m"\
for k
be
may
atomic
1,
to
cases
to
consider
take
we
one
10.
easily described
by
We
for
and
m"\,
.,
2m
"
than
intervals.
"^+i
I elements
numerical
greater
or
most
a^+.+i"
compact,
disrupt exactly
{"!,
.
define
1,
a^+i
depending
.,
a^
the
1.
w"
In
2^^.
"
on
ag^} defined,
then
we
parity
".x,y, z, w} is not
and
if and
only
(1)
nine
let
b
a-^^,
permutations
(These nine
^^
a^/2
set
a^,
of "a,
"
b, c, dy
relation
of
have
a^,
as
a^,
the
y ^
""+!,
"
a^^. Then
we
12,
set
and
merical
nu-
flgm-
I^
a^^-^, a^m),
the
following
in D:
"a, c, d, by
"c, d, a, by
"J),
d, c,
"a, d, c, by
"c, d, b, ay
the
2m
expected
put
ib,d, a, c"
d"c.
2,
w.
"c, b, d, a}
"
if
"", b, d, c"
b"a
for i
the
a^^j
ib, a, d, c"
ay
2,
permutation of ".a-^,
a^,
some
we
the
for i
.,
permutations correspond exactly
followingfrom
of iA, Dy
a^/g
m"\.
define
now
except for permutations
ix, y, z.wyeD
Moreover,
relation
X, y, z,
^^
m"2.
Thus if w
1,3,
and/Sj
c(.(^î)iîori
/Sg,a.^
i^^,ag
î, cx.^ jSg,(x^
p^. With
.
set
we
"
a^
,
is odd, and
i
I,3,
a-(rn+i-i)/2 ior
Then
is even,
and
m"\
is odd.
.,
have
even.
^^
.,
m"\
,
is
m"2
Case
4,
k elements
and
finite
2m"
integerequal
even
construct
we
m.
Case
4,
system
in its domain
subsystem of
definite
and
size of the
..,m"\
an
10
of 2m"
both
domain
a^,
numerical
we
fixingthe size of /S,-,
of
be
of numbers
selection
"i+i"
elements
every
subsystem
every
of the
elements
2m
that
construction
the
Let
relationship.
The
universal
finite relational
of 9t with
integer n
even
(A fortiori
is in F.
numbers
subsystem
51 of type "4" such
relational system
is in
axiomatizability
by
there is
% is not.
this end, for every
To
for every
that
that every
of
the criterion
All nine
are
needed
appropriate properties.)
to
to
the
make
strict
the
inequalities
subsystems
226
the choice
From
subsystem of
that every
Case
be
in the
The
2.
Case
2a.
(2)are
merely
in D
imbeddable
to
show
naturallyarise.
cases
or
a^^^
restricted to the
in "Re, A", but
a^ "
this situation
For
a^.
assignment the function
a^, a^,
subsystem. There
in the
not
is neither
omitted
element
element
the
is not
length equal
but by hypothesis
to
is a^, a^,
subsystem
not
in F;
not
a^m-
subsystem,
(1)
virtue of
by
of it.
subsystem
Case
up
is in F. Two
1 elements
"
omitted
element
subsystem
add
must
a^
the
and
is
The
2m
it is obvious
a^+j and a^^,

the interval {a^^-^,
a^.^.It remains
the nine permutationsof
Then
a^ and
(a^,a^ is less than
the interval
the definition of D
intervals between
of the atomic
sum
and
"Re, A", that is,that iA, Dy is
in
intervals between
for the atomic

to the
in A
of the numbers
imbeddable
is not
that iA, Dy
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
we
a^ra- Let
nor
a^+i
are
two
cases
may
use
for
a^
consider.
to
numerical
our
.,i"\,
1,
/ defined by /(a,_j) "t-i+l for /
to
It is straightforward but tedious
1. ...,n"i.
=
CLi+j for ?
that
/ is a numerical
verify
f{ai^j)
"
lation
assignment, that is, that it preserves the reand
D
to
observations
crucial
two
defined
are
as
(2).Only
by (1)
intervals (in the full system), if
this verification. First, regarding atomic
"i_,+i-a,_,.
(a^_j"1)
chosen
(^x,
y) 7^ {z,w)
from
the
the
defined,
as
the
1,
of
would
.
proof of
A,
z"w,
that
a^-\-\.
ai"
{z,w) is
and
then
not
numbers
no
so
it is clear
f(w). (Note that the
"
distinct
two
were
interval, and
Then
z"w.
f{y)^ f{z)
"
in A
atomic
an
x"y-\-2
f{x)
result that
the
non
atomic
intervals
size.)
same
2b.
"
x"y
definition
Case
for
implies the weaker
above
have
and
/(",_,+i)
-/(",_,") (a^.^+i-l)-
Second,
f{cik+i)"f{^k)-
^fc+i" aj.
that, if x, y, z,
=
then
k"i,
for
a^+i-a^
.,
be
i"\,
Here
expected
use
we
may
from
the
fia-i+j)^j+j+l
=
for
numerical
previous
/
1,
assignment /
f{ai_"j ai_j
by
case,
n"i.
This completes
=
.,
the theorem.
pleasant to report that we could prove a stronger result about

the theory of measurement
axiomatizable.
F, namely, that it is not finitely
for studying
of
tools
available
to
there
be
seems
a paucity
Unfortunately,
It would
be
such
questions for classes

state a conjecturewhich
of relational
to
if true
the
finite
are
closed under
would
systems. However,
provide one
axiomatizabilityof finitarytheories
say that two
satisfied by the same
submodels.
We
would
useful tool for
of measurement
sentences
we
are
like
studying
like F which
fmitelyequivalent
finite relational systems, and

only if they are
we
by a finitemodel
conjecture:// S is a sentence such that if it is satisfied
it is satisfied
by every submodel of the finitemodel, then there is a universal
sentence
finitelyequivalentto S. If this conjecture is true, it follows that
is finitely
submodels
closed under
finitarytheory of measurement
any
if and
by a universal sentence.
only if it is axiomatizable
proof (ordisproof)of this conjectureappears difficult. It easilyfollows
axiomatizable
The
if and
DANA
that,
sense
wider
sentences.
is
the
[7]
sense
and
Since
logical
by
logically equivalent
of
consequence
full set, it also
the
implies
submodels,
under
is
universal
class
of
set
of
it
because
equivalent
is thus
and
subset
sentences,
but
them;
in
universal
of universal
set
subset
finite
the
implies
this
to
finite
some
class
the
in
of relational
the
denumerable
wider
throughout
for
is true;
conjecture
is axiomatizable
is
removed
are
the
in
classes
(arithmetical)
restrictions
modified
227
SUPPES
PATRICK
universal
on
finitistic
thus
AND
satisfying S, being closed
systems
the
if
the
conjecture,
it
results
Tarski's
from
SCOTT
it.
to
Our
is
conjecture
its
pertinence
is not
we
should
like
to
special
area
in the
definable in
theory of
of
measurement
is
problem
Let
in
manner
all
for
negative
be
arise
which
relation
finitary
the
Is
finitely axiomatizable?
the
of
theory
the
that
shows
relations
quaternary
less than.
and
then
is true,
and
conclusion
binary numerical
in
imbeddable
models
In
typical of those
any
and
models
of
theory
measurement.
of plus
terms
systems
finite
about
of
problem
finitely axiomatizable
is not
an
general
theories
to
unsolved
of measurement.
conjecture
(If our
restricted
mention
elementary
an
the
concerning
one
in
definable
ment
measure-
of
terms
this
to
answer
plus and
less than.)
REFERENCES
[1]
G.
[2]
D.
this
R.
vol.
[6]
D.
19
W.
P.
[7]
A.
[8]
R.
and
and
PRINCETON
Hypothese
du
M.
An
Winet,
systems,
Econometrica,
theory of utility discrimination,
vol.
16
Remarks
Vaught,
vol.
UNIVERSITY
16
to
(1954), pp.
on
AND
vol.
the
STANFORD
582-588,
classes
589-591.
UNIVERSITY
of
I,
and
relational
1934,
based
on
192
the
pp.
notion
259-270.
(1955), pp.
of models,
theory
572-581,
Lwow
and
of utility
axiomatization
universal
(1954), pp.
Warsaw
continu,
science,
Contributions
Tarski,
mathematicae,
pp.
axiom
14-20.
(1954), pp.
utility differences. Management
mathematicae,
first-order
178-191.
SiERPiNSKi,
SuppES
in
121
1957,
Press),
description
experimental
An
making:
Decision
University
identity and
on
colloquium
Society
pp.
(Stanford
Semiorders
Luce,
(1956), pp.
24
[5]
Remarks
Hailperin,
283
S. Siegel,
and
California
Stanford,
T.
xiv
Mathematical
American
theory,
(1948),
Suppes
P.
Davidson,
Journal,
[4]
vol.
ed.
25, revised
approach,
[3]
Lattice
BiRKHOFF,
series, vol.
II,
vol.
III,
17
Indagationes
(1955),
systems,
pp.
56-64.
Indagationes
of
^lODELS
CHOICE-REACTION
FOR
Stone
jMervyn
medical
In
the
TIME
council*
research
two-choice
situation, the Wald

sequential probability ratio
variance
of the decision
and
applied to relate the mean
alternative
and
the ratio of
rates
times, for each
separately, to the error
the frequencies of presentation of the alternatives.
For
situations
involving
than
two
more
choices, a fixed sample decision procedure (selectionof the
alternative
with
highest likelihood) is examined, and the relation is found
between
the decision
time
(or size of sample), the error
rate, and the number
decision
is
procedure
of alternatives.
This
models
of
is made
no
is
and
with
data
is
that
need
designed with
to be
The
models
subject (S) is given

identify some
after
for
signal and
given
independent
The
present
models
dynamic
run
and
their
in
time, the
the
following
about
or
The
response.
time.
time
Thus
appeared
taken
the
in
may
choice
of
procedures, but
Also
no
(i)the
to avoid
parisons
com-
paucity
premature
orthogonal
at
of
a
for the
onset
change
stream
signal, a
into
rate
S. After
S's
reaches
Chaucer
recorded
Road,
1960, 25, 251-260.
up
signal
that
is,
mutually
with
time.
of three
Reprinted
certain
taking
time, S makes
will be
Cambridge,
of
stream
decision
time, the decision

is made
are
to
The
They will be hydro-
each
of
required
sequence;
not
the
reaction.
signals
do
which
presented with
is
of response.
time
228
is
random
response
15
signal and
S
to be
Unit,
in
appropriate
an
uniform
further
choice-reaction
Psychometrika,
mode
this
situation
of different
the
At
or
presentation
settled
quick
The
reasons:
the
to
form
attributes
After
Research
decision
pendices
ap-
that
hoped
in directions
ia made.
attributes
flows
"computer."
making
have.
be kept open
make
reaction
sense.
the
*Applied Psychology
article
signal
signal
input time, the front
mechanism
This
S has
in
to
in mind.
and
probabilitiesof
that
It is
for several
applying
as
the
successive
text.
he
confined
powerful discrimination,experiments will
most
of signals, the
assume
information
motor
the
in the
summarized
specificmodels
until
are
statistical
data
often
are
of the
details
reader
time-stationary stimulus
attribute
signal remains
data
field should
envisaged
are
the
mathematical
several
psychologically unreasonable.
the
interests;(iii)for the
our
appear
assist
experimental
means
rejections;(ii)published data
to
results
by analogy with
presented which
are
working
analysis of the
mainly
made
of available
The
presentation will
"calculated-observed"
model
point of usefulness
time.
only definitions
method
models
the
to
choice-reaction
for
and
this
develops
paper
called
the
components:
England.
with
permission.
MERVYN
the
the decision
input time, T", ;
apply
which
to Td
makes
Ti and
models
"
(the number
variables
will be related to the environmental
frequencies of presentation) and
their
the
Tm
are
It is assumed
that
that
commences;
information
the
the
paced
is supposed
there
that
patterns of information
may
"S's computer
which
variables
at
time
either
arise from
is
the
need
no
We
sources.
for
of
occurred) until
or
Sq
or
If there
Si
of
the
self-
is, some
is
that
variable
the
certainty
un-
at
has
variable
which
on
on
random
independent
random
the
information
(dependent
certainty
un-
added
noise
series of
each
no
The
that
will suppose
t and
random
Si
stream
statistical computer.
situation,from
external
intervals
(stationary)distribution
the
information; that
the
in
operates is equivalent to
short
signal (eitherSq
examining
preparatory warning signal is given. It
overlap
may
both
from
or
to
some
arise from
start
is "noisy" until
(This stream
it.)This assumption holds in the
some
there
in this sense,
input stage,
added
is
to
computer.
also when
and
when
knows
the
when
implied
it is not
Situation
Two-Choice
jar the
subject knows
is,he
signal is
condition
Models
the
arriving at
from
stream
Ratio
which
at
rate
By concentratingon T^ in this way,

necessarilyindependent of these factors.
Likelihood
say)
The
time, Tm
motor
incorrect responses.
that
has
] the
time, T^
signals and
of
229
STONE
the
signal
is made.
the response
Signal
Xi
Poix) and
Let
respectively.If
the
x's
of information
stream
auto-correlation
the
be
Pi(x)
x's
are
the
probabilitiesof
are
instantaneous
then
to
the
integralsof
the
stream
samples
over
not
the
stored
the
of
an
So and
less than
time
lags (or
at
zero
apart. If
intervals,then
the
least for
transforms
computer
Si
continuous
almost
an
successive
the
in
signal is
independence implies
for all time
t).Suppose
is then
quantity c(x) which
of
stream
auto-correlation
with
when
assumption
parts of the
between
assumption requires zero

those not small compared
x
X3
X2
each
adder.
SequentialCase
The
A
and
makes
computer
with
log B
the appropriate motor

the
been
total has
made
not
for Si
"
a
are
running
total of
preselectedso
c(xi),0(^2),
that
"
S decides
"
Constant
"
.
log
for So
(and makes
log B, provided
the total falls below

as
action) as soon
the decision would
exceeded
log A when
previously
facilitates
(The odd way of expressing the constants
have
later
230
READINGS
references.)If
IN
decision
the
is made
of the function
c(x)
S is familiar
with
different
that
situation
of
number
So
Si
distributions
then
[1]in
on
xi
X2
and
the
decisions
"
of
thought
S
by
let no
when
who
have
may
to
ceivable
con-
of
structure
of
(1) is
of the
averages
other
decision
incorrect
be
trained
are
cedure
pro-
response
this
possible that
is
signals presented
for any
It is
it is
optimality
the
and
exploratory,
as
the
be the
ni
learning,
task
However
from
The
,
averages
ni
of
process
equal probabilitiesof
n*
S,
to
Even
decision
information
The
that
initiation
T"
the
implies that
and
incorrect
of
Without
to
form
of
it
use
by
to
ways
variable
assume
called
the
much
T"
assumption
affect the
may
therefore
tion
assump-
is the
same
is made
for
T" and
deliberatelyshortened.
However,
Tj
in the
uncertainty
may
distribution
for So
(or of
be
without
about
for
(and
So
a
hold
a
reasonable
needed
to
the
n's,and
natively,
; alterit does
therefore
leading to Si) is
With
the
above
the
of
same
assumption,
of the correct
for
a comparison
comparison of those leading to Si).
test
about
of the
those
incorrect.
or
examining
"condition
The
.
affect Ta
result should
of
for So
that Ti +
possibility
and
is
requires only the
of Ti +
same
which
processed since the

by information
is
shown
1
it
Appendix
that, with mild
In
assumptions
of
it is T
influenced
correct
would
errors
be
decision
basis
the
time, Ti
Pi{x), the
same
decision
(The
exclude
action.
leading to
the
leading to
that
which
the distribution
wrong.
may
are
making
more
intervening
the
T's
provides
proportion
decisions
this
of
be
cannot
motor
is available
the computer
to
remembered
not
length
Td's,leading to
of
does
po{x) and
on
that
right or
long, T"
of the
restrictions
whether
is
be
test
trials
of Tj
presented
if Td is
assume
Consider
.) This
for Si
it must
so,
the value
correlated.
unwise
the
is found.
one
decision
are
fio and
"
Td
not
is,given
whether
think
n%
be
his computer.
on
for
appeal
not
be
etc., with
n*
followingassumption.
This
choice
probabilitydistributions
discrimination
deduced
following terms:
testing the model,
measured
the
theory
optimum
reward.
Before
be
the
optimality does
suitable
the
result of
may
optimal
can
necessary
then
Si
the
imposed
respectively.If n*o
and
So
the
samples
based
to
c(x)'suntil
and
by Wald
the
the
of the
of results. *S's computer
given knowledge
stated
that
nt. The
logpo(a:).
be
Po(x) and Pi(x).Such familiaritymay
S
has
trials
provided
performed many
the
sample T^
[1] shows
test
\ogp,(x)
implies that
function
trying out
nth
c(x) is
(1)
a
PSYCHOLOGY
the
at
sequential probability ratio
of the
Such
MATHEMATICAL
give
of the
a
Po(x)
model.
powerful
and
validity of
However,
fair
test.
Pi(x),it
the
model.
is difficult to
Since
is
an
operational definition,it would

clearly be
and
is
there
one
Pi(x).However,
Po{x)
tion,
assumpof
symmetry,"
which
in
some
discrimination
232
in
in
READINGS
IN
Appendix 3 that, provided lOe "

the followingrelation between
Ta
Td
The
J(e,e)
cc
PSYCHOLOGY
MATHEMATICAL
po "
and
e
,
lOe, the minimization
"
results
po
J(po,Po).
Kon-Sequeniial Fixed-Sample Case

If S has
incentive
then the advantage

quickly and correctly,
of the sequentialdecision procedure is that those discriminations which
by
chance
be
made
and
time
saved.
is
However
it
are
happen to
quickly
easy
is possiblethat S may
less efficient strategy which
is to
adopt a different,
an
to react
"
fix Tj for all trials at
Let the sample size

ratio
procedures are
decide
and
for Sj if
procedure based
procedures with
sequential case
on
po(x) and
making
case,
will
correspondingto
as
"
"
certain
procedures
be
n.
rate.
error
likelihood
The
c{x") " log C;

log Pi{x)
log po{x)
"
log C; c{x)
optimal in the sense
are
accepted
time
for So if c(xi)+
c(x") "
"
give
this decision
follows: decide
c{xi) +
0. These
"
which
value
"
"
"
that,if
other
any
used, there exists one of the likelihood ratio

remarkable
that in the
It was
error
probabilities.
useful predictionswere
obtainable
under
mild restrictions
piix).Unfortunately this does not hold for the fixed-sample
on
a;i
"
"
"
a;" is
smaller
difficult the
more
problem
of
testing whether
such
model
holds.
If there
input storage, it is possiblethat the results of the

imposed strategy just outlined are equivalent to those obtainable when
time Td
experimenter himself cuts off the signalsafter an exposure
this is the
is
of situation
type
of these
emphasis
self-
no
authors
considered
by
Peterson
is
and
Birdsall
mainly on the external parameters

supposed intervening variable. They
the
But
.
[2].The
(such as
define a
energy) rather than on any
situations for auditory discrimination in terms
of a parameter
set of phj^sical
normal
the means
of two
d, which is equivalent to the difference between
populations with unit variance. (For, in the cases
considered,it happens
that the logarithm of the likelihood ratio of the actual physical random
variables for the two
alternatives is normally distributed with equality of
This parameter sets a limit to the various
alternatives.)
discriminator
to So and Si)of any
using the
performances (errorprobabilities
bound
the
whole
of the physical information.
therefore
It
sets an
on
upper
performance of S who can only use less than the whole. In [2]the authors
S makes
make
the assumption that the information
the basis of which
on
likelihood
his discrimination
nevertheless gives normality of logarithmof the
ratio. They examine
data to "3e
whether
S is producing error
frequencies
that lie on a curve
defined hy a d greater than that in the external situation.
variance
For
under
the two
alternatives
variable
More
than
there
are
(which
may
be
Two
Alternatives
vening
probabilitydistributions for the interthat is,signals,- induces an
multivariate);
m
with
the
of
the consequences
where
pi(x) ior
distribution
probabUity
233
STONE
MERVYN
1,
"
fixed-sampledecision procedure
will consider
We
m.
"
"
based
on
.Xi
"
"
"
x"
is fixed.
signalsare presented independently with probabilities

Pm
Pi
the
when
the
of
and
is
to
if
signals.
q:.(D)
probability error
(adding to unity)
decision procedure D (based on a;i
.t")is used, then the probability
of error
to a singlepresentationis
If the
"
"
"
"
"
"
J2pM^)-
minimizing e is that which effectively

selects the
posteriorprobability.In this section, this
minimum
(or T^/t) and m when distributions are normal.
e will be related to n
it might be necessary
to supplement
in the validation of the model
However
Td with a time T^ representingthe time the computer requiresto examine
is the largest.For, although it
the m posteriorprobabilities
to decide which
It is shown
Appendix 4 that
signal with maximum
the
in
3D
might
be reasonable
expect Ta
that Ta
with
to vary
the
We
"
the
will state
relation between
take
Pi
x{l),
"
P2
"
inputs
is readilyseen
on
the
to
and
is held
is all-round
with
channel
of
x{l),
"
"
"
mean
a;(l),
"
x(m)
variance,
ances.
unit vari-
regarded
be
can
is stimulated
iih channel
The
unit
and
means
"
independent
are
0 and
zero
"
variable
random
x(m)
"
with
normal
in the
[3],who stated the

by the experimenter):we
under
Si
ing
signalcorrespondAppendix 5 that,
the
optimal procedure is to choose

the largesttotal. It is shown
that the
two
any
is constant
multivariate
with
symmetry.
similar channels.
suppose
Birdsall
constant
that
are
when
and
and
1/m
of
components
there
the
with
p"
to
larger.
Peterson
would
one
be
to compare
necessary
Thus
It
when
m
=
"
the other
while
"
'
between
x(m). Under
"."
suppose
x(i) is normally distributed
"
that
and
which
relation
and
is the
(treatedby
followingspecialcase
as
t' is the time
where
1)^',
(m
and decide
probabilities
=
for Ta would
simplest model
The
m.
Ti -f T^ is independent of w,
that
to suppose
in
this procedure,
nn'
for those
{1 +
for which
distribution
have
been
n(j.^
and
[0.64(m
e
results
nearly linear
are
(1/m). $"'
The
values
of
ni/
for certain
against log
m,
which
agrees
be
can
with
values
ized
standardof
and
Td is proportional to
then
is
^-\l/m)]'
is the inverse of the normal
independent of m,
plotted in Figure 1. It
If
e)
"
function.
calculated.
the
"
[$"'(1
l)"'''
-f-0.45]'}
that
seen
T^ is
very
experimental findings
some
in this field.
The
question may
be
raised
model.
whether
symmetry
condition
of the
to the
where
auditory signalis
case
an
Peterson
any
w-choice
and
task
Birdsall
presented in
one
can
obey
the
the
model
apply
equal periods
of four
234
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
LOGg^
Figure
of
an
(Ta) for Error
Time
Decision
The
present, but
expect the model
apply to
lightsarranged
discriminable
positional.However,
on
is
one
of which
the
to
low
in the
in
where
intensityvisual
positionaleffect
the
difficult,
would
to
of
one
not
fairlyeasily
be highly
display,for the noise would
some
case
superficially
upset it. We
of response
case
is
symmetry
case
would
difficulties of S
memory
any
Equally Likely Alternatives (m)
of
noise. In this
of S to "white"
exposure
1
Number
(e) and
Rate
lightsare patches
signalis superimposed so
the
not
may
be
important
and
noise
of white
that response
be
there may
symmetry.
Appendix
Let n,y be the sample size for a decision in favor of ","when

s, is presented.
by its characteristic function,
distribution of n,^ is completely determined
The
rPijFrom
A5.1
of
[1],if
"f"iii)
=
12Piix)[pi(x)/poix)Y,
then
a)JBVoo[-log0o(O] + aAVio[-log0o(O]
(6)
(1
(7)
/35Voi[-log0i(O] +
Now
^)AVn[-log"/.,(0]
1,
1,
F" defined in Appendix 2 are small. If a "

quantitiesEi
/8/(l
^)/a and B
(I
iS " 0.1 then to a good approximation A
and
in
t
1
+
(7),
w
(6)
+
w)
"/"o(1
î(w);so, putting
provided
and
the
0.1
/35Voo[-log0i(i^)]+
(1
"
(1
^)AV,o[-log"î(w)]
a)SVo:[-log0o(w)] + aAVn[-log"ô(w)]
1,
"
a).
MERVYN
By
comparing these equations with
and
^10
of
\pn
and
nio
(7), it is found that iôo 'A"i

of noo and noi (and similarly those
(6) and
the distributions
Therefore
.
uu)
identical.
are
Appendix
In the
235
STONE
of symmetry,
case
X Po{x)log [po{x)/pi(x)] J2 Pi(x)log [pi(x)/Po(x)]
E,
and
var
log [Pq{x)/p\{x)]under
oi[l],\i E and
A:12
From
paix)
(8)
no
V.
small,
are
J{a, ^)/E;
Pi{x)
log [Pi{x)/pQ{x)]under
var
n^
J(/3,a)/E.
Therefore
/((8,a)/J (a,/3).
ni/flo
=
By
with
differentiating(6) twice
(8) and
the fact that
a(l
VJ{oc,/3)
'"
t and
to
respect
lAyis the characteristic
substituting t
function
oc)[^n\ (ni
0, using
of riij
,
n,y]
E'
{I
"y
-a-
By symmetry
^(1
F/(/3,g)
""'
^)ml
(no
fiO^]
E'
a-
py
a-
Hence
J{a, P)vi
/(/3,a)yo
a)[4n? (fii no)2]

niy]]/(l
J(", /3)/3(l |3)[4n^ (no
a)"(l
{,1(13,
^y
Appendix 3
If
"
Keeping
lOe
"
po
the usual
"
and
0.1
[or po
1
^
+
"
"
(1
then, by (8), Ta
0.1
Po)/3]constant
"
lOe, the condition
"
methods
on
that the minimum
at
set of
be
the
for which
set
a
of all
decision
value
in the
a).
po)/(i3,
range
Td is proportional to J{e, e)
"
by
J{po, Po)-
possiblevalues
is made
given by
/3will be satisfied. It is found
and
Appendix
Let
(1
p(J{a, /3)+
oc
for Si
of
a:
(xi
"
"
"
Xn) and
Xi the
Then
.
TO
"
Z) Pi
1
Suppose Xi and
Xj have
common
S
xfX
"
p.(^)Xi
boundary; then, for
to be
minimum,
236
READINGS
it will not
IN
changed by
be
PSYCHOLOGY
MATHEMATICAL
small
displacements in this boundary. Hence, on

the boundary, PiPi{x)
PiPi(x); that is, the posterior probability of s"
equals that of s,- Consideringall possibleboundaries,the solution is that
=
is the
Xi
other
set of x's for which
s,- has
than
greater posteriorprobability
the
signals.
Appendix
Write
^(*)
under
Then,
Si
1] xXi)/n.
\/nx{l)is A^(\/n/i,
1) and
\/nx{i)is A^(0, 1) for
i t^
1.
Therefore,
ai(2D)
"
arrk!^
"
"
On
mi)r-'
[-Kw
exp
aA fjif]
du.
integrationby parts,
(9)
Z pM^)
(w
f ûr-'^(u
l)(27r)-'''
6
say, where
their tabulation.
"
while
The
ei{6)
same
that oiv
as
-\-w, where
of
not
are
addition
of
to
determine
"
will
the
and
"
as
as
the basis of
5
-^
oo
"
for 6.
the distribution
to
(wj
max
"
"
"
variables.
of
of 6 turns
v^î)and
v,Vi
out
"
Referring to Graph
second
(9),e"(0)
moment
distribution.
normal
and
of Peterson
o-âs follows. From
e"(^)
"
"probabilitydensity function"
those
d is
"
"
be
y"_i
4.2.2(7)
quotients
the
Also
approximately normal,
we
N(v, o-^),
(1/m). Also e"(0)
Birdsall. If 6 is
=
"
Therefore
(" v/cr).
=
0-^
[0.64(m
-^
and
oo
improve normality. Hence
calculations
(-iw') du
exp
this form
[3]use
20, the first and
"
v/aAlso
different from
very
agreeing with
normal
as
hence
and
m
independent
[4],it can be seen that,for
of
"
"
is
\ei{d)\
standard
are
Birdsall
and
e"(0)
However
0. Therefore
"
Vnix)
"00
-s/nn.
Peterson
characteristic function
the
'^^
(27r)-
the constant
nti"
=
var
var
1)~*+ 0.45]'for
error
{1 +
and
w
m
"
-$"'(l/m).
from
w
4.2.2(6) of [4], var
o"^
determines
Putting e"{d)
Graph
20, which
rate,
[0.64(m
ly'''+ 0.45f}[$-'(! e)
-
*-'(l/m)f.
e,
MERVYN
237
STONE
REFERENCES
[1]
Wald,
[2]
Peterson,
No.
[3]
13,
Gumbel,
Revised
among
E.
Birdsall,
and
York:
T.
Group,
Birdsall,
alternatives.
T.
Wiley,
G.
The
Univ.
G.
The
Quarterly
1947.
signal
of
theory
Michigan,
of
probability
Prog.
detectai^lity.
Tech.
J.
Statistics
of
extremes.
10/^6/69
received
1/4/60
New
York:
Rep.
1953.
Rep.
No.
decision
correct
10,
Electronic
Defense
1954.
received
manuscript
and
Defense
W.
W.
Michigan,
Univ.
Manuscript
W.
Electronic
Peterson,
choice
[4]
W.
New
analysis,
Sequential
A.
Columbia
Univ.
Press,
1958.
in
forced
Group,
I
STATISTICAL
INFERENCE
T.
W.
Anderson
Columbia
obtained
are
order when
there
A.
Goodman^
Universityof Chicago
likelihood estimates
for the transition
CHAINS
MARKOV
Leo
and
Universityand
Maximum
Summary.
ABOUT
and
in
probabilities
chain of arbitrary
their
Markov
repeated observations of the chain. Likelihood ratio tests

and x^-testsof the form used in contingency tables are obtained for testingthe
of a first order chain
followinghypotheses:(a) that the transition probabilities
are
are
constant, (b) that in case the transition probabilities
constant, they are
that
the
chain against
is a wth order Markov
specified
numbers, and (c)
process
the alternative it is rth but not wth order. In case
0 and r
u
1, case (c)
results in tests of the null hypothesisthat observations at successive time points
are
independent againstthe alternate hypothesisthat observations
statistically
chain. Tests of several other hypotheses are also
from a firstorder Markov
are
of a long
considered. The statistical analysisin the case of a singleobservation
are
"
chain
is also discussed. There

ratio criteria and
1. Introduction.
for certain time
which
is
discussion of the relation between
some
x'-tests of the form

Markov
used in contingencytables.
chain is sometimes
series in which
hood
likeli-
the observation
suitable probabilitymodel
at
given time
is the category
individual falls. The
chain is that in which

simplest Markov
of equiand
finite number
a
categories
distant
time points at which observations
are
made, the chain is of first-order,
the same
for each time interval. Such a
and the transition probabilities
are
chain is described by the initial state and the set of transition probabilities;
mediately
namely, the conditional probabilityof going into each state, given the imWe
inference
shall
of
statistical
state.
consider
methods
preceding
into
there
are
an
for this model

and
the
an
when
there
are
set of transition
same
to estimate
of states
finite number
or test
probabilities
asymptotic theory for these methods
increases.
We
for each
time
The
RAND
of inference
the transition
use
of
some
when
wish
velop
de-
the number
of inference
need
probabilities
of the statisticalmethods
given in detail [2].The data

study" on vote intention. Preceding
number
of potentialvoters
asked
was
This work
initial states
example, one may

hypotheses about them. We
shall also consider methods
has been
'
of the
for
not
of
more
be the
interval.
illustration of the
Received
in each
probabilities
operate. For
generalmodels, for example, where

An
observations
many
the transition
observations
same
or
for this illustration

the
1940
came
described herein
from
election
presidential
his party
or
candidate
Math.
Stat.,1957, 28, 89-110.

241
of
preferenceeach
August 29, 1955; revised October 18, 1956.

carried out under the sponsorship of the Social Science Research
was
Corporation, and the Statistics Branch, Office of Naval Research.
This article appeared in Ann.
"panel
each
Council,
Reprinted with permission.
242
month
from
May
(6 interviews)
October
to
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
At
each
interview
intention
at
constant
over
It
and
This
to
both
use
Some
in
the
herein
present
paper.
for
that,
The
of
Hoel
[10] in the
they consider
increases. We
a
a
and
of view
the
of
by
present
and
further
ordered
these methods
of state
"
"
as
the
be
time
and
states
so
discuss
Wi(0)
the
case
as
Bartlett
by
is observed;
time
points
where
that
is
likelihood ratio
the
to
ratio
are
t, given
likelihood
ratio
1, 2,
is referred to
Markov
first-order
tests
"
1 to m,
1,
i at
state
i is
the
Though
actual
no
of
is made
use
"
"
"
time
m;
^
"
1,
"
"
"
T)
are
the probability
he
with
shall deal both
We
1
.
for each
ni(0) individuals
they
[7].
chain.
-jm.
"
[2] or
a political
party, a
(a, b), etc. Let the times of observation
state
same
though they
where
x^-tests,
ordinary contingency
some
between
relation
and
criteria
that i might be, for example,
Pij(t){i,j
with (b) nonstationary

probabihties need not be the
there
ing
test-
hypothesis
pij
ior t
transition probabilities(that is,where
the
of the
of states
of
from
treat
for
given
(a) stationary transition probabilities(that is,Pij{t)
section that
their
5 of the present paper,
likelihood
integer running from
an
at
studied
number
chains, the reader

of
T. Let
"
presented
final section.
of Markov
arrangement,
0, 1,
=^
the
is
related
are
geographical place, a pair of numbers

he
given
and
motivation
been
considered
was
of the
parameters
Let
usually thought of
this
has
as
tables
present both
discussion
of the
model.
also
[2] are
single sequence
theory
contingency
we
discussion
2. Estimation
2.1. The
only
asymptotic
in
in the
x^-tests appears
explains
potheses
general hy-
more
[10].
paper,
procedures.
For
chain
of the
where
hypothesis that
how
it is shown
table
order
the
the
Hoel
given
[2],and
and
transition probabilities,
of the
situation
used
form
generahzation
In
their
methods,
shall discuss this situation in Section
x"-test of the
point
some
these
hypothesis.
methods
new
pothesis
hy-
first mentioned
were
[1] and
in
the
was
for all
simpler.
estimation
of fit and
goodness
of
this
for dealing Avith
of
the im-
at
the methods
in [1] and
given
or
this null
to
differed from
appearing
of
advantage
users
many
of the
problem
[3] and
An
to be
application seem
methods
new
party
probabiUty
the theory and
those
the probability of
conformed
methods, which
newer
residual
probabilitieshold
same
the data
of formulas
corrections
is
some
old and
the
the
extends
different from
somewhat
on
the data
how
see
develops and
paper
are
to
that
specificways
[2].It also presents
in [9],that
how
also in what
present
in [1] and
of interest
was
only
that
on
his intention
(first-ordercase), that such
and
(stationarity)
,
time
individuals.
one
interview
mediatelj''preceding
depended
interview
decided
was
person
the latter being
Democrat, or
Know,"
of
who
had not
primarily
people
consisting
category
candidate. One of the null hypotheses in the study was
voter's
each
"Don't
classified as RepubUcan,
were
time
in state
i Sit t
nonrandom,
random
interval).We
=
while
variables. An
0.
In
I,
"
"
"
T)
the
transition
assume
in this
this
in Section
observation
section,we
4,
on
we
a
shall
given
T.
iiO),i(\),i{2),
T, namely
"
These
possiblesequences.
transition
the
nij{t)denote
show
shall
of rnT
form
the
is
sum
probability, in
for the
individuals
be
i "t
t
m;
"
and
"
1,
"
"
T),
"
observed
for the
whose
t.
set
sequences.
of states
sequence
at
a
is
2(0),
i(t
1)
"
and
dimensions), of
nT
are
i(t)
j. The
for all
all sequences
given ordered
is
"
"
"
"
Pi,T-.u"rATr''-'''''''"
"
\i(T-l).i(T)
^^p."(0"-"'^
products
the
in the
0,
"
"
of
distribution
factorials.
of
"
"
lines
first two
nij{t)form
of numbers
set
actual
Let
1)
"
"
"
set
all values
over
of sufficient
of the
"
,m,
1)
2J7=i nij{t).Then
given ni(t
"
-{- 1 indices.
announced.
as
statistics,
nij(t)is (2.3) multiplied by
the
ni(t
1,
are
appropriate
an
the
conditional
1) (or given nk{s),k
1,
"
"
"
is
'^4^^^^-Updtr'''.
is the
on
numbers
(2.5)
"
PminAir'"''''''')
( n
( U
the
tions
pm-DUD
"""
(2.4)
This
"
describing
space
there
initial state
distribution of nij{t),
j
m;
abilities
prob-
should
(n[p.-(.-i).-(r)(r)r'"^^"''--(nb.(o)m)(i)]"^'"^^''^-"^"0
probabilities
transition
the
form
in state
z's with
of the
\i(0),i{l)
are
dimensional
nmT
"
function
"
m'"
^
[p.(0).-a)(l)
p.-(i)i(2)(2)p.(r-i)i(r)(r)r^-""'^"'-"
The
"
Z)n,-(0),(i)...,(7-)
all values
over
the
of sequences
Thus,
of the
1,
of individuals
the number
(for each
individuals
where
stationary. (When
nij(t){i,j
n,y(0
(2.3)
with
events
"
i(0),there
initial state
of sufficient statistics
set
(2.2)
set
i(T). Then
"
where
0,
PHT-l)i(T)
"
of individuals
of
the set
ni(o)i(i)...i(r) be
"
"
is in at "
throughout.)
the number
that
numbers,
"
"
probabilitiesare
replaced by pm-Diwit)
Let
the
necessarilystationary,symbols
not
are
i(l),
243
GOODMAN
individual
the
i{T). Given
"
Pi{0)ia) PiWHi)
when
Let
of states
A.
represent mutually exclusive
(2.1)
We
"
LEO
AND
of the sequence
consists
individual
ANDERSON
W.
same
distribution
one
distribution
multinomial
nij(t).The
as
distribution
would
with
of the
obtain
if
one
had
probabilitiesPij(t)and
nij(t)(conditionalon
'"4^^ÛpAtr''''
n ndt)
L
J=i
ni{t
J=l
"
1)
with
obsei-va-
resulting
the ^^(0))is
244
READINGS
For
with
chain
Markov
IN
concerning
form
of
set
from
nnp;r''
(2.6)
necessarily stationary
not
minimal
Maximum
likelihood
estimated
be
can
subject of
transition
estimates.
The
the
are
Uij
actual
factor
does
This
depend
not
{i
pi,
on
1, 2,
probabilitiespij (i,j
and
the
easily verified that
that
as
"
1,2,
"
,w,
obtained
m) consists of
"
1, 2,
"""
precisely of the
is
probability
the ith. sample
trials with
samples, it is well-known
for pij
that
1,
to
respect
0 and
observations.
samples, where
multinomial
zli'iîi
estimates
probabilities
stationary transition
probability (2.6) with
the restrictions pa
to
course
independent
such
probabilities Vai^), the nij{t) are
the
by maximizing
form, except for
same
nf
written
be
can
the
npr.
T^pij
the
when
^J=i Uijit)
that, when
i,j
(2.7)
for
n,;
fact
of sufficient statistics.
set
2.2.
Pij
set
the
probability (2.3)
o,i
'=1
Pij
from
result
form
in the
For
follows
stationary, the
are
the
(2.3); namely,
sufficient statistics. This
probabiHties
transition
PSYCHOLOGY
stationarytransition probabilities,a stronger
sufficiencyfollows
MATHEMATICAL
"
"
"
m).
For
likelihood
the maximum
are
T
pij
riij/n* X) nij(t)/^
53 w*(0
A:=l
(=1
/=1
(2-8)
and
hence
this is also
probabilityis of the
the Pij
on
the parameters
pij
approach used
likelihood
in the
the
are
in
estimates
form
same
other
except
in which
the elementary
for parameter-free factors, and
particular,it applies
In
same.
distribution
the
strictions
re-
of
the estimation
to
(2.6).
the transition
When
for any
true
probabilitiesare
preceding paragraph
applied,and the maximum
stillbe
can
found
for the Pij{t)are
necessarily stationary, the general
not
to
be
/mZ
nikit).
k=l
The
same
consider
the conditional
distribution
are
the
on
numbers
likelihood
maximum
same
a
as
one
multinomial
nij{t).
would
obtain
"
"
"
nij{T)
if for each
distribution
with
is used.
i and
obtained
Pait) are
of nij{t)given ni(t
distribution
of the niy(l),nij(2),
for the
estimates
one
1)
"
as
Formally
had
"
we
the joint
when
these
n,(i
probabilitiesPij{t)and
when
estimates
tions
1 ) observa-
with
resulting
estimates
The
can
estimate
to
tables
way
ptj
for
for
The
Yltriijit).
the
1,
"
"
by the
following
m
is the
of pa
of
estimate
is the
pait)
entries in the ith
of the
sum
entries nij{t)
the
Let
way:
The
table.
order
In
row.
i,
the corresponding entries in the two-
T, obtaining
"
of the
entries
with
table
two-way
i,jth entry
table
given further
n,j
of n,y'sdivided
by
find the
shall
n,y(i).We
the
presented
estimates
in
on.
nij(t).To
of
behavior
likelihood
maximum
of the
structure
consider
first
the
245
GOODMAN
A.
entries in the ith. row.
Asymptotic
2.3.
Pij
will be
this section
in
two-way
LEO
AND
stationary chain, add
covariance
The
described
estimate
of the
sum
be
divided
table
in the
jih.entry
W.
in
for given t be entered
ANDERSON
T.
nk{0)/2l wy(0)
that
assume
the
of
behavior
asymptotic
"
"
77^
(Vk
0,^
"
Pi(0)i(i) Pi(i)i(2)
the
"
"
variables
variables, and
P
of sequences
seek
we
time
including
the low
order
given
n,(o)(0)
size
n,(0),(i)..w(r)
and
are
parameters
asymptotically normally distributed

combinations
linear
are
nomial
multi-
of these
asymptotically normally distributed.

P'. Then
p]'/is the
of the matrix
state
time
0,
i at
nk-ij(t)be
0. Let
i at time
time
the
number
and j at time
"
t.
Then
of
moments
Uijit)
=
probability associated
set
the elements
k at
(2.10)
The
are
also
are
state
state
hence
nvXO
be
p,-y'
j at
probability of
sample
The
i{0), the
each
For
00.
hence
let
(pij)and
-"
with
and
Pi{T-\)i{T)
"
size increases.
sample
Let
Zl nj{0)
as
multinomial
simply
as
1)
r]k
^nk;ij{t)pi'"^'
pa
nk-ij{t)is
with
with
size of
sample
nt(0).
Thus
Var{n,.;,XO}
(2.12)
(2.13)
Cov
since the
set
us
now
follows
of nk;ij{t)
given
were
examine
procedures. The
to
be
n,(0)pir"po[l Pkf'^P.A,
-n,(0)?)ir'VoWr'V(7A,
Z!y i^k;ij{t);
they
seen
{nk.,i
At), nk;gK{t)\
other variables
Let
nk{Q)pkf'^Pii
"nk;ij{t)
(2.11)
in
will
be
of nk;ij{t)
in
probabilitiespa
-
8{n4;,X0
nk:i{t
-
between
"
theory for
"
1)
Thus,
PijUk-iit
1),
l)pij]
(2.15)
=
SS{[w,.;,;(0 nk-i{t
-
l)po] Ink:i{t
-
1)!
1)
nk-i{t
of nk-ij{t)given nk-i{t
IUk-.iit 1)}
Z[nk;ii{t)
(2.14)
"
obtaining the asymptotic
distribution
the
\)pii where
rik-iH
"
needed
multinomial, with
^).
(S'.
[1].
moments
conditional
Covariances
distribution.
multinomial
ih J)
0.
is
test
easily
246
The
READINGS
of this
variance
is
quantity
rik.iit 1)
"[nk;ij{t)
PSYCHOLOGY
MATHEMATICAL
IN
pijf
88{[n,;,v(0
"nk;i{t
1) pijf\ ",;,(^
nk-i{t
1)}
(2.16)
rikiO)pir^'Pij(l
of pairs of such
covariances
The
"[nk-ijit)
"
(2.17)
iik-iit
Z[-nk-i{t
"
"[nk-ij{t)
"
"
l)Pik]\nk.,,{t l)\
-Wfc(O) pkf^^PijPih
nk-i{t
\)pij\{nk-gh{t) nk-g{t
h,
1) pg,]
"
"
S8{[Wi-;i"(0
l)pij][nk;gh(t) rik-git
nk;i(t
=
1) p.u]
"
l)pi,][nk;ih(t) n,;,("
1) PijPih]
"
Pij).
1) Pij\[nk-ih{t) nk-i{t
"
Pij)
quantities are
""{[nk",î{t) Uk-iit
1) Pij{l
"
\)pg}]
(2.18)
Ink-,{t
1),Uk-git
0,
8[Mi.;o(0
=
l)j
+ r)
nk;i{,t l)p^j\\nk-gh{t
8g{K-;iX0
nk;r(t
nk-g(t+
+ r)
l)pij][nfc;0A(^
l)pgi,]
+
Mi.;g(i
^.
l)pg/J
(2.19)
IWt;g(^+
=
To
nk;gh(s)
Since
"
we
and
nu-gis
size
"
ii t ^
1,
variables
Uk-ijit) Uk-jit
variables
uncorrected
Wi(0) fixed, nk-ait) and
assume
"
of multinomial
covariances
The
nkiO)pki~^\
l)pgh are
"
l)pijfor^
nk;ij(t) nk;i{t
variables
variances
probabilitiespij and sample

and
"
0 and
means
1),ni-;i(^ l), ni;iy(0|
0,
summarize, the random
have
"
"
i ^
or
ni-gh(t)are
0.
"
"
"
"
with
l)pij
g.
independent
if A- ^
l.
Thus
8[niy(0
(2.20)
Uiit
l)pij]
0,
(2.21)
8[n,,(0
8[wi,(0
Uiit
Uiit
l)pijf T. nk(0)pl'~'^
pijd
=
l)pij][nih{i)riiit
-
pij),
l)pih]
(2.22)
=
"
Hnk(Q)pki ^^ PijPih,
7^
h,
k=l
(2.23)
8[Wij(0
n,{t
l)pij][ngh{s) ng(s
-
l)pgh]
0,
t 9^
or
i 9^ g.
248
READINGS
PSYCHOLOGY
MATHEMATICAL
IN
Let
T
E Z
(2.30)
fr=l
Then
limiting variance
the
limiting covariance
the
the
Because
pir"
Vk
of the numerator
between
of (2.27) is "i"i
Pij{l
linear
limiting normal
distribution
n'^ {pa
Since
n"^ {pa
Pij(l
have
are
other terms,
the
Q,
(i
"
"
"
1, 2,
Pij from
often
for
for
shall also make
hence
have
the
of the
samples with
the
This
this section.
values
fact
Hence,
can
in
of i
be
or
two
with
sample size
in the iih. state
for
different values
of i
(i.e.,the
the
about
probabilities
1, 2,
pij in
pij{t)
"
"
"
terms
m).
of
methods
of t
It
m
in
terms
standard
the variables Pij{t)
those
to
of
test
1)
of
pendent.
asymptotically inde-
are
similar
"
the estimates
as
1), and
"
nij(t)/ni(t
testinghypotheses concerning the Pij{t)it
reformulate
limiting
same
trials.
different values
proved by
the
of multinomial
sample sizes "ni{t
and
trials,
used
earlier in
will sometimes
T"- independent
procedures
may
then
applied.
3. Tests
of
3.1. Tests
On
covariances
limiting joint
limiting joint distribution as
same
hypotheses
hypotheses
be
pa) for
"
estimates
the
possible to
Ui
sample sizes n({)i(i
samples consistingof multinomial
be
i has
same
probabilitieswith
different
for two
given
of the fact that the variables
use
t have
i and
given
multinomial
pa)
"
limiting
pa) and
"
probabilitiespa
independent samples consistingof multinomial

We
Pij(l
pij(l
pij) has
"
of observations
{ruîf^
{pa
possible to reformulate
be
pa)
"
number
functions
independent
0, variances
means
0, variances
means
asymptotically independent
are
similar
from
obtained
m)
(2.27),the variables
as
{ntf^ {pa
set
of multinomial
variables
\. The
"
with
0, variances
means
^Jlo ni(t).
set {n(t"if
{pa
factors),and
distribution
will
"
with
expected total
"
"
(see,e.g.,
"
"
covariances
The variables (ri4î)^'^(pij

"5i(,pijP9ft/"/"i.
pij)
Also, the
the estimates
as
is the
and
rK/"i, which
of this limit
covariances
distribution with
distribution
"digPaPgh
n*
distribution
and
limit distribution
same
the covariances
where
"îgPiiPoh
In
distribution
joint normal
the variances
limitingjoint normal
limiting joint normal
covariances
and
has the
pa)
"
and
pij)/0i
"
pgh
nomial
multi-
[4]).
Pij)have
"
4"i pij
of normalized
increasing sample size,they have
limits of the respectivevariances
2, p. 5 in
Theorem
and
distribution
the
Pij),and
"
is "dig
combinations
variables,with fixed probabilitiesand

a
different numerators
two
of (2.27) are
numerators
0i
(=1
of
the basis
derive
can
every
pij "
First
we
hypotheses
and
hypotheses
of the
certain
confidence
about
regions.
asymptotic distribution theory

methods
confidence
specificprobabilitiesand
of statistical
inference.
in the
Here
regions.
preceding section,we
we
shall
assume
that
0.
consider
testing the hypothesis
that
certain transition
probabiUties
T.
have
p.j
specifiedvalues
and
{n*f
the
zero,
determine
or
We
1,
"
"
for
m,
"
given
i. Under
test
for
one
the usual
of
/A
which
m
the
A
of the set pij for which
in the denominator
for different i
be
can
instance
(3.1) over
The
use
all
i,resultingin
of the
of
-test
there is
for
good reason
as
goodness of fit,described
of
were
borrowed
3.2.
In
Testing
the
is that the
the
null
the
region
less than
and
"
Markov
1 moves
hence
(i,j
1, 2,
"
"
"
goodness
p,j
ph j
freedom
critical
of the
region
pij for
set
coefficient
sists
con-
n*{pij
are
x^-variables.
other
obtained
be
can
po
pijf
"
(3.1)for different
to obtain
m)
to
x^-distribution
with
variables
forms
by adding
1) degrees of freedom.
"
in the
as
the
the
of fit is discussed
in this section
(according
significancepoint. (The
adopting the tests, which
chain,
to
the
is the
pij
state
transition
t. A
at
(t
pij
of the transition
1,
in
are
[5].We
believe
analogous
situation
from
to
that
x^-tests
which
probabilities are
probability
general
t; let
on
"
"
"
that
they
us
say
T). Under
time
the
t are
"
function
1)
the null hypothesis is
IlUPir''-
(3.3)
likelihood function
under
maximized
maximized
under
the alternative
UllP.itr^''.
t
i.j
is
to
it is
""'^^^^
pijit)
constant.
individual
an
alternative
probabiUties for
riiit
(3.4)
be added
can
transition probability depends
(3.2)
The
that
of confidence
the
with m(m
x^-variable
hypothesis H:pij(t)
hkelihood
more
consists
replaced by pij.) Since the
hjrpothesis that
the estimates
The
or
one
(see [5]).
stationary
state i at time
about
variables).Thus
significancepoint of
confidence
for all pij
test
pij
degrees of
"
asymptotically independent, the
are
tains
ob-
as
way
\2
significancelevel
(3.1) is
asymptotically independent,
For
of multinomial
than
of freedom.
degrees
with
of this hypothesis at
(3.1) is greater
"
means
Pij
-distribution
asymptotic theory
test
one
more
same
^P^LZ^
n*
i=i
asymptotic
or
testing the hypothesis
an
the
hypothesis
with
asymptotic theory for
standard
use
to
p?jin
on
the null hypothesis
distribution
limiting normal
the null hypothesis,
(3.1)
has
of the fact that under
can
region
consider
specificexample
We
249
GOODMAN
A.
depending
distributions
confidence
LEO
use
covariances
As
make
estimates.
normal
or
AND
pij)have
"
and
for multinomial
ANDERSON
pij
(pij
variances
multinomial
Pij
W.
in
this assumption
We
Pij{t).
alternate
test
pothesis
hy-
250
READINGS
ratio is the likelihood ratio
The
PSYCHOLOGY
MATHEMATICAL
IN
criterion
(3.5)
PiiWJ
of
shght extension
"
is distributed
log X
theorem
as
of Cramer
with
(T
1) [m(m
"
of Neyman
[6] or
[11] shows
that
1)]degrees of freedom
"
when
the null hypothesis is true.

ratio (3.5) resembles
Hkelihood
The
in
of homogeneity
tests
further this similarityto usual
equivalent
For
those
to
has
table, which
used
1, 2,
"
have
rows
"
"
"
jth.column
the
to test
the
Pirn
is
equivalent
^y
with
,
equal
pij
in Section
given
as
i and
given
for j
be
can
1, 2,
6.
mates
the esti-
contingency table,
from
multinomial
that
is,in order
random
that
"
"
"
1, such
that
variables represented by the
the
data
the hypothesis that
to
homogeneous
are
there
are
constants
the probability associated
that is,Pij(t)
for t
pa
this
in
1,2,
"
pn
with
"
"
the
T.
"
appropriate here ([6],p. 445); that is,in order
seems
calculate
22 ni(t
l)[pij(t) pijfI
-
is true, Xi has
Pij ;
the usual limitingdistribution with
(m
"
1)
of freedom.
of the
test
the
in all T rows;
to pa
hypothesis
1) degrees
Another
for
pij{t)
distribution,so
same
is
"
as
appearance
interest is that
this hypothesis, we
if the null
asymptotically
are
same
proof
r.
"
(3.6)
(T
formal
same
of homogeneity
x^-test
The
the
respect. This
Pi2
"
tables.
velop
de-
now
probabilitiesPij{t)for T independent samples. An
hypothesisof
The
pij(t)has
represent the joint estimates
to
and
procedures for contingency
by this contingency table approach
set
of multinomial
for standard
shall
presentedearlier in this section will be
given i, the
obtained
ratios
(see [6],p. 445). We
contingency
that the results obtained
likelihood
tables
hypothesis
trials
test
to
can
of
homogeneity
be obtained
by
for
independent samples
of the likelihood ratio criterion;
use
this hypothesis for the data
given
in the
table,
calculate
(3.7)
X,
IliPij/PiAt)^''''.
t,j
which
is
formally
distribution of
"
similar
log
Xi is
to
the
x^ with
likelihood
(m
"
1){T
ratio
"
criterion. The
asymptotic
T.
of i.
value
given
AND
relating
the
Hence,
LEO
the
to
A.
251
GOODMAN
table
contingency
hypothesis
be
can
approach
dealt
tested separately for each
of i.
value
Let
us
consider the jointhypothesis that
now
1, 2,
"
"
of i
values
the set of
the
m,
"
I,
the fact that
directlyfrom
and
ANDERSON
preceding remarks
The
with
m,
W.
"
"
T. A
"
the
random
variables
calculated for each i
1,2,
"
"
for all i
Pij{t)and
1, 2,
the
null
"
"
"
follows
hypothesis
pij for two
under
different
hypothesis,
asymptotically independent,
are
,m
"
pa
joint null
asymptotically independent. Hence,
are
Xi
Vait)
of this
test
sum
TO
x'
(3.8)
E X?
E E riiit
with
based
Similarly,the test criterion
Vi^'I
Va
t,j
limiting distribution
has the usual
DlMt)
m{m
(3.5) can
on
1)(7'
"
"
be written
Ttl
(3.9)
log Xi
-2
log
-2
X.
i=l
of the
3.3. Test
second-order
hypothesis that the

chain. Given
Markov
of being in state
Pijk for t
2,
"
t. When
k at
"
T.
"
{j,k)
at
composite
t
given
of state
form
composite
(h,k), h
chain
9^
with
to
This
representation is useful
chains
Now
and
be
can
let
t, and
the ni(0) and

the
ni(0)
riijkii)
(i,j,
the
let nij{t
1,
different sequences
nij{t
"
because
"
"
1)
m;
of states.
"
of
the
first-
i and
states
composite
state
the probabihty
course,
states
easily
are
transition probabiUties 0.
of the results for first-order Markov
"
The
is well-
as
comphcated
more
composite
certain
in state
the n^Xl)
"
probabihty
special second-order
at
T)
is
conditional
were
set
2, in
"
j at
in this section
assume
extending the idea of
2,
2 and
"
stationary,pukit)
Of
Pijk{t).
The
zero.
i "t t
2.
and
=
as
1 is
"
We
E^ ''^Hkit).
nonrandom,
"
first a
the other hand,
probability
of individuals
"
is
i. On
on
with
some
Section
nonrandom
were
and
states
number
Wij(l) are
the
(i,j) at
state
from
over
be the
riijkit)
in k at
where
carried
is
Consider
the
this,let the pair of successive
j, given (i,j), is
seen
chain
represented
(i,j). Then
state
the
be
can
do
order.
is in state
2,3,---,T)he
depend
not
chain
(see,e.g. [2]).To
chain
a
individual
an
the second-order
Pijk(t)does
known, the second-order
given
first-order stationary chain
for which
chain, one
define
is of
1, let pukit) (i,j,k=l,---,m;t
injatt"
order
chain
that
the
earUer
random
"
1,
that
sections
variables.
The
of sufficient statistics for
distribution
of nijk{t),
given
1),is
(When
the transition probabiUties need
the symbols pijk should, of course,
be
not
be
the
same
for each
time
intervâl,
replaced by the appropriate Pukit)through-
252
READINGS
the
of
set
riijk
is the
obtained
be
^r=2
nijk{t)form
of pijk for
likelihood estimate
of
for
was
'^m
;=i
Now
let
against
the
Pi"fc
that
"
T,
"
t.
stronger result
ing
concern-
the
maximum
1).
"
"=2
hypothesis that the chain
it is second-order.
1,
==
"
"
is first-order
hypothesis is
null
The
A:
'Pmik
Vik,
say, fory,
testing this hypothesis is^
"""
ratio criterion for
and
"
null
2,
is
(=2
testing the
alternative
V'2.jk
consider
us
i,
21 Wiyt(0/ 2Z nait
(3.10) over
/m
and
"
first-order chains; namely,
stationary chains
"
of sufficient statistics. The
set
"
probabiUties, a
it
as
I,
product
transition
stationary
sufficiencycan
numbers
nijk{t)for i,j, k
n,;(l) is given,
with
chains
For
PSYCHOLOGY
MATHEMATICAL
of
joint distribution
out). The
when
IN
The
m.
"
that
likelihood
(3.12)
/ pî.)"'"'^
(PiA.
i,i,k=\
where
m
(3.13)
p^vi
maximum
is the
from
likelihood
(2.8).This
random
nij(l) were
were
We
1)
"
1)
"
m{m
the
given j,
estimates
2,
"
"
"
m).
n'^ {puk
X
for t, /e
2,
"
"
"
m,
1, 2,
and
the
"
used
be
"
has
table, which
contingency table, can

and
probabilitiesfor
develop further
now
formal
same
the
distribution
seems
that
for
p^*
Vjk
yuk
appropriate.
To
as
appearance
estimates
hypothesis is
of homogeneity
x^-test
tained
ratios ob-
independent samples {i
the
represent
null
The
m.
"
to
of freedom.
likelihood
shall
the nij{l)
asymptotic x"-
1)^degrees
"
(3.12) resembles
ratio
that
an
"
of multinomial
An
has
procedures for contingency tables.

asymptotic
puk) have the same
this similarityto standard

the
what
some-
earlier section the
assumed
we
log
"2
pjk differs
that
in the
for problems relatingto contingency tables. We
For
T"\
here
see
the fact that
to
hypothesis,
likelihood
the
We
"
in this section
m{m
"
Z) ^7Vfc(0
/ 23 w/O
of y^k
while
the null
rn(m
that
observe
'^m
difference is due
Under
with
distribution
estimate
variables
nonrandom.
w,yfc /HI]
as
1,
a
given j
for
1,
this hypothesis
test
calculate
(3.14)
2Z n*j(pijk pjkf/pjk
Xi
i,k
where
(3.15)
n*j
2]
riijk
23 22 nijk(t) 22 îjit
=
If the hypothesis is true,
degrees
2
The
Xi
1)
-=
1=2
t=2
has
the
usual
"
2Z nij(t).
(=1
with
(w
of freedom.
criterion
was
(.3.12)
written
incorrectly in (6.35) of [1]and
(4.10) of [2].
"
1)
T.
analogy with
111 continued
for
homogeneity
by
ANDERSON
W.
use
is
The
formally similar
of "2
of
given value
of
Let
us
2,
"
"
"
log
the
to
of the
test
trials
multinomial
We
hypothesis of
be
can
tained
ob-
calculate
(pjkI ViiLT''\
likelihood
ratio
asymptotic
The
criterion.
with (m
1)^degrees
relatingto the contingency table approach dealt
with
for
each
X" is %
remarks
preceding
value
from
253
GOODMAN
A.
3.2, another
independent samples
X,
distribution
Section
LEO
of the likelihood ratio criterion.
(3.16)
which
AND
the
j. Hence,
of freedom.
"
hypothesis
tested
be
can
separately
j.
A
m.
the
consider
now
of this joint hypothesis
test
that
joint hypothesis
p,yt
can
for all i, j, k
pjk
be obtained
1,
by computing the
sum
X^
(3.17)
Hxj
which
Similarly the
criterion based
test
Vik
j.i.k
with
Umiting distribution
the usual
has
Z) n*j(pijk VikfI
m(m
of freedom.
written
be
(3.12) can
on
1)^degrees
"
-2
log Xy
log
-2
-=
X)
(3.18)
/ p"J
log [pijk
nijk
''"
^='
X)
[logpijk
nijk
log pjk].
ijk
preceding remarks
The
Let
{i,j,
pij...ki
I at
state
and
state
time
"
"
"
i at time
is
(t
"
chain
directlygeneralized
be
can
k,
"
"
1, 2,
"
"
r,r
-{- 1,
of order
"
"
"
"
and
"
state
it is not
m) against the alternate hypothesis that
chain
of order
at
time
r.
an
for i
Pj-ki
"
-{- I
"
hypothesis
shall test the null
(that is,Pij...ki
"
"
T). We
"
for
the transition probability of
denote
m)
"
k at time
t, given state
that the process
2,
"
1 but
I,
r-order
an
chain.
denote
nij...ki{t)
Let
respectivetimes
the
We
^7=1 'nij...ki{t).
r, t
here
assume
(3.19)
where
-\- I,
"
Pij...ki
nij...ki
"
"
"
the
"
.ki
For
as
2,
"
"
"
set
j,
estimates
m), and
"
"
k,
"
let nij...k(t
"
are
nonrandom.
I at
1)
The
nij...ki/n*j...k
,
and
Xr=r nij...ki(t)
n*j...k X
given
1)
i, j,
is
r-i
(3.20)
states
I, t, and
"
the nij...k(r
that
of pij..
likelihood estimate
maximum
frequency of
the observed
"
"
"
"
nij...ki
fc,the
be
^ nij...k(t 1)
set Pij...ki will have
of multinomial
may
"
the
probabilitiesfor
represented by
an
same
n,j...k{t).
asymptotic
tion
distribu-
independent samples {i
table.
If the null hypothesis
254
('P:j-ki
for
Pj--ki
1, 2,
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
"
"
is true, then
m)
"
the
of homogeneity
-test
appropriate, and
seems
xl-k
(3.21)
Pj...kif
1^ n*j...k(pij...ki
I Pj...ki
where
(3.22)
the usual
has
that
here
assumed
in this
differs somewhat
from
the
the
m~^
2Zi, -,* Xy
sum
"
"
"
"
"
Another
and
sets
j,
"
"
""
all values
test of the null
from
1)
"
(j
1, 2,
see
for
1 to
"
"
m;
"
"
; k
"
1, 2,
"
k)
m),
"
"
if
"
for i
is true.
obtained
be
can
"
"
ni'~\m
Pj-.-ki
"
while
Pj...ki
fixed.
are
hypothesis {pij...ki
of j,
hypothesis
"
1)-order chain,
"
hmiting distribution with
the joint null
under
We
estimate
parameters
that the nj...ki{r
",
(r
an
with
variables
the usual
will have
-fc
degrees of freedom
1,2,
random
assumed
have
we
are
1), for
"
be multinomial
of freedom.
likelihood
maximum
the fact that the nj...ki(r
to
paragraph
1)^degrees
"
This
?^y..-i^(0/2Z^=r-2
nj...k(t)).
(viz.,2ir=r-i
l)-orderchain
"
to
Since there
(m
difference is due
are
21 n*j...k 22 nj...kiit)
/ 2^ nj...k(t),
with
(r
an
I
n.ij...ki
pj...ki
for
Pj...ki
pj...ki
by
of the likelihood
use
ratio criterion
(3.23)
\j...k
=
",
{Pr..ki/Pij...ir'
i.l
"2
where
asymptotically as x^ with
log Xj.-.k is distributed
of freedom.
2Z {-21ogXy...t}
nij...ki\og{pij...u/pj-ki)
i,j,-'',k,l
j,---,k
with m~^{m
limiting x^-distribution
joint null hypothesisis true (see [10]).
a
In
where
the special case
successive
at
time
1, the
points are
=
alternate hypothesis that observations
is
the process
is of order
order
will note
reader
The
chain
be
can
of order
is of the
test
to
these statistics
from
test
when
section, we
for each
used
to
hypothesis that
test
null
the
servations
ob-
hypothesis
in
it is of order
(m
this section, we
asymptoticallyas
"
can
ratio and
x^ with
[m''
"
that
r).By
an
compute
"
proach
ap-
the
that
observe
m"]im
it
is of
the process
the null hypothesis that
earUer
hypothesis that
grees
1) de-

have
assumed
that
the
transition
probabihtiesare
interval,that is,stationary. It is possible to

has
the rth order chain
stationary transition
time
the
first-order chain.
1 against the alternate
"
when
hypothesis that
null
the logarithm of the Hkelihood
distributed
are
of freedom
this
times
presented
of freedom
statisticallyindependent against the
are
method
the
generalizedto
that
"2
X^-criterion
or
same
that
\f degrees
"
against the alternate hypothesis that

similar
In
if degrees
"
Also,
(3.24)
has
(m
test
the
the null
probabihties
256
READINGS
where
naff,^v
When
2Jf=iiia^.^vit)-
hkelihood
Under
of
the number
{A
\){B
\){AB
modified
4. A
-qi
is
imum
max-
where
,
has
asymptotic x"-distribution,and
an
ABÂB
1)
"
A(A
"
"
1)
B{B
"
1)
"
-^ A+B).
An
probability
with
log
"2
of freedom
In
model.
nonrandom.
were
is
hypothesis,
degrees
the null hypothesis is assumed, the
of Pa".iiv is qa^ r^y
estimate
null
the
PSYCHOLOGY
MATHEMATICAL
IN
the
is that
the ni(0) are
size
Then
alternative
and
assumed
preceding sections,we
sample
n.
the
(2.5)multiphed by the marginal distribution
distributed
distribution
of the
the
that
multinomially
of the
/i/(0)which
set
ni(0)
n,j(0
set
is
i^^^^nr,.r'^r
(4.1)
11^^.(0)!'=^
J=l
In
this model, the maximum

likelihood
maximum
likelihood
of
estimate
-m
(4.2)
estimate
of p.y
is
again (2.8),and
the
is
rii
=
.
The
variances, and
means,
taking
the
expected
nfc(O)replaced by
nrik
Since nk{Q)/n estimates

of n'^ {pij
"
pij) are
asymptotic theory
The
starts
of (2.20) to
values
Also
.
T]k
as
n,XO
~"
(2.23); the
î(^
in Section
of the tests
given
and
2.4. It follows
in Section
stationary state; that is,if
k=l
uncorrelated
variances
from
Vi-
found
l)pij are
these
and
with
by
with
apply
^.^(0).
covariances
facts
for this modified
simplify somewhat
covariances
^VkPki
3 hold
"
formulas
same
^)Pij are
(4.3)
ni{t
"
consistently,the asymptotic
asymptotic variances
from
of nij{t)
covariances
that
the
model.
if the
chain
T.
then
For
from
/_,
of
available.
known
efficient
have
for the
rji
and
Tj{
ignored.
subject
the
of
case
likelihood
Pij
Lagrange
multipliers can
subject
observation
asymptotic
results
00, while
has
when
shown
"
"
"
the
I and
"
m)
from
for
Section
2.4. This
we
stationary
the
7?i
log
the
zljPii
the
not
because
relevant
estimates
/mAO)
1, Pa
the
obtained
=
maximizing
Vj
for
equations
known,
are
th
by
zlt ViPa
1)
is
mates
esti-
likelihood
where
are
it
are
state
pa
case
zLîjlog p,_, +
Sj Vj
Vj
"
are
likelihood
state
maximum
same
for
as
likelihood
of the
chain
is
the likelihood
0.
Pij
maximum
the
the process
is of
[10],and
Hoel
the
that
time
t, for t
"
rii
probabiUties
an
"
that
0 and
means
in
proved
Wi(0)
as
on
"
"
the
is of order
r{u
r), which
"
alternative
to
Bartlett's
Pij...ki
Pa-.-ki
(i
argument
have
pa)
"
and
same
given
Gardner
=
test
1 is
and
are
that
essentially
For
tests
the
to
(specified).
is
case
3.1
the order
parallelto
x
-test
the
hypothesis
can
3.3,
be
are
example,
where
in Section
in Section
ratio
3.3. The
the
alternate
presented
likelihood
it is
in
[8].
procedures
chains.
the
like
limiting
covariances
L. A.
1,
that
m), and
"
in Section
given
the generalizationsof the
n,j/nt
test
criterion for this test
order
"
possibly nonstationary
also
given by the usual
0. An
by
against the alternate hypothesis
the process
"
and
co
Hence,
="
pa
observations
are
variances
different way
asymptotically
estimates
of i
in
was
(see [3],p. 91). He
1, 2,
is
T,
"
(n*)^'^
(pij
the
1)
oo). Bartlett
-^
sequence
covariances
(j
pij
the variables
criterion [10] to
"
different values
applicablefor large T. Also, the x"-test presented

provide
"
(T
observed
the
independent
to
(n
they consider the asymptotic
appropriate
tests
ratio
sequence
likelihood
maximum
of states
increases
times
at
z27=iî(O)
hence
asymptotic variances
was
such
ratio
is that
one
previous sections,
the asymptotic theory for T
this hypothesis, and
hypothesis
and
oo
'positivelyregular'situation
fixed and
for
except
-^
observed
of observation
of
Uij
Wi(0)
of
the
and
with
that
see
for
case
for two
result
Hoel's
to
in the
where
case
general
more
great length. In
that
2.4 shows
distribution
valid
for the
of
multinomial
of Section
Thus
starts
use
of the
maximum
chain
in state
in the
covariances
normal
the
the
obtain
number
formulas
asymptotic
that
general
to
of times
([3],p. 93)
multinomial
2,
more
stationary
[3]and
Bartlett
z27=iîj) have
even
used
fixed. The
normally distributed
(n*
in
presented
were
by
that
i at time
has
the number
shown
state
or
the
1, "î ViPa
restrictions
be
on
was
studied
been
theory
the
chain
additional
some
estimates
used
in
'^jPa
of
be
by maximizing
estimates
to
the
estimates.
5. One
has
have
chain
chain
that
obtained
the restrictions
to
hood
of
obtained
are
pij
and
-m
with
the special case,
In
257
GOODMAN
A.
of the
estimates
special case
is
maximum
(4.3) holds,
zLnijlog
LEO
Ti)i If it is known
"f)^
in this paper
The
case.
in the
0- In
rji
the
AND
knowledge
dealt
whether
information
log
and
Vi
when
-pn
We
for this
0,
ANDERSON
stationary state, equations (4.3) should
estimation
not
pi!"
Vk
W.
for
null
is that
are
also
generalized
criterion [3] for testing
258
READINGS
ratio criteria. The
likelihood
6. x^-tests and
PSYCHOLOGY
MATHEMATICAL
IN
in this paper
x^-testspresented
equivalent, in
to the corresponding likelihood

a certain sense,
are
asymptotically
to follow
ratio tests, as will be proved in this section. This fact does not seem
general theory of
from
the
from
those
that
x^-tests
in each
individuals
2.1) as
need
consider
not
herein
For
small
are
the
to
based
are
may
samples,
chain
preferred (see
asymptotic distributions
samples
known.
will be
power
when
shall
now
that
prove
the
p's have
relevant
this will be
statistics
ratio
the
of the form
but
to
under
null
here
can
be
herein, and
Let
us
under
applied
consider
From
mi{t)
p,j(l
by
X
usual
"
the
formally
are
show
well
similar
[nmi{t
tingency
con-
that
the
In
particular,
to
prove
that
the likelihood
appropriate
of proof presented
method
-distribution
lence
question of the equiva-
the
in
the
other
sections
oo
x^-statistic
(3.8) under
that n'^ (pijit)

"
Pii(l
variances
or
Va)
Vij]have
are
the
'Pij)/mi{t
are
null
pothesis
hy-
asymptotically
"
"
different i, they
1)]^'^
[piXO
"
whenever
the
asymptotic
an
certain
log \i is asymptotically
"2
it has
mogeneity
ho-
of the
0 and
to
ratios, have
where
as
different t
"ni{t)/n. For
in
used
their motivation
distribution.
hypothesis. The
see
means
form
used
be
can
shall discuss
we
as
2.4, we
with
be preferred
asymptotic x^-distribution
appropriate statistics given

=o
to
are
(see (3.6)).In order
x"
therefore
distribution
Section
1), etc.,
asymptotically
ances
asymptotically vari-
2Zt fîit
1). Then
p*1) pij(t)/^t'ini{t
has
^nrwi
p*jf/p*i
x^-theory,
asymptotic
an
(t
l)[pij{t)
Ptj),etc. Let
-distribution under
(6.1)
the
T
the
independent. Then
the
then
the alternate
to
also where
normally distributed
where
shall
hypothesis. Then
of the tests
small
x^-tests
(tests of
the
an
of proof
actually hkelihood
and
the Xi-statistic,
equivalent
for
specificalternate
methods,
and
ratio
has
x^-statistic
method
X, (see (3.7)),which
are
tests
of the
are
which
approach
tests
asymptotically equivalent in
are
of the form
not
of
tive
related to the rela-
there is
appropriate limiting normal
asymptotic distribution,we
the
3.2
the
that
for statistics
true
criterion
likelihood
hypothesis. The
null
and
simpler.
the
in Section
shall show
First,we
be
of the
deciding which
of these
decide
to
somewhat
large and
users
many
to
chain
relative rate
power
which
x^-tests,
of the
advantage
presented
sense.
relative
for
tentatively suggested
their application seem
under
the
accumulated
[5]).The
this section,a method
In
tables, is that, for

We
and
been
in
the sample size is moderately

An
hypothesis.
and
has
comments
is not
on
Markov
model.
data
enough
not
be
to
from
extremely general, while the x^-tests

presented
be
Markov
obtained
of
(see Section
m^ categories
sequences
x'^-tests
based
of interest. The
having been
as
different
are
directlyby considering the number
possiblemutually exclusive
the data
on
presented herein
-tests
obtained
be
variables
hypothesis
the alternate
tests
of the
the multinomial
-tests;the
can
"
"
"
the
null
"
hypothesis. But
p
lim
(p*
pij)
T.
ANDERSON
W.
LEO
AND
A.
259
GOODMAN
because
(6.2)
the
From
and
(^
p lim
in probabilityof (p*
pa) and {niiit) ni{t)/n),
it follows that
n'^ (piAt)
Pa) has a limitingdistribution,
convergence
the fact that
"
"
"
[nZ^-^^ ^^^f^ ^*^^' ^ rjMj^JliMLlljl]

o.
"
(6.3)
miit))0.
p lim
"
Pij
Pij
has the same

asymptotic distribution as zLnniiit 1)
Hence, the x^-statistic
This proof also indicates that the
that is,a x^-distribution.
[Pijii) ptjf/Pij}
show
that
shall now
We
X?-statistics
(3.6)also have a limitingx^-distribution.
is asymptoticallyequivalentto Xi under the null hypothesis;
2 log Xi (see(3.7))
and hence will also have a limitingx -distribution.
We first note that for |a;|
" ^
"
"
(1 + x) log (1 + x)
(1 + x)ix
xl2 + xjZ
xV4
"
"
"
(6.4)
x72
(.tV6)(1 x/2
-
"
"
),
"
and
x)
(6.5) I(1 + x) log {\^

(seep.
217 in
We
[6]).
-2
log Xi
(6.6)
-2
I(a;V6)(l x/2 +
"
"
)\^\x\
"
also that
see
x^/2I
X-
X) nijit)
log [pij/pij{t)]
E nit
1)pi,(^)
log [pii{t)/vd
j,t
Z Wi(^
Dp.ytl+ ^vi(^)]
log [1 + Xij(t)],
i.t
where
PiMpa
[paii)
a;iy(0
=
"
"
The
difference A between
"2
log Xi and the
is
Xi-statistic
A
1ogX.
-2
(6.7)
Y.i.i
riiit i)piA[i+
-
2l7=i
Paîjit)
Since
Xi
[xiXOf/2}.
E riiit l)piA[l+ Xiiii)]
log [1 + a;iy(0]-a;.y(i)
-
i.e.for any
probability;
tends
e, under the null hypothesis,
probabilitysatisfies the relation
that A converges
of
the
relation |A | "
probabiUty
shall show
We
Ei ni{t)
"^
Pr{ IA I "
(6.9)
[x.mm.
0,
(6.8)
log [1 + xiM
xiM
The
00
.
e} ^ Pr-{IA 1 "
^
Pr{ 12
Pr{2n
to 0 in
to
"
0, the
unity as
IXij{t)
\" U
and
and |XiXO \ "

1"
J^j,t
riiit l)pii[Xii{t)f
"
Y.j.t
I'"
IXiiit)
"
and
\"h].
]Xij{t)
h\
260
READINGS
It is therefore
only
necessary
[pait)
Xij(t)
Since
the null
hypothesis,and
(6.10)
v^^;Mn x,m
to
that
prove
n[Xij{t)]converges
pijl/pij
converges
"
to
in
zero
in
to
ability.
prob-
probability under
{[^"%~
^'']
[-^-^^^]}
v^^;m^
PSYCHOLOGY
MATHEMATICAL
IN
'
it follows that
n[xiM'
(6.11)
to
converges
Since
and
the
has
"
-statistic has
log X,
method
The
log Xi and
in
converges
null
Xi
also
"2
under
probability to
zero,
log
would
It
apply
to
in Section
we
the likelihood ratio criterion under
as
have
the alternate
suppose
t, s, i,j. It is
to
easy
under
increases,the
may
Pait)
also
moved
hypothesis,
This
be
can
be
can
We
when
we
shall
alternate
Since
(6.12)
the
to
null
that
the two
the
and
tests
hypothesis given
is
statistic
in
the
approach
(xôr
"2
some
to
as
some
ratio test
are
of Pij(t)
as
close to 1 in large
hypothesis
If the
closer
move
values
the
to
of
null
again asymptotically equivalent.

of the
proof of asymptotic
in this section
(see also [5],p. 323).
the
null
preferred
of these
is
rejected
specifiedcritical
the likelihood
to
"2
when
tests
hypothesis
(stochastically)larger than
sense
for
[11]).
increases.
is true.
kept fixed,then
are
is not
log X) exceeds
be
Pais)
tests, the alternate
are
kept fixed. Since
is
x^-test
limiting
hypothesis
the comparison
to
hkeUhood
same
the Hkelihood
fixed but
the
null
null
is,Pij(t)^
shght modification
1)
"
the null hypothesis.
the power
not
X,
m(m
actually Ukelihood
not
are
to
"2
of statistics
that, under
other words, if the values
hypothesis
hypothesis are
seen
value,
ratio
log
test
under
hypothesis.
ni(t) is
converges
test
applied
1 (see [5]and
by
x^ is
ni(t)/n converges
/n
to
hypothesis
statistic
the alternate
tends
deduced
an
appropriate
might decide that
if the
-test
In
suggest another
now
comparisons between
make
to
alternate
it
is true; that
the situation in which
closer
equivalence under
the
of each
3.2
-statistic has the
hypothesis.
hypothesis and the significancelevel
power
for the
the
3.2
where
case
hypothesis
that both
see
order to examine
samples and
the
alternate
any
for the alternate
be
refer to
remarks
previous
In
another
in Section
Xt since they
proof that the
distribution
consistent
x? +
asymptotic equivalence
proved
ratios.)Hence,
Now
log X,
"2
with
(The proof
not
the
was
"
has
show
to
and
of freedom.
ratio criterion,but
The
Q.E.D.
the null hypothesis,
the null hypothesis.
used
be
log
hypothesis, "2
1) degrees
"
-distribution
presented herein for showing the asymptotic equivalence of

could
of the form
(T
x"
"
Hmiting
under
the
probability when
in
zero
Xij{t)f
[ixiAt)ny''
in
in
linear
combination
probability to its
probabiUty
"
of multinomial
expected value
variables, we
to
m,(t
see
that
"[ni{t)/'n] mi{t).Hence,
=
l)[pii(t) pijlVpii
-
T.
and
log X)/n
"
ANDERSON
W.
in
converges
(6.13)
LEO
AND
261
GOODMAN
A.
probability to
Mt
Dpiiit)log [pijit)/pi,],
i,},t
where
(6.14)
pij
XI Pijit)mi(t 1)/ Xl niiit
(6.15)
is
alternative
tests
are
for the
applied,
simple alternatives
(see p.
decrease
the
as
alternate
reducing the
hypothesis
be
(b)
of the
statistic
has
form
be
where
en,
limit
quite different from
approximate computations
it will tend
and
the
can
we
suggest which
points
enough
limit
information
the
seen
methods
The
x^ and
to
the
for
least it
test
to
I, or
(b)
this
of
the
methods
certain sense,
alternate
determining
the
of
the
test
herein
critical value
the
critical
question
some
from
see
will tend
test
to
is greater than
of the
the power
ratio
test, and
if
the
some
preferred.
the significancelevel
of
asymptotic theory
so
II
particular Type
some
as
the
-test
does
not
error
give
limits
of stochastic
comparison
the
significance
can
also be
used
study of
in the
ordinary contingency tables. We
for
x^ and
hypothesis
which
the
hypothesis.
powers.
comparison discussed
ratio
0 if
is to be
vary
usual
problem,
null
desirable sequences
that
seems
en). While
handle
comparison of
hkelihood
that, in
when
method
c' and
lie between
suggest
may
(or at
hkelihood
of the
power
to
find that
appealing approach is to
However, a more
ratio of significancelevel to the probability
approaches
made
When
be
comment
be
can
really suitable),we
of
of type
the
increases.
?i
the power
that
Hence, by this approach
hmit.
error
to
(there may
is
critical value
for the
as
constant
an
compared
hypothesis.
(a) is used, then
will increase
is
of
steadily closer
moved
preceding paragraph
the stochastic
is less than
and
computed
alternate
when
case
related to Cochran's
chance
[3].If method
in
log X)
"
this form
stochastic
can
(xôr
in the
remarks
can
discussed
was
the
whether
to
in the
limits. When
usually the
be
can
is somewhat
tests
increases,thus
Method
value
included
are
as
[5])that either (a) the significanceprobability
in
323
that
comparing
for
hmits
stochastic
these
then
method
This
composite hypothesis,
some
0,
Pij)/pijis
{pij{t)
"
the two
between
is
differ from
limits
If
is better.
test
difference
small
stochastic
two
suggests which
only
Pijf/i^pD-
hypothesis, these
will be
small, then there

the
l)[piAt)
of them
computation
and
Zwii (t
pij
n-"c!0
(6.13)is approximately
alternate
the
Under
lim
(6.12) and
between
difference
The
1)
likelihood ratio methods

and
is true
is to be
fixed, and
we
are
have
not
have
lent
equiva-
suggested
preferred.
REFERENCES
[1] T.
W.
Anderson,
RAND
"Probability
Research
Memorandum
models
No.
for
analyzing
455, 1951.
time
changes
in
attitudes,"
262
[2]
READINGS
T.
Mathematical
The
[3]
[4]
M.
H.
S.
W.
"The
Cambridge
Philos.
[6]
H.
Soc,
47
changes
edited'
Sciences,
Paul
by
in
attitudes,"
F.
Lazarsfeld,
1954.
of
goodness
Vol.
time
analyzing
Social
Illinois,
frequency
"Large-sample
(1951),
theory:
fit
for
test
probability
Proc.
chains,"
86-95.
pp.
parametric
Math.
Ann.
case,"
Vol.
Stat.
27
1-22.
pp.
"The
Cochran,
pp.
for
the
Glencoe,
Press,
Free
Chernopf,
G.
in
Thinking
Bartlett,
(1956),
[5]
models
"Probability
Anderson,
W.
PSYCHOLOGY
MATHEMATICAL
IN
of
x^-test
goodness
fit,"
of
Math.
Ann.
Vol.
Stat.,
23
(1952),
315-345.
Mathematical
Cramer,
Methods
of
Princeton
Statistics,
University
ton,
Prince-
Press,
1946.
[7]
W.
Feller,
Wiley
[8]
L.
A.
L.
A.
[10]
P.
G.
[11]
J.
Neyman,
Hoel,
Symposium
Press,
"On
the
Vol.
26
Stat.,
"A
test
for
on
Berkeley,
p.
Mathematical
the
in
problems
1,
John
information
1954.
Library,
of
analysis
theory
Statistics
pp.
Vol.
Applications,
Markov
chains"
(abstract),
Ann.
771.
chains,"
to
Its
distribution
University
statistical
Markoff
1949,
and
Columbia
(1955),
"Contribution
and
Theory
1950.
estimation
Essay,
Master's
Goodman,
Math.
York,
"Some
Jr.,
Probability
to
New
Sons,
Gardner,
theory,"
[9]
Introduction
An
and
239-274.
Biometrika,
of
the
and
Vol.
x^-test,"
Probability,
41
(1954),
Proceedings
University
430-433.
pp.
of
the
of
Berkeley
California
264
READINGS
which
those
permit
kind
of the second
be also included
to
in
descriptionof
unitary
chastic
sto-
behavior.
choice
PSYCHOLOGY
MATHEMATICAL
IN
(t,t +
time
"implicit"response
which
Particular
re
Considered
It is believed
the
that
which
hypotheses upon
description is based
choice
most
applicable to
However, the
situations.
of
derivation
mathematical
model
be
can
hypotheses
readily applied to experimental data
additional
without
assumptions is
for a cerachieved
tain
more
conveniently
which
these
from
may,
in
taken
to
certain
equivalent to
usually classified
there
VTEs
are
the
as
partial
VTE.
situations in which
some
observed
not
are
be
circumstances,
be
response
But
an
with
specificinterpretationis given
term
"implicit response." It
underlying
the stochastic
are
the
to
of
occur
kind
the
is associated.
the parameter
No
Sihiations Which
Choice
tJierc will
At),
and
would
unlikely to be present. In these

the "implicit" response
be
cases
may
regarded as a tendency to make a given
or
might perhaps be given
response,
some
physiologicalinterpretation.
seem
probabilities of the various

"implicit" responses
ring
occurconsists of experiments where
edge
knowlconsidered to be independent
are
of the outcome
of
correctness
or
of one
another.
So that for given conditions,
is not
available to the 5
a
response
of
each
kind
implicitresponses
until after the choice has been made.
intervals unare
affected
appearing at random
junctive
Thus, for example, most
ordinarj^disof other
by the appearance
time studies are
reaction
not
It follows from
implicit responses.
considered because the 5 in these experiments
the firstassumption that the distribution
class
of
situations.
This
The
class
kinds
match
can
a
known
the
requirement.
class
of situations
is not
considered
of
with
of the
Nevertheless,
which
trivial
can
one.
be
is
It
others
tion
{a) Discriminaventional
conexperiments, including most
psychophysical procedures
this category,
{b)Studies of preference
and
conflict,
(c) Investigations
learning in choice situations.
includes among
in
his response
The
next
of the
section
of
is
paper
mainly concerned with the events supposed

be
to
taking place during a
singleexperimental trial.
intervals
between
sive
succes-
implicitresponses
of a given kind
and
is
determined
tirely
enexponential
by the response
parameter
see
[e.g.,
Feller, 1950,
p.
220].
that
Assumption 2. It is assumed
final choice response is made when a
run
of K implicitresponses of a given
kind appears, this run
rupted
being unintersponses
of implicit reby occurrences
either
other
kinds.
K
of
may
be assumed
to take a particularvalue
be regarded as a further paramor can
eter,
which
can
be estimated
from
perimental
ex-
data.
The
Stochastic
Model
Assumption
The
notions
is based
which
upon
the model
simple and involve

only two assumptions :
Assumption 1. It is first assumed
that, for given stimulus and organismic
conditions,there is associated with each
possiblechoice response a singleparameter.
are
This
very
parameter
probabilitythat
in
determines
small
interval
the
of
before.
Mueller
has been
employed
(1950) has used this
tween
approach to describe the intervals betioning
bar-pressesin an operant condiwhere
one
only
experiment
response
is involved.
For
the
same
"
(1950) and Bush
used an
Mosteller
sumption
as(1951) have
which
is very
similar, the
only difiference being that their models
situation, Estes
J.
R.
used
rather
discontinuous
distribution
continuous
than
265
AUDLEY
of responses
keep the expositionas brief
in order to
as
consideration
possible,
in this paper
to situations
involving
Christie (1952) in discussing will be limited

choice
between
of response
a
abilities
prob-
in time.
only
determination
the
in
discrimination
the
used
has
experiment,
assumption
same
for situations where
two
are
responses
Finally,the author of the

(Audley: 1957, 1958)
present paper
notions
has previously used the same
competing.
combine
to
response
in
Bush
1.
in
an
been
has
it
all
and
analysis of
that
(1955),
Mosteller
with
in
times
response
model
continuous
amples,
ex-
assumed
situation, have
runway
these
considered
K
"
There
several
are
Firstly,when
of
for all alternatives
identicallythe
same,
be
shown
and
will be
and
have
elaborated
possible
two
and
called A
and
the
Let
the
with
jS. Assumption
and
that
means
occurring
+ A/) is
an
{t,t
interval
time
small
parameters
be
responses
two
of
p{a),the probability
in
be
will
responses
will be labelled
respectively.
2.
B, and implicitresponses
kinds
of the two
b
situation with K
given by:
p{a)
aM
[la]
1.
p{h)
^M
[lb]
Similarly
1,
times
response
can
presented.
general case
more
two-choice
sults
Re-
be
can
distributions
be
which
reasons
for assuming that K "

i^
1, but not if i^ "
be advanced
the
in
is relatively
only this
that
so
elsewhere.
associated
to
the
Furthermore,
2,
will
derived
1, but
generalizationdoes not appear

have
been previously employed
situation involving choice.
this
for the
been
havior.The
learning be-
individual
However,
sponse
re-
simple when
special case
tives,
alterna-
two
problem
mathematical
scription The
stochastic de-
in a
probabilities
of
and
times
i.e.,m
2.
are
plicit
im-
an
but
of either kind
response
ponential
ex-
of
probabilityp {a or h)
The
to
not
terval
Audley, 1958). both, occurring in the small time inis
of these
Neither
properties is in
with
experimental findings.
agreement
p {a or b)=p (a)+p (b) 2p {a)p(b)
Secondly, when K " 1, the sequence
{a-\-0)M-2a^{MY
fore
of "implicit"responses
occurring be-
(e.g.,see
final
ofifer a
of
various sequences
"implicit"choice
suggests an approach to descriptorsof

the second kind.
For example, "perfect
might be
identified with sequences
consistingof
of one
kind only.
"implicit"responses
confidence"
Derivation
No
Hence
including VTE's
havior.
the description of choice bethe
of
classification
Thirdly,
if
of
possible means
within
is made
choice
in
p{aorb)
of order
terms
This
the distribution
choice
of the Stochastic
Model
further assumptions are required

of the model, which
the
to
{a-\-^)^t
(A^^
continuous
transition
case
when
of implicitresponses
in time
follows that of
[Ic]
ignored.
are
possibleif
becomes
is made
Poisson
ess
proc-
220).
Therefore
probability,pin, t),of
in the
obtaining n implicit responses
time interval
(o,t) is (e.g.,again see
Feller, 1950, p. 221):
Feller, 1950,
(e.g.,see
p.
the
in the derivation
can
any
be applied
number,
m,
to
situations
of choices.
involving
However,
p{n,t)
(g + ,g)n^"e-("+g)^
[2J
266
READINGS
IN
MATHEMATICAL
PSYCHOLOGY
series of sequence
p{o, t),
particularthe probability,
probabilities.
of
obtaining no implicit response
Thus,
either kind in time t is given by :
Pa
p^-\-p\ + p\ + p^q^ +
[5]
e-("+^)'
p(o,t)
[3]
Whence, simplifying,and substituting
for p and q from Equations 4a and 4b
that
the
The
first
Pa,
probability:
is
is
to occur
a
implicitresponse
an
In
nite
of
/*
=1
Pa
a2[a + 2/3]
[a + ^][(a + 0Y
00
"
t=o
/3T2a -f /3]
[4a]
[" + /3][(a+ ^y
Similarly,for implicitb
Equation 6a
following form
responses
(8
+
Since
be
may
of
occurrences
follow
implicit
Poisson
[4b]
Pa
written
in
the
[(g + ^y
say,
[6b]
aiS]
say,
[6a]
Similarly
a
Pb
"
a^-\
p(o, t)adt
"
/32]
a-f/3'[(a+ |S)2-a^]
sponses
re-
process.
that
SO
when
/S,Pa
"
and
"
Equations 4a and 4b also give the

/3
probabilitythat, startingat any given
+ /3
a
the next
to
implicitresponse
moment,
the
Thus
the difference between
5 respectively.
will be an
a
occur
or
probabilitiesof the various
implicit
Therefore, ignoring for the moment
in
occurring is accentuated
responses
vals
questions concerning the time interthe expressionsfor the probabilities
of
successive
between
sponses,
implicit rechoice
overt
the
to
a
final choice
of
sequence
events
of independent
sequence
trials,with
Pb, of the
increases
be treated
can
as
that
there
binomial
two
4a
and
is believed
4b.
Probability,Pa, That
is an
A Response
the
possible sequences
which
Final
Choice
with
can
For
the
of
occurrence
an
"
between
a's
occur.
and
b, until
The
two
and
clearly: p'^,p'^q,p^q, p^q^ etc.

over-all probability.Pa, that the
an
A, is the
sum
The
final
of this infi-
be
Trial
b, with
them.
This
which
property
and
VTEs,
Error
obtained.
to
the moments
of VTEs
distribution
cessive
suc-
early members
determine
of the
can
here
Attention
the
choice
any
of the
readily be
will be
fined
con-
of VTEs
number
mean
preceding (a)
of this class of sequences

: aa,
baa,
are
The respectiveprobabaa, babaa, etc.
abilities
The
of these various
is
sequences
is
in the
identify alternating appearances

"implicit" responses,
we
minate
ter-
implies
organisms exhibit.
Vicarious
If
K
be easily classified when
2.
all be simple alternations
they must
choice
than
to
ation
accentu-
and
certainty in the
underlying
more
which
processes
many
The
with
the
Equations
The
is
choices
overt
Pa and
probabilities,
given by
types of event
The
responses.
ing
lead-
(b)
ticular
par-
choice.
Mean
Any
There
of
Number
VTEs
Preceding
Choice, V
are
no
VTEs
of implicitresponses
if the sequence
is
aa
or
bb.
J.
R.
is 1 VTE
There
baa
267
AUDLEY
if the sequence
is
if the sequence
is
Va
yields the following results
choice
abb.
or
There
2 VTEs
are
2afi
/?
(a + /3)2
a^
abaa
babb, and
or
Dividing the
so
on.
with
into those
responses
of impHcit
sequences
odd
an
with
those
number
even
an
(a + j8)2
of
following probabiHties are

VTEs,
Since
found (lettingP(F
w) be the probability
of obtaining n VTEs) :
P(V
p' + q'
2)
p'q + pq'
4)
pY
P{V
1)
p'q + pq'
P(F
3)
p3g2 ^ p2g3
P{V
5)
p'q' -i-p'q*
given
moment,
Pb, Va
"
SP{V
"""
latencies
algebraicmanipulation
some
again substituting for p and q
from Equation 4a and 4b.
La and
[7]
{a + /3)2
a^
Let
then
be
7 may
Equation
re-
q:
written
as
P{a, t) be
time
at
If 7
sideration
con-
the
mean
and
all responses
B
and
for A
Latencies
Mean
3a0
=
of
the
sponses
re-
and
Lb
respectively.
The
all the
distribution
separately.La
taken
after
and
Choice
Here, however,
latency, L, of
mean
and
Final
will be limited to
3) +
of
time
of the
2)
i.e.,if
possibleto determine
It is
2P{V
1) +
VTEs
is dominant
Vb.
"
Distribution
Time
the
on
fewer
response
final responses.
P{V
be
would
any
moments
that
seen
at
The
Now
can
which
etc.
respec-
precedingthe
Pa
re-
p
^+2
be
there
average
be
may
iS
tively, it
etc.
/3
and ^
as
-+2
PY
0)
2a
1
written
=
and
-f 2/3
P(V
2a
a^
13
the
p{V
[8b]
=
-
and
2/3
2aj8
Vb
ber
num-
[8a]
/,
for
and
sponses,
Re-
Lb
probabilitythat,
the
a's
consecutive
two
no
and
or
last
the
that
appeared,
Let
an
a.
was
implicit response
P{a, t;n) be the probabilitythat, at
6's have
37
Line
7)==
have
appeared,
there
was
response
been
have
/,
no
a's
consecutive
two
or
b's
(1 +
V is dependent only on the ratio

maximum
becomes
of /3 to a, and
a
Therefore
when
1,
/3.
i.e.,
a
7
Thus
responses.
and
that
an
plicit
the last imalso that
a, and
exactly
implicit
Thus
00
the number
when
Pa
be
would
of VTEs
Pb
mum
maxi-
P{a,t)
h-
and
of VTEs
Responses, Va
Preceding
Number
Mean
B
and
P{a,t;n)
and
Vb
the
an
and
the
method
combined.
Let
mean
P{a,t;n), Equation 2
employed to find Pa
determine
are
Separate consideration of
number
of VTEs
preceding
E
n=l
To
The
be the probabilitythat
P{a;n)
sequence
of
events
ends
with
an
a,
268
READINGS
consecutive
two
no
as
IN
b's
or
PSYCHOLOGY
MATHEMATICAL
having
Similarly it
be
may
Clearly,
occurred.
determined
^.
,_
that
,"
-l)
P(^0=e-(-+^)'[(^
^
P(a;l)
P(a;2)
P{a;Z)
V^l
JJ M
W + ^y
Now
etc.
,
(a + ^y
P(a, t)atdt
being respectivehprobabilities
these
with
associated
aba,
the
a,
t=o
2(a + /3)
ba,
etc.
(a + ^y
P{n,t). P{a\n),
P(a,t;n)
2
gives P{n, t),so that
Equation
Now
and
sequences;
P(o, Oa^^
S=o
a/3
i8
+
(a + /?)("+
[10a]
2/3)
P{a,t; 1)
and
similarly
P(l,t)-P{a; 1)
2(a + i3)
Ln
a/3
a-\- (3
+
P{a,t;2)
By
al3
kind
same
latency for
a^th--{a+")t
argument
that the mean
all responses,
is given
L,
2(a+i3)''+a/3
L
[a+^][(a+^)2-ai8]
a+/3
3al3
-{a+")l
P{a,t;2")
[11.]
[a+/3][(a+/3)2-a/3]
a^^tê3!
Returning
and
10a
Equations
to
Hence
/3
be
10b it can
=
by
Similarly
PiaJ)
it
2!
etc.
of
be demonstrated
may
{a + ^y
the
[10b]
(a + /3)(2a+ ^)
V.
that
seen
(a+^)(a + 2^)
P{a,t;n)
and
(a+^)(2a+/3)
be written
may
as
1
and
1!
re-
2)
2) (a+^)(^
(a+^)(^
+
"*"
which,
upon
3l
"^
spectively. Thus
will, on
simplification,
gives
shorter
choice
i.e.,if Pa
2
~V
response
time
Pb, La
"
In order
the
the
to compare
time
dominant
than
"
the
other,
Lb.
the
theoretical
distribution to observed
data, the probabilityP(0, t)of

response
sponse
re-
have
average,
having occurred
by
no
time
final
/ is
J.
R.
is
also given. This
269
AUDLEY
situation, then generallyspeaking
clearly
ical
would
one
P(0, 0
P{o, i) +
P{a, /) + P{b, t)
of
P{o, t) is given by Equation 3 and

7
{a
{a,t)and P{b,t) by Equations 9a and
9b so that, upon
some
simplification,
7
the
T7^
"
the
probability
"
-rrz
Hence, the probability,Pc,
being
for this type
correct
Pc
and
Model
vance
present, it is only possibleto ad-
At
is
[13]
^2
the
Descriptors of
Kind
Second
fident
con-
i.e.,choosing A,
a2 +
The
of
given by
,_
,_
bb,
/3)2
"choice,"
a+/3
of
/3)'
{a +
of
/3. The
ability
probwould
be
aa
"
sequence
and
"
"
expect
speculations concerning
dence"
as
"degree of confiin the correctness
of a given
worth
choice.
Nevertheless, it seems
there
these
since
considering
appears
Comparing this probabilitywith the

overall probability of an
A response,
Pa given by Equation 6a,
some
variables
be
to
such
definite relation between
Pc-P.
a2(a+ 2i3)
a2+/32
[a+/3][(a+/3)2-a/3]
a^^^ja-^)
the
~
kind of descriptorand the

indices
of choice
conventional
second
(1911),whose
Henmon
be
will
in
considered
havior.
be-
paper
detail
more
that
choices
regarded
later, showed
with
confidence
are
by an 5
generally
than others.
accurate
quicker and more
This
result
discrimination
psychophysical
where
definite
in
demonstrated
was
choice
existed.
be
"
^ and
Since
only
Pc
"
Pa-
for these "confident"
implicitresponses
two
responses
fore
be-
occur
final choice, it is clear that
mean
hence
positivewhen
14 is
Clearly,Equation
tion
situa-
correct
[a2+/32][a+j8][(a+/3)=^-a/3]
[14]
more
is shorter
time
response
their
the
than
This
time.
over-all average
response
approach consists essentiallyin ing
equat"degree of confidence" with some
possibleways
of the reciprocalof the number
tributed function
might be atof VTEs
The
preceding the final choice.
to a particularchoice.
The second suggested approach to
tion
classificafirst of these involves some
is based
plicit judgmental confidence
of imof the various
upon
sequences
sponse,
the fact that these appraisalsof a repreceding a final
responses
There
choice.
vacillation
no
bb, might be regarded
confident"
than
large number
abababaa.
kind
all,such
at
as
sequences
It will be shown
sequence
suppose
be the correct
the incorrect choice in
has
the
data.
and
psychophys-
implicit responses
with
to
as
that this
itself. Degree
response
of confidence, therefore, might be
involving a
propertiesrequired by Henmon's
For,
"more
of vacillations, such
of "confident"
after the
as
low
instructions,fol-
normal
under
which
For example, sequences
involve
or
two
"confidence"
in which
aa,
to
seem
response
has
response
occurs
in
after
occur
has
been
the
time
of confidence
might be taken
to
overt
an
made,
before
is
lead
choice
If,after
occurred.
an
further
the
ment
state-
produced,
to
sociated
as-
tinuing
con-
greater
this
con-
270
fidence than ifnothing
Indeed, it might be
or
appeared.
develop
tion
of the
making the primary

choice
and
mate
giving an estiresponse
of
confidence
from
for degree
this kind of assumption.
Other approaches to the second kind
of descriptorare undoubtedly possible
under
another
be
these
assumed
This
data
where
it
that
there
in
can
one
reasonably
are
no
tematic
sys-
6"s behavior.
an
therefore
can
trials
closely resemble
changes
various
hypotheses quite
predict how often
be
regarded
further
special assumptions.
before examining Henmon
The
with
between
Properties
of
that
set
of very
which
simple
might be expected
which
observed
variables
In
situation.
it is not
it
is
sumptions
as-
in
among
choice
exposition of this
possible to examine, in
an
of the model in
detail,the success
describing the results of experiments
which
relevant.
For one
are
thing,
only the particularcase arising when
K
in
2 has been presented,whereas
be
it
t
practice may
more
profitableo
i^ as
treat
a
Also, the
parameter.
far presented is concerned
so
argument
with the events
at
supposed to occur
The
a
single experimental trial.
in which
the model
is applied
manner
to
a
experimental data based upon
of trials will depend
number
very
's
sults,
re-
exhibit
the model
seems
be examined
cannot
preferableto
seems
about
test
here
hypotheses
vidual
indirelations upon
A brief argument
for this
functional
data.
derive relations
to
to
about
empirical evidence
choice behavior in general.
In effectinga general appraisal of
is hindered
the model, one
by the
general lack of individual results in
the experimental literature. For reasons
Model
of this paper
be used
can
in which
manner
Data
principal aim
worthwhile
seems
ever,
How-
match
to
the
the
Empirical
AND
show
the
dence
level of confi-
each
Agreement
The
it
be determined.
can
kind
and
the conditions
individual
portant as appropriate for testing the model

The impresent scheme.
without there being any need to make
point is that it is possible to
associated
the
which
conducted
were
easily. They each

a
given level of confidence would be
Also
the expected distribution
employed.
kind
of
the
first
of descriptors
to
(1911), in which
Henmon
the
within
test
of quantitative evidence
will be
confined
to
mainly
an
experiment by
possibleto
between
times
PSYCHOLOGY
MATHEMATICAL
distribution
for the
model
IN
READINGS
has been presented by

(1955) and for the study of
by Audley and
learning behavior
ferred
Jonckheere (1956). The reader is reof view
point
Bakan
to
any
these papers
However,
taken
stand
clear
on
for further
tails.
de-
irrespectiveof the
this question, it is
is concerned
present model
with individual results and that
that
the
much
upon
the way
trials resemble
may
be actual
conditions
may
upon
be
from
a
in which
one
separate
There
another.
variations
in stimulus
trial to trial,or
direct dependence
earlier
trials,as
For
in
this reason,
able.
generally availthe following
For this reason,
mental
comparison of the model with experievidence is largelyqualitative,
although, given appropriate data,
quantitative comparisons would have
such
results
been
possible.
not
Psychophysical Discrimination
In
there
of later
are
tions
Situa-
physical
considering results from psychoexperiments, say using the
constant
method,
it is necessary
to
separately the comparison of

periments.consider
learning exconsidera-
each
variable with
the standard.
This
272
stimulus
conditions
the
are
Nevertheless,
all trials.
for
same
general
predictionscan be advanced.
Here, "degree of confidence" will be
function of the
equated with some
number
of VTEs
reciprocalof the
ceding
preVTEs
of
can,
course,
number
range
from
of
zero
infinity. Generally speaking,confidence
to
is rated
scale from
some
upon
unity. Let C, be the degree of

with
confidence
associated
a
given
of VTEs
choice, and, V, the number
ing
Determinpreceding this choice act.
to
zero
suitable relation between
would, in fact,be one of the experimental

problems suggested by the
For the moment,
present approach.
however, it will be assumed
that.
V+
that when
1
1 ; and
0.
00, C
It will be recalled from
=
when
with
concerned
If the stimulus
VTEs
section
that the
number
of these will,when
less than the number
i^
mean
2, be
see
e.g.,
varied
are
diffêrentsets of trials,
for
as
in
the
method
discussed
constant
example
in the previous section, general
conclusions
are
again possible. For
discussingEquation 7, it was shown
in
that
the
number
mean
of VTEs
the ratio of
only upon
Again assuming
that
pends
de-
/S.
to
(a -+-j8) is
constant,
proximatel
ap-
be
would
roughly symmetrical function of the

magnitude of the variable, having a
maximum
the
at
PSE.
Thus
the
degree of confidence,C, would

ing
roughly U shaped function hav-
average
a
minimum
maximum
the
at
been
has
time
PSE.
Since
shown
PSE
the
at
have
to
and
crease
de-
to
side of this point,

again vary inversely.
This agrees
with experimental data
(seeGuilford, 1954).
either
upon
the
conditions
dence
confi-
between
choice
0, C
between
judgment time,
Guilford
again
(1954).
[16]
relation
and
be
so
of the
C and
This hyperbolic function is in ment

agreewith
tions
experimental determina-
some
The
final choice.
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
and
would
of implicit
choice.
final
preceding
a
responses
Preferenceand ConflictSituations
be
Now
it can
demonstrated,
easily
In this kind of situation,a number
using Equation Ic, that the mean
of objectsare
paired and the subject
choice time when
n implicitresponses
makes
choice indicating the prea
ferred
Tn, is given by
occur,
object of each pair. For any
given pair of objects, say A and B,
T
two
Whence,
choice
is
possible to
time
T,
the
n
is eliminated
17, it
since
[17]
2, and
"
from
express
the
V,
to
given by
V
-\-2
{a-\-^)C
+
a
|3
To.
[19]
of ence
preferthere are
measure
Because
of objects, it is more
venient
conto label the r objectspresented
the subject as Xi, and to let the
associated
"absolute
preference" for each,
the
(i
1, 2,
"
"
"
r).
equations will
and
say
aj
the
^'th and
This, of
ak,
now
with
kind
parameter
ai
Substituting for V from Equation 16

and adding an arbitrary constant.
To,
choice time possible,
for the minimum
be taken
/3 can
number
mean
of
and
represent some
for A and B.
to
Equation
function
as
cause
be-
parameters
The
and
/S of
for the comparison
is to
be
be replaced by,
jth. objects,Xj and
course,
of
make
of
Xk.
the very
strong assumption that the a/s
are
in-
J.
R.
dependent of the particularcomparison

in which they are involved.
This
be
could
tested
assumption
readily
by
using the model appropriately, and is
accepted here only in order to simplify
notation.
The results of the following
would
be qualitativelythe
argument
if
even
same,
contextual
there
effects
to
comparison.
Variation
in choice
comparisons.
time
The
among
of
set
i be
a paired comparison
usually be ranked.
individual's ranking of
object,
X^"X2"
This
on.
"
a,
that
so
we
write
may
"Xi"Xi+,"
X,- is preferred to X2
"
meaning
"
an
"
"
"
that
means
tti+i
"
"
"
"
"
ai
and
0:2
"
Consider
ar.
can
is given by Equation
be rewritten as
now
11, and
that
the
scale
Number
"
"
be
any
this
have
tained.
main-
the
was
lute
abso-
an
basis, so
relative
values
of VTEs
It
"
would
than
by
comparisons.
should
be
inclusion
of
affected
unnew
for differentcomparisons.
shown, in discussing
number
Equation 7, that the mean
of
in a given situation,depends
the ratio of a
entirely upon
to
(3.
the
Using
present notation this would
VTEs
be the ratio of
and
Xk.
The
maximum
aj
to ajt,
number
when
ay
for
objectsXj
of VTEs
=
ak,
has
and
creases
de-
become
the values of the parameter

the
more
disparate. Thus
number
of VTEs
as
Lu.k)
values
so
pair of parameters, say uj and ak, and

let these be the a and /3of the earlier
choice
the mean
equations. Then
time
scale
rather
"Xr,
"
"
absence
an
be
to
the basis of
on
an
of
interestingto determine
far the assumption of

contextual
effects can
jects, the
ob-
technique, can
Let
It will be
how
If the assumption turns

out
approximately true, then the
parameters, at, would provide a means
of scaling the stimulus objects for a
In essence,
fact, given individual.
such an
each
approach would resemble that adopted
by Bradley and Terry (1952), but
ferentwould have the added advantage that
dif-
in
were
peculiar
273
AUDLEY
"
ocj
ak
upon
SajUk
+ oikY
[aj -\-a/t][(ay
"
ClearlyL(^j,k)
depends
upon
aya*]
two
[20]
things ;
the
the
should
depend
tirely
en-
differences in preference
and not upon

the general level of
preference for the two paired objects.
Thus
for adjacent objects, Xi and
of VTEs
before a
Xi+1, the number
of the parameters
(ay + a^)
sum
final choice will not rise with
choice
and, secondly, the product of the rameters,
patime
from
as one
proceeds
preferred to
Other
things being
ujUk.
nonpreferred objects. This is slightly
equal, the choice time will decrease
(ay -f-ak) increases.

Again, with
will increase
(ay -j-ah) constant, L(j,k)
with the product, reaching a maximum
as
when
ay
aA:.
Choice
time
will therefore
(o) depend upon the general level

preferencefor objects,being quicker
for preferredobjects,(b)will be quicker
the greater the difference in preference
for the two
paired objects. This in
with
experimental finding,
agreement
for
children
choosing among
e.g.,
liquidsto drink, Barker (1942), for
aesthetic preferences,Dashiell (1937).
of
complicated by differences in "preference

distance" between
jects,
adjacent obbut the predictionis again found
with experimental
to be in agreement
evidence, e.g., see Barker (1942).
Learning in choice situations. It is
in considering learningbehavior
that
the
need
for
individual
results
is
greatest (Audley " Jonckheere, 1956).

full advantages of the present
The
variables can
approach to response
only be gained by incorporating the
for
assumption in a stochastic model
274
in which
this
learning. The
way
1, has
might be contrived, when K
already been outlined and illustrated
elsewhere
(Audley: 1957, 1958). On
the whole, therefore, the experimental
literature does not provide results in
which
enable the predictionsof
a way
be falsified,
the model
to
at a
even
level.
The
that
most
can
qualitative
=
be
done
is to show
here
that
the
be good mations
approxithe properties of learning
to
data.
particulartheory of
of course,
it would,
anchor
the theory
variables
more
by
be
possibleto
closelyto
identifying
of the choice model
parameter
sponse
re-
the
with
the ratio of
will
rise to
Pb
fall
Let
with
A, the correct
B, the incorrect
way
in which
with
and
and
response,
response.
with
vary
/3
The
ward
re-
punishment is naturally a
for investigation and
would
matter
form
condition
the
of
the
certainly
made.
be
would
prediction which
Nevertheless,it is not unreasonable to
and
that
assume
will be
some
monoincreasingfunction,and ^ some
tonic decreasing function of practice
and of punishments and rewards.
Let
supposed that the S has

strong tendency to produce
it be
first a
incorrect
choice, i.e.,a
is small
this
upon
(a -f /3), and if the levels of,

punishment and reward, are such
then
of
say
as
of this quantity,
tion
accentua-
an
of
flattening of the curve
latency as a function of practice. The
or
monotonic
decline
in
when
tencies
la-
response
an
is introduced
learning situation for the

firsttime does not counter
this prediction.
For, then, it is to be expected
that (a -f /3)will be initially
small and
the effect of increasinga, and, hence,
{a + /8)will be reinforced by the growing
difference in magnitude between
a
and /?. In originallearning,therefore,
factors work
the two
together and
decrease
in
produce the monotonic
latency.
a
number
The
be expected
a
from
of VTEs,
to be
7, is seen
function
to /8. Thus
to
rise to
13, i.e..Pa
would
maximum
Pb
VTEs
tion
Equaonly of
until
0.5, and
the
decline.
These
predictionsare
applicable
choice
to
the
situations
very
so
probably only
simple two-
far
considered.
studies,the problem
the
in
be
the
is
to
to
complicated
by
might
expected
happen
way
which
the
over-all latency L, and the latencies
relevant
being
cues
are
of A and B, La and Lb respectively. utilized by the organism and there is
In discussing Equations 10a and 10b
no
point in reviewing the controversy
relative
it
was
to
what
/3. Consider, firstly,
then
influence
the
the constancy
there will be
tency,
la-
constant,
until Pa
Superimposed
the ratio of
monotonic
over-all
|8) and
fall will be
observed
be associated
(i.e.,a
again.
into
and
properties of the model
Consider,
simple learning behavior.
for example, learning in a simple two-
dependent
(a -f ^) and
are
sum
maximum
0.5
Lb.
than
/S. The
to
La
0.5, when
L, if (a + j8)remains
an
The
choice situation.
latencies
factors,the
two
upon
exceeds
generally shorter
All of the
ing
learn-
appropriate theoretical construction.
the
will be
to disturb
Given
at
Pa, reaches and
dictions rise and

pre-
well
might
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
shown
that
the
dominant
sponse,
re-
For
over
discrimination
this
matter.
It does
however
pointing out that, in

behavior, it is very
place it will be expected that La will
probablethat there appears something
until the probbe greater than Lability like the problem of the use of the third
in psychophysical procedof making the correct
choice, category
on
will have
the
the average,
Thus in the first
shorter choice time.
seem
worthwhile
discrimination
R.
That
ures.
be
is,a distinction
between, on
necessary
hand,
definite
because
in the
done
one
This
situation.
of the parameters
may
behavior
upon
the
exert
in
an
two
be
about
be
tive
specula-
because
possiblethat both of
accounted
these
size
ence
influ-
which
few
suitable
of confidence
advanced
speculations
in
the
present
The
important point, it seems

the
the author, is that
to
general
stochastic
model
is capable of dealing
paper.
with
ways.
the
only
been
have
differences
for by
analysis of judgments
on
to
275
AUDLEY
may
occurs
has
something
is raised
point
the
which
the other hand, behavior
simply
to
seems
of choice and,
act
J.
this
kind
of issue, rather
than
the
at
Firstly, by determining
ability
probof making a particularresponse
present time.
when
"true" choice is made
a
and,
Henmon
gives the distribution of
the
ability all choice times for each individual.
probsecondly, by determining
that
"true"
choice
that
is made.
Since
Henmon's
experiment. The experiment

Henmon
conducted
(1911) is
by
of particular interest, because it provides
data
from individual 5s, in a
situation
stimulus
where
be assumed
can
to
therefore,are
concerned
conditions
be fairlyconstant
important for any model

with
the
properties of
choice behavior.
Henmon
required 5s,
horizontal
shorter
than
of the
20.3
5s
lines
was
other.
the
lines
always
respectively.
mm
in
each
of
one
of
confidence
The
in each
to
choice
largerthan
In
responses
each
indicate
time
that
are
category
their
5s
There
that
this
in
the parameters
of choice times
it was
However,
it would
matter,
the distribution
decided
that
be
from
alone.
perhaps
if
could be made
out
stronger case
the only time datum
used to estimate
model, the wrong

relatively quicker in
is
was
are
equations
if values
correct
some
appears
indication
there
is
with longer
general decline in accuracy
choice times, again predicted by the
model, there is also a slight rise in
in going from
short to
accuracy
very
It is
moderately short choice times.
the present
no
of
must,
can
to
determine
6a
This
from
tables of results,because
in
For
the
assumed
and
following
minimum
jS
were
is not
Henmon's
the data
intervals
this
possibletime
way.
was
of
best fitwas
entirelya
with
The
are
200
the
reason,
estimated
For
various
times, estimates of
the
determined, and
theoretical distribution of choice
computed.
some
before which
occur.
already grouped
be
course,
time
response
response
easy
mates
esti-
Equations
upon
for the
11.
There
in
for
chosen
was
based
mined,
be deter-
to
probability of
Accordingly
minimum
function
the
Pa,
latency.
required
mean
course
/3 are
response,
second.
and
the
of
and
milliseconds.
The
confidence.
although
of
minimum
age
aver-
the
of
the parameters
Two
are
is
for wrong
responses
for correct
choices, as
qualitative difference
a
as
examining accuracy
of time.
from
of the
usual to estimate
judgment.
second
some
addition,
is qualitativelyin ment
agreeHenmon's
data, except in
predicted by
derived
two
comparison
distributions should give further indications
the
to
as
adequacy of the
present approach to choice behavior.
In testing the goodness of fit of the
and
mm
things. Firstly,although
two
also be
can
model,
and
model
with
in
longer or
The lengths
20
were
instructed
were
this
in all details
1,000 trials,to decide whether

two
the
model
observations,
trial to trial. The
from
it succeeds
times
leading to the
then adopted. This is not
satisfactoryprocedure, but
assumed
value
to
be
2, and
with
no
276
READINGS
TABLE
observations
ignored
indication
of
These
direct
the
the
Bl and
Br
be
to
this basis
3.19 and
time
giving a
comparison
=
1.28,these
scale
13
mum
minisec.
4.28.
and
times
of response
The
1.
agreement
given
between
in Table
model
data
and
seems
be
to
reasonably good.
Concluding
the
looseness
there
whole,
in the
hypotheses
are
linked
variables.
therefore,
whether
related
and
to
It
to
in which
way
theories
contemporary
is
needs
In
indicate
try
observed
model
the
seems
to
of the
the
to
one
another
share
of
potentialities
than
be
to
is to
the
make
proach,
ap-
specific
=
Each
with
choice
to
situation
certain
have
to
be
does
important
behavior
will
and
into
count.
ac-
to
seem
properties
therefore
reasonable
doubtedly
un-
tions
unique condi-
be taken
model
the
certain
appears
these variables might not
general
intention
case
which
while,
worthdetermine
not
K
2.
arisingwhen
It is not to be expected that the two
simple assumptions will alone account
for the relations existing between
sponse
revariables in a wide diversityof
tests
But
sponse
re-
most
way,
be tested.
to
rather
many
local
this
this presentation of the
stochastic
certain
even
In
are
have
On
in
operate
havior
descriptions of choice beconsiderably simplified, but
better ways
of formulating and testing
theories
model
suggested. The
are
itself is naturally also a theory about
certain aspect of behavior, and
a
as
only
situations.
Remarks
which
situations.
pected
ex-
distributions
is
choice
such
ured
meas-
be 0.34
to
observed
the
On
sec.
Br, the
and
6.68
of
/3
For
taken
was
0.40
time
in seconds.
5s
194)
p.
possible time
about
referringto
values
for
below.
considered
are
taken
was
2,
Bl, the minimum
For
in
results
(1911, Table
Henmon's
simple laws
minimum
The
circumstances.
in calculations.
best available
the
time, it seemed
PSYCHOLOGY
MATHEMATICAL
IN
it
initial
working hypothesis. It can be tested

the
in great detail against data, and
by relatively parameters
are
of
kind
which
could
J.
R.
either
with
identified
be
R.
Bradley,
psychological
of
Methods
estimating
statistical
will
For
neither
of
method
metrika,
of
goodness
elsewhere.
model,
present
parameters
of
tests
discussed
be
fit
example,
Rev.,
of
of
one
and
the
time,
R.
R.,
solved
easily
and
D.
Cartwright,
mean
sponse
re-
L.
R.
as
behaviour
J.
Quart.
ject.
sub-
1957,
Psychol.,
exp.
S.
The
J.
criminative
dis-
of
measurement
Rev.,
Psychol.
Affective
F.
J.
determinant
J.
of
1952,
value-distances
esthetic
1937,
Psychol.,
K.
W.
EsTES,
individual
an
Amer.
174-196.
54,
judgments.
57-67.
50,
of
description
of
decision-time
of
response.
443-452.
59,
stochastic
J.
of
behaviour.
Amer.
learning
Wiley,
may
priate
appro-
REFERENCES
the
Stochastic
York:
Relation
1941,
Psychol.,
11
the
give
New
categories
the
to
Dashiell,
AuDLEY,
F.
Mosteller,
learning.
1955.
values.
parameter
chol.
Psy-
of
alternative
over-all
to
matical
mathe-
learning.
313-323.
58,
"
for
Christie,
be
F.
simple
problems.
the
Equations
I.
Bio-
comparisons.
324-345.
for
1951,
rank
cedures
pro-
probability
the
given
responses
The
designs.
Mosteller,
"
model
models
occurrence
E.
block
paired
39,
R.,
the
these
novel
any
of
1952,
R.
Bush,
Bush,
involves
For
H.
Terry,
"
incomplete
constructs.
The
and
A.,
of
analysis
physiological
or
277
AUDLEY
Toward
learning.
statistical
Psychol.
Rev.,
of
theory
1950,
57,
94-107.
9,
W.
An
theory
and
its
Wiley,
1950.
Feller,
introduction
probability
to
12-20.
R.
AuDLEY,
times
within
inclusion
The
J.
response
description
stochastic
of
of
individual
of
Attitude
S.
S.
in
1958,
Psychometrika,
J.,
JoNCKHEERE,
"
A.
Brit.
learning.
statist.
A
"
The
Skills,
R.
the
resolution
Q.
McNemar
Studies
G.
Hill,
of
5,
M.
of
Perceptual
by
New
Rev.,
of
study
In
children.
Merrill
A.
V.
Henmon,
aggregate:
211-212.
conflict
personality.
1942.
the
experimental
An
"
in
and
distinction.
1955,
P.
J.
York:
(Eds.),
Mc-
1911,
New
18,
Nat.
relation
of
the
time
Psychol.
accuracy.
Theoretical
relationships
of
measures
Acad.
R.
York:
its
186-201.
G.
some
WoODWORTH,
The
to
C.
among
Proc.
C.
A.
New
1954.
Hill,
judgment
Mueller,
Methods.
Psychometric
McGraw
87-94.
9,
general
methodological
Motor
Barker,
Graw
1956,
Psychol.,
D.
physical
psychoPsychol.,
1-37.
28,
J.
York:
Bakan,
J.
chastic
Sto-
R.
Guilford,
for
processes
to
Amer.
25-31.
23,
1917,
R.
relation
subjects.
judgment.
AuDLEY,
York:
the
George,
behaviour
learning
New
applications.
Sci.,
S.
Henry
1950,
conditioning.
36,
Experimental
Holt,
1938.
123-130.
psychology.
BUSH
R.
ROBERT
BY
FOR
MODEL
MATHEMATICAL
FREDERICK
AND
experimental arrangements
and Skinner
straightrunways
for empirical
though
development of a
models
aid the
sufficient body of titative

quaninformation has been accumulated.
science when
This accumulation
be used
can
point the direction in which
to
general;
in
to
believe the model
we
plan
we
order
later
possiblewe
correspondence between
the
and the
model
more
shall
Wherever
papers.
discuss
the adequacy of such models in their

our
quently
interim states.
Models, in turn, fre-
in
discrimination
and
test
is
the model
multiple-choice
zation
experimentsin generali-
and
problems
extend
to
describe
to
models
and
constructed
be
should
using
boxes,
in
Introduction
phenomena
HOSTELLER
University 2
Harvard
Mathematical
LEARNING
SIMPLE
being developed
one
since striking parallels

of the
though many
(2)
by
organizingand
do
exist even
experimentaldata and in basic
interpreting
premises differ. Our model is
directions for experimental discussed and
suggestingnew
developed primarilyin
research
for
necessary
for
quantitativemodels
The
most
is that of Estes
this paper
learning
situations.
attention
on
This
and
We
simple
some
shall focus
We
Fellow
Post-dcx;toral
in the
research
was
oratory
supported by the Lab-
Relations, Harvard
part of a program
Project on Mathematical
as
authors
helpful advice and

but especiallyto
R.
sity,
Univer-
Laboratory's
constant
Drs.
W.
Jenkins,
L. Solomon.
of
appeared in Psychol. Rev., 1951, 68,
313-323.
as
follows:
(1)perception
stimulus, (2) performance
response
occurrence
278
the type of
mental
called "instru-
with
concerned
learningwhich has been

conditioning" (5), "operant
behavior"
"type R conditioning"
or
tioning"
(10), and not with "classical condi(5),"Pavlovian conditioning"
"type S conditioning" (10). We
or
shall follow Sears (9) in dividing up
of
ment,
encourage-
O.
mathematical
theory.
are
persons
many
cepts.
conmon
com-
example is the particle

of modern
interpretations
the chain of events
Models.
gratefulto
are
R. R. Sears, and
of the
most
An
wave
atomic
Social Sciences.
of Social
article
and
of
of
sets
of affairs is
state
feature
models.
(2).
acquisitionand extinction
iSSRC-NRC
Natural
This
tion
contribu-
of other
in terms
new
stems
formalize association theory. Both

be
preted
re-intermodels, however,
may
learningphenomena.
recent
designedto describe
model
This
Estes' model
construct
to
concepts while
from
an
attempt
to
shall present the

mathematical
we
of
basic structure
for
ing
learn-
building.
provided by
attempts
numerous
The
as
model
of this fact is
Evidence
'
as
are
of reinforcement
terms
rich
quantityand variety of available
data
In
the branches
Among
of psychology,few
the
useful in
are
in
"
Estes
of
or
instrumental
act, (3)
an
environmental
event,
Reprinted with
permission.
280
write
to
operator in the form
our
Qp
This is
be used
(2) will
equation
basic operator and
our
"
ment
equation (2) corresponds to an increin p which is proportional
to the
maximum
p).
possibleincrement, (1
the term,
Moreover, since b is positive,
in p
bp, correspondsto a decrement
is
the
maximum
which
proportionalto
Therefore,
possibledecrement, "p.
"
"
factors
those
Continuous
Up to this point,we have discussed

of a
only the effect of the occurrence
the
of
that
probability
response
upon
Since probability must
be
response.
since in a time interval
conserved, i.e.,
h an organism will make
some
response
must
we
investigate
or
no
response,
effect of the
the
upon
response
of
occurrence
one
the probability of
In a later paper,
response.
shall discuss
this problem
in
other
anwe
detail,
the eter
parammust
but for the present purpose
we
crease
always deinclude the followingassumption. We
the probability. It is for these
that there are
conceive
two
general
that
reasons
rewrote
we
given
in the form
reward
or
operator
our
equation (2).
in
associate the event
We
of
overt.
presenting
into classes
reinforcingstimulus
other
with
the
that
in
0 when
no
we
assume
is
reward
overt
With
b,
work
the
(See
making the response.
review
[11] of the
by Solomon
required in
the
of work
influence
respects,
many
corresponds to
on
our
an
behavior.) In
a(l
term,
p),
in
increment
make
further
no
our
of reward, amount
of work,
strength of motivation, etc.

theoretical
our
experimental
values
fit.
of
and
B is not
it is neither rewarded
which
on
any
response
probapunished. Since the total bility

of all responses
be
must
unity,
nor
data,
we
b which
In other words,
our
In
results
will
paring
com-
with
lost
at the
by
by
response
be compensated
must
corresponding
probabilityof
in
the
loss
or
non-overt
gain
sponses.
re-
tant
assumption is imporof
experiments
analysis
This
in
which
the
a
use
example.
singleclass of
for study, but
and
can
later
do
paper
such
In
other
the
Skinner
overt
We
or
responses
defer until
discussion
two
box,
experiments a
is singled out
responses
occur.
in which
are
or
runway
for
choose
give the best

model
the probabilitygained
it follows that
citatory
"ex-
parameters,
b, to experimental variables such
amount
is
punished, then
"
potential" in Hull's theory

(6) and our term, "bp, corresponds to
in Hull's
increment
"inhibitory
an
potential."
and
If
and
mutually exclusive
changed.
Nevertheless, the probability of that
A is changed after an occurrence
response
or
In this paper,
we
relate
to
attempt
divided
sub-
etc.
occurs
nor
probability of
the
overt
associate events
we
punishment and
as
response
non-
are
A, B, C,
A
rewarded
neither
given
experimental extinction.
the parameter
such
a, and
parameter
and
of responses,
overt
The overt
responses
kinds
an
as
Reinforcementand
Extinction
factors which
b those
as
for particularconditions
particularorganisms.
and
with
probabilityand
the
with the parameter a

which
always increase
associate
can
of parameters
1,
also lie
the parameters a and b must
Since a is positive,
0 and 1.
between
that the term,
p), of
a(l
see
we
we
only with
present time is concerned
the form of conditioning and extinction
with
the
not
precisevalues
curves,
tain
main-
0 and
probabilitybetween
the
for
the cornerstone
as
theoretical development. To
our
(2)
hp.
a{\-p)-
p +
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
more
of
ments
experi-
responses
reinforced differentially.
mathematical
the aid of our
With
ROBERT
BUSH
R.
operator of equation (2) we
AND
Q of equation (2) with
now
may
change in the
progressive
of
ment
probability a response in an experisuch as the Graham-Gagn^ runway
(3) or Skinner box (10) in which
describe the
the
environmental
same
follow
events
We
of the response.
only apply our operator
each
need
occurrence
to
set
equal to
zero:
Ep
It follows
bp
(i
b)p.
(5)
directlythat if we apply
for n
successively
this operator E to p
have
times we
peatedly
re-
E^p
of the
initial value
some
281
MOSTELLER
FREDERICK
(1
(6)
byp.
Each
applicationof This equationthen describes a curve
the operator correspondsto one occurrence
extinction.
of experimental
sequent
and the subof the response
The
environmental
events.
Latent Time, and Rate
Probability,
p.
probability
tions
algebra involved in these manipulaisstraightforward.For example,
if we
apply Q to p twice, we have
Q'P
QiQp)
a-\- (I
b)Qp
ous
Before the above results on continuand
extinction
reinforcement
can
be compared with empirical results,
we
first establish
must
between
=
our
(1 -a-")
a+
relationships
perimental
probability,
p, and exsuch
measures
[a + (1
b)pj
(3)
time and rate of

to
Moreover, it
be
may
that if we
apply Q
times, we
have
to
readilyshown
n
p successively
(1
X
Provided
and
are
both
not
both unity,the quantity(1

tends to an asymptotic value
"
as
increases.
(4)
b)\
zero
or
b)"
"
of
zero
Therefore, Q"p
then
that the response

or class of responses
first is p.
will
studied
occur
being
Since
have
we
already assumed
that
of
other
occurrences
affect p, one
may
of
number
the
calculate
mean
easily
the
will
before
which
occur
responses
do
not
limitingvalue of a/ (a + b)
becomes
large. Equation (4)
of acquisition.
describes a curve
response
no
response
we
have
assumed
is
reward
occurs,
we
that
given
may
after
describe
extinction trial by a special

operator
E which is equivalentto our operator
mean
number
ing
includwill occur,
of responses
the one beingstudied,is simply \lp.
which
were
all
that
assumed
In that derivation it was

the responses
independent of
abilities
prob-
one
that transition
another, i.e.,
the
same
between
the
an
that the
shown
and
Since
being studied takes place.

tion
(2) has presented this calcula-
Estes
either zero or unity. For

necessarily
ii
b (speakingroughly
a
example,
of rethis impliesthat the measures
ward
the
ultimate
and work are equal)
in time h of
probabilityof occurrence
the response
studied
is 0.5.
being
purpose,
to be a
non-reinforced
It should be noticed that the totic

asympvalue of the probabilityis not
when
order
model.
pendent
of responses which are indeof one
another.
(For this
consider doing "nothing"
we
response.)The probability
proaches
apresponses
as
have
simple and useful model is the one

described by Estes (2). Let the activity
of an organism be described by a
(iTT-.-^)
a-f "
must
sequence
(2-/,=
responding. In
this, we
do
latent
as
for all
is a bold
think
of overt
follow
one
us
that
one
pairsof responses are

pairs. This assumption
indeed (itis easy to
responses
that
cannot
another),but it appears to
other assumption would
any
282
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
of the
of Grindley (4) on chicks and the data
require a detailed specification
of
in
each
perimental
exCrespi (1) on white rats.
possible
responses
many
Since
ered.
considequation (7) is an expression
arrangement being
have
time between
and
for the mean
Frick
the end of
[8]
(Miller
of the type being studied
attempted such an analysis for a parresponse
ticular one
experiment.)
that
assumed
the
performance.
and
The
the
which
of responses
"trial":
L
time,
latent
mean
then, is simply h times
mean
the
a
(7)
of
from
some
then
each
strength of
drive
likely
pressed
ex-
in terms
latent
time
as
of trials.
expression is
of the
function
It turns
identical
At
If the increments
is
latent
mean
an
follows:
its
upon
time, L,
of the probability,p,
by equation (7),while this probability
is given in terms
of
of the number
Hence
we
trials,n, by equation (4).
obtain an expressionfor the mean
may
The
of the response
in i
increment
occurrence
the organism
on
motivation.
or
measured
responses,
arbitrary starting point,
time, h, required for each response
depend, of course,
and
involved
very
culate
cal-
now
may
of responding in
being studied adds

as
will
of the
If i reparrangement.
resents
time required for the
mean
occurrence
response
we
rate
mean
Skinner-box
the
on
occur
of the next
being studied,
type
ber
num-
P
The
the end
of time, h, for its
amount
same
requires
response
every
further
is
It
out
to
ber
num-
this
that
equation (4)
we
obtain
small,
sufficiently
are
write
them
for the
mean
may
as
differentials and
rate
of
responding
dn
l/h.
We
"activity level"
and
where
co
is the
maximum
which
ences
(2) except for differnotation.
T in
(Estes uses
shall call
rate
when
occurs
(10)
Oip,
It
the
by definition "
of responding
=
\ obtains.
of Estes' paper
in
place of our n; our use of a difference

equation rather than of a differential
a
b)
equation gives us the term (1
of
Estes'
Estes
fitted
instead
e~^.)
his equation to the data
of Graham
and
results
differ
(3). Our
Gagne
"
from
Estes'
in
"
respect, however:
latent time
in
one
the asymptotic mean

is simply h, while
Estes' model
we
obtain
-(^)
L.
This
mean
amount
equation suggests that

latent
time
depends
of reward
of required work,
that
two
(8)
and
and
on
since
the
final
on
the
the amount
we
have
sumed
as-
depend on those
variables,respectively. This conclusion
seems
to
agree
with
the data
The
Free-Responding Situation
free-respondingsituations,such
box experiments, one
rate of responding or
usually measures
In
as
that in Skinner
the
versus
cumulative
time.
of responses
obtain
theoretical
number
To
first
we
expressionsfor these relations,
for
obtain an
the
expression
bility
proba/" as
function
of time.
From
equation (2), we see that if the response

being studied occurs, the change
in probabilityis A^
a{\
p)
bp.
We
have already assumed
that ifother
and are not reinforced,
occur
responses
in
the
no
change
of the response
being studied will
Hence
the expected change in
ensue.
probability during a time interval h
is merely the change in probability
times the probability p that the re=
"
"
ROBERT
studied
being
sponse
interval
time
BUSH
R.
AND
in
occurs
that
The
of
rate
by the time
as
derivative
of
change
As
the
dt
(12)
hp\
1/^ is
where, as already defined, w
the activity level. This 'equation is
plicit
easily integrated to give p as an ex=
of time
function
Since equation
t.
(10) states that the mean

responding, dn/dt, is co times
p,
and
is
(13)
and
and
gration
inte-
oipo
(13)
have
we
of
initial rate
Fo
the
time
/ is
The
h/a.
responding at
coô, and
long
very
let
of responses,
and
results
0 is
ours
above,
of work
between
is the
Estes'
dependence,
of the final rate
and
upon
of reward
amount
trial.
extend
analysisto give
our
expressions for rates and cumulative
during extinction. Since we
responses
have
assumed
that a
0 during extinction,
in place of equahave
we
tion
(12)
may
after
final rate
number
form as
respectively,have the same
the analogous equations derived by
Estes (2) which were
fitted by him to
data
on
a
bar-pressing habit of rats.
We
Pq{\+u)-\-[\ -^0(1+")"-""'
function
with
cumulative
per
where
linear
equation
agrees
that
the
asymptotic
says
Both
constant.
equations
(15) for rate of responding
which
rate
large,the
equation (15)approach
This
amount
H^
(15)
very
becomes
discussed
dn
/ becomes
ability The essential difference

prob-
the
after the
obtain
we
of
rate
time
time.
(14)
dp
o:p{a{l-p)
e-""')+ e-""']
exponentials in
zero
have
we
(1
bility
proba-
Writing this
h.
of
3";
(11)
with time is then this expression
rate
result is
\uit-\+ u)
log [j"o(l
"
1 +
p{a{\-p)-hp].
expected
divided
The
/.
Expected (Ap)
=
time
283
MOSTELLER
FREDERICK
(16)
dt
[
(14)
dt\t==oc~'i.+u~l
+ bla
dvA
Foo
ci)
which
when
by
Equation (13) is quite
expression obtained
by
for
similar
to
Estes
inclusion of the ratio
per
response,
hence
with
the
follow from
expression
number
do
conclusions
results
for
of responses
reinforcement
of work
amount
Estes'
b and
the
where
the
(17)
Oibpjt
we
extinction
is Ve
cope.
write equation (17) in
ma}/
form
not
dm
V
cumulative
during
tiplied
mul-
ginning
pe is the probability at the beThe rate at the
of extinction.
(2).
is obtained
beginning of
Hence
per
Oipe
It
decreases with
These
response.
An
and
dm
except
and
the
b/a.
The final rate of responding according
with a and
to equation (14),increases
of reward
hence with the amount
given
our
CO
integrated for p
gives
Ve
(18)
dt
Vebt
ous
continu-
grating
by inte-
equation (13) with respect
to
An
integration of this equation gives
for the cumulative
number
of extinc-
284
IN
READINGS
where
tion responses
^log [1 +
PSYCHOLOGY
MATHEMATICAL
have
we
VebO
let
^log
(^")(19)
to the empirical
log t,used by Skinner
is similar
result
This
K
equation m
in fitting
experimentalresponse
=
equation
(10).
gin
advantage of passingthrough the ori-
if
of
number
extinction
sponses,
re-
limit.
Thus,
m, has no upper
result is correct, and
indeed if
our
equation (21) into equation
(20),we
is
no
of extinction
sponses.
re-
all
of the time
for large values
slow
the
to
practicalpurposes,
is
the logarithmicvariation
For
however,
limit
upper
size of the "reserve"
to
justified
We
use
some
t,
arbitrary
"completion" of
criterion for the
1,
niT
tinction.
ex-
(1
responses
the
be about
5, and
assumed
ofa
in
responses
data,
unity. Values
0.026
result is shown
to
was
chosen
were
the data.
to
in the
figure.
Fixed Ratio
and
Ratio
Random
Reinforcement
In
from
guage,
psychological lan-
present day
the term
1,
m.=-log_
Ve
(20)
"fixed ratio" (7)refers

the procedure of rewarding every
in
ation
free-respondingsitu")" In a "random
2, 3,
^th response
(^
to
express
of extinction
this "total"
responses,
mr,
explicitfunction of the number

of preceding reinforcements, n.
The
in
which
only quantity
equation (20)
depends upon n is the rate, Ve, at the
If we assume
beginning of extinction.
that this rate is equal to the rate at
the end of acquisition,we
have from
equations (4) and (10)
an
shall
rates
over
now
of
numbers
types
an
"
animal
is rewarded
but the
after k responses
of responses
per reward
the average
on
actual number
varies
"
ratio" schedule,
as
"
fittingequation (23)
The
to
number
the
the ratio Fo/F/
and
0.014
tinction
ex-
estimated
was
be about
to
From
Fmax/F/
ratio
of
number
after 5, 10, 30 and
reinforcements.
90
this criterion is
wish
(23)
")"|
shall consider extinction
of extinction
now
measuring the "total"
be
number
Fol
This result may

be compared with the
data
of Williams
(12) obtained
by
of
We
rF",ax
Fâx
-7
rate
"complete" when the mean
responding V has fallen to some
value, F/. Thus, the "total"
specified
to
obtain
empiricalequation is correct,
there
it is
beginning
that the logarithmic

of equation (19)impliesthat
Skinner's
so
coô is the rate at the

acquisition. If we now
=
substitute
be noted
total
then
Fo
of
it must.
as
It may
character
the
where
curves
has the additional
Our
and
h'
(22)
"
'max
specifiedrange.
some
We
derive expressionsfor mean

responding and cumulative
of
responses
of reinforcement
for these
schedules.
two
If
of
equation
apply our operator Q,
(2),to a probabilityp, and then apply
our
operator E, of equation (5),to Qp
1) times, we obtain
repeatedly for (^
we
"
^^
Tr
(E''-'Q)p=il-b)''-'[p+a{l-p)-bp^
at
(Fâx-
Fo)(l -a
-by
(21)
p-â'(l-p)-b'p
(24)
ROBERT
R.
20
lO
BUSH
A/ur?76e.r
"Total"
Curve
70
00
so
/^"//7/oro"/?7ey7A3
as
a
=
0.014, F"u"x
/?
of reinforcements.
of the number
function
0.026,
5Fo, V,
Fo.
This equation is identical to our result

for continuous
reinforcement, except
where
that
=
(25)
a\\-{k-\)h^---\â
and
We
the
a'/k replacesa
obtain
may
b'/kreplacesb.
and
a
schedule
Q operates on
the probability that E
l/k). Hence
p IS (\
change in p per response
"
symbol
equal to."
"approximately
means
the
In
approach
would
present
case
be to retain
on
and
operates
the
on
expected
is
Expected (Ap)
tQP
-f (1
After
and
from
\/k)Ep
inserted
equations (2)and (5)are

the
we
simplified,
result
(28)
p.
obtain
equation (28)
Expected (Ap)
-"
P)
bp.
(29)
b'
,
-(l
p)
This
-jp
(27)
result is identical to the
p) -bp.
in
result shown
the fixed ratio
^-(l
bility
proba-
p is l/k and
a'
Ap
lows:
fol-
the
the
ever
throughout; howthe approximationsprovide a link
The
the
with
previous discussion.
approximations on the right of these
two
justifiedif kh is
equations are
the
small compared
to unity. Now
mean
change in p per response will be
the second and third terms of equation
(24) divided by k:
primes
as
the
response,
any
that
The
result for
similar
ratio"
"random
After
exact
no
100
(12).
Williams
from
Data
of
of extinction responses
equation (23) with
number
plotted from
60
-so
AO
30
285
MOSTELLER
FREDERICK
AND
(27)
and
mate
approxi-
equation (27) for
case.
(29)
Since both tions

equahave
the same
286
form
as
we
case,
for the
write
the
result for the continuous
our
reinforcement
at
may
mean
equationidentical to equation (13),

that a is replaced by a'/k.
except
Similarly,we
the
to
of responding identical
placed
equation (14) except that a is reis
This
result
meant
by a'/k.
final rate
fixed ratio and
to both
apply
to
expression for
an
ratio schedules
comparing
In
with
dom
ran-
result
random
comparing
those under
different
ratios
of
ment
reinforce-
fixed ratio and
ratios
difficultyof
large). The
are
comparing
no
rates
of comparing rates
(unless both
ratio, nor
under
see
we
under
rates
schedules
various
does not
forcement
reinto
seem
of
of reinforcement.
the above
of
responding under continuous
an
obtain
time,
recovery
meaningfulway
once
of responding
rate
mean
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
for
be a weakness
our
model, but rather
natural consequence
of the experia
mental
portance
imthe
However,
procedure.
of these considerations hinges

equation
the orders of magnitude involved,
(14) for continuous reinforcement,we
upon
and such questions are empiricalones.
be careful about
must
equating the
âctivitylevel, w, for the three cases
(continuous, fixed ratio and random
Aperiodic and Periodic Reinforcement
ratio reinforcements). Since 1/corepresents
Many experiments of recent years
the minimum
time between
mean
designed so that an animal was
were
it
includes
successive responses,
reinforced
at a rate aperiodicor periodic
both the eating time and a "recovery
in time (7). The usual procedure
time."
the
mean
By the latter we
the
asymptotic
time
or
with
rates
for the animal
necessary
to
organize
re-
itselfafter eating and get in

make
another bar press
position to
key peck.
presumably
In the fixed ratio case,

the animal
learns to look
after each
for food not

as
both
the
are
the
fore
There-
eating time
mean
time
recovery
for the continuous

ratio
In the
case.
it seems
reasonable
random
the
ratio than
mean
reward
would
To
the
was
The
analyze this situation
consider
the
or
random
be lower
of responses
for
per
Moreover,
same.
k, the
first
per
reward,
interval
mean
we
may
of
number
mean
time
to
be
equal
to
sponses
re-
the
multipliedby
responding:
rate
of
T-t:
at
conclude
for fixed ratio when
number
of time
sequence
rewards.
to occur.
to
for either fixed ratio

that
of this set is
rewarded.
dft
be
the activity level,co, would
smaller
for continuous
reinforcement
ratio, and
value
mean
arrangement
intervals between
that
than
have
the actual
as
dom
ran-
expect
similar but smaller difference
Hence,
Tn, which
"
mean
would
one
case,
"
Some
used
and
per response
less for the fixed ratio case
than
mean
"
T.
intervals,
or
k response.
every
Ti,
of time
set
peck, response
after one
of
press
which
occurs
but
ideally
case,
intervals has elapsed is
these time
in the continuous
only after
choose
is to
we
(30)
Tc^p.
Equation (29)for the expected change

is stillvalid
probability
per response
in
if we
as
consider ^ to be
now
variable
expressedby equation(30). Thus,
the time
rate
of change of p is
should
with
increase
expect that co would
the number
of responses
ward,
per reif eating time were
k.
Even
subtracted
expect
Without
out
these
a
in all
cases
arguments
we
should
to
apply.
quantitative estimate
of
(31)
f f;(l-,)-"6,l
=
With
little effort, this differential
equationmay
be
integratedfrom
0 to
288
for
continuous
all
other
(7).
However,
story.
For
it
thing,
one
it is
that
is just
this
for
easier
ent
pointed
of
work
(2)
Estes
out.
REFERENCES
of the
part
clear
seems
1.
L.
Crespi,
forcement
rein-
variation
Quantitative
and
in the
performance
to
Amer.
rat.
continuous
p.
incentive
organism
the
between
discriminate
the
and
results
experimental
with
is consistent
is
deduction
This
model
for
than
reinforcement
schedules.
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
J.
55, 467-
1942,
Psychol,
of
white
517.
.
and
handled
extinction
this
have
; we
not
2.
of
here.
effect
W.K.
Estes,
Toward
statistical
Psychol.
learning.
Rev.,
theory
1950, 57,
94-107.
3.
Summary
Graham,
C,
is
learning
of
in
of
of
the
aid
The
parameters
mathematical
with
are
the
in
appear
related
to
Grindley,
influence
operators.
which
variables
such
5.
the
as
imental
exper-
the
of
measures
time
latent
and
probability
of
rate
tween
be-
empirical
responding
defined.
are
operant
1940,
Psychol.,
26,
C.
of
the
Experiments
in
Psychol.,
1929-30,
Hilgard,
E.
reward
chickens.
young
R.,
on
Brit.
J.
173-180.
20,
D.
Marquis,
and
and
the
on
of
amount
New
learning.
Appleton-Century
G.
York
1940.
Co.,
amount
Relations
work.
C.
Conditioning
6.
and
reward
The
ous
spontane-
conditioned
exp.
learning
D.
of
M.
and
251-280.
the
response
described
are
equations
operator
in
J.
response.
4.
of
occurrence
time
small
simple
Changes
presented.
probability
for
R.
extinction,
recovery
model
mathematical
Gagn6,
and
acquisition,
C.
Hull,
L.
York:
7.
W.
Jenkins,
Partial
and
O.,
New
1943.
Stanley,
and
reinforcement:
critique.
Acquisition
of behavior.
Principles
Appleton-Century-Crofts,
C.
J.
review
and
Psychol. Bull., 1950, 47,
193-
234.
of
extinction
and
discussed
for
for
the
behavior
the
habits
simple
Skinner
are
runway
box.
8.
G.
Miller,
and
Prick,
and
F.
and
Psychol.
responses.
of
Equations
A.,
behavioristics
tical
Statis-
C.
of
sequences
Rev.,
56,
1949,
311-324.
latent
mean
number
time
derived
are
problem;
as
and
function
for
for
equations
of responding
of trial
the
the
R.
Sears,
rate
10.
Skinner,
responses
for
the
Skinner
box
is made
attempt
to
derived
are
experiments.
the
analyze
11.
An
with
various
type
the
reinforcement
experiment.
correspondence
in
Wherever
between
the
versity,
Uni-
behavior
of
isms.
organ-
Appleton-Century-
1938.
L.
R.
The
influence
Psychol.
of
Bull.,
work
1948, 45,
Williams,
S.
B.
Resistance
to
tion
extinc-
of
Skinner
pres-
function
reinforcements.
possible,
the
The
York:
behavior.
as
partial
Harvard
at
1949.
1-40.
ing
learn-
schedules
Solomon,
on
12.
process
F.
B.
numbers
time
versus
Lectures
New
Crofts,
of
R.
Summer,
runway
mean
cumulative
9.
23,
[MS.
of
J. exp.
the
number
506-521.
Received
September
21,
of
Psychol., 1938,
1950]
MODEL
FOR
STIMULUS
AND
BY
ROBERT
GENERALIZATION
DISCRIMINATION
R.
BUSH"
AND
FREDERICK
Harvard
University^
Introduction
The
and
discrimination
behavior
to
simple mechanisms
and
eralization and
of stimulus
processes
extinction
Whether
not
or
to
seem
there
can
previous
the
as
distinction
be littledoubt
how
tween
be-
defined
as
assumed
was
Stated
is a useful
our
the stimulus
tinction
ex-
tion
situa-
by the experimenter
constant.
in the
is the
that few
ther,
Furmodel
this
work
(1),where
reinforcement
learningand behavior
one,
as
discrimination.
show
the basic postulates of

on
acquisitionand
generates
fundamental
learning theory.
this
shall
we
zation
generali-
theory
of
are
MOSTELLER
simplest terms, generalization

in which
phenomenon
strength of a response
increase
an
applicationsof behavior theory
learned
in one
plies
stimulus situation imbe made
to practical problems can
in
of
increase
an
strength response
without a clear exposition of the phenomena
in a somewhat
stimulus
different
uation.
sitof generalizationand discrimination.
When
the
this
two
occurs,
It is our
impression that few
though
Alsituations are said to be similar.
crucial experiments in this area
have
several
there
intuitive
are
been reported compared with the number
notions as to what
is meant
larity,"
"simiby
of important experiments on simple
the
one
usually means
ties
properconditioning and extinction. Perhaps
which
rise
to generalization.
give
for this is that there
part of the reason
We
alternative to using the
see
no
few theoretical formulations
too
are
of generalizationas an operaamount
tional
is to say,
available.
That
we
ceive
conof degree of "simidefinition
larity."
that explicitand quantitative
In the model, however, we
theoretical structures
useful in
are
shall give another
definition of the
guiding the direction of experimental
of
but this definition
degree
similarity,
research and in suggesting the type of
will be entirely consistent
with the
data which are needed.
above-mentioned
tion.
operational definiIn this paper
describe a model,
we
based
elementary concepts of
upon
We
also wish
to clarify what
we
mathematical
set theory. This model
In
by stimulus discrimination.
mean
for
provides one
possible framework
of
all
is
the term,
sense
one
learning a
analyzing problems in stimulus genof
discrimination.
Our usage
process
in
if any
"SSRC-NRC
Natural
*
Post-doctoral
and
This
of the
in the
research
oratory process
supported by the Lab-
was
Project on
Mathematical
indebted
are
to
Models.
many
persons
for
sistance
as-
make
to
F. R.
F.
We
us
This
refer
by which
an
in
and
restricted
more
one,
to the
specifically
animal
one
learns
stimulus
to
ation
situ-
A
(or response
response
different
in
ent
differwith
"strength")
a
B
stimulus
at
the
moment
situation.
We
are
not
with, for
concerned
D.
and
R. L. Solomon.
Sheffield,
ing
gratefulto W. K. Estes for sendof his paper(2).
pre-publication
copy
also
are
We
response
and
but in particular
encouragement,
gomery,
Brush, C. L Hovland, K. C. Mont-
is
term
however.
Social Sciences.
of Social Relations, Harvard

sity,
Univerof the Laboratory's
part of a program
as
We
Fellow
article
example,
animal
appeared in Psychol. Rev.,1%1, 58,

289
the
process
which
by
learns to discriminate
413-423.
Reprinted with
an
between
permission.
290
READINGS
possibleresponses
various
IN
MATHEMATICAL
in
fixed
PSYCHOLOGY
apply it
the
to
above
described
periments.
ex-
stimulus situation.
of the more
general
prototypes
stimulus
of
generalization
problems
shall consider
and discrimination, we
As
of experiments
kinds
the followingtwo
The
shall employ some

of the elementary
notions
of mathematical
set
We
theory
(i) An animal is trained
by the
particular response,
procedure, in an
tally
experimen-
strength
insofar
in
new
same
the
possible. One then asks about

strength or probability of occurrence
is
the
in this
response
and how
it
similarityof
the
In
similar.
an
one,
response
that
response
rewarded
or
the
in
process
of rewards
effects
strength in
response
animal
the
but
one
in the
not
the other, but
learns
tually
even-
respond
to
other,
the
influence
at
or
least
in
probabilityof
of
tion
in each situathe response
of training
the number
to
of
properties
individual
such
of
introduce
of
the notion
of such
all these
sists
con-
of elements,
set
is the
of
sum
Intuitively,the
numbers.
weight associated with an element is

of the potentialimporthe measure
tance
in
element
of that
organism's behavior.
the
we
define
can
the set; the
over
influencing
More
erally,
gen-
density function
gral
inte-
is the
measure
of that function
not
that these two
consider
bridge the
To
and
reward.
of
responses,
basic
the
over
between
stimuli
shall borrow
some
gap
we
the set.
notions
of
(2).
Estes
will
of reinforcement
(The concept
close to exhausting
experiments come
role,however.)
integral
an
play
the problems classified under
elements
stimulus
that
assumed
crimination,
and disthe heading of generalization
kinds
of
but
they
model
we
do
fundamental.
are
to
we
positive
its "weight";
denote
to
ments.
ele-
If the set
set.
finite number
measure
any
final results
our
numbers
next
number
trials,with the degree of similarityof the

of
two
situations, and with the amount
do
rise
give
associate with each element
may
the
we
with
varies
We
nor
measure
to
(rates
respond with different probabilities
the
strengths). One then asks how
or
ments
ele-
and
their number.
on
not
neither
We
of the
first.
non-rewards
and
situation
stimulus
one
warded
re-
generalization,
of
does
elements
the
not
in the
less than
the
Through
either
is
in
and
The
undefined
are
serious difficultiessince
are
fined
experimentallyde-
is rewarded,
other
which
situations
stimulus
two
presented alternately
is part of the
of stimuli.
restriction
no
universe.
this situation
lack of definition of the stimulus
involve
is
animal
(ii)An
of the
rest
universe
elements
situation to the old
new
the
we
This
degree of
the
on
an
specificproperties
set of stimuli which
place
tion
situa-
stimulus
new
depends
of
with
shall denote
of this set
situation.
stimulus
with
by
entire
response,
ticular
paras
(geometrical,optical,acoustical,
tinct
etc.) is regarded as separate and dis-
experimentally defined,
it is
as
the
which
in
and
one
experimental box
Thus,
situation similar to the training
stimulus
model.
our
situation,such
occurrence.
"tested"
is then
animal
The
certain
stimulus
from
probability of
or
the
At
has
end of training,the response
define
to
forcement
rein-
usual
situation.
stimulus
defined
make
to
Model
be
described
believe
that
Thus,
the
has
been
of
signed
dethese
to
permit analysis
experiments. In the next section we
will present the major features of the
model, and
in later sections
we
shall
in
of
one
two
states
as
far
It is
exist
as
the
organism involved is concerned; since

the elements are undefined,these states
do not require definition but merely
need
speak
state
labelling. However,
of elements
as
which
shall
we
are
being "conditioned"
in
to
one
the
ROBERT
state
of
FREDERICK
291
MOSTELLER
being "non-conditioned."
particular trial or occurrence
as
On
AND
in the other
of elements
and
response,
BUSH
R.
in the
response
that
it is conceived
of the
sub-set
learning process,
organism perceives
an
stimuli
total
It is postulated that the

of the response
available.
in
the
to
given time interval
is equal
of the elements
in the
measure
had been
sub-set which
ditioned,
previouslycon-
of
by the measure
sub-set.
the entire
Speaking roughly,
divided
the
ratio of the
probabilityis the
of the conditioned
elements
the importance of all the

sumed
perceived. It is further as-
perceivedto
elements
that
the
conditioned
to
the
if that
response
is rewarded.
response
be wrong
conditioned
and
to
that the
suppose
ments
ele-
spatially separated in the

actual situation as Fig. 1 might suggest;
are
the
spread
smoothly
out
conditioned
notation,
of
elements
conditioned
and
any
m{ ) denotes
the
sub-set named
set or
parentheses,
and
between
where
Xr\
of equal
that
so
or
overlap
an
of
the conditioned
miS)
of the
of
and
sub-set X
in
Heuristically, this assumption of

arise from
a
equal proportions can
fluid model.
Suppose that the total
situation is represented by a vessel
containing an ideal fluid which is a
fluids
if the
Thus
in
that
the
fraction of
the
in
fraction
the
well
are
fraction of alcohol
much
will be
thimbleful
as
sponds
corre-
of the mixture
mixed, the volumetric

same
the
to
of conditioned
sub-set
thimbleful
course
vol-
C (totalset minus
set),and the volume
The
stimuli.
the
(2)
The
corresponds
alcohol
the
the
to
"
stimuli in X
ures
meas-
equal
of non-conditioned
of the sub-set
stimuli, S
tion
assump-
m(C)
m{X)
the
is
corresponds to the
of the water
cates
indi-
C (also
proportionsin the
m{Xr\ C)
P
and
the intersection of X
called set-product,meet,
X and C). We
then make
of
measure
fact, that the
of the partialvolumes.
sum
"ume
to a
where
to
contrary
of the mixture
measure
m{X)
do
alcohol
and
water
assume,
volume
(1)
be
the substances
of
m(Xr\ C)
P
which
pletely
chemically interact but are comFor discussion let
miscible.
measure
of the response
occurrence
substances
of two
non-
for the probability
then have
we
defined in the text.
are
set-theoretic
In
ones.
are
the
among
mixture
not
non-conditioned
and
the various sub-sets involved
particular trial. C is the sub-set of

set
elements
previously conditioned,X the subof S perceived on the trial. The sub-sets
in
situation is illustrated in Fig. 1.
The
It would
diagram of the singlestimulus
Set
1.
is
perceived
sub-set
Fig.
situation S with
importance
whole
the
vessel.
of
measure
will be
whole
ditioned
con-
equal
set
S,
to
as
nition
expressedby equation (2). Our defiof p is essentiallythat of Estes
he speaks of
(2) except that where
of elements, we
number
speak of the
measure
We
of the elements.
next
consider
situation which
this
general
disjunct from
In
we
new
another
set
stimulus
a
set
S'.
S' will
not
be
denote
by
the set S, i.e.,S and
S'
292
READINGS
will intersect
We
overlap
or
can
shown
as
Sn
(3)
define
now
similarityof S'
to 5
of
index
an
Generalization
The
An
animal
in
(4)
=
'
m{S')
is trained
that the
index
this definition says

of similarityof S' to
by the
of the set
measure
notation
makes
made
tacit
of
measure
clear
S'.
that
assumption
element
an
we
has been
2.
Fig.
But
whose
the set S,
been
part of the sub-set

in the second
form
tioned
condiin
shown
as
response
elements
C is
situation
the
set
S';
we
this part by Cr\ S'.

From
the discussion precedingequations
denote
is
5 to S'
the
have
independent of the set in

is
which it
measured.) Definition (4)
also gives the index
of similarityof
form
C of 5 will have
to
ments
of ele-
set
or
sub-set
also contained
the
sponse
re-
trained in the firstsituation
elements
(Our
that
situation and
then his response

strength is measured
in a similar situation.
After the animal
is the
divided
of their intersection
measure
make
to
stimulus
one
whose
In words
Problem
We
in a position to say
are
now
something about the firstexperimental
problem described in the Introduction.
by
mil)
rj(5'to S)
PSYCHOLOGY
in
the intersection by
denote
We
Fig. 2.
MATHEMATICAL
IN
(1) and
that
the
(2), we
easily see
can
probabilityof the
respjonse
occurring in S' is
as
mjCnS')
P'-
m(I)
7i{Sto S')
m{S)
(5)
m(S')
_
~
m{S)
We
now
use
the assumption of
to S).
7,(5'
m{Cr\S')
this last equation it is clear that
be
the similarityof S' to 6" may
not
In
the similarityof 5
of the
fact, if the measure
same
is not
zero,
section
inter-
indices
the two
S'.
to
as
equal
proportionsso that
From
the
(6)
are
m(Cnl)
"m(I)
m{I)
The
first equality in
follows
m(C)
from
the
part of C which
w(5)
(7)
this equation
fact that
the
only
is in S' is in the inter-
of 5 and S'
equal only if the measures
regrettablethat
equal. It seems
are
similarity,by our definition,is nonsymmetric. However, we do not care
the general assumption that
to make
of all situations are
(a) the measures
the
the
time make
and
at
same
equal
of
an
assumption that (b) measures
element
For
elements,
to
say
the
2' X
situation,
lightbulb,
in
same
2' X
say
as
ballroom.
set
FiG.
of
have
would
small
2' box,
same
it appears.
importance of
the
say
is the
in which
situation
then
be
of elements
set
or
in each
situation,
in
large
Further
this pair of assumptions, (a) and

leads to conceptual difficulties.
(b),
2.
Diagram
of
two
similar stimulus
of them.
situations after conditioning in one
situation in which
The
training occurred is
denoted
by the set S; the sub-set C of 5
represents
to
the portion of 5 which

The new
the response.
was
ditioned
con-
stimulus
situation in which the response

strength is to
is represented by the set S',
be measured
and the intersection of S' and 5 is denoted by
ROBERT
BUSH
R.
AND
Fig. 2. The
second equahty
equation (,7)is an
applicationof our assumption that the
of C is uniformly distributed
measure
section
in
shown
as
in
5 and
over
the
the entire
does
If
of
fraction
same
and
the intersection
so
now
equations (6)
m(C)
m{I)
P'
such
to
sound
physicaldimensions
lightor
as
In fact,our
intensity,frequency, etc.
model
such
that
general
no
suggests
relation
is
i.e.,that
possible,
sible
sen-
any
similarity is
of
measure
very
fore,
Thereorganism determined.
of our
the point of view
model, experiments such as those of
Hovland
stration
serve
only as a clear demonthat stimulus
generalization
such
exists. In addition, of course,
tions,
experiments provide empirical relacharacteristic of the organism
studied,between the proposed index of
sions,
similarityand various physical dimen-
much
S.
set
above
similarity defined
as
obtain
(7),we
(8)
m(S')'m(S)
"
equation (4) we
From
of C
measure
combine
we
contains
of
293
MOSTELLER
FREDERICK
note
the
that
firstratio in equation (8) is the index

of similarity of S' to S, while from
ond
equation (2) we observe that the secratio in equation (8) is merely
in S.
the probabilityp of the response
from
these relations
but
of
the scope
outside
are
model.
our
conclude, therefore, that our

this point has made
to
(9) model
p'
v(S' to S)p.
up
no
quantitative predictionsabout the
Equation (9) now
provides us with
shape of generalizationgradientswhich
the necessary
operational definition of
with
be compared
can
experiment.
the index of similarity,7j(6"to S), of
Nevertheless, the preceding analysis
The probathe set S' to the set 5.
bilities
of generalizationdoes provide us with
in S
of
the
and
p'
p
response
to discuss experiments on
a framework
be measured
and S', respectively,can
In the following
stimulus discrimination.
either directly or
ments
through measureshall extend
We
Hence
sections
of latent
of
of
rate
or
sponding
re-
model
so
to
as
(1). Therefore, with equation

experiments.
(9),we have an operational way
determining the index of similarity.
As
direct
of
draw
Any
change
where
that
does
of
consequence
our
equal proportions,we
made
conditioned
was
provided
stimuli
introduce
This
conclusion
the
change
which
been previously conditioned
In this section
had
to that
follows
sponse.
re-
from
that
the
model
word
needs
to
be
said
develop
about
the
correspondence between our result and

the experimental results such as those
of Hovland
predicts
(3). Our model
nothing about the relation of the index
some
sults
re-
later and
of the
present
show
paper
postulates used in our previous

shall examine
(1). We
paper
the step-wise change in probability of
in a single stimulus
tion
situaa
response
generates
S.
equation (9) and the fact that we have

defined our similarityindex in such a
that it is never
greater than unity.
way
A
we
that will be used
of occurrence
probability
tinction
Ex-
and
Operators
can
tion
situa-
stimulus
response
the
response,
not
in
our
we
permit analysisof such
Reinforcement
The
sumption
as-
the followinggeneral conclusion.
will reduce
of
time
We
generalize the notions

already presented as follows : Previous
of
to a particular trial or
occurrence
the
have
response,
been
conditioned.
in
question a
perceived as
to
sub-set
our
sub-set
shown
of S
On
X
oi S
the
will
trial
will be
in Fig. 1.
cording
Ac-
previous assumptions,
294
probabilityof the
the
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
is
response
P
We
m(Xr\ C)
m{C)
m{X)
m{S)
the
that
assume
will be conditioned
the response
to
result of the reward
the
will
of A
measure
of reward,
amount
motivation,
on
We
etc.
non-conditioned
and B
that A
be shown
to
and
of A
will become
oi X
of
result
the
making
sponse.
re-
to
assume
error
compared
small
are
extend
We
assumption of equal
our
proportionsso that
we
m{Br\C)
m(C)
m{A)
m{B)
m{S)
results
is part
the
to
b.
"
apply
experiment
discrimination
stimulus
on
described
in
sidered,
con-
of the
conditional
sub-set while sub-set
part of the
new
non-conditioned
Thus, the change in the
in
position to treat
scribed
experimental problem de-
now
are
the second
in the
new
is
set.
sub-
is
An
Introduction.
presentedalternatelywith
mal
anitwo
S' which
situations 5 and
stimulus
similar, i.e.,which
measure
Problem
Discrimination
The
We
sub-set A
b) and g=l"a
p^, a/(a-\shall
In the next
section we
where
(11)
the end of the trial being
at
pidg", (16)
(poo
P""-
obtain
we
the Introduction.
have
m{Ar\C)
Now
presented
initial probability
po,
an
these
that of S.)
to
have
we
the basic
Q^'po pn
this last assumption

be small if the measures
our
postulates of our
previous model which we applied to
other types of learning problems (1).
When
the operator Q is appliedn times
set-theoretic model
can
the general
model
postulated
in
a fixed
acquisitionand extinction
stimulus
situation (1). Hence, the
operator
generates
assume
disjunct. (The
are
resulting from
in
further
simplicity we
For
is identical to
operator
(15)
bp.
given and that

depend on the
the strength of
as
in
required
the work
as
This
a(l-p)-
p +
trial:
for
sub-set
that another
oi X
sub-set A
Qp
Qp effective
probability
of the next
start
(10)
now
value of
new
at
have
are
non-zero
of C is
^m{C)
[m{A)
m(A)il
C)~\
m{An
m{Bn C) (12)
m(B)p.
p)
-
This
last form
of writing equation
(12)
equalitiesgiven in
equations (10) and (11). If we then
the
from
results
let
b^m,
m.(A)
a
=
'
m{S)
m{S)
(13)
Fig.
and
divide
equation (12) through by
m{S), we have finallyfor the change

in probability:
Am(C)
^p
a{\ -p)-
m{S)
bp.
(14)
and
S'.
diagram
Set
3.
trainingin
The
S' includes
various
Set
numbered.
thus define
Q which
when
mathematical
ator
oper-
appliedto p gives
S
situations,
disjunct sub-sets are
includes
2, 4, 5, and
6.
1, 3, 5, and
The
6;
intersection I
is denoted
by 5 and 6. T, the complement
plement
of / in S, is shown
by 1 and 3; T', the comof / in S', is shown
by 2 and 4. C,
the conditioned sub-set in S, is representedby
6, while the conditioned sub-set in S',

Tc is denoted by 3,
is representedby 4 and 6.
3 and
We
discrimination
for
similar stimulus
two
To' by 4, and
Ic
by
6.
296
READINGS
"discrimination
MATHEMATICAL
IN
operator,"denoted
by
PSYCHOLOGY
S which
are
not
in S'
D, which operates on the similarity Tc divided

index r; each time the environmental
The
second
by
the
following the
event
from
e.g., from
reward
changes
another,
response
of event
type
one
to
non-reward.
to
thus alternate
associated
events
with
the operators
of the
Q'. So if tj, is the ratio
and
of I
measure
that
to
of S
of le divided
S and
Our
Drî.
task is to postulate the form
next
nor
pn' [""'
=
For
choose
choice of such
our
mathematical
wish
simplicitywe
have
to
operator
an
which
always decreases 77 ôr holds it

lead to
fixed),but which will never
values
of
negative
Therefore, we
tj.
postulate that
Dr]
where
^ is
in the range
then have
ki],
(25)
which
is
new
parameter
between
and 1. We
zero
"
value
D^riQ
of
v(S' to S)
m(S)
(29)
m(S')
We
shall
consider
now
k^no.
it
reasonable
seems
the
"operant"
the
in
to
m{Tc)
m{I)
m{TJ)
mjC)
(2""o(l D-rio)+ (Q'QyôD-m

[""" (a" ao)g"](l ^"t?o)
Hence, from
/)o
ao
ao'
/So.
is
variation
Côc-(ôo-ô)/"""r7o.
(27)
response
our
final expression for the
of pn, the probabilityof the

in situation S, as a function
of the trial number

is composed
first term
n.
This
of two
major
corresponds to
of the
inspectionof equation (27)
Moreover,
shows
have
measures
'
m{S')
equations (17),(18),and
have
we
(30)
_
~
m(T')
+
This
assume
may
m(Ic)
Pn
same.
assumptions
m{T)
(19),
the
are
Moreover, in view of our

of equal proportions,we
that initially:
m(C)
that
assume
levels of performance
situations
two
(26)
Combining equations (20), (21),(22),

and
(26),we have
special
some
examples for which certain simplifying

assumptions can be made.
nation
(a) No conditioningbeforediscrimitraining. If no previous conditioning
took place in either S or S',
m(S)
=
ao')g'"](lk'^no)
[âo-(ôo-)8o)/''"V,(28)
Vn
help
operator which represents

transformation
over,
Moreon
ij.
we
of S).
measure
a'
late.
postu-
an
linear
in
a' /(a'-f b'),and g'

a"'
1
is the initial
b',and where t/o'
where
v'
(aJ
experimental
intuition is of much
our
in guiding
by the
m(I)
of the operator D.
find that neither
We
data
(24)
elements
S' (the measure
in S':
"
Tjiî
the
of the symmetry
between
S', we may write for the probability
after the
ith presentation of S, the ratio after

the (i-|-l)th presentation is
S).
to
Because
of the
occurrences
of the
measure
of
of
measure
corresponds
term
the intersection of S and
In
the present problem, we

ing
considerare
alternate presentationsof 5 and S'
and
relative
(the measure
stimulus
equation
The
terms.
the
relative
elements
of
that, except
p^
ax,
and
when
in
like
1,
we
manner
equation (28) for k 9^ 1, we have

In Fig. 4 we
have plotted
aj.
pj
equations (27) and (28)with the above
0.12,
assumptions. The values a
b
W
0.03, /"o
0.05, 170
0.50,
from
0.95
chosen
were
As
can
for these calculations.
be seen,
the proba-
ROBERT
R.
AND
BUSH
These
is learned.
i.o
297
MOSTELLER
FREDERICK
describe the
curves
by
general sort
Woodbury for auditory discrimination
in dogs (4).
have
We
argued (1) that the mean
latent
time varies
inversely as the
in
Thus
Fig. 5 we have
probability.
result
of
obtained
p" and pn
exhibit
curves
of
plotted the reciprocals
Fig. 4. These
perimental
general property of the exof
time
on
running
curves
(5)
rats obtained
by Raben
(b) Complete conditioningin S before
discrimination training. Another
spein
given
the
same
ao
4.
Curves
of probability,p (in S),
trial number, n, for
p' (in S'), versus
discrimination trainingwithout previous conditioning.
Fig.
and
It
assumed
was
in S
rewarded
was
but
not
that the response

in S'.
rewarded
Equation (27),equation (28),and

p^'
0.05, a
0.12, o'
0.03, ijo
no' 0.50, and k
p^
the values
0,
0.95
6'
were
used.
bilityof the response in 5 is a monoerated

tonicallyincreasing,negatively accelfunction of the trial number,
creases
while the probability in S' first indue to generalization,
but then
decreases
to
zero
as
the discrimination
Curves
of probability,p, and its
FiG. 6.
trial number, n, for the case
reciprocalversus
criminatio
of complete conditioning in S before the distraining. Equation (27)with the
values /".o
0,90,
0.80, k
1, /3co 0, 7/0
0.50 were
used.
and /
=
cial
the
of interest is that
case
set
in which
5 is completely conditioned to
before the discrimination
the response
experiment is performed. In this case,

In Fig. 6 we
have
px/So ^0
ao
tions
plottedpn and \/pn with these condi=
the values
and
2.0
7?o
The
Fig.
5.
Reciprocals of probability,p, of
in S',
in S, and p',of the response
0.80,
curve
of
p"
0.90,
1/p
the
versus
1, i8""
and
/
n
0,
0.50.
is similar
experimental latency
Solomon
(6) from
by
for discrimination
n,
with
rats.
experiment
a jumping
training without previous conditioning. In
latent
the model described earlier (1), mean
cal.
(c) Limiting case of S and S' identitime is proportionalto the reciprocalof probability.
kind
of
the
Another
limitingcase
The
were
curves
plotted from the
of discrimination
experiment being
values of probabilityshown
in Fig. 4.
the response
trial number,
versus
in
shape
curve
to
obtained
298
READINGS
here obtains when
considered
the
two
into
one
where,
S'
type of partialreinforcement
is refor example, an animal
warded
on
S' is of
I oi S and
S'.
5 and
trial in
second
every
The
situation.
stimulus
both
S and
problem degenerates
The
identical.
make
we
situations
stimulus
MATHEMATICAL
IN
fixed
intersection
the
of
measure
of 5.
equal the measure

equation (5),we have
/ must
From
the
(31)
'
ni(S)
while according to
postulate about
equation (26), the
our
D,
operator
similarity index
from
varies
trial to
trial:
For
S and
S' identical,
the above
equations
are
take
I.
cues
depends
that k
way
available.
are
how
on
to
the problem.
The
usual
procedure is to select a block of {j -f-/)
trials during which 5 is presented j
times and S' presented / times.
The
is determined
actual sequence
ing
by draw"6" balls" at random
"5 balls" and

from
"6"
containing j "S
urn
an
and
balls."
I when
none
since / and
Moreover,
are
trials
describe
we
applying Q
its
to
index
ij
by the
the
number
For
mean
of times
number
mean
S'.
to
the
we
have
just argued that for 5

have
t}
I.
Thus
the
of shifts from
of j
of shifts is j.
f,
Since
(Q'Q)%D"'r,i. (36)
(34)
of the analysis exactly allels

parof
that given above for the case
rest
alternate
The
P
to
previously,we applied D tori for each

write for the (i4-l)th
we
pair of shifts,
block of {2j) trials
The
But
determined
specialcase
number
pected
ex-
operand j times, Q'
f times, and
its operand
or
of probability by
(33)
^V-
and S' identical,

we
is
sequence
effective
an
value
new
+
=
This
balls"
repeated throughout training.

In our
describe the
model, we
can
effects on the probabilityof a known
by an appropriate application
sequence
of our
and
D for
Q,
Q',
operators
of
S, presentations of
presentations
S', and shifts from one to the other,
less cumbersome
respectively. A
reasonable
method
provides a
mation
approxifor each
block of (j + j')
:
to
many
of T, the
identical,the measure
be zero.
complement of / in S, must
Since Tc is a sub-set of T, the measure
of Tc must
also be zero.
Therefore,
equations (17) and (19) give in place
of equation (20)
5
simple
analysis
will handle
we
available for discrimination
are
in such
two
incompatible, unless
Thus, we are forced
that k
assume
(32)
k^VO-
Vn
then
m(I)
"^
the basis of temporal order.

generalization of the above
identical to
course
Thus
PSYCHOLOGY
for the
S'.
presentations of 5 and
results will
value
be
of k
identical
except
involved
in
the
operator D.
Equation (22) gives us then
Summary
Pn
(Q'Qypo
P.-
(poo po)f".(35)
-
This equation agrees with our

result on partialreinforcement
(1).
(d) Irregularpresentationsof S and
S'.
In most
experiments, ^ and S'
not
are
presented alternately,but in
that the animal
an
irregularsequence
so
cannot
learn to
mathematical
model
previous generalization and
discriminate
on
described
in terms
for stimulus
discrimination
is
retic
of simple set-theo-
An index of similarity
concepts.
of the model but is
is defined in terms
in
related to measurements
The
tion
generaliza-
mathematical
experiments.
for acquisition and
operators
extinc-
ROBERT
in
discussed
tion,
earlier
an
from
derived
are
R.
the
BUSH
AND
(1),
paper
C.
HovLAND,
set-theoretic
here.
presented
model
The
The
I.
conditioned
applied
the
to
stimulus
on
of
analysis
1937,
17,
received
II.
periments
ex-
discrimination.
October
/.
C.
Psychol.,
B.
by
of
R.
R.,
"
Mosteller,
model
Psychol.
2.
ESTES,
of
W.
Rev.,
K.
learning.
94-107.
F.
for
1951,
Toward
Psychol.
58,
matical
mathe-
simple
rat's
in
Rev.,
1949,
Solomon,
313-323
statistical
theory
1950,
measured
/.
response.
learning.
57,
42,
R.
L.
56,
"
physiol.
of
Latency
learning
discrimination.
1943,
by
comp.
nation
discrimiof
intensity
running
Psychol.,
254-272.
of
measure
ulus
stimcamp.
29-40.
white
differences
illumination
of
J.
dogs.
35,
The
W.
279-291.
51,
learning
References
Bush,
eralization
gen-
responses:
1937,
The
1943,
M.
Raben,
1.
gen.
The
125-148;
Psychol.,
genet.
Woodbury,
1950]
13,
of
/.
conditioned
patterns
[MS.
I.
is
of
finally
generalization
responses:
Psychol.,
model
299
MOSTELLER
FREDERICK
Amer.
422-432.
in
response
'single
J.
as
door'
Psychol.,
TWO-CHOICE
BEHAVIOR
ROBERT
BUSH
R.
OF
AND
PARADISE
THURLOW
Harvard
FISH
WILSON
R.
*
University
Our
individua
S's. In general, most
Ss
problem stems
principallyfrom
served in a contingent experiment are
found
two
experiments. Brunswik
(1) obthe
bution
acquisition of a position to have an asymptotic choice distrifood was
discrimination
of
the
selection
of
by rats when
100%
in
box.
favorable
gent
Noncontinalternative.
placed more
frequently
one
Research
situations
by Humphreys
(9) was
give rise to other kinds
of choice distributions;in such expericomparable in that S had two choices
ments,
with
of
reinforcement
both.
the
partial
asymptotic proportion of
He required college students
choices
of the
favorable
alternative
to
guess
trial whether
match
the proon
not
to
or
a
light has been observed
every
portion
would
scheduled
of reinforcements
flash, and then in accordance
with
for the alternative.^
a
predetermined schedule, the
did
flash.
The
We
the nonlight did
not
or
attempted to obtain
Ss.
Humphreys
study exemplifies a noncontingent results with nonhuman
for
confronted
two-choice
Red
fish
contingent procedure
were
paradise
by
with partial
learning since the flash of the light a position discrimination
did not
the choice made
side was
in which
reinforcement
depend upon
one
random
and
the
other
Brunswik's
faced a conrats
correct
by S.
tingent
a
75%
situation
since
the
side correct
for the remaining 25%.
mental
environdiscrimination
change, presentation of food, The
was
a
apparatus
box with adjacent goal compartments.
was
contingent in part on S's response.
A contingent two-choice
For the experimental Ss, E placed the
research
on
humans
has been performed by Goodin
the
food
correct
compartment
Ss decided regardless of whether
entered
now
S had
(2, pp. 294-296). Her
trial
which
the
the
division
of
on
two
correct
goal box;
every
buttons
If the choice
between
the
to
two
was
was
goal boxes
press.
for
the
they earned
a
experimental
poker chip, transparent
correct,
otherwise
learning
has
Human
not.
with
been
two-choice
partial reinforcement
further
observed
Bush
and
Mosteller
two
associated
of
types
with
(2) suggest
procedures are
different
forms
food
when
under
"contingent procedure (3) and under

noncontingent procedure (3, 4, 5, 6,
7, 8, 10).
that these
the
see
that
so
group
these
in the
they
The
Ss
correct
had
control
able
were
to
ment
compart-
chosen
rectly.
incorwas
group
run
separatingthe
in order to produce
goal compartments
conditions
comparable to those used
with
by
an
opaque
divider
Brunswik.
of
Theory
asymptotic choice distribution

distribution
after learning)
^
This research
of
We
and
and
This
are
to
(choice
for
We
the
to
attempt
data within
describe
the
the framework
mental
experiof the
tory
supported by the Labora^
Social Relations,Harvard
Besides
University.
contingent and noncontingent procedure,
indebted to W.
other kinds of factors,such as a gambling
S. Verplanck for suggesting
that we
fish in learningexperiments
use
versus
a problem-solving
orientation,have been
F. Mosteller for numerous
suggestions related to asymptotic choice distribution (6)
was
criticisms.
article appeared
We
in
J. exp.
shall
Psychol.,1956, 51,
300
not
315-322.
deal with these other factors.
Reprinted with
permission.
ROBERT
On
Mosteller
(2).
1,2,.. .) there
that
each
trial and
of
of four
One
side.
the
that
to
favorable
more
events
leads
As
pn+i-
assume
choose
will
this
on
occurs
different value
in
similar
the
effect
analyses,
we
feeding
of
is
lyzing
previously used for anatwo-choice
experiments using the
from
contingent procedure is obtained
the above table by imposing the further
The
model
restriction
that
This
1.
ai
event
an
301
WILSON
R.
the
symmetrical for
two
goal boxes; we
assumption for nonthat a
feeding. In addition, we assume
side
of
on
one
feedings
long sequence
the probability of
make
would
tend
to
special assumptions
going there unity. These
the general model
reduce
to
about
the following statements
pn+i-
make
similar
information
This
schedules.
model
is
by Bush
equivalentto
and
Estes
Mosteller
and
(2)
by
(4) for
human
with
the
experiments
describing
the models
used
non-contingent procedure.
The
other specificmodel
tion
assump-
implies that nonfeeding is
THURLOW
AND
and
given by Bush
trial " (where k
0,
exists a probability /""
model
Stochastic
BUSH
R.
for
the
perimental
ex-
the
group,
bilities.
probais
obtained
It was
secondary reinforcementmodel,
expected that this model
X
1
from
the additional
restrictions,
describe learning by the control
would
that
This
model
and Q!2 " ai.
assumes
in the present
experiment. Given
group
S enters
when
one
goal box and sees food
be shown
that
this specificmodel, it can
in the other
goal box it is secondarily
the asymptotic p for each S will be either
the response
reinforced
for
just made.
1.0 or
0; for the 75:25 schedule it is
which
does
herein
alter the response
not
called
predicted that
will tend
towards
depends
We
the
exact
of
value
of
the
models
ai.
for the
present
p of 1
towards
of additional
that
than
of
the values
We
The
i's will
tend
precise
portion
pro-
1 depends
towards
and
ai
more
0.
about
the
dicts
pre-
asymptotic
an
on
ai-
with
chiefly concerned
are
tions
restric-
this model
S will have
0 and
or
that
shown
each
that tend
foregoing table by imposing

sets
been
that
periment.
ex-
obtained
are
It has
centage
per-
specificmodels
group
different
two
the
upon
These
from
The
1.0.
two
propose
experimental
of i's
high percentage
of the
forms
dictions
preptotic
asym-
of the
of choices
distributions
suggested by two
side.
favorable
These
predictionscould
different theories of learning. The first
be
tested
experimentally by running
formation
specific model, herein called the intaining
trials in the experiment and obmany
by taking
model, is obtained
for
of
choices
each
a
proportion
X
and
0.
As
a
result, the
ai
ai
the last 100 trials. The
S during, say,
in the forefirst and fourth listed events
going
thus
obtained
would
form
proportions
table have the
effect on
in
which
turn
are
p";
same
they correspond
the favorable
and
third
to
side.
listed
food
being placed on
Similarly,the second
have
events
effect; they correspond

placed on the unfavorable
to
the
food
side.
same
Karlin
These
each
trial may
providing information
be
about
described
(11)
of
models
could
be
is very
many
experiment
distributions
slow.
to
obtain
In view
we
are
that
suggests
the
trials would
as
the payoff
which
distribution
being
restrictions appear
arise most
to
readily
from a cognitivelearning point of view,
because
compared
with the predicted ones.

Unfortunately,
the mathematical
analysis presented by
forced
the
of
vergence
con-
these
Therefore, a great
required in the
be
the desired
bution.
distri-
of these considerations,
examine
the "nearto
302
predicts that
model
clustered
will be
distribution
mation
infor-
The
distributions.
asymptotic"
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
such
around
START
BOXES
STARTGATE
the
.75, whereas
model
predicts
that it will be U-shaped with a peak near
smaller peak near
0.
somewhat
1 and
a
The model for the control group
(a2
1)
also predictsa U-shaped near-asymptotic
0 should
distribution,but the peak near
be very
small compared to that for the
These
model.
secondary reinforcement
with
data
compared
predictions are
below
just
point
GOAL
DIVIDER
CHAMBER
"/"\z"
reinforcement
secondary
Fig.
Sketch of the discrimination
apparatus.
below.
Method
suction
by
Subjects.The Ss
49 red
27
paradisefish,
and 22 in the experimental
in the control group
The
red
paradise fish (Macropodus
group.
is
a
hardy tropicalfish about 2 in.
opercularis)
of its small demands
in length selected because
housed
The
Ss were
for care.
separatelyin
"
with
tanks
This
water
the
was
turned
each
day
fish has
diurnal
Apparatus.
box
of
in
J-in.
white
1"F.
by
our
matically
auto-
were
12-hr. period
activity cycle. (This

of
activity.)
apparatus
for parts of the

had
group
the
The
"
which
standard
rhythm
shown
as
constructed
except
for
on
control
to
indicated
"
appetite. Lighting
fixtures
fluorescent
by
of 80"
temperature
temperature
for maximum
feedingstudies
was
were
The
maze
white
opaque
nation
discrimi-
was
Fig. 1.
control
divider,whereas
opaque
side of the other

could
box
was
white
opaque.
These
interchanged. (Exploratory
studies indicated that a positiondiscrimination
with identical goal boxes is learned very slowly
by these fish.)
The apparatus
was
placed in a 10-gal.tank
shielded from
room
lights. Lighting came
largelyfrom a 75-w. spotlight2 ft. above the
sides
and
maze
was
taken
be
focused
to
that
experimentaltank
those
of the home
the
on
ensure
were
start
water
as
Care
of this
close
tanks of Ss.
as
possibleto
to
mm.
the
in
mm.
opening of
the
not
made
an
In the
eat
in
error
Procedure.
All
"
apparatus
Ss
received
75%
on
for
scheduled
trials.
On
were
that
than
two.
The
runs
selected
of Incorrect
All fish had
right,
yellowside
was
140
goal was
remaining
goal box was
the
one
restricted
by
The
could
the
of
the favorable side
of 20.
blocks
total
goal box (the
on
trials for which
within
was
for reinforcement
given trial only
incorrect
was
because
the other
reinforcement
The
correct.
trials while
One
20 trials a day or less.

trials,
scheduled
favorable side) was
of the
or
procedure.)
same
domization
ran-
restriction
not
be
longer
schedule.
favorable for about
white for one-fourth;

one-fourth of the Ss; right,
and left and white
leftand yellowfor one-fourth,
for one-fourth.
The
follows.
procedure for
The
fish
was
the
control
group
released from
was
the
as
start
down
to the goal boxes.
chamber, and it swam
experimental food was
pared
preinto the goal box which
If the fish poked Its nose
an
inexpensive(10 cents an
E lowered
for that trial,
correct
a medicine
("Lumpfish caviar" packed by
was
the
Hansen
with
fish
into
Caviar Co., New
a
York, N. Y.). These
dropper
compartment
egg
found
food of
secured to an arm) to allow
to be a highlypreferred
(the dropper was
eggs were
If the fish entered the Incorrect
the paradisefish and were
convenient
the fish to feed.
to obtain
In
and
The
store.
were
presented singly; goal box, no food was placedin the goal box.
eggs
fish
chased
he
held
the
into
the
the
end
of
medicine
back
either case,
t
was
on
a
dropper
egg was
Feeding.
"
The
chamber.
conditions
larger than
"
for
this divider was

parent.
transexperimentalgroup
For one
goal box the side oppositethe
formed
from a pieceof
to the box
entrance
was
lightyellow plastic;the corresponding
opaque
the
was
egg
the
the
fish was
dropper). To secure
egg,
obligedto pullit from the dropper. A fish was
all of its food by solvingthe
requiredto earn
discrimination problem.
Pretraining.The pretrainingtook two
or
the first day the fish was
three days. On
fed
(10 or 20) by eye dropper in its home tank.
eggs
For the next
two
one
or
days the fish underwent
forced trials (10 or 20) in the maze.
Half of the
the right-side
forced trials were
to
goal box.
one-third
of the fish were
About
rejectedfrom
the experiment at the end of pretraining
after
or
two
or
one
days of discrimination
training
rejectedbecause they
leaving49 Ss. (Fish were
Plexiglas, would
The
goal boxes.
was
(the
and
diameter
fish eggs
ounce) caviar
from
304
READINGS
MATHEMATICAL
IN
PSYCHOLOGY
predict the
of
relative
of
groups
Ss
framework
of
the
learning are
determined
from
the
the
of
values
be
must
The
mated
esti-
models
other
considered
data
Within
models,
by the
data.
predict,however,
different
rates
which
parameters
learning
under
run
conditions.
experimental
of
of
rates
do
properties of
the following
in
sections.
The
The
near-asymptoticdistributions.
for the experimodels
specific
mental
"
two
group
and
model
140
120
100
80
60
about
the
of
Fig.
2.
parallelthe
is
for
22
the
of the
the
The
spread
of
152-160)
pp.
from
likelihood
maximum
used
was
the data, giving .7

The
during 10
The
distribution
trials
results
ratio
gives
test
considered
be
shown
are
of Table
column
of fish.
control
than
little
the
goal
box
slowed
Just
can
sight of
when
down
how
this
determined
analysisof
We
the
be
can
One
hasten
models
food
was
the
learning
only by
the
to
we
two
the
this
conjecture,of course,
food in the opposite
obtained
not
about
comes
a
more
process.
can
be
detailed
data.
note
described
at
model
this
above
point that
do
between
is
the
ber
num-
on
the
.22.
The
first
formation
in-
predicts a clustering
TABLE
Distribution
was
but
from
inferred
the
last
rapidly
group,
second
show
we
choices
trials
10
last
Trials
AND
Side)
for
for
not
Successes
of
Group
(Choices
During
Two
the
the
Parallel
that
more
experimental
more
figure.
that
and
in
of the
is clear
learned
group
the
It
successes
of 10 trials for each
groups
of favorable
Favorable
Fig.
"
coefficient
satisfactory.
In
Learning curves.
show
the proportion of
blocks
fit
the
In
Table
likelihood
This
.4.
of
learning is
frequenciesof successes
during the last
49
of successes
trials (the number
varies from 0 through 49). The
served
obtribution
U-shaped near-asymptotic disis not
determined
by initial
the
rank-order
relation
corpreferences alone;
mate.
esti-
the
butions
distri-
the
after
successes
computed.
the
tions
predic-
of
shape
successes
in
1 and
P
the
as
of
then
can
(12,
estimate
to
different
complete.
portion
pro-
side is plotted
block of 10 trials.
method
nearly
column
Mean
of the favorable
which
determines
parameter
the
distribution.
of
two
stat-fish which
experimental group.
of choices
for each
for each
Learning curve
of fish and
groups
very
model
reinforcement
secondary
make
"
TRIALS
the information
"
the
22
the
of
the
Groups
Stat-Fish
Experimental
Real
Fish
of
Last
of
Fish
Which
the
49
ROBERT
37
around
this
but
confirmed
clearly
not
data.
group
BUSH
R.
prediction is
mental
by the experiThe
secondary
the other
reinforcement
model, on
bution
hand, predictsa U-shaped distriwith greater density at the
high end than at the low end. This
On this basis
is confirmed.
prediction
choose the secondary
alone we
can
model
reinforcement
in favor
of the
Detailed
tions
ques-
model.
information
goodness of fit are

followingsections.
The
the control
assumption that
nonreward
has no effect (ao
1) and
it predictsthat the near-asymptotic
the
involves
group
the model
to the data in a detailed way.

that
the control group,
we
assume
the same
mate
model appliesand then esti-
For
both
whether
=
a2
of
U-shaped
but with
the
end.
at
low
small
very
with the data shown
density
indeed
This
agrees
of 27 fish stabilized
at
side
2 ; one
out
the unfavorable
during the
side 46 times
other
the
favorable
the
trials
a-z
yet
In
run.
the basic
in the
1 made
on
estimates.
three
of
secondary
we
experimental group,
the primary reward
estimate
for
model
need
parameter,
than
of
(the smaller the
Estimates
FOR
Each
of
the
of
Two
the
is
nonreward
for the control

ai
as
control
fact that it is not
does
control
1.0
near
for the
model
reward
the
pectation
ex-
large
the
group,
in the
assumed
but
group,
the
that
quite 1.0 suggests
slightlyreinforcingeven
The
group.
result that
is less for the
experimentalgroup than
(primary reward
group
is
not
effective)
predictedby any
more
control
of the models.
The
effects of
relative
as
follows.
942
We
this
and
is
reward
secondary
primary
for each group
secondary reward
and
be
can
that
note
that
means
about
60%
as
for the
effective
as
primary reward
primary and
Similarly,
in this experiexperimental
(.956)-^^
group.
ment
value
of a,
the
.986,and
Parameters
Groups
Obtained
of
Fish
so
is about
secondary reward
effective for the
These
Two
is
a2
than
event
the
For
value.)
of
confirms
that primary reward

is more
(A small value of a impliesa
effective
more
30%
TABLE
This
ai.
effective.
(915).69
effects
of 10
in
used
are
correspondingprimary
the
parameter,
a^.
relative
butions
distri-
block
that
estimated
the
observed
be
shown in Table 3. It can
the secondary reward
rameter,
pais larger for both groups
a2,
are
noted
a\, and
two
scribed
de-
the first
uses
conjunctionwith
for moments
of the p-value
formulas
and
distributions derived
by Bush
however,
Mosteller (2, p. 98). The results,
to
the secondary reward parameter,

These
estimates are
required for
reasons:
{a) we wish to measure
It
in each
successes
be
cannot
of the
moments
for the
Having chosen
"
reinforcement
the
to
parameters
in detail here.
value
for the control group.
Parameter
the
not
consider
we
that
did
or
the
during
assumption
model
the
side
section
next
last 49 trials.
26 fish either stabilized
The
stabilize
it chose that
"
estimate
procedure used
in the last column
of Table
that
reward
will also be
successes
assumption
1 is tenable.
The
two
determine
and
parameters
the
not
or
distribution
and {b) the estimates

effect),
measuring goodness of fit ot
greater the
used in
are
these
trials;
proposedfor
model
305
WILSON
R.
considered
of
in the
THURLOW
AND
percentages
because
may
control
be
in
group.
error
of the sampling
in the parameter
estimates,but
indicate roughly the effects.
preciably
aperrors
they
do
of comparing
Stat-fish.A convenient
way
model
w
ith
is to
data
predictions
Carlo computations or "statMonte
"
run
fish"
as
described elsewhere
131, 251-252).
140 trialseach
One
were
(2, pp. 129-
hundred
carried out
of
runs
on
IBM
306
IN
READINGS
machines^
using the
Table
given in
sample of
the
22
initial
stratified
such
drawn
was
runs
TABLE
Comparison
Data
that
22
probabilities
symmetric beta
the
with
from
Experimental
the
Fish
Statistics Computed
of
for
of
distribution
approximate
would
experimental
these 100 runs,
From
group.
values
parameter
for the
PSYCHOLOGY
MATHEMATICAL
and
from
Obtained
from
the
Group
of
Sequences
the
22 Stat-Fish
the
the parameter
.7.
s
These 22 stat-fish can
then be compared
directlywith the 22 paradise fish in the
distribution
experimental group.
The
of the stat-fish
"learning curve"
in Fig. 2 along with those of the
is shown
fish.
real
fish
It
is
curve
the
be
can
seen
between
it is
model
the
the model
curve
parameters
for
discrepancy
of
Rather
how
well
estimated
were
stat-
should
the data.
and
indication
some
the
the
This
experimental group.
be
interpreted as
not
that
slightlyabove
the
from
mates
Loosely speaking, the estiobtained
by requiring that
the learning rates
of the model
lation
popuand
of the experimental sample
be equal. To
measure
goodness of fit
look at other properties of the
we
must
data.
were
these
All
chose
the
others
chose
four
The
of the 22 stat-fish
in the
The
distribution
near-asymptotic
same
manner
results
of Table
shown
are
for the real fish.
as
in the third column
the
that
for goodness
consider
we
of fit would
side
and
two
each.
once
initial
These
bilities
proba-
success
real fish is five.
.85, respectively.
of failures
This
suggests
have
been
would
agreement
if the initial distribution of bilities
probahad
had
less density in the
the
extremes;
formal
be
only
never
better
that
are
group
tests
by
found
close to the
sufficiently
corresponding frequenciesof the experimental
2 and
it
of .95, .95, .85, and

The smallest number
of
obtained
was
unfavorable
of
result
are
of the stat-fish
two
stat-fish had
data.
successes
discrepancies
the fact that
the
to
fluous.
super-
beta
symmetric
only as
used
was
distribution.
initial
true
bution
distri-
mation
approxi-
an
learning during the first

the distribution
10 trials tends
to spread out
of response
probabilitiesand so
the true initial distribution probably had
than
the symmetric beta
less variance
Furthermore,
Many
sequentialpropertiesof the data

the corresponding
to
in
properties of the stat-fish "data"
be
can
order
compared
obtain
to
further
goodness of fit.
Thus
the distribution
of
of
measures
have
we
tabulated
(of successes
for the experimental group
failures)
for the
the
stat-fish.
and
mean
In
and
show
we
of the total number
SD
of the number
runs,
Table
of
distribution
of
The
given
distributions
stat-fish
manner
lengths,as
well
S.
per
but
of
one
the
as
It
the
be
can
slightlysmaller for the

the stat-fish,
and
that
these
'
We
measures
are
number
tabulated
to
of
cesses
suc-
that
seen
be
can
real fish than
B. P. Cohen
and
Seymour for making these computations.
of iSs.
The
and
we
so
the
putations.
com-
compare
distributions
are
statistics listed
fish.
P. D.
greater
the
that
much
of the data.
the
same
two
groups
not
normal
the Mann-Whitney
used
(13). Comparison
values
fish and
in
compared
to
statistics
real
of each
of the
for
the variabilityof
real
used
as
of
for the
all
are
means
is less for the
indebted
Table
in
of various
runs
stat-fish
the
in
used
and
runs
in
than
Table
.3.
model
led
Thus,
we
test
seven
to
clude
con-
adequately
scribes
de-
of the fine-graincharacter
ROBERT
R.
BUSH
AND
THURLOW
Summary
307
WILSON
R.
multiple-choice
behavior.
/.
chol,
Psy-
exp.
1955, 49, 97-104.

A
two-choice
with
Ss
of
each
Ss
The
two
experiment
choice
49
were
conventional
the
given
were
the
Both
maze.
75%
side
Two
of
of
placed
the
on
that
predicts
the
side
of
The
data
the
reward
detailed
and
box
predicts
100%
the
choice
Hake,
which
9.
H.
W.,
of
binary
the
between
and
model
It
then
is
reinforcement
much
of
the
of
with
differing
/.
exp.
1-5.
R.
of
Perception
of
structure
L.
random
/.
G.
series
Psychol,
exp.
M.
E.
11.
fine-grain
S.
data.
in
alternative
Some
random
models
learning
learning
and
the
serial
symbols.
/.
1951, 41, 291-297.
Psychol,
Karlin,
/.
294-301.
effect
of
anticipation
exp.
tinction
ex-
in
conditioning.
to
Probability
recency
and
expectations
1939, 25,
Psychol,
Jarvik,
Acquisition
verbal
analogous
negative
cluded
con-
model
ation.
situ16-22.
Hornseth,
"
reinforcement.
symbols.
Humphreys,
exp.
10.
predictions
made.
are
secondary
describes
data
bility
Proba-
extinction
Hyman,
"
statistical
W.,
and
response
of
situation
ure
meas-
secondary
H.
1951, 42,
the
/.
1953, 45, 64-74.
secondary
and
primary
results
the
the
and
8.
sumes
as-
goal
opposite
made
approach
from
comparisons
of
the
hand,
Parameters
of
experimental
structure
"secondary
L.
1955, 49,
Psychol,
conditioned
Psychol,
in
theory.
problem-solving
Acquisition
model
proaches
ap-
sis
Analy-
situation
Postman,
"
A., Hake,
D.
P.
verbal
was
choices
other
support
estimated
that
J.
of
model.
adequately
Grant,
other.
effectiveness
are
This
The
the
in
obtained
reinforcement
food
of
just
will
7.
L.
New
H.
J.
learning
in
exp.
percentages
on
food
fish
the
or
trial.
an
choosing
which
on
all fish.
response
individual
one
for
iish
distribution
.75
sight
that
that
side
model,"
reinforces
assumed
of
preceding
about
of
the
the
reinforcement
model"
R.
"
processes.
1954, 47, 225-234.
learning
/.
in
theory.
Coombs,
conditioning
J. J.,
cussed.
dis-
are
certain
un-
1954.
Psychol,
exp.
H.
statistical
GooDNOw,
the
predicting
group
probability
trial
particular
for
of
C.
Straughan,
"
verbal
terms
the
K.,
in
association
Decision
Wiley,
W.
of
one
side
Thrall,
(Eds.),
Estes,
behavior
interpretation
an
statistical
M.
R.
of
on
Individual
of
York:
6.
experimental
the
sides
other
K.
Davis
5.
time.
"information
in
the
on
In
the
mental
experi-
both
on
W.
terms
observe
to
rewarded
models
the
The
increment
and
the
stochastic
behavior
on
time
25%
food
were
groups
the
of
remaining
the
Estes,
situations:
into
with
run
whereas
of
4.
comes
out-
divided
were
opportunity
an
absence
or
presence
the
is described.
fish
Ss
procedure
Ss
trial
paradise
control
provide
to
about
each
on
red
the
groups;
designed
information
complete
I.
walks
arising
Pacific
J.
in
Math.,
1953, 3, 725-756.
12.
References
Mood,
A.
M.
Introduction
the
to
York:
New
statistics.
theory of
McGraw-Hill,
1950.
1.
Brunswik,
of
rat
E.
Probability
behavior.
/.
exp.
as
determiner
Psychol.,
13.
1939,
25, 175-197.
2.
Bush,
R.
quantitative
(Ed.),
R.,
"
Mosteller,
Stochastic
F
.
models
New
for learning.
York
Wiley,
F.,
Mosteller,
"
Bush,
techniques.
Handbook
Cambridge,
of
Mass.
In
social
Detambel,
M.
H.
test
of
model
for
(Received
May
26,
G.
LIndzey
psychology.
Addison-Wesley,
1954.
1955.
3.
Selected
R.
R.
1955)
TOWARD
STATISTICAL
BY
THEORY
WILLIAM
K.
OF
LEARNING
ESTES
University
Indiana
Improved experimental techniques studies developinga statisticaltheoryof

and simple elementary learningprocesses.
From
study of conditioning
discrimination learningenable the presthe
definitions
and
which
assumptions
ent
for this kind of formulation,
day investigatorto obtain data
appear
necessary
which
shall attempt to derive
we
are
sufficiently
producible
orderlyand reto support exact
relations among
ures
commonly used meastive
quantitaof behavior and quantitative
of behavior. Analogy
expressions
predictions
with other sciences suggests that full
describing various simple
utilization of these techniquesin the
learningphenomena.
pend
analysisof learningprocesses will dePreliminary
Considerations
to some
extent upon
a comparable
Since propositions concerning psyrefinement of theoretical concepts and
chological
verifiable
events
are
methods.
only
The necessary interplaybetween
to the extent that they are reducible to
theory and experiment has been
of behavior under specified
predictions
hindered,however, by the fact that
environmental
conditions,it appears
of the many
theories of
none
current
that
and consistency
greatest economy
likely
learningcommands
general agreement
for the
among
researchers.
progress toward
reference will be
theories
It
frame
common
slow
so
built around
in
likelythat
seems
result from
of
theoretical
the statement
structure
will
mental
of all funda-
laws in the form
long as most
fined
verballyde-
R
f(S),
which are
constructs
hypothetical
R
where
and 5 represent behavioral
not susceptible
tion.
to unequivocalverificaenvironmental
variables respectively.
and
While awaitingresolution of the
Response-inferred
laws,as for example
peting
comapparent disparities
many
among
those of differential psychology,should
it may
be advantageous
theories,
of this
be derivable from relationships
to systematize well established empirical
The
form.
reasoningunderlyingthis
at a peripheral,
tical
statisrelationships
has been developedin a recent
position
level of analysis.The possibility
oped
paper by Spence (8). Although develof agreement on
theoretical framework,
a
within this generalframework,the
at
least in certain intensively
tent
expresent formulation departsto some
studied areas, may
be maximized
by
from traditional definitions of 5
definingconcepts in terms of experimentally and R variables.
manipulable variables,and
Many
apparent differences among
of assumpdevelopingthe consequences
tions
to
contemporary learningtheories seem
by strict mathematical reasoning. be due in part to an oversimplified
nition
defi-
This
*
For
are
essay will introduce

continual
reinforcement
series of
of his efforts
of stimulus
view
of stimulus
and
and
The
response.
response
as
mentary,
ele-
units has
reproducible
as
always
theory construction,as
many
the writer
specificcriticisms and suggestions,
had considerable appeal because of its
is indebted to his colleaguesat Indiana
versity,
UniThis simplicity
is deceptive,
simplicity.
especiallyCletus J. Burke, Douglas
entails
it
the
since
postulation
however,
G. Ellson,Norman
Guttman, and William S.
of
various
hypothetical
processes to acVerplanck.
well
at
This
for
article appeared in Psychol. Rev.,1950,57, 94-107.

308
Reprinted with permission.
W.
K.
309
ESTES
observed
in beThe pointof view to be developedhere

havior.
variability
In the present formulation,
will adopt as a standard
we
conceptual
shall follow the alternative approach of
model a closed system of behavioral and
variables. In any spein
environmental
includingthe notion of variability
cific
the definitions of stimulus and response,
the environmental
behavior-system,
and investigating
the theoretical conseinclude either the enquences
tire
component may
of these definitions.
stimuli
available
of
in
population
It will also be necessary to modify
the situation or some
specified
portion
the traditional practice
of that population. The
of statinglaws
behavioral
in terms of relations between
of learning
clusive
component will consist in mutually exfor
count
isolated
stimuli
at a
and
responses.
of
description
quantitative
and
learning
extinction of operant behavior

led the writer to believe that
have
a
self-consistenttheorybased
classical S-R
model
may
be
the
upon
if
difficult,
to extend over
impossible,
any very
wide range of learning
out
phenomena withthe continual addition of ad hoc
not
hypothesesto
A
described
handle
recurrent
as
follows.
every
new
ation.
situ-
difficulty
might be
In most
classes of responses, defined in

these classes
objective
criteria;
tempts
At-
tions
formula-
terms
of
will be exhaustive in the

will include
be
evoked
Given
sense
all behaviors
by
that
they
which
that stimulus
may
situation.
of the
the initial probabilities
various responses available to an organism

in a givensituation,
shall expect
we
the laws of the
theoryto enable predictions

of changes in those probabilities
function of changes in values of
as
a
independentvariables.
of
the organism
simplelearning,
Definitions
Assumptions
and
to "do nothing" in
originally
the presence
1. R-variables. It will be assumed
of some
stimulus;during
t
he
that
of
movement
make
or
to
learning,
organism comes
sequence
any
be analyzedout of an
movements
some
predesignated
response in the presmay
ence
of the stimulus;then during exand
tinction,organism'srepertory of behavior
the response
treated
various
as
a
"response,"
graduallygives
erties
propbe treated as dependent
of which can
to a state of "not
responding"
way
variables subject
to all the laws
again. But this type of formulation
does not define a closed or conservative
shall abof the theory. (Hereafter
breviate
we
with
in
the
word
In
order
derive
to
sense.
as
system
R,
response
any
and extinction
sary.)
necesappropriate subscriptswhere
propertiesof conditioning
from the same
In order to avoid a common
set of generallaws,
it is necessary to assignspecific
of confusion,
it will be necessary
source
ties
properclear
distinction
make
between the
to
of not
to the state
a
responding
is said
which
is the alternative to
occurrence
terms
of the
One solution
designatedresponse.
is to assignproperties
needed
as
by special
as has been done,
hypotheses,
for example,in the Pavlovian
tion
concepof inhibition.
of
simplicity
shall avoid
The
role of
the interest of
theoretical structure,we
procedure so far as
this
possible.
In
i?-class and
The
to
term
always refer
produce
effects within a specified
class of behaviors
environmental
range
/^-occurrence.
i?-class will
of values.
This
which
definition is not
without
objection(c/.4) but has the

the actual pracadvantageof following
tice
It may
be
of most experimenters.
possibleeventuallyto coordinate R-
classes defined in terms

mental
of environcompeting reactions has
effects
defined
with
i?-classes
in
b
ut
emphasizedby some
writers,
effector
activities.
usuallyneglectedin formal theorizing. terms of
been
310
READINGS
IN
shall mean
i?-occurrence
we
By
particular,unrepeatable behavioral
All
event.
defining
which
occurrences
of
criteria
meet
7?-class
an
PSYCHOLOGY
MATHEMATICAL
point of view,
predictionsof
present
cording
of
learning enable
the
changes
are
function
in
of
probabilityof response
as
a
time under
mental
environgiven
conditions.
class,and
such are experimentallyinterchangeA stimulus,or stimulatingsituation,
able.
as
In fact,various instances of an
will be regarded as a finite population
IJ-class are
ordinarilyindistinguishable of relativelysmall, independent, environmental
of an
in the record
which
of
experiment even
only
a
events,
with
effective
is
time.
though they may
actually vary
at any
sample
given
not
In the followingsections we
shall desigare
respect to properties which
nate
the total number
of elements
picked up by the recordingmechanism.
sociated
asIndices of tendency to respond, e.g.,
with a given source
of stimulation
probability as defined below, always
5 (with appropriatesubscripts
as
counted
as
instances
of that
the
to
laws
refer to i?-classes.
These
distinctions
where
may
be
by an illustration. In the Skinner-type

conditioningapparatus, bar-pressingis
usually treated as an i^-class. Any
of the organism which
movement
sults
rein sufficient depressionof the bar
the recording mechanism
is
to actuate
than
more
clarified
be
must
and
of stimulation
source
one
considered
the number
in
ment),
experi-
an
of elements
fective
ef-
given time as s. It is
tions
experimentalcondithe
stimulation
involve
repeated
of an organism by the "same
stimulus,"
ments
that is by successive
samples of elecounted
instance
of the class.
as
an
from an 5-population,each sample
-R-class may
The
be subdivided
into
be
treated
an
as
independent
may
finer classes by the same
kind of criteria.
random
pected
sample from S. It is to be expression
We
could, if desired,treat dethat sample size will fluctuate
of a bar by the rat's right
somewhat
from one moment
to the next,
and
forepaw
depression of the bar by
treated as the
in which
case
s will be
the left forepaw as
instances
of two
number
of elements per sample
average
different classes provided that we
have
over
a given period.
will be
which
a
recording mechanism
In applying the theory, any portion
affected differently
by the two kinds of
and
movements
different relations
mediate
to stimulus
at
of
the
environment
is
input (as for example
the presentation of discriminative
exposed
tion.
the
uniform
ganism
or-
tions
condi-
5-popula-
an
of different S's said
number
The
which
to
under
considered
be
may
uli
stim-
reinforcingstimuli). If probability
is increased
by reinforcement,
then reinforcement
of a right-forepawwill
the probaincrease
bar-depression
bility
any
that when
assumed
or
that
will
occur,
instances
will
and
of
also
that
subclass
increase
the
probabilitythat instances of the broader

will occur.
class,bar-pressing,
2. S-variables.
For
analytic purposes
it is assumed
is conditional
upon
It is not
responses
can
stimuli
eliciting
be
that
lation.
appropriate stimu-
can
be
identified.
Ac-
present in
the
behavior
constant
experiment,
and
of
If the experimenter
to hold
the
during
then
the
stimulating
the
course
entire
of
tion
situa-
single5. If
a
conditioningexperiment, a light
shock are to be independently manipulated
the CS and US, then each
as
will be
in
of
to be made.
attempts
an
number
are
situation
all behavior
implied,however, that
predicted only when
situation will
depend
perimental
exindependent
upon
a
nd
the
degree of
operations,
predictions of
specificitywith which
to be
these
treated
sources
as
of stimulation
will be
312
READINGS
Elicitation
Controlled
BY
tegrated to yield
ment
Reinforce-
Conditioning:
Simple
PSYCHOLOGY
MATHEMATICAL
IN
Let
of
consists
described
to be
The
tem
sys-
of
sub-
elements, So,
be
which
ently
independmanipulated
may
of the remainder of the situation,
S, and a class,R, of behaviors defined
properties. By
by certain measurable
controlled
of
originalstimulus,
a
means
which
has initially
that is,one
a high
probabilityof evoking R, it is ensured
stimulus
of
population
will
occur
contiguouslywith
the
of
instance
an
trial
every
of stimulus
which
elements
on
is present.
ditioning
con-
experiment, for example. So

lus
would represent the populationof stimufrom
the
sound
elements
emanating
R
of
limb
flexion response
5c which
time
as
be
may
conditioned
conditioned
on
If the remainder
as
function
(1
po)e-qt
than
tively
change in x per trial is relasmall,and the process is assumed
continuous, the right hand portion of
rate
be taken as the average
(1) may
of change of x with respect to number
of trials,
T, at any moment, giving
cial
spe-
any
original(or
other
than
conditioned)
un-
that
the
accumulation
other
in
of
ditional
con-
situations
classical conditioning,provided
tion
experimentaloperationsfunc-
ensure
learned
every
tion.
will
sample
that the response

the
in
occur
drawn
from
to
be
of
presence
the 5-popula-
If the
(Sc
dx_
_
~
dT
''
Operant
x)
'
Sc
In
differential
equation
may
be
the
more
common
by
different
to
be
controlled
some
ment
Reinforce-
Stimulation
mental
type of experivarious
instrumental,trial and
operant,
in-
Contingent
arrangement,
This
Conditioning:
BY
etc.
'
(30
assumed
not
relations
(1)
'
'
regularlyevoking the response to be

conditioned,it is to be expected that
the equations developed in this section
will describe
any
x)
Sc
of rein-
of
to
=
of the number
have
we
that other
Ax
has
trials.
new
trial will
any
"
of the situation
propertiesfor the
trial
at
ap-
experimentally neutralized,the
probabilityof R in the presence of a
sample from 5c will be given by the
ratio x/Sc. Representing this ratio by
the singleletter p, and making appropriate
substitutions in (3), we have the
of
followingexpressionfor probability
Since
from"
of
to
limitingvalue,Sc,in
later section.
of elements
to
will
be
(Sc
its initial value
stimulus
expected number
x, the
of x, and
proach
a negaA method
of
tivelyaccelerated curve.
evaluating x in these equations from
of response latency,
empiricalmeasures
reaction time, will be developed in a
or
tion
administra-
on
any one
of elements
the number
are
elements
is
from
the
evoked
number
mean
So effective
from
So, and
increase
electric shock.
an
Designating the
as
and
which
jCq is the initial value
speci- forced
fications
amplitude;
training trial by
each
of
(3)
represents the ratio Sc/So. Thus
i?
ments
move-
conditioned
be
to
where
certain
meeting
direction
of
all
include
would
and
the
t5îcally,
on
-qt
Xo)e
"
been
ple
sam-
buzz-shock
familiar
the
In
source
(Sc
"
consider first tliesimplesttjrpe
us
conditioningexperiment.
that
Sc
termed
error,
the
investigators,
learned
is not
elicited
originalstimulus,but
initial strengthin the
sponse
re-
by
has
experimental
W.
situation and
K.
as
originally
part
activity."Here
occurs
of so-called "random
be evoked concurthe response

cannot
rently
of each new
with the presentation
stimulus
effects
sample,but
in
Let
of the
occurrences
consider
us
same
by making
stimulatingsituation
the
changes
contingentupon
some
secured
be
can
of the
sponse.
re-
situation of
this sort,assuming that the activities of

the organismhave been cataloguedand
all
classifiedinto two categories,
sequences characterized by
313
ESTES
trial lasts until the
R,
discrimination
common
animal
the
sponse,
repre-designated
example,in a
For
occurs.
is placedon
beginningof
stand;
trial in
jumping stand
each trial and
until the
continues
animal
so
on.
at
the trial
leaves
the
experiment
runway
reaches
the end
Typicallythe
ing
stimulat-
lasts until the animal
box,and
apparatus the
situation present at the beginning

of a trial is radically
changed, if not
rence
completelyterminated,by the occurof the response in question
; and a
trial beginsunder the same
tions,
condiset of properties
new
being assignedto
for
class R and all others to the class Re,
except
sampling variations,
interval. The
after some
of class R are to be
and that members
pre-designated
learned.
pattern of movement-produced stimuli
be changed
If changesin the stimulus sample are
present during a trial may
i?
the
after
of
evocation
occurrences
behavior,
by
independentof the organism's
we
should
tain
cer-
expect instances of the two
classes to occur,
response
ment
move-
on
the
age,
aver-
to their initial
proportional
For if x elements from
probabilities.
conditioned
the 5-populationare
originally
of R
to R, then the probability
of
ments
elewill be x/S; the number
new
at rates
of
bit of behavior
uniform
some
eatingor
drinking;in
some
such
cases
as
the
utilized for this purpose must

be established by special
training
prior
behavior
learning experiment. In the

box, for example,the animal is
trained to respond to the sound of the
conditioned to R if an instance
magazine by approachingit and eating
will be ^[(S
or
s again
a!;)/5],
drinking. Then when operationof
occurs
of
stimulus
number
of
the
the
magazine follows the occurrence
representing
tioning
condimatically a bar-pressing
elements in a sample; and the matheduring
response
the animal's reof the latter,
sponse
expected increase in x will
it
to the magazine will remove
be the product of these quantities,
stimuli
the
o
f
the
from
the
in
the
same
time,
vicinity
sx[(S x)/S^]. At
that for an interval of
of Rg will be (5
the probability
x)/S, bar and ensure
to
Skinner
"
"
"
of
the number
and
to Re if
be
an
new
elements
instance
occurs
tioned
condiwill
time
thereafter the animal
exposed
sx/S; multiplyingthese quantities,therefore
to
most
the
of the
sample
of
will not
be
5-population;
elements
to
matically which the animal will next respondmay

sx[{S x)/5^] as the matherandom
be considered very nearlya new
Thus
decrease
in
x.
expected
S.
from
should predictno
we
change sample
average
In the simplestoperant conditioning
in X under these conditions.
be
ing
In the acquisition
possible to
experiments it may
phase of a learnthe
almost
entire
stimulus
tions
restrictwo
ple
samchange
important
experiment
each
jR
tend
after
of
occurrence
plete
(comimposed by the experimenter
while in other
to force a correlation between
reinforcement),
changes
the sampling of only some
in the stimulus sample and occurrences
stricted
recases
of
the
duced
of R.
The organism is usuallyintroportion
5-populationis
ment).
correlated with R
into the experimentalsituation
(partialreinforceshall consider the former
We
and the
at the beginningof a trial,
we
have
"
314
in
case
of
detail in the remainder
some
shall
which
any
trial to
actuallyoccur
instances
conditioned
be
elements
present
first movement
conditioned
the
the
to
until
be
be
so
x/S; if this value

we
can
trial,
readily
of reprobable number
sponses
the
the
trial is
will
occur
terminated.
next
the
of
recurrence
In practice,
sample of effective
will change somewhat
the
and
some
responses
fail to octrial may
cur
The
only response
one
on
the
the
on
of movements.
composition,
occur
stimulating
maintained, the most
next.
is R, since
be omitted
never
may
This
the trial continues until R occurs.
instance
an
the first response
on,
the
of
the
elements
on
and
response,
trial.
the
of events
course
however,
in
value
(of all classes)that
be
If
sequence
which
the
compute
before
R,
trial would
stimulus
will have
if the
terminates
could
probable
same
experimental
ple
representedby 5, the sameffective on
trial
one
by s, and
any
the ratio s/S by q. The probability,
p,
of class R at the beginning of any trial
tioned
condi-
predesignated
complete constancy
situation
ments
ele-
be
situation
varies littlewithin
trial;
stimulus
of
The
not
and
occurs
the
cues,
the first movement,
the
of class Re.
the
in
completely constant
probabilitythat
trial,and to proprioceptive
is
from
cues
cues
be
will
external
some
situation
during
of
be
will
occur
beginning
movement
next
trial.
that
on
to
on
ulus
stim-
to
environmental
the
to
at
present
the total number

available
all i?-classes
expect
that
to
proxima
conditional
definition of the
from
The
equal
Let
this section.
By our
relation,we
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
trial in
to
question is p;
that it will be
will
of R
occur
the
on
the
probability
is p (l"p);
the second
the
probabilitythat it will be the third

p)^; etc. If we imagine an inp{l
of trials run undefinitely
largenumber
der identical conditions,and
represent
is
"
the
number
any
trial
of response
by
value
we
n,
of
occurrences
"
on
weight each possible

probability(i.e.,
may
by its
expected relative frequency) and obtain
In S3mia
mean
expected value of n.
n
bolic notation
have
we
which
argument
detail
by
verify
the
need
we
been
has
Guthrie
line
of
developed in greater
(4). In order to
reasoning involved,
ideas down
to set these
now
investigatethe
and
mathematical
form
of
possibility
deriving functions
will describe
in
which
Znpil
Since
we
need
expression inside the

sign will be recognized as
of
term
the
duration
trial lasts until R
of
trial in
categorized
which
in
all
to be
are
that
movement
counted
convenience
we
duration
shall
in the
assume
of instances
general
the
infinite series with
well-known
Then
we
have, by substitution,
p/{i
(1
p)y
1/p.
occurs,
probable
of
we
the
have
sequences
as
summation
l/{l-{l-p)y.
sum
L, the
Then
be
the
of
responses
of the
and
will
trial,
number
expected
time
average
product
the
per
time
mean
per
response.
"responses"
L
mum
and that the minigiven situation,
for completion of a
time needed
response-occurrence is,on
For
the
terms
Suppose
strength of R.
py-K
ing.
of learn-
empirical curves
expressionfor
an
pZnil
The
each
p)"-i
the average,
h.
opment,
followingdevelthat the
of class R
mean
is ap-
Since
nh
will be
h/p
from
may
Sh/x.
conditioned
stimulus elements
we
substitute
present
for
on
its
to
all
new
each trial,
equivalent
equation (3), dropping
the
sub-
W.
So and
scripts from
K.
obtaining
So, and
315
ESTES
in the
figurerepresents the equation

2.5
Sh
L
(S
xo)e-^''
where
(Lo
1
hê-^"^
(4)
values
estimated
.9648e-^2^'
of
from
the
give
to
appears
Thus,
will
value of Lo
the
h
over
from
decline
initial
an
(equal to Sh/x^)
asymptotic minimum
and
value
preliminary test of the validityof

this development may
be obtained
by
data
to
learning
applying equation (4)
from a runway
experiment in which the
A
in the derivation
assumed
conditions
are
tion.
degree of approximaIn Fig. 1 we
have plottedacquisition
Graham
and
data
reported by
Gagne (3). Each empiricalpoint represents
the geometric mean
latency for
realized to
of
group
with
elevated
fair
21
rats
food
for
which
traversinga simple
theoretical
The
runway.
noted, is very similar in form to the

developed
acquisitioncurve
Graham
and
The
by
Gagne.
present
differs from
formulation
theirs chiefly
in including the time of the first response
of
the
ing
learnas
an
integralpart
The
tion
quantitativedescripprocess.
of extinction
be
presented in
in this situation
to
shall have
we
to
between
situations
be
may
treated
defined
Then
if
from
the
to
the
two
any
beginning of
given occurrence
of
(T+l)
may
of
be
write
the
data
by
of
a
Graham
theoretical
and
curve
from
response
At
fore
there-
(and
considered
during
identity
Ax
runway
of R, and
have
from
Sh/x.
Ax
^
Latencies
elapsed
learningperiod
the
occurrences
in time
15
the
of R.
occurrences
reinforcements)of R, we
preceding development
Since
1.
''trials"
above, will represent
Fig.
as
let t represent time
we
the number
during conditioning, obtained
tervals
in-
reinforced.
40
TRIAL
the
in those
analyticalpurposes.
Making this
derive
we
an
assumption,
may
sion
expresfor rate of change of conditioned
strengthas a function of time
response
in the experimentalsituation,
during a
in
which
all
class R
of
period
responses
time between
10
that
assume
reinforcements
for
curve
L, as
will
forthcoming paper.
apply the present theory
the
to experimental situations such
as
Skinner
in
which
the
box,
learningperiod
is not divided into discrete trials,
order
In
are
curve
be
forced
rein-
were
This
proach theoretical
ap-
series of trials.
data.
satisfactorygraduation
it
pointsand, might
of the obtained
L
been
q have
and
Lq, h,
AT
as
the
we
trial,
crement
incan
AT
'~At'
published
Gagne
(3),
derived
in the
are
fitted
text.
its equivalent
Substitutingfor Ax/ AT
from
(1), without subscripts,and for
316
IN
MATHEMATICAL
the
ing
preced-
READINGS
AT/At its equivalentfrom

equation, we have
PSYCHOLOGY
Since
(or
s(S
Ax
At^
x)
"
s(S
"
x)x
(5)
occurrence
hS
new
to
small
in
be
may
respect
tinuous,
con-
portion of
taken
time
is
begun)
of R,
are
we
the
of i?
tion
equathe value of
as
and
after each
in
now
expected
a
function
of
occurrence
as
Representing rate
dR/dt, and
by y
=
dx/dt
to
reinforcement
is assumed
the process
the righthand
derivative
the
per
and
(5)
with
"trial" is
express
occurrence
change
If the
is administered
reinforcement
situation
consideringa
are
we
in which
the ratio
tion
posiof
rate
of time.
of
l/h by
w,
have
we
integrated
dT
dR
wx
{S
dt
dt
"
Xo)
"
1 +
Xo
5
X
(6)
(S
1
Xo)
"
and
if
take
we
of
Xo
Tq
wXq/S
the rate
the
upon
ginning
at the be-
experimental period
as
this relation becomes
s/Sh. In general,this equation

with
the
defines a logisticcurve
of initial acceleration dependamount
ing
where
of R
the
value
of Xq.
Curves
(W
"
(7)
To)
1 +
of
ro
probability(x/S) vs. time for S

100,
B
0.25, and several different values
of Xq are illustrated in Fig. 2.
=
To
illustrate
plotted in
this
Fig.
have
function, we
measures
of
rate
1.00 r
Fm.
2.
Illustrative curves
of
are
probabilityvs.
the
same
time
during conditioning;
parameters
except for the initial a;-values.
of the
curves
of
K.
W.
317
ESTES
10
"
10
IN
TIME
Fig.
Number
3.
15
20
MINUTES
minute
during conditioningof a bar-pressinghabit
per
is derived in the text.
singlerat; the theoretical curve
in
of responses
tive
and fittingthe cumulamost
respondingduring conditioningof a barpurposes,
of
records
with
The
the
rat.
integral
a
single
pressingresponse by
equaSkinner
tion
tion(7):
a
box; motivaapparatus was
the animal had
24 hours thirst;
was
w
been
trained
drink out of
to
previously
i?
w^+-^log
w
\w
/
lustrated
the magazine, and during the period il.
reinforced
was
all
with
at
various
times
for
were
an
estimate
of the
obtained
responses
per minute
The
theoretical curve
the
in terms
rate
at
the
in the
represents the number
made
of responses
by counting the number
half-minute
made
the
before
during
and
the half-minute
after the point
being considered,and taking that value
as
where
Measures
bar-pressing responses.
of rate
water
of
midpoint.
figurerepresents
after any
of
sponses
re-
interval of
time,
learning
beginning of the
sponses
original record of reperiod. The
vs.
time, from which the data
of Fig. 3 were
obtained,is reproduced
in Fig. 4. Integrationof the rate equation
for this animal
yields
t, from
i?
the
13/ +
125
logio(.038 + .962e--240.
equation
Magnitudes of R computed from this

equation for several values of t have
13
been plotted in Fig, 4 to indicate the
1 + 25e--24'
has
goodness of fit;the theoretical curve
A considerable
the
drawn
it
of
been
in
the
since
variability not
part
figure
of the empiricalpoints in the figureis
would
completely obscure most of the
due to the inaccuracy of the method
In an
experimental
empirical record.
of estimatingrates.
in press
In order to avoid
(2), equation (8)
report now
is fitted to several mean
this loss of precision,the writer has
conditioning
four
tive
cumulafor
of
of
the
curves
adopted
rats; in all
using
practice
groups
curves
of
responses
vs.
time
for
cases,
the theoretical
curve
accounts
for
318
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
elements
of stimulus
-100
is
trolled
con-
so
lation
by experimentalconditions that
from
drawn
sample of elements
of
contiguous with an occurrence
each
it is
The
sort of derivation
the response.
handle
this
kind of partial
needed to
in this section.
MINUTES
IN
TIME
sketched
will be
reinforcement
briefly
detailed treatment
more
vant
given,togetherwith releexperimentalevidence,in a paper
in preparation. It should be emphasized
now
that we
are
using the term
"partial"to refer to incomplete change
of the stimulus sample on each occurof a given response, and not to
rence
periodic,or intermittent reinforcement.
will be
Fig.
record
3
equation given in the text.
an
than
more
99
comparing
the
present formulation
Consider
study by
that
from
derived
Further
values.
verification of the
been
of the variance
cent
per
observed
the
has
the
which
from
obtained.
were
from
of
mulative
originalcuof
Fig.
points
Solid circles are
computed
the
Reproduction of
4.
of successively
curves
acquisition
habits,
bar-pressing
ing
Skinner-type condition-
behavior
ing
involv-
system
competing behaviors,
classes of
two
tion,
in a situaRg, which may occur
5, composed of two independently
obtained in a
Sr and Sg.
manipulable sub-populations,
apparatus which included two bars
Experimental conditions are to ensure
differingonly in position. It has been
lating
that of the sample, s, of elements stimuand
found
that the parameters w
s/S
ments
the
at
time, elebe evaluated from the conditioning
organism
any
can
learned
of
curve
used
one
bar
to
conditioning of
and
response,
predict the
a
detailed
then
of
course
learned
second
and
sponse.
re-
while
The
overall accuracy
in
describingthe
of bar-pressingand
should
that
fact
the
that
runway
be allowed to obscure
not
due
are
disparities
to the
experimentalconditions
do
fact
not
usuallyfullyrealize the assumption that

ment
only one i?-class receives any reinforceA
the
period.
learning
during
the
of
theory,
generalformulation
more
which
does not requirethis assumption,
in the next
will be discussed
Rg.
by
section.
the
entire stimulus
sample
by
of
occurrence
for
box
might
the
terminated
sponse;
bar-pressingre-
luminate
is il-
box
if the
visual stimulation
the
lustrate
il-
be
in which
is not
the
example,
of
occurrence
of system
Skinner
R,
effective
Se remain
by
kind
of
occurrence
from
terminated
systematic
initial portion
It is believed
by
This
but
small
is present in the
error
of the curves.
of most
that these
sponses
re-
the
elements
until
of these tions
equationing
of condirate
effective until
Sr remain
from
terminated
will
unaffected by bar-pressing
relatively
be
but
its
closes
Let
elements
given time,
from
the
Xr
the
of
behavior
Rg).
of
total number
S conditioned
number
Sr conditioned
numbers
of
instance
represent the
from
latter
(the
eyes
being,then, an
animal
if the
terminated
will be
to
occurrences
to
i? at
of elements
R, Tr and
of R
Tg
and
Rg prior to the time in question,and q

Partial
It
may
can
be
situation
Reinforcement
be shown
"learned"
that
in
a
a
providedthat
given response
trial and
some
s/S. By reasoning similar to

utilized in derivingequations (2)
age
(5), we may obtain for the averrate of change of Xr with respect to
the ratio
error
sub-popu-
that
and
320
READINGS
crease
IN
decrease in probabilityof
or
PSYCHOLOGY
MATHEMATICAL
currence
oc-
theory
be
to
seems
series of trials
an
inevitable developmen
the
at
depending
present stage of the
ability science of behavior; agreement on this
probis less than or greater than the
writers of
point may be found among
value
for
those
otherwise
conditions.
diverse
equilibrium
widely
viewpoints,
(1), Hoagland
(5),
e.g., Brunswik
Discussion
Skinner (7),and Wiener
(9). It is to
The
be expected that with increasingrigor
foregoingsections will suffice to
illustrate the manner
in which problems
of definition and
continued
interplay
of learningmay
be handled
within the
between
theory and
experiment, the
framework
of a statisticaltheory. The
various
formulations
of learning will
over
extent
the momentary
whether
upon
which
to
the
here
formal
velopedtend
de-
system
to converge
upon
set of
common
fruitfully
applied
may
to
interpret experimental phenomena
can
only be answered by a considerable
of research.
A study of conprogram
curi-ent conditioningand
extinction of
simple skeletal responses which realizes
quite closelythe simplifiedconditions
concepts.
assumed
in press.
Other papers
in preparation will apply this tion
formula-
rigidityand oversimplification
traditional
of
stimulus-response
theory without abandoning its principal
tion
advantages. We have adopted a defini-
extinction,
spontaneous recovery,
and
discrimination, related phenomena.
Skinner's
(7) concept
response similar to
of genericclasses,
and
given it
statistical
be
in the derivations
has
paper
report is
of the ent
prescompleted, and a
been
now
be
It may
the point of
relation of this program

to contemporary
theories of learningrequires
little comment.
made
No
been
certain
on
the
implied by
versial
contro-
present
analysis.
Stimulus-response terminology.
has
attempt
been
made
to
An
overcome
of the
some
of stimulus
have
and
Laws
within
of
tation.
interprelearning developed
this framework
defined
refer to behavior
in the
introductory
theory. It
systems (as
section of this paper) rather than
to
investigationto
between
relations
isolated
stimulusof
ing
learnconceptions
correlations.
and
discrimination
portant response
by stating imThis
gation
investiThe
learning curve.
concepts in quantitativeform
is
another
their
intended
be
to
ships
interrelationnot
investigating
search
for
"the
learning function."
by mathematical
analysis. Many
to
present
"new"
is the
of
purpose
of the
clarifysome
and
attempt has
view
issues
to
The
helpful to outline briefly
our
similarities will be noted
developed here
between
tions
func-
and
"homologous"
lations
expressionsin the quantitativeformuof Graham
and Gagne (3) and
of Hull (6). A thorough study of those
theories
has
thinking
than
in
build
influenced
directlyon
I have
formulations,
explore
based
writer's
either
Rather
of
point
of
statistical definition
and
those
felt it desirable to
alternative
an
on
the
respects.
many
behavior
and
view
of
vironment
en-
doing
greater justiceto the theoretical views

of Skinner
and
Guthrie.
statistical
The
writer
simple
does
function
not
will
believe
be
found
that
to
any
count
ac-
learning independently of
On
particularexperimentalconditions.
the other hand, it does seem
quite possible
small set of
that from a relatively
definitions and assumptions we
be
may
able to derive
expressions describing
learning under various specificexperimental
for
arrangements.
Measures
of
has
behavior.
been
indicate
the
of
as
mary
priAnalyses presented
that simple rela-
responding
dependent variable.
above
Likelihood
taken
K.
ESTES
probability
experimentally
rate of responding
common
W.
tions
derived
be
can
and
obtained
such
common
measures
as
between
and
latency.
able
of contiguityand effect.Availexperimental evidence on simple
to the writer to
learninghas seemed
require the assumption that temporal
contiguityof stimuli and behavior is a
Laws
321
pendent
quantitativeproperties;indevariables
environmental
of
Laws
the
of
theory
relations between
behavioral
statistical distributions
are
and
state
momentary
environmental
events.
probability
changes in
variables.
point of view it has been

tween
derive
simple relations bepossibleto
eral
probabilityof response and seving,
learnof
used
condition
the
formation
measures
for
commonly
necessary
and
to develop mathematical
pressions
exrelations.
At the level
of conditional
that is of laws
of differential analysis,
describinglearning in both
classical conditioningand instrumental
ior
relatingmomentary changes in behavditions.
conto changes in independentvariables, learningsituations under simplified
other assumption has proved necesno
sary
From
No
gation.
present stage of the investi-
at the
In
order
accumulation
to
account
of conditional
for
the
relations in
the
this
made
been
effort has
to
defend
lation
assumptions underlyingthis formu-
by verbal
"really"happens
analyses
inside
of
what
the
organism
proposed
arguments.
pealed
apbe
evaluated solelyby
that the theory
of experimentalopto a group
erations
its fruitfulness in generatingquantitative
which
are
usually subsumed
various
functions
ena
phenomunder the term "reinforcement" in currelating
rent
and
discrimination.
of
Both
learning
experimental literature.
Guthrie's
(4) verbal analyses and the
REFERENCES
writer's mathematical
dicate
ininvestigations
favor
of
others
one
in any
i"!-classat the expense

have
we
situation,
of
or
It is
similar
E.
1. Brunswik,
Probability as a determiner
essential property of reinforcement
/. exp. Psychol.,1939,
of rat behavior.
that sucis that it ensures
cessive
25, 175-197.
of a given R will be
occurrences
K.
Effects of competing reactions
2. EsTES, W.
contiguouswith different samples from
for
the
conditioning curve
on
the available populationof stimuli. We
bar-pressing. /. exp. Psychol, (in
that
have
an
made
assumptions concerning
specialpropertiesof certain
press)
no
the role of
3.
The acquisiC. H., " Gagne, R.

tion,
covery
reextinction, and
spontaneous
Graham,
such as driveafter-effects of responses,

in
affective
reduction,
changes
tone, etc.
Thus
the
here
of
oped
quantitativerelations develmay
prove
useful to investigators
the
nature
learningphenomena regardless
beliefs as to the
investigators'
of underlying processes.
conditioned
exp.
operant
Psychol., 1940,
sponse.
re-
26,
251-280.
Psychological facts and

psychologicaltheory. Psychol Bull.,
1946, 43, 1-20.
Guthrie,
5.
Hoagland,
H.
the
and
R.
E.
4.
of
of
/.
The
law
Weber-Fechner
all-or-none
theory.
/.
gen.
Psychol, 1930, 3, 351-373.

Summary
An
some
attempt has been

issues in current
by givinga
the
6.
made
clarify
learningtheory
and
response
7.
Skinner, B. F.
New
8.
New
Principlesof behavior.
Appleton-Century, 1943.
C. L.
York:
to
statistical interpretationto
concepts of stimulus
Hull,
York:
The
behavior
of organisms.
Appleton-Century, 1938.
The
Spence, K. W.
in
nature
contemporary
of
theory
struction
con-
psychology.
Psychol.
Rev., 1944, 51, 47-68.
by derivingquantitativelaws that
N.
9.
Cybernetics. New
Wiener,
pendent
simple behavior systems. Degovern
1948.
WUey,
in this formulation,
variables,
classes of behavior
are
samples with
[MS. received July 15, 1949]
and
York:
THEORY
STATISTICAL
SPONTANEOUS
OF
AND
REGRESSION
W.
K.
ESTES
Indiana
the
From
in
viewpoint of
constructing
would
be
habits
any
if
convenient
of
responding
given
situation
situation.
of
that
In
unreasonable,
be
that
could
be
much
tendencies
do
forgetting
"
How
construct
law
measure
as
has
to
situation
very
or
the
candidate
has
or
usually
paper
tenure
Social
as
Science
article
the
neural
varies
The
appeared
in
clearly
sition
po-
or
purely
of
Rev.,1955,
322
pothetic
hy-
now
area
gests
sug-
have
this type
of
Until
ables
vari-
explanatory
to
or
define
which
problems
of
evaluate
to
that
quire
re-
the
have
proposed.
parsimonious"
"more
By
explanation,
intrinsic
to
situation
and
play
this
special hypotheses
thor's
au-
Council.
Psychol.
class
refer
ordinarilystimulus
spontane-
which
"
possible either
be
of
fully explored, it will
explanation
been
array
prematurely.
scene
been
the
vorite various
fa-
postulated
prepared during the

fellow
faculty research
Research
scarcely
ronment
envi-
the
postulates
have
some
in
parsimonious
an
intervening
been
which
was
in
organism.
either
process,
hypothetical,
This
for
of
trace
attention
the
more
temporal
events
that
entered
never
as
for
The
events.
e.g., set, reactive
"
memory
den
bur-
statistical
to
the
of
the
hypothesized
organism
constructs
time,
interval
inferred,
or
in
of
the
extensiveness
havioral
be-
some
The
be filled with
sort, observed
This
properties
satisfying
variable.
explanatory
from
of environmental
compete
with
illustrated
shift
to
attempts
in
enough
cured
se-
out
events
will be
explanation
inhibition,
to
hoc assumptions.
learntime-dependent ing
to
processes
function
permanently
remains
the
expressing
temporal
unfilled
an
the
face
cannot
intervals
It is easy
for?
accounted
to
state
recovery,
than
having
once
which
paper
of
ways
it is al-
ad
few
approach
in this
with
difficulty
empirical
of
phenomena
separated.
these "spontaneous" changes
be
gap
and
they
the
hypothetical entities
foothold
of
for
postulate
to
Few
turn
whatever
is that
that
aid
response
in
The
ill-favored
new
The
are
to
but
the
ogy
psychol-
during
organism
the
well
are
each
tions
rela-
of
spontaneous
occur
so
a
certain, however,
more
e.g.,
"
are
mental
environ-
in
Nothing
unpostulate.
in
account
construct
easier
the
ing
learn-
of
and
much
hope
to
terms
orderly changes
that
when
in
laws
behavioral
variables.
is
facie,
prima
stated
between
than
it would
of
not
to
exposure
to
changes.
this type
modifiable
case,
empirical
the
all of
behavioral
to
respect
intervals
rest
is required
manner
organism's
an
with
were
only during periods
ously during
it
theory,
learning
University
interested
one
RECOVERY
role
62, 145-154.
any
variables,
variables, which
given type
thus
in
the
to
must
of
be
of
sources
are
behavioral
expected
to
interpretive schema.
Reprinted
with
permission.
W.
the
present instance
K.
ested
inter-
323
ESTES
carried
during a given period,some

stimulus
ments
elenewly conditioned
way
^
tendencies
will be replacedbefore the next
change during rest
response
have
which
intervals following experimental perinot
period by elements
ods.
ing.
conditionAnd
that there are
been
available
for
note
two
we
previously
lowing
ables
principalways in which stimulus variSimilarly,during the interval folcould lead to modification
in reextinction
an
sponse
period, random
tendencies
fluctuation will lead to the replacement
during rest intervals.
The first is the direct effect that changes
of some
lus
of the just extinguishedstimuin the stimulus
characteristics
of exelements by others which were
pled
samperimental
situations from trial to trial
during conditioningbut have not
In
been
available
or
sponse
period to period may have upon reduring extinction.
the
result
will
be
either case, the
a
probability. The second is
gressive
probetween
ods
perichange in response probability
learningthat may occur
if the stimulatingsituations obtaining
of the rest
function
of duration
as
a
have
within
and
interval.
between
periods
The
former catethese ideas testable,
In order to make
elements in common.
gory
state more
can
plicitly
again be subdivided according we must
formally and exenvironmental
the
variation
is
and
the
as
assumptions
tematic
sysconcepts
will
random.
involved.
Once
this is done, we
or
The
have in effect a fragmentary theory,or
random
lected
component has been sefirst subject of investigafor certain
account
as
our
tion
model, which may
for several reasons.
One
that
it
is
sponse
apparently spontaneous changes in rehas received little attention
heretofore
tendencies.
At a minimum, this
in learningtheory. Another
will enable
is that in
formal
model
to derive
us
other sciences apparently spontaneous
of the concept
the logicalconsequences
environmental fluctuation so
changes in observables have frequently of random
turned out to be attributable to random
be tested againstexperithat they may
mental
level.
the
data.
If
molecular
at
correspondence
a
more
processes
wish to
considerable
out to be good, we
turns
Perhaps not surprisingly,
may
in order
to
analysis has been needed
incorporatethis model into the conceptual
environmental
random
ascertain
how
of S-R
structure
learning theory,
intervals
of
limited
fluctuations
rest
counts
during
theory which acviewing it as a
pendent
for a specificclass of time-depected
followinglearningperiods would be exto influence response
ties.
probabiliphenomena.
of
It will require the remainder
shall reMost
of the assumptions we
quire
In
in
specifically
this paper
to summarize
and
results of this one
over-all
we
are
the
learned
the
of
methods
of
phase
the
investigation.
(8)
for
Theory
Stimulus
of
our
Even
can
prior to a
anticipatethat
for
fluctuation
of
response
analysis,we
mental
environ-
whenever
occurs,
at
the
the end
only
ability in
probof
For
present purposes.
environmental
experimental period will not be the

the probabilityat the beginsame
ning
as
If conditioning is
of the next.
terms
that
of
the
It
the
could
fluctuation
continua.
shall
concepts
within
will
be
Hullian
be
simplicity
mathematical
of
reasons
theory.
argument
the
convenience
one
briefly
situation,as
given time, determines
given organism a population of
and
restated
be
Any
constituted at
Fluctuation
detailed
need
elsewhere
discussed
been
have
and
a.
General
out
the
worked
of stimuli
develop
of
ideas
these
ing
statisticallearn-
however,
apparent,
system
out
in
similar
terms
of
along generahzation
324
stimulus
which
from
events
organism'sbehavior
affects the
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
mental
at
constant
any
instant; in statisticallearning theories

a
the populationis conceptualizedas
of which
all
situation,
sample
random
let
Now
in which
type of
the
consider
us
undergo
fluctuation.
periment
ex-
is
run
organism
period in the same
ior
random
sample is drawn on each trial. apparatus. In dealing with the behavthat
extinction
and
mental
experioccur
occurs
b. Conditioning
during any given
pled
period, the total population S*
only with respect to the elements samon
exhaustive
an
population
and
ment
ele-
is conditioned
at
can
any
be
the
available
and
are
derived by various
Under
not.
the
periment
ex-
two
elements
of
during
5' of
that
period
which
elements
ered
conditions consid-
probabilityof a
time
during the
given
any
ments
equal to the proportionof elethe
in this paper,
response
periodis
subset
subset
the
the
in
during the
partitionedinto
which
are
available
time
portions:
of these response classes.

exactly
of these assumptions,
basis
the
On
been
an
one
elements
stimulus
situation
one
functions have
than
more
be
stimulus
time, each
any
in the
ganism
or-
classes.
response
d. At
for
of
available to
behaviors
in a given situation may

categorizedinto mutually exclusive
to
trial.
The
c.
which
from
elements
of stimulus
set
at
scribe
in the available set S that are
investigators(2, 5, 8, 16, 21) to deof learning predicted conditioned to that response.
the course
Owing to
the
in
which
for an
idealized situation
there is some
environmental fluctuation,
is
physicalenvironment
idealized situations
trial. No
each
populationon
stimulus
stant
perfectlycon-
organism samples the
the
and
testingpurposes, but the theory seems

pirical
to
give good approximationsto emin
obtained
functions
learning
an
the
unavailable,
terval
into S', during any given inthat
an
;'
A^, and a probability
i.e.,go
in S' will enter
element
are
in
element
available set 5 will become
for
available
are
probability; that
5.
These
illustrated in Fig. 1 for
ideas
cal
hypotheti-
situation.
experimentalperiodsunder well-
short
controlled conditions.
In the present paper
behavioral
from
tention
at-
our
changes
that
experimentalperiods
within
occur
turn
we
the
changes that
the
intervals between
as
to
of
spondingly,
periods. Correreplace the simplifying
we
of
assumption
occur
function
perfectlyconstant
domly
assumption of a ransituation.^
cally,
Specififluctuating
the
with
situation
ability
that the avail-
it will be assumed
during a
given learning period depends upon a
of independently variable
large number
stimulus
of
components
or
elements
aspects of the environFig.
It is
possiblenow
to
functions
derived
the
for
be
this
able
paper.
random
to
go
go
back
earlier
variation, but
into
this
point
we
in
and
rect"
"cor-
to
allow
wUl
not
the present
1.
Fluctuations
in stimulus
sets
ing
dur-
regression (upper panel) and

from extinction (lower
recovery
spontaneous
spontaneous
panel). Circles represent elements connected
Values
of p represent probA.
to response
abilities
of response
in the
available
set S.
W.
The
relevance
of
the
learning phenomena
fact that
both
K.
scheme
for
from
the
arises
conditioned
and
ditioned
uncon-
will constantly be
elements
fluctuatingin and out of the available
set S.
During an experimental period
in which
curs,
conditioningor extinction octhe proportion of conditioned
ments
elein
will
increase
decrease
or
relative to
proportion in 5'. But
these
during a subsequent rest interval,
will
tend
toward
proportions
equalityas
325
ESTES
rium
in which
result of the fluctuation
Recovery
The
essentials of
the
at
any
period
in
curve
The
is
the
in
an
that will be
in
ing
condition-
given by the topmost

panel of Fig. 2.
upper
equation of
In
equal.
proportion
followingthe
time
are
tioned
condi-
spontaneous
the
will be
curve
rived
de-
later section.
analogous fashion
the essentials
of the spontaneous
recovery process
in the lower panel of
are
Fig.
We
tion
begin at the left with a situamaximal
following
conditioningso
that all elements

A.
of
treatment
our
of
of
Regression
and
of
elements
Spontaneous
of
S'
of conditioned
1.
Interpretation
and
predicted course
regressionin terms
schematized
process.
densities
in 5
The
the
the
elements
conditioned
are
During
to
singleperiod
sponse
re-
of
all elements in the available

and
spontaneous
regression extinction,
recovery
5
conditioned
set
to the class of
are
will be clear from
an
inspection of
competing responses A and the probability
Fig. 1. The upper panel illustrates a
A
of
to
zero.
temporarily
in which, starting from
goes
case
a
zero
Then
during a recovery interval,the
level,conditioningof a given response
random
interchangeof conditioned and
A is carried out during one
period until
unconditioned
elements
between
5 and
probabilityof A in the available
5'
results
in
increase
in
the
a
gradual
situation represented by the set S is
elements in
unity. At the end of the conditioning proportion of conditioned
5
until
the
final
is
state
equilibrium
will have, neglecting any
period we
reached.
The
of
predictedcourse
fluctuation
that
occurred
taneous
sponhave
may
the
during
in 5
the
period, all
conditioned
to
of
the
and
elements
all of
the
temporarily unavailable elements in S'

unconditioned.
val
During the first interof
the
the ensuing rest interval,
At
.6 of the conditioned
proportion }
elements will escape
from 5, being replaced
the
the
of
by
proportion;' .2
=
unconditioned
elements
from
5'.
ing
Dur-
as
recovery
function
of time
is
in the
given by the topmost curve
lower panel of Fig. 3.
According to this analysis,spontaneous
regressionand recovery are to be
regarded as two aspects of the same
process.
process
In
is
each
curve
the
case
given by
with
form
of the
celerated
negatively ac-
the relative rate of
acteristics
interchange change depending solelyupon the charof the physicalsituation embodied
will continue, at a
creasing
progressivelydein
the
and
Rates
;'.
parameters
;
rate, until the system arrives
and recovery should,then,
at the final state of statisticalequilib- of regression
the variability
together whenever
vary
* The
term
regression will be
spontaneous
of the stimulatingsituation is modified.
used here to refer to any
in response
decrease
It cannot
be assumed, however, that
is attributable
probability which
solely to
of regression and
amounts
recovery
stimulus fluctuation. It is assumed
that over
short time intervals,the empirical phenomeshould be equal and opposite in all exnon
periments.
of forgettingmay
be virtually identified
The
illustrative example of
with regression,but that over
longer intervals
Fig. 1 meets two specialconditions that
forgettingis influenced to an increasing extent
do
not always hold: (a) the conditionby effects of interpolatedlearning.
further
intervals the
326
IN
READINGS
MATHEMATICAL
PSYCHOLOGY
in the upper
given by the top curve
of
2.
If
in
the
tion,
situapanel
Fig.
same
conditioninghas been carried only
to a probabilitylevel of, say, .67,then
the
Pit)
total
number
will be
conditioned
of
smaller
and
the
ments
eleof
curve
regressionwill not only start at a lower

value, but will run to a lower asymptote,
and
so
on.
Similarly,if in the
situation representedby the lower panel
of Fig. 1, response
probability
goes to
the
extinction
zero
during
period, the
of spontaneous
covery
repredicted course
is given by the lowest curve
in
the upper
if
extinction
of
panel
Fig. 3;
terminates
at higher probabilitylevels,
the successivelyhigher reobtain
covery
we
the
in
shown
curves
figure.
Number
of preceding learning periods.
of preceding
Increasingthe number
acquisitionperiods would tend to
Fig.
the
In
upper
elements
conditioned
is
the
regression
panel the proportion of
of spontaneous
Families
2.
curves.
parameter.
in 5' at
In
the
of conditioned
ing
and
extinction
ditioning
con-
in
elements
series
the
at
the
tion
propor-
from
start
initial response
zero
and
probabilitiesof
unity, respectively;and (b)
ditioning
con-
and
condi-
proportion in 5 is
lower
panel the proportion
of conditioning is unity and

in S' is the parameter.
end
of
the end
of
total number
the
and
zero
the
increase
extinction
are
carried to
comparable criteria within the experimental

terval.
inthe
rest
period preceding
Predictions
Concerning
Experimental
OF
Terminal
level
If
fixed,the
attained
period
value
at
of conditioningor
other
conditions
tinction.
ex-
remain
level of response
probability
the end of a singlelearning
will determine
and
Effects
Variables
both
the
FiG.
initial
the
as3miptote
of
the
curve
the
For
regression or recovery.
the
situation
represented by
upper
ing
of conditionpanel of Fig. 1, the curve
the predicted
goes to unity, and
of
course
regression is
spontaneous
Families
3.
In
curves.
conditioned
of
the
of
elements
in 5' at
is unity and
elements
in
spontaneous
recovery
panel the proportion of
upper
the
5
at
the
end
proportion
the
end
of
of
tion
of extinc-
In the lower
parameter.
proportion of conditioned elements
is the
the
at
the
the
end
of extinction is the
proportion in 5 is
zero.
tinction
ex-
ditioned
con-
parameter
panel,
in 5'
and
328
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
elementary probabilitytheory we have

element is
for the probabilitythat an
the
val
of
in 5 at the end
(^ + l)st interA*
following an experimental period:
us
designate by p(t) and p'(t)
proportionsof conditioned elements,
and therefore the response probabilities,
lowing
in 5 and 5' respectively
at time t folan
experimentalperiod. The set
Let
the
of conditioned
1)
/(/ +
This
[!-/(/)]/ +/(0(1
difference
equation
-J).
solved
be
can
(2, 12) to yield

by standard methods
of t and
in
for
formula
terms
a
/(^
the
will
elements
in 5
the conditioned
in part from
in number, that were
come
elements,p{0)N
in S at the end
and
of the
experimentalperiod,
time
are
these
two
from
obtained
[1]
/-C/-/(0)]a'
/(/);
j'; and
'}"']')"
"
between
is bounded
is in S
will settle down
value /
to the constant
sufficiently
long interval
spectively.
re-
relations
at
of
p{t)=^\:piO){J-(J+
p'(0)/(l -a')N'2
the
/,
by the definition of ; and

probabilitythat any element"
after
and
hand
are
we
ready to write the general
expressionfor spontaneous recovery and
regression:
and
"
to
the initialvalue of
/ represents the fraction ;'/;+

a
represents the quantity (1
Since
at
Equation
these
With
in 5
sources
by setting/(O) equal
/(O) is
conditioned
the
in part from
from
where
elements,p'{0)N' in number, that were

of findingelein 5'. The
ments
probabilities
parameters:
time
at
p{0)lJ
{J
l)aq
-f-^'(0)(l-aO(l-/).
[3]
time,
having been
Equation 2.
values N
and S' will stabilize at mean
The functions illustrated by the curve
and N', respectively,
which satisfythe
families of Fig. 2 and 3 are all special
relation.
the upper
of Equation 3.
In
cases
N
[2] panel of Fig. 2, p'{0) has been set
J(N -f N').
equal to 0; in the lower panel, /"(0)
Spontaneous recovery and regression.
has been set equal to 1.
In the upper
and reCurves
of spontaneous
gression
recovery
has
been
of
set
panel
Fig. 3, P'{0)
be obtained by apcan
now
propriate
equal to 1; in the lower panel, p{0)
application of Equation 1,
has been set equal to 0.
and
of elements
the total numbers
in 5
the parameters
eliminated by
N'
and
means
of
simplicity,it has been
For
all of the
paper
that
same
values
some
situations
to
of
that
assume
associated
with
data
that
elements
it
;'.
and
be
might
in
In
different
reasonable
more
in the
Skinner
should
be
while
For
(11)
Homme
box
regarded
the
values
elements.
by
as
in this
5* have the
dealing with
different parameter
obtained
available
assumed
elements
portion of
fixed
remainder
and
gest
sug-
the
ways
al-
fluctuate.
from
those given in this paper.
Relevance
General
sections
and
considerations.
developments
are
ample,
ex-
described
Application of an analytic method
(8) shows that conclusions in the
will differ only quantitatively
general case
elsewhere
Empirical
present
two
of
Adequacy
The
the
aspects,
retical
theo-
preceding
one
eral
gen-
which
are
by no
specific,
with
the
regard
same
footing
means
on
to
It will be necessary
to testability.
discuss separately the general concept
of stimulus fluctuation and the specific
and
one
mathematical
model
utilized for
pur-
W.
of
deriving
its
testable
K.
329
ESTES
to contemporary
assumptions common
statistical learning theories
(2, 5, 8,
The
reason
matical
why the fluctuation concept
16), the result of the union is a mathehad to be incorporatedinto a formal
model which yieldsa largenumber
of predictionsconcerning changes
theory in order to be tested was,
of course, the difficulty
of direct observational
in response
tervals.
probabilityduring rest incheck.
Thus
for the presOnce
formulated, this model
ent
this concept must
be treated with
is readilysubject to experimental test.
the same
and
reserve
even
suspicion Its adequacy as a descriptivetheory of
which
as
appeals to
spontaneous recovery and regressioncan
any interpretation
unobservable
This
be evaluated quite independentlyof the
events.
remoteness
from
direct observation
merits
of the underlying idea of stimulus
however,
may,
fluctuation.
represent only a transitorystage in the
developmentof the theory. Relatively
Spontaneous recovery.
Space does
direct attacks upon
certain aspects of
the
detailed
not
discussion
of
permit
the stimulus element concept are
experimentalstudies,and we shall have
vided
proby recent experiments (1, 21) in
of
to limit ourselves to a brief summary
which the sampling of stimulus populafrom
derivable
tions
empirical relationships
has been modified
experimentally the theory, together with appropriate
and the outcome
retical references
ture.
compared with theoto the
experimental literaTo
the best of my
expectation. Further, it should
knowledge,
poses
be
noted
that
the
idea
of
quences.
conse-
stimulus
fluctuation is well grounded in physical

considerations.
Surelyno one would
the
references cited include
which
all studies
able
provide quantitativedata suitfor comparison with
predicted
functions.
deny that stimulus fluctuation must occur
continuously; the only question is
The
of recovery
is exponential in
a.
curve
whether
fluctuations are
large enough
form
(3, 9, 17) with the slope independent
under ordinary experimental conditions
of the initial value
(3).
effects upon
ior.
behavto yield detectable
b. The
of recovery
is inversely
asymptote
The
surmise
that they are
is not
related to the degree of extinction
(3, 11).
The
of
is
idea
of
the
c.
directly
asymptote
a new
vironmental
enrecovery
fluctuating
one;
components
has
been
used
related
to
the
number
given prior
in
to
of
conditioning periods
(11).
extinction
an
explanatory sense
by a number
d. The
of recovery
is directly
asymptote
with parin connection
investigators
ticular related to the spacing of preceding conditioning
problems: e.g., by Pavlov
(19)
periods (11).
and
Skinner
Amount
creases
of recovery
e.
(22) in accounting for
progressively detion
during a series of successive extincof conditioning
perturbationsin curves
periods (4; 13; 19, p. 61).
or
counting
extinction,by Guthrie
(10) in acfor the effects of repetition,
be noted that items c and d
It may
and recentlyby Saltz (20) in accounting
represent empiricalfindingsgrowing out
for disinhibition and
reminiscence.
of a study conducted
expresslyto test
Considered
in isolation,
the concept
the
certain aspects of
theory. Many
of stimulus fluctuation is not even
directly
inadditional
from
predictions derivable
be incorpotestable;it must
rated
the theory must
remain
unevaluated
into some
broader body of theory
until appropriateexperimental evidence
before empirical consequences
be
can
becomes
lation
available,e.g., the inverse re-
of
derived.
found
in
In the present paper
that when
this concept
conjunctionwith
other
we
have
is taken
concepts and
between
and
as3miptote
spacing of extinction
and
the
of
recovery
trials
or
ods,
peripredictionsconcerning"ex-
330
in
mentioned
zero"
tinction below
previous section.
Spontaneous regression. Predictions
tween
beconcerning functional relationships
spontaneous regressionand such
experimentalvariables as trial spacing
or
degree of learning parallel those
given
but
for
above
in the
of
in amount
Summary
In
are
of
purposes
predictedexponential
verification. The
decrease
for
available
progressivelyreduced if we
successful in bringingother relevant
are
cal
independent variables into the theoretiof the
fold by further applications
here.
analyticalmethod illustrated
regressionas
this paper
be
extinction,may
from
recovery
for in terms
ing
of preceding learnbeen
observed in several
investigated
have
we
that certain apparently

the possibility
behavioral
changes, e.g.,
spontaneous
in stimulus
function of number
periodshas
be
it may
recovery,
regressionthere
of
case
data
fewer
spontaneous
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
of random
conditions.
counted
ac-
tion
fluctuaTaken
the concept of random

isolation,
in
lus
stimu-
has proved untestable,

fluctuation
tions
(6, 11, 13, 14). Predicit
into a model
but when
incorporated
concerning regressionin relation
of
have
led
not
has
descriptions
of
to
quantitative
to spacing
learningperiods
pirical
tested in conditioningsituations, a variety of already established embeen
to be in agreement with
but they seem
relationshipsconcerning spontaneous
and
to
and
lationships
regression
rather widely established empirical rerecovery
of
and
tion
retenthe
determination
new
ones.
some
between
spacing
A forthcoming paper in which the same
in human
learning (15, pp. 156tribution
model is appliedto the problem of dis158; 18, p. 508).
ther
furwill
be
raised
of
provide
the
practice
question may
Finally,
studies
there
whether
that would
are
no
embarrass
ness
evaluation of its scope and usefulof
in the interpretation
learning
facts
experimental
the
ory.
present the-
comprehensiveness phenomena.
the theory, then
made
for
had
been
REFERENCES
negative instances would be abundantly
K., " Hexlyer,
1. BxjRKE, C. J., EsTES, W.
available. Under
conditions,for
some
lation
of verbal conditioning in reS. Rate
regressionfails to
or
example, recovery
to stimulus
variability. /. exp.
at all following extinction or
appear
Psychol, 1954, 48, 153-161.
F.
Stochastic
ever,
2. Bush, R. R., " Mosteller,
conditioning,respectively.Since, howmodels
for learning. New York: Wiley,
dealing with a theory that
are
we
in press.
pendent
is limited to effects of a single inde3. EixsoN, D. G.
Quantitativestudies of the
variable,stimulus fluctuation,
ery
interaction of simple habits: I. Recovfects
instances of that sort are of no special
from
specificand generalizedef/. exp. Psychol.,
of extinction.
limited
Like
theory,
significance.
any
1938, 23, 339-358.
tions
be tested only in situathis one
can
If
where
claim
of
suitable
measures
are
taken
4.
G.
D.
Ellson,
Successive
extinctions of
in rats.
bar-pressingresponse
Psychol.,1940, 23, 283-288.
a
and
where
the
represented
negligibleor
effects of variables not
in
And
the
else
model
are
5.
these
evidence
seems
cations,
qualifito
K.
Toward
statisticaltheory
learning. Psychol. Rev., 1950, 57,
94-107.
6.
be
danger
uniformly confirmatory. The
dence
of continually evading negative eviables
by ad hoc appeals to other varibe entirelyobviated, but
cannot
EsTES, W.
of
quantitatively
predictable.
subject to
available
either
/. gen.
EsTES, W.
K.
Effects
of competing
tions
reac-
for bar
conditioning curve
pressing. /. exp. Psychol., 1950, 40,
on
the
200-205.
7.
EsTES, W.
K.
Statistical theory of
phenomena
Psychol. Rev., in press.
in
tributiona
dis-
learning.
K.
W.
8.
K.,
W.
EsTES,
of
C.
Graham,
H.,
16.
The
A.,
and
extinction,
conditioned
operant
Psychol.,
exp.
17,
1952,
369-396.
sponse.
re-
17.
/.
tistical
sta-
learning.
spontaneous
Psychometrika,
of
recovery
J.
verbal
of
description
acquisition,
The
New
1952.
W.
McGill,
"
L.
learning.
Green,
Longmans,
G.
Miller,
A.
Irion,
"
human
of
psychology
York:
M.
R.
A.,
J.
McGeoch,
276-286.
Gagne,
"
15.
theory
learning.
in
60,
1953,
Rev.,
J.
variability
Psychol.
9.
C.
Burke,
"
stimulus
331
ESTES
N.
Miller,
E.,
S.
Stevenson,
"
tated
Agi-
S.
26,
1940,
behavior
of
during
rats
mental
experi-
251-280.
extinction
10.
and
York:
New
of
Spontaneous
extinction
in
reinforcements,
13.
of
D.
"
I.
Pavlov,
diana
In-
E.
W.
1927.
Press,
cence,
reminis-
for
theory
other
and
regression,
Psychol.
sive
Succes-
K.
reflexes.
London:
Anrep.)
single
act
1950.
EsTES,
V.
Univer.
Oxford
Saltz,
1953.
Conditioned
G.
perimental
ex-
York:
New
Press,
P.
by
(Trans,
20.
differences.
finite
of
Chelsea,
W.,
Univer.
Oxford
19.
in
theory
and
psychology.
tion,
acquisi-
thesis,
Ph.D.
Method
E.
C.
Osgood,
number
extinction
initial
of
Calculus
York:
Lauer,
spacing
18.
1953.
Univer.,
C.
Jordan,
New
to
Unpublished
period.
12.
relation
duration
and
recovery
205-231.
21,
1936,
from
taneous
spon-
Psychol.,
comp.
1952.
Harper,
E.
L.
Homme,
of
curve
/.
recovery.
11.
ing.
learn-
of
psychology
The
R.
E.
Guthrie,
Rev.,
1953,
nomena.
phe159-
60,
171.
extinctions
and
acquisitions
of
21.
habit
jumping
in
relation
to
M.
Schoeffler,
compounds
to
of
/.
reinforcements.
48,
1955,
of
Probability
of
comp.
/.
exp.
Psychol.,
48,
1954,
8-13.
323-329.
14.
Lauer,
D.
W.,
"
W.
Estes,
K.
of
Rate
22.
successive
learning
discrimination
F.
B.
Skinner,
in
relation
to
trial
1953,
8,
York:
spacing.
Crofts,
Psychologist,
The
384.
behavior
of
versals
re-
New
Amer.
sponse
re-
discriminated
physiol.
stimuli.
Psychol.,
S.
schedule
1938.
stract)
(Ab(Received
April
18,
1954)
Appleton-Century-
isms.
organ-
VARIABILITY
STIMULUS
OF
THEORY
W.
K.
AND
ESTES
Indiana
There
are
by
ments
learningexperirecognizedas important
in
stimulatingsituation
J. BURKE
University
of aspects of the
number
C.
LEARNING
IN
others.
Statistical theories of learning
differ from
Hull
in
making
way
1
are
to
the necessary
tions
condito
as
ble
plausigenerally appear
for
drawn
from
are
learning
tiguity
conbut they have not gained wide acceptance
theories
from
reinforcement
or
ing,
of learninvestigators
among
certain
characteristics
of the
theories,
possiblybecause Guthrie's assumptions
invariant
with
spect
relearning process are
been
formalized in a
have
not
while
other
stimulus
to
properties
them
make
that would
easilyused
characteristics depend in specificways
This paper
is based upon
a
reported upon
paper
the
of the
nature
stimulating
by the writers
Institute
and
at
the
Boston
of Mathematical
The
lines has
has
meetings of the
situation.
ber
Statistics in Decem-
writers' thinking along these
1951.
related
research
been
been
stimulated
facDitated
by
and
their
The
participation
in mathematical
interuniversityseminar
at
for
behavior
met
theory which
of 19S1 and
Tufts
College during the summer
sponsored by SSRC.
was
in
central
stimulus
be used
variability
concept
for explanatory purposes
than
rather
points
by theorists of otherwise diverse viewit
and
of
as
a
but which
source
treating
they
explicit
error,
require
resentation
repand Guthrie
in attempting
model
for effectivego beyond Skinner
in a
formal
that
One
to construct
utilization.
a formalism
find, for
may
will permit unambiguous statements
of
example, in the writings of Skinner,
about
stimulus
variables
of
clear
and
Guthrie
assumptions
recognition
Hull,
and
of
the conthe statistical character of the stimulus
sequences
rigorous derivation
these
of
All
conceive
a
assumptions.
stimulating
concept.
It has
been
shown
in a
of many
situation as made
previous
nents
compoup
that
several
pendently.
less inde(7)
which
pects
or
quantitativeasmore
paper
vary
of learning,for example the exthis locus of agreeFrom
ponential
ment,
of habit growth regucurve
strategies
diverge. Skinner (17)
larly
obtained
in
certain conditioning
incorporates the notion of variability
of
into
his stimulus-class
experiments, follow as consequences
concept, but
statistical assumptions and need not be
little use
of it in treatingdata.
makes
accounted
for by independent postuHull states the concept of multiplecomponents
lates.
All of the derivations were
ried
carexplicitly(13) but proceeds to
tions
write postulatesconcerning the condiout, however, under the simplifying
of a
of single assumption that all components
of learning in terms
the
stimulatingsituation are equally likely
leaving a gap between
components,
trial. By removing
to
fined
deand
occur
on
formal
experimentally
any
theory
that
in a posivariables.
Guthrie
now
we
are
restriction,
tion
(11) gives
the theory
to generalizeand extend
verbal
nomena,
interpretationsof various phein several respects. It will be possible
e.g., effects of repetition,in
show
that regardlessof whether
to
sumptions
asthese interpretations
of stimulus variability;
terms
that
Set
Generalized
Assumptions
and
Model:
Notation
an
models
This
article
appeared
in
The
exposure
of
stimulating situation
Psychol. Rev., 1953, 60, 276-286.

332
an
organism
determines
Reprinted with
to
a
set
permission.
W.
of
referred
events
ESTES
K.
AND
C.
that any change in the

shall attempt to deal
It is assumed
collectivelyas
to
333
BURKE
J.
situation (and we
constitute
plines
ing
only with controlled changes correspondspecial disciof
to
experimental
concerned with vision,audition,
manipulations
tion
distribumodel
a
wish to formulate
new
variables) determines
We
our
etc.
of values of the 6i. By repeating
mation
of the stimulus situation so that inforthe same
these special disciplines the "same"
mean
from
we
situation,
described
in
as
be fed into the theory, although
physical terms, and we
can
will depend
utilization of that information
speaking,repetition
recognizethat,strictly
refers
of
the
situation
to an
demands
of
the
exsame
learning
upon
which
be
idealized state of affairs
can
pferiments.
shall make
approached by increasingexperimental
For the present we
only
completely
the followingvery
general assumptions control but possibly never
realized.
the stimulating situation:
about
(a)
It is recognizedthat some
of
sources
The effect of a stimulus situation upon
internal
the
stimulation
to
made
are
be
ism.
organregarded as
an
organism may
This means
that in order to have
events,
of many
(b)
component
up
a
reproduciblesituation in a learning
When
a series
a situation is repeated on
to control
these
of
experiment it is necessary
of trials,
one
component
any
These
stimulation.
data
the
of
stimulus
events
trials and
first
events
various
the
occur
may
fail to
others; as a
tive
least,the rela-
occur
when
the
stimulus
(as
situation
same
experimentally)occurs
on
fined
de-
series
pendent
be representedby indetrials,
may
formulate
probabilities.We
lows:
these assumptions conceptually as folof
and
preceding
any given organism

elements.^
set 5* of N*
elements
N*
of 5*
stimulus
of the
in that
each
represent all
that
events
organism
with
to
are
can
of these
values
by di
event
of S*
N,
and
the
In
we
assume
have
noted
distribution
stimulus
the
of
6; we represent
probabilitythat the stimulus
corresponding to the i^^ element
occurs
on
given trial.
any
the sequel, various sets will be designated
by the letter 5, accompanied by priate
approsubscripts and superscripts. The letter
the
same
arrangement
set.
denotes
"trial"
the behavior
on
situation
certain
to be
elements
trial.
that
sampled
probability
those
elements
upon
the
of
of the parameter
superscripts,always
that
given trial is assumed
possibleevents
the
with
use
trial.
on
function
which
If in
elements
of
have
behavior
correspondingto an element of the set.

this reason
For
(b) For any reproduciblestimulating specificsituation
situation
present
the term
of 5*
are
given
have
occur
situation whatever
in any
the
In
ganism
or-
immediately
sociate
as-
The
trial.
the
the
of
extended
to necessense
sitate
sufficiently
the
distribution
in
6
including
movement-produced-stimulation arising
the
from
the responses
occurring on
in
we
also activities
shall not
we
paper
We
(a) With
schedule
maintenance
on
approximation, at
frequenciesof the various
events
the
some
on
set 5.
An
being sampled,
negligibleeffect
in
we
by
that
often
means
element
situation.
represent
of
of 5*
a
duced
re-
is in 5
value of
only if it has a non-zero
the given situation.
These
sets
nection,
representedin Fig. 1. In this conthat a probmust
note
we
ability
of zero
for a given event
does
if and
6
in
are
not
mean
that
the
event
can
never
"accidentally";this probability
has the weaker
tive
meaning that the relaof subscripts
of the
frequency of occurrence
the size of
is zero
in the long run.
For a
event
occur
334
READINGS
IN
PSYCHOLOGY
MATHEMATICAL
3 +.1
Fig.
5
set
1.
schematic
representation of stimulus elements,the stimulus space 5*, the reduced

with non-zero
and the response
0 values for a given stimulatingsituation,
containingelements
classes A
partitionof
more
and
detailed
the reader
A.
The
S into Sa and
joiningelements of S
arrows
explicationof
this
is referred to Cramer
It should
events.
be
different
many
For
associated
example,
with
The
point
(5).
clearlyunderstood that
the probability,
6, that a given stimulus
trial may
event
occurs
on
a
depend
upon
to the response
classes represent the
S^
environmental
stimulus
The
response
previous paper
for its
model
(7)
formulated
will be
used
in
here
We
of
important modification.
any
shall deal only with the simple case
tive
two
mutually exclusive and exhaus-
event
may
Model
without
response
visual stimulation
Response
classes.
class being recorded
in
The
a
response
tion
given situa-
probabilityupon several
will be designatedA and the complementary
in the environment.
light sources
The
dependent
class,A.
lus
Suppose that for a given stimuvariable of the theory is the probability
the
associated
element,
probability that the
occurring on a given
response
^ in a given situation
depends only trial is a member
of class A.
It is recognized
two
separatelymanipulable components
upon
that in a learning experiment
of the environment, a and
b, the behaviors available to the organism
and
that the probabilities
ment's
of the eledifferent
be
classified in many
may
being drawn if only a or b alone
the
interests
of
depending upon
ways,
were
present are 6a and ^s,respectively. the
experimenter. The response class
Then
the probabilityattached
to this
be anyselected for investigation
thing
may
element
in the situation with both components
from the simplestreflex to a complex
present will be
chain of behaviors
involvingmany
different groups of effectors. Adequate
6
Oa-\- Ob
daQhdepend
different
"
336
READINGS
correspond
to
and
system,
speak
of
case
this
of
parameter
we
evaluated
of
the
not
where
havior
beto
cease
the
be
it cannot
relative frequency. It
as
MATHEMATICAL
probabilityin
as
situation
do
IN
PSYCHOLOGY
that
specifythe probabilities
element
that is sampled
on
trial will become
to
A.
For
shall
connected
convenience
limit
ourselves
lus
stimu-
any
given
to
or
in
exposition,we
in
this paper
to
a homogepreviouspaper (7) the simplestspecialcase, i.e.,

neous
trials with
series of discrete
be related in a simple manthat p can
ner
ments
that all eleto rate or latency of responding in
probabilityequal to one
nected
tions
occurringon a trial become consituations;thus in all applicamany
A.
to response
of the theory, p is evaluated in
with the rules prescribedby
We
accordance
begin by asking what can be said
about
the course
of learningduring a
the theory,either from frequency data
and
trials
tribution
from
other
of
or
appropriate data,
regardlessof the dissequence
has
in
shown
been
matical
is treated for all mathe-
evaluated
once
as
purposes
probability.
of
be
shown
define
Representation
Learning
of
stimulus
that
our
familyof
events.
It
will
generalassumptions
mathematical
tors
opera-
describinglearningduring any prescribed

t
he
member
of
trials,
sequence
the family applicablein a given situation
depending upon the 6 distribution.
shall first inquire into the characteristics
Processes
gradual of
of learning in most
course
situations,
number
the
earlier
of
a
quantitative We
of a
to all members
common
theories,e.g., those of Hull (13), Gulliksen and
Wolfie
family, and then into the conditions
(10), Thur stone
under
which
the operators can
be apassumed
individual
have
that
nections
con(18)
proximate
formed
adequatelyby the relatively
gradually over
are
a
series of learningtrials. Once we adopt
simple functions that have been found
convenient
for
statistical
view
the
of
a
representing learning
stimulating
data in previouswork.
be
shown
situation,however, it can
Let us consider the course
of learning
rigorouslythat not only the gradual
the
in
but
the
form
the
of
trials
of
of
course
during a sequence
learning
be accounted
simplifiedsituation. Each trial in the
can
typicallearningcurve
series is to begin with the presentation
for in terms
of probabilityconsiderations
that
of a certain stimulus
if
assume
even
we
tions
conneccomplex. This
In
order to account
are
basis.
to
This
be
would
no
formed
on
an
all-or-none
being the case, there

evidence
whatsoever
require a
formation
for the
of
seems
that
postulate of gradual
individual
connections.
situation defines
5*
so
that each
distribution
of 6
element in 5* has
over
some
di,of occurringon any trial,

probability,
and
we
represent by S the subset of
elements
with non-zero
6 values; any
element
that occurs
on
a trial becomes
Psychologicallyan all-or-none assumption
connected
has the advantage of enabling us
to ^
(or remains connected
has
drawn
A
if
it
been
for
the
that
to
fact
to
account
on
a previous
readily
reader
the
situations
in some
learning is sudden
trial). For concreteness
and gradual in others;mathematically, might think
of a simple conditioning
it has the advantage of great simplicity. experiment with the CS
preceding the
For
these
statistical US
recent
by an optimal interval,and with
reasons,
is
theories of learninghave adopted some
conditions
arranged so that the UR
each
trial
and
decremental
form of the all-or-none assumption (3, evoked
on
the situation repfactors are negligible;
7, 15).
resented
Under
all-or-none theory,we must
an
by S is that obtainingfrom the
W.
of the CS
onset
The' number
of elements
US,
shall
connected
at the
This
p equal to
zero.
"*"*element
after the
Sj
it is not
in 5 will stillremain
w**' trial if and
only
"
element
is connected
(3)
The
to A
F,{n)
(1
to
E[Ni{n)],
will be
ed-.
from
individual
[1
(1
e.)"]
now
in
iv
E (1
positionto
î)".
express
p, the probabilityof response ^, as a

function of the number
of trials in this
for the term
By substituting
of
its
Fi{n)
equation (1)
equivalent
from
equation (3), we obtain the relation
curve
reduces to
(6)
p{n)
which, except
is the
(1
to
and
no
for
equal
^)"
notation,
previously
in
change
function
same
for the
(7)
have
derived
and
6 case*
sponds
corre-
linear operator used by

Mosteller (2) for situations
the
factor
decremental
is
volved.
in-
the
relation
same
observed
to
Hull's
probability
responding
theory as in the present formulation.
tion
Except where the distribution funcof the 9i either is known, or can
be assumed
theoretical grounds to
on
be approximated by some
pression,
simple exwill
be
not
venient
conequation (5)
with.
In practicewe
to work
are
apt to assume
equal di and utilize
describe
to
equation (6)
experimental
data.
The
in
nature
of the
of approximatio
error
involved in doing this
can
be stated
generally. Immediately after

first trial,
the curve
for the general
the
must
case
4
=
can
situation.
(5) P{n)
$i. It
verified
of
(4) E[_NA{n)'] i:Fi{n)
are
independent
the
in 5
elements:
We
there is
where
of elements
contributions
of
by substitution that
fixed point at /"
a
1, and this
the asymptote approached by
of /"(") vs. n as " increases
will be
the
possible
number
In mathematical
form, equation
Hull's wellas
(6) is the same
after the
n^^ trial, known
expression for growth of habit
the sum
of these expected
strength, but the function does not
expected number
connected
easilybe
Bush
distribution of
after the n'''
obtain:
we
trial,
the
on
trials;the
{\
6iY. Hence, if Fi{n) represents
the expected probabilitythat this
is
of
of
family
each
if
any of the first n

likelihood that this occurs
sampled
Equation (5) defines

for
learning curves, one
6 distribution,
and it has
simple propertiesthat are
this simplification
negativelyaccelerated curves, approaching
a
easilybe extended
simple negative growth function
initial
the
Bi tend toward
as
arbitrary
equality. If all
of the 9i are
equal to 6, equation (5)
condition.
The
that the learning
337
BURKE
J.
in
results may
of any
case
the
to
are
C.
all bounds.
all begin with N^.
Members
of
the
over
will
No
loss of erality
be monotonicallyincreasing,
family
gen-
is involved
our
in S
beginning of the
means
obtained
curves
and
of the elements
to A
experiment.
simplicitywe
tions
following deriva-
the
in
suppose
that none
in S will
For
N.
designatedby
in
of the
to the onset
AND
the response probabilityp will refer

tion.
of ^ in this situato the probability
and
be
ESTES
K.
This
j^^êl\-{\-e:)--]
is
for
lie above
the
the
essentially
the
equal $
that
paper.
function
same
case
(7) ; the terms 9 and

correspond to the terms
q
paper
for the
curve
in
of
s/S,
veloped
de-
previous
equation (6)
and
of
338
PSYCHOLOGY
MATHEMATICAL
of trials for the numerical exin S, as a function of number

amples
Response probability,
is the exact solution for a population of elements,
presentedin the text. The solid curve
The dashed
describes the equal $ approximacurve
of which have ^ = 0.1 and half ^ = 0.3.
tion
Fig.
half
IN
READINGS
2.
with
0.2.
Initiallyno
elements
of S
are
to A.
conditioned
involved
ple
of a simby means
numerical example. Imagine that
increases for a few trials,
two
curves
then decreases until they cross
structing we
are
dealing with some
particular
(in condistributions
of
in
which
the
6
conditioning experiment
hypothetical
have
CS can be representedby a set 5, comdiverse
forms we
posed
usually found
ments,
of two
this crossing in the neighborhood of
subsets of stimulus eleand
the
after
of
sizes
the fourth to eighth trial)
ing,
crossSj,
5i
N^
N^
;
of elements
the curves
tent
diverge to a smaller exN/2, where N is the number
Assume
that for all elements
than before,then come
in S.
togetheras
the
both
the
at
in
of
to
same
probability
being
5i
as3niiptote
go
be
drawn
I.
It can
0.3 and for
proved that the
on
p
any trial is ^^
those in Sg, $2
wish to
for the general and specialcase
^"^- Now
we
curves
from
the
n
one
as
cross
exactly once
predicted learning curve
compute
goes
make
cannot
to infinity.We
sponses
during a series of trials on which A reeral
any genabout
the maximum
statement
ror
erare
reinforced,assuming that
involved
in approximating expreswe
begin with all elements connected to
sion
(5) with expression(6), but after /4. Equation (S) becomes
of specialcases, we
studying a number
equal
6 case;
the
siderat
=
inclined to
are
believe that
the
error
p(n)
approximation will
be too small to be readily detectable
ing
experimentally for most
simple learnintroduced
by
situations
the
that
do
not
involve
l-^lNr{0.3)
X(l-0.3)"+iV2(0.1)(l-0.1)'^]
pounding
com-
l-^[0.3(0.7)"+0.
of stimuli.
The
development of equations (5)

(6) has necessarilybeen given in
rather general terms, and
be
it may
of the conhelpful to illustrate some
and
Plotting
from
curve
Now
this
numerical
computed
values
equation,we
given in Fig. 2.
let us approach
obtain
the
the solid
same
prob-
W.
ESTES
K.
AND
supposing this time that we

the
different 6
know
nothing about
values in the subsets S^ and 5, and are
We
tain
ob0.2.
now
given only that d
under
predicted learning curves
the equal 9 approximation. Equation
lem,
but
(6) becomes:
p{n)
values
numerical
and
(1
yield the dashed

Inspectionof Fig.
this
computed from
of Fig. 2.
curve
that
shows
leads
treatment
exact
0.2)"
to
the
higher values
early trials but to lower

the difference
values on
the later trials,
The
for
large n.
becoming negligible
in brief,for the steeper curvature
reason,
of p{n)
of the exact
with
high
drawn, and
is that elements
curve
values
conditioned
therefore
earlier in the learningprocess

low
with
values,and
will tend
to
be
likely to
are
recur
A,
to
than
ments
ele-
then
cause
be-
frequently
J.
339
BURKE
selves
theory developed above, limiting our6
the
to
equal case.
As written,equation (6) represents
of conditioningfor
the predictedcourse
sponse
a
single organism with an initial reWe
can
probabilityof zero.
that an experiallow for the possibility
ment
value of ^(0)
begin at some
may
other than zero
by rewriting(6) in the
form
more
general
(7)
p{n)
which
the
on
they
C.
=:
has
the
of
course
to
(8) Pin)
ey
(6)
as
consider
conditioningin
organisms with
varying initial
need simply
we
the group
form
same
wish
we
pmn
cept
ex-
initial value.
for the
If
[1
like
mean
of
group
values
of
but
probabilities,
response
equation (7) over

by m, obtaining
sum
divide
and
the
^Zpin)
in successive
high
late
the
of
would
be
an
that
have
early trials will
unconnected
with
values
lead
p.
elements
not
During
been
contribute
per
tively
rela-
to
stages of learning,elements
low 6 values
on
samples, to
with
drawn
the
The
[1
standard
p(o)](i
deviation
these circumstances
is
ey.
of p{n)
simply
der
un-
more
trial than
appearing at the same

stage
equal 6 distribution and will
(9) "tM
P'(n) pKn)
\Jlî:
-
^)Vp(O)
(1
depress the value of p below the curve
where
for the equal 6 approximation.
o-p(O) is the dispersionof the
bility
Variainitial p values for the group.
be
It should
emphasized that the
around
the
curve
mean
learning
the
to
of
generality
present approach
in
decreases
manner
to zero
a
simple
troduced
learningtheory lies in the concepts inand
the methods
developed as learningprogresses.
ing,
The
of counter-conditiontreatment
for operating with
them, not in the
extinguishingone response by
i.e.,
tion
Equaparticular equations derived.
to a competing
(5), for example, can be expected giving uniform reinforcement
follows
automatically
to apply only to an
response,
extremely narrow
from
of the acquisition
the
account
class of learningexperiments. On
our
riving process.
utilized in deother hand, the methods
Returning to equation (6)
that the probabilities
of A
and
to
recalling
equation (5) are applicable a
and
A
terest
to unity, we
wide varietyof situations.
For the inmust
always sum
that while response
A undergoes
note
of the experimentally oriented
with
accordance
in
will
few
of
indicate
(6),
conditioning
brieflya
reader,we
extinction
in
the
of
A
the
obvious
extensions
must
most
undergo
response
=
340
READINGS
IN
PSYCHOLOGY
MATHEMATICAL
between
with the function
accordance
of
terms
the
respond
theory and experimentalvariables,will

1
(1
e)\
pA{n)
plin)
be
experimental evaluation
possible.
If,then, we begin with any arbitrary Limitations of space preclude a detailed
theoretical analysisof individual learning
p(0) and arrange conditions so that A
=
conditioned
and
is evoked
drawn
to
to
in this paper.
In order
how
the model
will be utilized
indicate
and
will be
to
suggest
of
some
its
planatory
ex-
shall
function
simple decay
situations
all elements
tinction
trial,the exgiven by
each
on
of response
the
we
clude
conpotentialities
few
a
general remarks
of learning
concerning the interpretation
within
the
theoretical
phenomena
framework
have developed.
we
Applicationof the model to any one
isolated experiment will always involve
element
for information
of circularity,
an
about a given 6 distribution must
with
p{n)
(10)
p{0){l
e)\
and standard
deviation
Again the mean
for a
be
of p{n) can
computed
easily
like
values
of
of organisms with
group
6 but varying values of p{0):
(11)
p{n)
(12)
As
crM
in the
around
zero
case
the
in
p(0){l-dy
(1
of
acquisition,
variability
mean
curve
simple manner
be
^)Vp(O).
decreases
over
to
series of
obtained
from
behavioral
This
disappears as
circularity
data
are
related
available
from
The
data.
soon
number
as
of
utilityof the
experiments.
is expected to lie in the possibility
of predictinga variety of facts
concept
trials.
Since
due to variation in
variability
the parameters of the 0 distribution
once
ing
p{0) is reduced during both conditionhave
been
evaluated
for a situation.
and counter-conditioning,
it will be
involved
has
The
methodology
that in general we
should
seen
expect
been illustrated on a small scale by an
of reless variabilityaround
a
curve
6
experiment (6) in which the mean
of original
learningthan around a curve
value for an operant conditioningsituation
learning for a given group of subjects.
estimated
from the acquisition
was
bar-pressinghabit and then

of
predicting the course
bar-pressing
acquisition of a second
under
habit
animals
by the same
modified
conditions.
slightly
curve
Application
of
Model
to
the
Statistical
Learning
Experiments
Since
been
our
with
in this paper
has
the development of a stimulus
concern
of
utilized
When
in
the
statistical model
is taken
tion
togetherwith an assumption of associahas been necessary
have
the
in the interests of
tials
essenby contiguity,we
clear exposition to omit
reference
of a theory of simple learning.
to
of the empirical material
The
most
learning functions (5), (6), and
upon
which
theoretical
our
(10) derived above should be expected
assumptions are
based.
The
evaluation
of the model
to provide a descriptionof the course
detailed interpretation of learning in certain elementary exmust
rest upon
periments
of specificexperimental situations.
It
in the areas
of conditioning
is clear,however, that the statistical
be emand verbal association.
It must
phasized,
model
developed here cannot be tested
however, that these functions
in isolation;
will not constitute an
gether alone
adequate
only when it is taken toof
with
for a number
assumptions as to how
theory of conditioning,
and with rules of correlevant variables,
especiallythose conlearningoccurs
model
of considerable
it
generality,
W,
been
not
tions.
deriva-
our
factors
decremental
(1,4, 9, 14, 16)
that the
mized,
mini-
are
evidence
is considerable
there
of
curve
C.
AND
conditioning experiments
In
where
in
into account
taken
have
decrement,
trollingresponse
ESTES
K.
ditioning
con-
argument
equation (5) and can

by the equal
The
that
fact
the model
curves
from
fitted to certain
be
is
(7).
case
pirical
em-
desirable outcome,
of
be
regarded as providing
of
the
test
a
exacting
very
theory; probably any
contemporary
to accomplish
quantitativetheory will manage
but
course,
cannot
this
much.
the
On
detail in the
be noted
that the
tain
theory yieldscerspecificpredictionsconcerning the
effects of past learningupon
the course
situation.
of learningin a new
In general,
the
increment
extent
end
in p
depends to a certain
immediately preceding
trials. Suppose that we
the
upon
of
sequence
identical
have
two
which
decrement
or
trial
during any
proximated
ap-
well
derived
functions
can
be
6
in mathematical
present paper, it may

statisticalassociation
has the principalpropertiesof

our
341
BURKE
J.
animals
each
of
/"(")equal,say, to 0.5 at the

trial n of an experiment, and
has
of
that for each animal

suppose
trial n +
A
is reinforced
on
histories
of
the
two
animals
to differ in that
other
response
The
I.
are
sumed
prethe first animal
has
arrived
via a seat
quence
p{n)"O.S
hand, the fact that the propertiesof our
reinforced
trials
while
the
of
tistical
learning functions follow from the stasecond animal has arrived at this value
tion
of the stimulatingsituanature
of unreinforced
trials.
is of some
interest;in this respect via a sequence
animal
will
On
trial n 4- 1, the second
the structure
of the present theory is
receive the greater increment
to p (except
simpler than certain others, e.g., that
the
9
in
the
pendent
indewhich
of Hull
reason
equal case);
is,
(13),
require an
animals
the
in brief, that for both
for the
postulate to account
stimulus elements
most
form of the conditioningcurve.
likelyto occur
\
those
with high 6
tions
trial
It should also be noted that devian
+
are
on
ments
from
the exponentialcurve
form
values; for the first animal these eleoccurred
will
have
of
be
instances
frequently
as
a
s
significant
may
the present model
quence
during the immediately preceding sewe
good fit. From
of trials and thus will tend to
kind of deviation
must
predicta specific
be preponderantly connected to A prior
when
the stimulatingsituation contains
of the secelements
values.
of widely varying 6
to trial " -h 1 ; in the case
ond
of conditioning
animal, the high 6 elements will
If, for example, curves
to A
stimuli taken
to two
during the
separatelyyield have been connected
and
different values of 6, then
immediately preceding sequence
significantly
trial
is reinforced
thus
when
A
the curve
of conditioning to a comon
pound
of the two
to deviate
the
separate
growth
one
(16); Miller's
theory as
relevant
this
to
The
only
results appear
analysis,but
substantiated
Although
become
we
pected
ex-
simple
relevant
discovered
this
be
either of
a
in
reported by
regard
data
than
from
have
we
literature is
hesitate
further
curves
function.
line with
stimuli should
to
we
aspect
periment
ex-
the
Miller
be
in
would
of
the
until additional
develop
-\-\, the
second
the
animal
will receive
in weight of conthe greater increment

nected
elements.
this analysisit
From
that,other thingsequal,a curve

reconditioning will approach its
rapidly than the curve
asymptote more
tion
of originalconditioningunless extinchas actuallybeen carried to zero.
How
important the role of the unequal
follows
of
distribution will prove
to
be
in
counting
ac-
empirical phenomena of
relearningcannot be adequately judged
for
available.
shall not
342
until
further
means
for
has
research
provided
estimating the orders
of the effects
3.
R.
Bush,
for
of magnitude
R.,
"
stimulus
F.
Mosteller,
generalization and
Psychol. Rev.,
mentioned
have
we
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
model
nation.
discrimi-
1951, 58, 413-
423.
here.
.4.
Decremental
Calvin, J. S.
factors in
con-
ditioned-respo^nselearning. Unpublished
Summary
statistical
Earlier
Ph.D.
learning
associative
of
treatments
have
5.
ple
sim-
thesis,Yale
H.
Cramer,
statistics.
been
Univer.,
1939.
Mathematical
Princeton:
methods
Princeton
of
Univer.
fined
re-
Press, 1946.
and
stimulus
generalized by analyzing
heretofore
fact
in greater detail
concept
and
by taking
different
that
the
6.
of
components
corresponding
to
bar
statistical model
relative
The
set.
which
various
variable
7.
with
given
set
model,
assumption
contiguity, provides
of
a
9.
1950.
D.
A.,
Grant,
10.
together
association
by
limited
theory
E.
Guthrie,
1946, 43,
Hilgard,
characteristics
the
of
R.
E.
C.
York:
14.
here
R.,
"
other
quantitative
learning.
Brogden,
In
J.
Miller,
G. A.,
of
16.
Miller,
S. S. Stevens
of
human
York
R.
R.,
"
model
Skinner,
Crofts,
18.
Bush,
Hosteller,
for
F.
W.
verbal
J.
313-323.
/.
tistical
sta-
learning.
in press.
rate
subjects
to
stimuli.
of conditioning of
single and
multiple
/.
gen.
Psychol.,
399-408.
B.
F.
The
York:
behavior
of
isms.
organ-
Appleton-Century-
1938.
Thurstone,
/. gen.
matical
mathe-
L.
L.
The
learning function.
Psychol., 1930, 3,
469-493.
simple learning. Psychol.
Rev., 1951, 58,
tinction
ex-
in
Wiley, 1951.
2.
and
ing.
learn-
(Ed.), Handbook
New
New
1943.
conditioning.
to
McGill,
"
The
J.
New
of experimental psychology.
Ap-
expectations
description of
1939, 20,
studies
tioning
Condi-
York:
Acquisition
verbal
analogous
conditioned
Animal
G.
exp. Psychol, 1939, 25, 294-301.

15.
17.
1.
Bull.,
1940.
G.
L.
of
REFERENCES
W.
New
Appleton-Century,
with
formulations
D.
Marquis,
learning.
Humphreys,
situation
facts and
Psychol.
Principles of behavior.
L.
D.
model
compared
are
127-149.
theory.
Psychometrika,
elaborated
Psychol., 1951,
exp.
Psychological
pleton-Century,
Hull,
forcement
rein-
eyelid
1-20.
and
13.
tion
adap-
H.,
of
psychological
12.
Dark
random
in human
chometrika, 1938, 3,
11.
quisition,
ac-
extinction,and relearning,are
compared with experimental findings.
Salient
/.
York:
A
D.
L.
"
Wolfle,
transfer.
learning and
Psy-
Gulliksen,
taken
theory concerning
W.
Humphreys
conditioning.
42, 417-423.
periment
ex-
conditioning phenomena.
theory it has been possible
of the learning
to distinguish aspects
that
depend
properties of
upon
process
situation
from
those
the
stimulating
that
do
Certain
not.
general predictions
the
H.
Hake,
"
the
probability
to
applications. New
phenomenon
tions
opera-
1950, 57,
Rev.,
introduction
its
Wiley,
this
from
statistical theory
Psychol.
An
and
certain
Within
Toward
theory and
theory
an
of
K.
learning.
Feller, W.
functions.
statistical
The
8.
stimulus
represented by
are
and
in
actions
re-
for
curve
94-107.
events
the
of
behavior
competing
pressing. J. exp. Psychol., 1950, 40,
EsTES, W.
of
by a mathematical
frequencies with
aspects
affect
of
conditioning
200-205.
perimental
independent exis represented in
an
variable
the
stimulus
of
population
Effects
the
on
have
ent
differstimulating situation may
probabilitiesof affecting behavior.
The
K.
than
of the
account
W.
EsTES,
[MS.
received
November
12, 1952]
344
READINGS
trial. The
dependence of
the
upon
S's responses
stimulating situation
is
pressed
ex-
theory by defining a
such that each
relationship
conditional
in Sc is conditioned
that
evokes
when
from
which
it evokes
Then
on
expect
(1) that
Sc on the
which
of
If
Ei
relation
of Ss begins
group
with
the value
of Trial
-\- I will be
7r[(l-5)p(")+ 0]
(l-,r)(l-0)p(n)
"(1-6) Pin) +"T.
experiment
an
p(0), then
the end
at
have
would
we
(3)
E2
p(i)
of class Aj.
response
occurs
we
the end
at
will
trial will become
be
that
so
on
ii-e)viO)+6T,
of Trial 2
trials
-e)viO)
A2.
to
Sc
from
are
+ dir
e-ir']
ditioned
con=
and
so
that
[t
successive
be shown
can
the end
at
p(0)](i
for
on
sufficientlygeneral it
are
samples
(i-0)[(i
E2 trial the
an
conditioned
if successive
discrete
ciples
prinp(2)
sampled
all elements
Ai while
to
sample
p(n+l)
bility
proba-
average
the basis of association
on
from
the response
that when
an
trial on
given by the
to
is
compatible
predictingEi but
of
and
predictingE2,
Now
belonging
response
the
of Ai after Trial
it
occurs
interferes with
occurs
(1"jt);then
class Ai, i.e.,

one
with the response
which
Ei
an
situation.Equation 1 will
applicableon the proportion tt of
trials and Equation 2 on
the proportion
be
(tends
to
to evoke) either Ai or
A2. In order to
of
interpretthe formal model in terms
verbal conditioning experiment, we
a
assume
PSYCHOLOGY
forcement
the
in
element
MATHEMATICAL
IN
e)\
trials;in
by
induction
of the nth. trial
statistically
independent, the probability
of
Ai after Trial
an
p(n), is
defined
ated
n, abbrevi-
in the
the proportion of elements
model
as
Since
(1
6)
"
in Sc that
Ai,
to
and
for the
similarly that
Equation
of an A2, [.l"p{n)^.
probability
accelerated
these definitions the rule for calculating
initial value
the change in response
ability
prob-
With
value
on
Ei trial
an
be stated
may
formallyas
and
on
E2 trial
an
Pin
1)
(1)
{l-e)p(n)+d
as
(2)
(l-e)p(n).
This
genesis of these equationswill be

The proportion (1-^)
fairlyobvious.
of stimulus
the
and
sampled
elements
on
trial does
the proportion 6 is
elements
are
is not
of elements
status
not
all conditioned
not
are
change;
and
these
either
Ai or to A2 accordingly as an Ei
Now
in a random
E2 occurs.^
Consequently
should be expectedto apply only to
paper
the
be
be
seen
negatively
running from
curve
p(0)
to
the
the
asymptotic
of the statisticallearning
surprisingat first
since it makes
asymptotic response
the
probabilitydepend solely upon
reinforcement.
It
o
f
probability
seems,
however,
is rather
be in excellent
agreement
results
of Grant,
experimental
to
(3) and Jarvik

Hake, and Hornseth
(5). The question that interests us
or
to
an
rein-
ing situations which
functions derived in this

leam-
are
lowing
symmetrical in the fol-
each response class there must

condition which, ifprescorresponda reinforcing
ent
To
sense.
on
any
to
the
that
ensures
trial,
class will terminate
response
ing
belong-
the trial. These
functions should,for example, be applicableto

discrimination with
learningof a simpleleft-right
discrimination
left-right
in the
free
responding
correction,
Skinner box, or to Pavlovian conditioning.
correction ; but
*
fraction between
it will
sampled,
that
sampled
4 must
outcome
with the
The
one,
(4)
T.
model
p(n-\-l)
be
must
and
zero
conditioned
are
p(O)](l-0)-.
Pin)=^-{jr-
without
not
to
to
regarded
as
or
We
as
estimate
cannot
conducted
not
were
the theory, and

that we would be
to
the
since
for the latter conclusion
test
TABLE
Design
Experimental
in
Terms
Series
Each
DURING
bility
Proba-
of
(ttValue)
Reinforcement
OF
level
confidence
345
STRAUGHAN
H.
J.
theory.
the
of
confirmation
AND
is to be
this agreement
coincidence
remarkable
is whether
now
ESTES
K.
W.
periments
ex-
cally
specificannot
we
alert to
as
guarantee
the theory
to
notice results contrary
in the literature
which might appear
in the
been
have
we
a"
of these
case
It has
decidedly positiveinstances.
seemed
to
out
of
of
predictionfor
situation,it
can
testable
one
given experimental
to
generally be made
In the
more.
yieldmany
to be reportedwe
experiment
tried to set
have
situation similar in essentials
separate
from that of
to
value
event
groups
to
1 has
in Table
been
into four subgroups of four
; within
treatment
group,
say
subgroups have the same

Group
but each receives a separate
value
"K
of Ei's and
randomly drawn
sequence
I, all
Ea's.
Method
that
Humphreys, Grant, and others

experimental design which
permit testing of a variety of
Each
of the theory.
Ss each
orders
particular
each of the three
occurrences,
indicated
up
used by
with
an
would
the effect of over-all
subdivided
theory,namely,
if it will generate
that
IT
of
tures
fea-
convenient
of the
one
mathematical
is to
ing
experiments,mak-
new
some
use
able
objection-
of this impasse
out
way
carry
that the least
us
In
differs.
reinforcement
preceding
order
in a
run
Apparatus. ^The experiment was
containinga 2-ft. square signalboard and
were
four booths.
Upon the signal board
12 12-v., .25-amp. lightbulbs spaced
mounted
"
room
The bulbs
evenly in a circle 18 in. in diameter.
occupied the half-hour positionsof a clock face.
used
fication Only the top two lightson the board were
of 120 trialsin an individualized modiin this experiment. The signalboard
signals
as
situation
of the Humphreys
table 40 in. high
mounted
a
on
vertically
was
with the schedule of ir values shown in
Ss'
booths.
of
front
in
5
ft.
about
and
consequences
through
run
was
successive series
two
was
Table
will be able
and
from
to
learning rates
compare
different
be able
different
same
of
probabilities
the second
ment
reinforce-
series
startingat
to
groups
compare
initial values but
will
we
exposed to
of reinforcement.
probabilities
Comparison of Group I with
the other groups
over
permit
learning rate
series when
{B value)
the
tt
series will
both
of
stability
of the
evaluation
value
from
change. Series Ia and

will provide a comparison
not
initial response
values are the same
series
does
or
which
probabilitiesand
but the
to
does
series IIb
in
amount
booths
The
we
of groups
starting
asymptotes
similar initial values but exposed
; within
the
to
first series
the
Within
1,
ir
of
from
made
were
two
30 X
60 in.
30 in. high,placedend to end but meeting

tables,
behind them would
at an
angle so that Ss sitting
toward the signal
board,
be facingalmost directly
about
7 ft. in front of Ss'

The four Ss
eyes.
table.
each
were
Two
Ss
sat
at
separated from
another by panels2 ft.high and 32 in. wide.

mounted
verticallyon the
panels were
one
These
table tops
edge of
so
as
extend
to
14^
in.
beyond the
the seated Ss.
the table between
each booth, 18 in. back from S's edge of

wooden
panel 12 in. high
a
table, was
ing
the table top and extendmounted
on
vertically
In
the
across
the width
of this panel facingS

of the
same
covered
lightswere
size
as
of the booth.
those
On
the side
lights
reinforcing
the signalboard but
two
were
on
by white, translucent lenses. These

in front of S, 4 in. apart and
directly
8 in. above
the table top. On the table below
each reinforcing
lightwas a telegraphkey.
The orders of presentationand the durations
of the signallightsand
lightswere
reinforcing
346
READINGS
IN
PSYCHOLOGY
MATHEMATICAL
"Are you sure

understand
all of the instructions
corder
Esterline-Angusreby a modified
you
far.'' The
of the trials will
rest
so
using a punched tape and a system of
The
recorder
off without
electrical pick-up brushes.
have to be run
conversation
was
or
any
the
table
the
board.
behind
other
choice on
on
signal
a
placed
interruptions.Please make
activated by deprestrial even
Recorder pens which were
if it seems
difficult. Make
a
sion
every
of the telegraphkeys in Ss' booths
the first trial,
then try to improve your
were
on
guess
between
the brushes.
mounted
as
Thus, the preas
rect
corsentationsguesses
you
go along and make
many
of the lightsand Ss' responses
choices as possible."
were
the same
recorded
answered
on
Questions were
by rereading or
tape. A panel lightwas
the Esterline-Angus
mounted
above
recorder so
tions.
paraphrasingthe appropriatepart of the instructhe signalboard, could
If there were
that E, seated behind
questionsabout tricks
any
the followingadditional paragraph was
watch the operation of brushes and pens during
read.
"We
have
the experiment.
told you
everything that will
in the experimentalroom
Windows
were
ered
covhappen. There are no tricks or catches in this
with opaque
material and the experiment
how well
to
see
experiment. We simply want
in darkness except for lightthat came
cult
was
run
can
profitfrom experiencein a rather diffiyou
from the apparatus.
problem-solvingsituation while working
under time pressure."
48 students obtained
Subjects.The Ss were
in psychology
from beginning lecture courses
The recorder was
started again and the
now
of 1952 and assignedat
240 experimentaltrials were
off in a conrun
tinuous
during the fall semester
with no break or other indication
random
to experimentalgroups.
sequence
Procedure.
At the beginningof a session,Ss
S at the transition from
Series A to
to
were
brought into the room, asked to be seated, Series B. On each trial,the signallamps were
and read the followinginstructions:
lightedfor approximately2 sec. ; 1 sec. later the
"Be sure
seated comfortably;it will
are
lightin each S's booth
appropriatereinforcing
you
be necessary
hand
to
keep one
restinglightly lightedfor .8 sec; then after an interval of .4
the next
beside each of the telegraphkeys throughout the
sec.
ready signalappeared; and so on.
used
The high rate of stimulus presentation
was
experiment and to watch both the largeboard in
controlled
"
"
the front of the
will be
each
at
or
trial,
ready signalon
two
on
top lights
the
small
two
Your
compartment.
own
your
and
room
task
lightsin
in this
periment
ex-
sure
to
make
your
choice
as
soon
will give you

four practicetrials."
guished
extinpoint the overhead
lightswere
and
the
recorder
made
started.
If any
ous
obvi-
during the four

by
practicetrials,they were
pointed out by E.
During the four practice trials the reinforcing
lightswere
always given in the order: Ei, Ei,
E2, E2. After the practicetrials the following
instructions
were
were
Terminal
read;
be
Discussion
and
It
probabilities.
response
clear
for each
of
Tf
have
from
"
of
Equation 4 that the predicted asymptote
we
mistakes
the part of
the
is on.
"Now
on
left
the
At this
verbalization
will
ready signalappears, press the proper key

down
firmly,then release the key before the
ready signalgoes off. It is important that you
press either the left or the rightkey, never
both,
each trial,
and that you
make
decision
on
your
and indicate your choice while the signallight
as
minimize
Results
to
big board. About a second

the rightlamp in your
or
will
for
As soon
a moment.
light
compartment
the ready signal flashes you
to
as
are
guess
whether the left or the rightlamp will lighton
choice by pressing
that trial and indicate your
the proper
key. If you expect the left lamp to
light,press the left key; if you expect the right
not
lamp to light,
are
press the rightkey; if you
Be
to
Ss.
outguess the experimenteron

The
least as often as you can.
each trial will be a flash from the
later either the
sure, guess.
in order
series will be
the
obtaining during
taken
discussion
our
the
the
value
series.
We
proportion of Ai
mean
during the last 40 trials of

responses
of terminal
each series as an estimate
response
are
both
and
probability,
summarized
for
series in Table
Mean
FOR
Response
Each
values
and
TABLE
Terminal
these
all groups
2.
Series
Probabilities
W.
ESTES
K.
For the firstseries a simpleanalysis

of variance
yields an F significant
.001
level for diff'erences
the
beyond
estimate
variance
groups
value for the standard
each
between
series the
.6"
"*
falls
Group
series.
/^'
means
"
Group I in the
tests
computed
the last
significantly
I
2
I I
3 4
I
5
I
7
I
6
OF
BLOCKS
I I I I
9 10 II 12
I
8
(m)
TRIALS
20
in
the
theoretical
series,but
level
probability
same
P(m)" 300
"
yt
^
Pimh .850-260(.92)20"'n-'"
both
in
second
of
in the
had
t
Fig. 1.
series.
asymptote
reaches the
"p(m)-300*J73O82)
and theoretical
curves
Empirical
resenting
repapproximates it in
of Ei predictions
mean
(Ai
proportion
falls
II
Group
nificantly
sigresponses)
per 20-trialblock for each series
the first series but

short
iiT
"
short of the theoretical asymptote

the second
For
mean.
between-groups
asymptote
"
the
seems
forward.
straightinterpretation
III
approximates
Group
theoretical
"
20(m-l)
t test
and
differences among
subgroup
at the .05 level.
significant
the
"
the .05
between
F has a probability
and .10 levels. In neither series were
The
group
in the
appropriatetheoretical
the second
of
mean
group
P(mJ-.300*280(.982)^"
"^T
withinobtain
we
error
this is used
and
mean,
the
From
means.
among
347
STRAUGHAN
J. H.
AND
firstseries.
Of the
for differences
blocks
two
as
tween
be-
Taking the theoretical "r

ally
equal to "V407r(l tt),which is actuof the true
underestimate
a slight
value,we find that approximatelyhalf
distribution.
"
of 20 trials
of the scores
in each series fall within
all
yielded
probabilities
series,
and
one
a of the theoretical asymptote
greater than .10 except thef for Series
only one score in each series deviates
the .02
at
IIb which was
significant
It
than three a.
more
in each
level.
concerning
Evidently the predictions
mean
asymptotic values are
correct, but
the
rate
of
approach to
asymptote
is faster with
than
the other conditions.
under
Group III
According to theory,not only group

means,
but
also
should approach
To obtain evidence
individual
tt
curves
by
then,that except
appears,
for
ant
few widelydevi-
the p values of individual Ss

theoretical asymptote.
cases
approach the
One might raise

what
is meant
curve
empirical
kind.
asymptotically.the
Ss
Naturallyone
to
questionas
to
just
by the asymptote of an
in a situation of this
perform
would
at
not
constant
expect
rates
the
tenability indefinitely.It does not seem

that
have
of this aspect of the theory we
of
sort
breaking point was
proached
apany
examined
vidual
the distributions of indiever;
in the present study, howAi response proportionsfor the
one
subgroup of Group I was
last 40 trials of Series IIIa,IIIb,and
for an additional 60 trials beyond
run
^B. If allindividual p values approxiTrial 240 and maintained
mate
an
average
as
to
the theoretical asymptotes over

then for each of the series
these trials,
the
individual
an
.304 Ai responses
the
mean
over
these trials.
Mean
proportions
learning curves.
data
mean
are
value,
plottedin
of Ai responses
approximately binomial
proportion
response
should cluster around

TT, with
proportionof
"
In
terms
Fig.
of the
per block of
348
READINGS
trials. The
20
which
readilyobtained
is
curves
empirical
obtained
tion
Equa-
ordinal
Trial
by
20
Ai responses
=
p(0)](l
[7r
Equation
m,
obtaining for K
[1
(1
all values
over
of
blocks of trials
should
(5)
each
describe
of
Fig. 1
mean
of the
numerical
mean
curves
values
once
for the eters

param6; furthermore,the
differ
of 6 required should not
substituted
are
and
X, p{0),
groups
should be
from
constant
The
group.
series to series
values
of
are
of
Equation 7 to the sum
proportionsfor a given
solve for 6.
For Group I
then equate
the observed
series and
the estimate
we
obtain
for
IIIa,6
.08.
values
.018 and
Using these
eter
param-
have
computed the
theoretical curves
for Group I and for
the first series of Group III, which
be
may
we
in Fig. 1.
seen
In this analysis
find agreement
between
we
theory in one respect but
either series and
within
er-]
expressionbeing simply the
for each
be
procedure.
used is
P(i)](i
^)20('"-i)
Kir-lirP(l)]
(1
[1
fl)2o^]
(7)
1
(1
6y
g)a''("'-i)
the wth block of 20

value of p (n) over
trials. According to theory, Equation
among
pirical
em-
can
have
we
T. -^-b^-
20 6
value
sum
method
to
""=1
this
only
these
the
P(m)
P(m)
lack
we
of 6 and
simple statistical
The
ber
num-
n
+
inclusive,
expected proportionof
in the block,we can write
to
respectively.Now
estimates
block of 20 trials running from
Trial_"+
and
function
PSYCHOLOGY
from
Letting m be the
4.
of
theoretical
describe these
should
MATHEMATICAL
IN
and
another.
The
theoretical
data
in
not
curves
vide
pro-
reasonably good descriptionsof

by the experimental
in the
the observed
points,especially
procedure. The values of p (0) in the
the
of
but
values
for
6
case
Group
I,
in
the neighborfirst series should be
hood
the
two
are
no
means
by
equal.
of size 16
groups
of .50,but for groups
of
fixed
course
sampling deviations
large
so
in favor
p{0)
measured
this
could
be
quite
best to get rid of

of P(l) which
be
can
it will be
accurately. To do
1
write Equation 5 for m
more
we
The
latter finding does
in
asymptote
Group III
P(l)
[1
20 6
solve for
then
did
We
Cx-p(O)]
[x
(1
p(0)]
20e[x-P(l)]
(1
the
not
try
substitute
this result
to
and
6y
into
estimate
values
for
value
without
using
Series II b
used
above,
interest
more
predictedcurves
from
estimate
to
IIIb by the method
tion
Equa-
5 giving
P{m)
series,while
but it will be of
and
first
Group II since the

is virtually
horizontal
empiricalcurve
and
closely approximates the line
.50. We
could proceed
r
P(w)
as
not.
was
Lr
come
for the first series of
en
p(0)]]
"
not
had found in
as
we
surpriseinasmuch
the previoussection that Group I was
short of its theoretical
significantly
to
struct
con-
for these series
tion
additional informa-
any
the data.
According
the
to
[tt
Observed
be .58 and
P(l)](l
values
(6)
")2''('"-i).
of P(l)
theory,it should
those
turn
.59 for Series Ia and
out
to
IIIa,
already
values
at
be
from
curves
our
possible
to
disposal. The
in the second
pute
com-
information
p(0)
series should
be
W.
TABLE
Predicted
THE
Ai
ESTES
AND
J.
Observed
and
OF
K.
Mean
Second
be
will
Response
expected
except insofar
Frequencies
p(0), so
the
in
349
STRAUGHAN
H.
Series
as
except
.018
we
d, respectively,
theoretical
III, respectively.The
difference between
lies in the
number
"
80
160
cedural
pro-
lis
to
the
this
1"''
2400
tical
statis-
utilized
in
variable
80
160
have
p{0),it, and
computed
Similarly,the
for Series III
should
Series B
"
80
curve
160
have
fitting,the
been
spondence
corre-
the theoretical and
between
240
data
240
"III
80
160
240
TRIALS
Fig. 2.
Empiricaland theoretical cumulative
and
also to
in the
forcements;
preceding rein-
of
according
model, however,
only
Ia and
I should
IIIb, and we have used

this value, .08,together with .30 for ir
and
.85 for p(0) to compute
the predicted
for IIIb shown
in Fig. 1.
curve
dom
Considering that no degrees of free-
apply
and
of
Fig. 1.
estimated
d value
in
the
error
for Series II b,
curve
this is plottedin
change
Using .50,.30,
the values
as
to
effect
no
for Group
applicableto lis-
be
and
of the first
the theoretical asymptotes
and
for
.85
.50
series,or
Groups II
have
for sampling
estimated
6 value
to
it leads
response
curves
for individual Ss of Group I
350
empiricalcurves
The
for
reason
does
seem
bad.
of the irregularities
some
brought
will be
section.
IN
not
in the
out
statistical
PSYCHOLOGY
MATHEMATICAL
READINGS
increase
no
Slopes
of
similar
and
in resistance
the
two
change.
to
curves
are
very
the
totals do not
response
differ significantly.This
result is in
next
of
test
to
one
line with predictions from

be
the statistical
correspondence can
retical
model, but a little surprising,
by calculatingfor each theototal
curve
a
predicted mean
perhaps,from the viewpoint of Thornin the second
ory
thereinforcement
of Ai responses
series, dikian or Hullian
since partial reinforcement
has
of Equation 5, and
paring
comby means
held
increase
these values with the observed
to
generally (6) been
aspect
obtained
of the
This has been
totals.
mean
the
is
comparison
The
low.
satisfactorily
In order to give an
which
to
the
to
form
Fig.
for the
the
of
data
of
the
then
to
effect, some
curves
being too irregularfor curveThe
theoretical
fitting purposes.
2
in
Fig. represent Equation 7
curves
with 6 values obtained
by a method of
approximation.
of the
are
curves
require other values for this parameter,

viz.,.075,.45,.24,and .18,respectively.
3 and
the
from
4 deviate
theoretical
siderably
con-
form.
that the empirIn general,it appears

ical
individual Ss can
for most
curves
quite satisfactorilyby
the theoretical function, and this fact
basis for inferringthat
gives us some
in this situation mean
learning curves
be
of Ss reflect the
learning uncomplicated by
individual
any
IT
value
effect of 120 reinforcements

of
comparing
response
We
.50 may
curve
be
forms
find that
predict the
(b) that
learning
the
rate
response
of reinforcement
the
in
between
initial
the
bility
proba-
probabilityand
obtaining
ing
dur-
series.
Sequence effects.The mean

curves
studied
in the preceding section may
reflect adequately all of the learning
not
that went
on
during the experiment.
of
The
in some
irregularities
the mean
of Fig. 1 might be
curves
for if there is a significant
accounted
to
tendency for Ss' response sequences
follow the vagaries of the sequences
of
El's and
E2's. To
we
have
check
proportionsof Ai
mean
trial block
In
Ss,
there
blocks and
Ai
to
per
block.
IIb-
in
lead
blocks
they werd
in which
into
no
12
cessive
suc-
were
these
of
trial
classified according
of Ei
Then
120 trials
Since there
576
the number
vs.
10-
in Series B.
divided
were
3 the
responses
preparingthis graph, the

were
bility
possi-
Fig.
occurrences
for all groups
of Series B
48
this
on
plotted in
frequencies of Ei
by
the reinforcements
of
course
the difference
upon
evaluated
totals for Series Ia and
the
parameters
series of learning trials and
one
and
mean
ate
evalu-
from
blocks of 10.
at
seem
stances
circum-
some
learning curve
depends,
approaches its asymptote
an
as
ner,
yet incompletely specifiedman-
artifacts of averaging.
gross
The
of
trend
would
possibleto
series ; and
a
new
the
which
mean
described
for groups
of
study
our
"
fitted quite well by this function with

.30 as the asymptote
T
parameter.
Four
Numbers
2,
11, IS, 16,
curves,
Curves
in
at
smoothing
this
in
theoretical
function,
noncumulative
Ten
least it is
at
vidual
indi-
from
learning curves
{a) that under
mean
be
Ss
curves
response
I.
The
cumulative
chosen
conclusions
extent
for
Group
was
The
seem
extinction
to
situation.
the
of individual
theoretical
cumulative
all Ss of
3.
between
idea of the
plotted in
have
we
in Table
resistance
to
the behavior
conforms
and
theoretical values
and
observed
given
differences
for
values
done
for
occurrences
the
set
of all
Ei's occurred, the
352
READINGS
independent of tt, as requiredby the

values
all
are
theory, the numerical
the 6 estimates
larger than
from
of reinforcement;and
that response
should change in accordance with
probabilities
exponentialfunctions,
learningrates (as measured
by slopeparameters)being independentof
and probability
both initial condition
forcement.
of rein-
bility
obtained
The
curves.
response
mean
PSYCHOLOGY
MATHEMATICAL
IN
most
of this
straightforward interpretation
disparitywould
short
that,owing to the
interval, successive
intertrial
trials
are
independent in the
not
the
required by
Nonindependence
as
theoretical
would
the present
First,stimulus
at
least
learning that
occurred
affect behavior
than
on
cessive
suc-
the
overlap,and
trial
on
one
the
next
on
to
end
statisticalcriterionfor
asymptote
of the second
both
model.
drawn
samples
trials would
would
have
sense
in so far
consequences
isconcerned.
experiment
immediate
two
The
be
first and
series,
Group
II
met
was
series
second
was
retical
approach to theoby Group I by the
and by Group III in
series. In
the second
short of theoretical asymptote
but
reached the same

response probability
Group I duringthe firstseries.
Learning rates were
virtuallyidentical for
and Group II,second series,
Group I,firstseries,
that resistance of response
indicating
probability
forcement
reinaltered by 50% random
to change is not
in this situation. Learning rates differed
within both
significantly
among
groups
series. In general,learningrate
was
directly
as
had
random
related to difference between

initial response
sampling
greater
and
of
reinforcement
probability
probability
would allow for,thus increasingPAiEj,
It was
that this relationship
a series.
during
suggested
and decreasingPaê^- Second, the reinforcing
depend upon temporal massing of
may
stimulus of one
trial,Ei or
but individual
trials. Not only group
means,
plex
E2, would be part of the stimulus comcould be described satisfactorily
learningcurves
effective at the beginning of the
by theoretical functions.
extent
is
interpretation
widely spaced trials
correct, then more

should result in better
the alternative
of
tendency was
as
whole.
for Ss
to
respond
On
creased
(Ei and E2 occurrences)inas
a function of trials.
significantly
nonreinforcements
of d
estimates
observed
tivity
the contrary, sensieffectsof individual reinforcements and
series
to
tween
be-
agreement
learning rate
mean
to
of the
also in reduction
and
No
If this
trial.
next
dependence
ability
probupon
References
of reinforcement.
1.
Summary
2.
behavior, and
Learning rates, asymptotic

in
of response
sequential
properties
situation
a verbal conditioning
studied in relation to
were
from statistical learningtheory.

predictions
in
run
Forty-eightcollegestudents were
individualized modification
of the "verbal
3.
an
tioning"
condi-
experiment originatedby Humphreys

4.
(4). Each trial consisted in presentationof a
followed
left-hand
a
or
right-hand
by
signal
"reinforcing"
light;S operated an appropriate
key to indicate his predictionas to which light
would
the
appear
on
each trial. For each
selected randomly, was

lights,
El, the
other
as
Eo.
On
one
of
the first series of 120
probability
.30,.50,and
On
.85 for Groups I,II,and III,respectively.
the second
120 trials,
Ei occurred with probability
El occurred
trials.
5.
designatedas
with
6.
.30 for all groups.
Theoretical
that mean
were
bility
probapredictions
predictingEi should tend asymptotically
of Ei, both during
to the actual probability
original
learningand followinga shift in probaof
7.
EsTEs, W. K. Toward a statisticaltheoryof

Rev.,1950,57,94-107.
learning.Psychol.
EsTES, W. K., " Burke, C. J. A theory of
in learning.Psychol.
stimulus variability
Rev.,1953,60, 276-286.
Grant, D. A., Hake, H. W., " Hornseth,
and extinction of a verbal
J.P. Acquisition
conditioned
with differing
response
/. exp.
percentages of reinforcement.
1951,42, 1-5.
Psychol.,
tion
Humphreys, L. G. Acquisitionand extincof verbal expectations
In a situation
chol.,
analogousto conditioning./. exp. Psy1939,25, 294-301.
Jarvik, M. E. Probabilitylearningand a
pation
negativerecency effectin the serialanticiof alternative symbols. /. exp.
1951,41,291-297.
Psychol.,
Jenkins, W. O., " Stanley, J. C. Partial
and
review
reinforcement:
a
critique.
193-234.
1950,47,
Psychol.Bull.,
statistics.New
McNemar, Q. Psychological
York: Wiley, 1949.
(ReceivedJuly 10, 1953)
'
LEARNING
FOR
MODELS
MATHEMATICAL
SOME
OF
INVESTIGATION
AN
CURT
F.
FEY
Universityof Pennsylvania
this
In
determine
to
is
is made
attempt
the results of
whether
study
an
of
merit
The
lies in
model
invariance
parameter
The
learning experiments
be described by stochastic models
can
Mosteller
and
proposed by Bush
Luce
(1959) without
(1955) and
changing the model parameters.
different
two
this
investigate
to
of
question
in
tail.
de-
greater
experimental design
was
that of Galanter
and
improved over
Bush
(1959) by running only one
rather than three trials per day, and
extended
it was
to
parison
provide a combetween
100% reinforcement
reinforcement.
and 75% random
its
ability to describe and predict data

mum
successfullywith the aid of a miniof free parameters.
For any
this can
be done with
Models
The
experiment
one
models.
several
Consequently
stringent test
to predict the
data with
of
such
of the
values
in
can
Galanter
and
studied
Mosteller
showed
in
it is not
clear
the
P/a
and
analysis
invariance
basic
or
was
errors
of
and
attributable
was
mechanism
consequence
in
of
difficulties in
the
model
sampling
estimating
of the
present
gets
rewarded,
Based
on
the author's
PhD
dissertation
This
article
appeared
in /. exp.
the
let
ai,
given
one
on
which
it
probability
the
on
models
of
next
specify the
changes.
probability of going
right-hand side (probabilityof an
the
trial
on
such
with
associated
with
can
n;
;8i and
a2,
parameters
associated
then
be
let
that
reward
and
ai
and
in
pn',
nonnegative
and
02
The
nonreward.
defined
qn
jSa be
the
jSi are
/S2are
models
following
way:
Psychol.,1961, 61,
353
for
response
The
be
pn
"error")
supervisedby R. R. Bush, and read by R. D.

Luce
and
The
data analysis was
J. Beck.
of the
Center
performed at the Computer
tance
University of Pennsylvania with the assisof S. Corn and P. Z. Ingerman.
' Now
General
at
Dynamics/Electronics,
Rochester, New York.
same
any
if
that
the
then
either
turn
can
the
on
outcome
of these
Let
to
only
the
right on
state
ing
mak-
pathprobability
are
response
that
making
maze
the
trial increases.
study
makes
stochastic,
are
models
models
trial
beta
quantity
probabilitiesof
to
or
The
manner
purpose
in
left
trial.
parameters.
The
with
The
animal
the
to
parameter
to
lack
models
these
linear
the
the
independent: the response

on
a
given trial depends
probability and
response
the previous trial.
on
An
the
of
i.e.,they deal
responses.
T-maze
whether
in
probability p;
is applied to
P).
Both
model
uses
alpha model
is applied to the
response
it
model
invariance
Bush
beta
the
the linear transformation

same
may
(Bush
alpha
In
transformations.
lack of parameter
situation, but
apparent
an
invariance
in this paper
model
used
the
as
Mosteller, 1955) and the

of them
(Luce, 1959). Each
predict
experiment.
(1959) previously
Their
(1955).
"
to
parameter
of
model
linear
the
Bush
of the
models
two
designated
mined
deter-
these
used
be
of another
ability
be
once
are
experiment
the outcome
in
that
way
The
more
of parameters
set
parameters
one
parameters
structure
invariant
one
in
fine
is its
model
455-461.
Reprinted with
permission.
354
Response
Model
Alpha
pn+\
CLypn
left tum
pn+\
a^pn
right turn
g"+i
aig"
right turn
5"+i
a^qn
left turn
properties of
Mathematical
model
listed by
were
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
the
Galanter
alpha
Bush
and
male hooded
rats
Subjects. The 5s were
the Long-Evans
Rockland
strain, from
York
York.
Farms, New
City, New
They
weighed about 75 gm. on arrival. Eight rats
used for the preliminary experiment. In
were
the main experiment 63 rats were
used, but
the final A^
50, because 13 died during the
experiment.
"
(1959,
pp.
condition
272-273)
that
the
for
left response
is
special
always
of
is
right-hand response
rewarded, (100:0). For the beta
never
model
the mathematical
properties have
determined
been
(1960, 1961),
by Kanal
Bush, Galanter, and Luce (1959, p. 387),
and
Bush
(1960).
rewarded
and
'
TABLE
Period
Comparison
Statistics
of
Model
Corresponding
aa
.955
FOR
from
the
Values
ALPHA
/32 =
Note.
"
Standard
error
of the mean
was
MODEL
computed
1960.
Fey,
Group
35
Trials
Calculated
Beta
from
of
with
pi
AND
.642, FOR
see
2, 100:0
First
details
For
range
Group
Experimental
pi
.97,î
Model
approximation.
.858, and
1, ai
.952, AND
=
with
CURT
F.
TABLE
Model
of
Statistics
Values
Alpha
75 : 25 Experimental
of
Calculated
Model
pi
and
The
model
approximation.
Note.
"
parameters
were
from
Corresponding
with
Beta
estimated
Group
.858,and
1, ai
.97,j3i .952, and ^2
î
with
FOR
range
2, 75:25 Group
Period
Comparison
355
FEY
a2
=
.955
for
.642
Model
the 100.0
Standard
group,
error
was
computed
from
was
a
Apparatus. The T maze
replica mash to balance olfactorycues, and the top
used by Galanter and
Bush
(1959). contained the reward pellet.
This experiment consisted of
It consisted of a straightalley runway
Procedure.
for
a
nd
for
the
main
three
T
maze
a
pretraining
ment.
experiparts: (a) preliminary handling; (")
The T maze
built in such a way
was
straightalley pretraining;and (c) T-maze
of the
that the crossbar and the start arm
learning.
T could be separatedand a goalbox could be
The 5s were
kept in the laboratoryfor 23
and were
hooked to the stem of the T, therebychanging
days at ad lib. food and water
5s were
the maze
into a straightrunway.
The maze
handled daily. Then
deprived of
built of plywood with a removable
wire
food for 18, 21, 21i 21 1, and 22 hr. on Days
was
and
wood
doors.
The
mesh
24, 25, 26, 27, and 28, respectively.
pressed
top
The
inside of the stem and the attachable goalbox
pretrainingstarted on Day 29. For
under
the remainder of the experiment5s were
were
paintedmedium gray, the rightarm was
18 hr. food deprivationat the beginning of
painted lightgray, and the left,dark gray.
fed 4 hr. later
The length of the cross
60 in.,the
each daily run.
arm
was
They were
for a 2-hr. period.Water
able
was
always available
lengthof the stem was 26 in.,and the attachgoalbox was 10 in. The alleyswere 4 in. in the cages.
8 in. high. The
trial per day of
wide and the walls were
The 5s were
given one
10 in. long with
was
pretrainingon the straightalley runway.
startingcompartment
side and
a
a
Pretraininglasted for three days.
guillotinedoor on the maze
tine
During the 30 days of Period 1 of the Thinged door on the outside. Another guillodoor was
maze
at the choice point. The goal
learning,the followingprocedure was
adhered
to:
were
deposited
placed at the end of each arm.
.038-gm. pelletwas
cups
in the rightgoal cup; nothing was
The metal goal cups had double
floors,the
placed in
"
of that
"
bottom
part contained inaccessiblewet
food
the left goalcup.
The
was
placedin
the
356
READINGS
The
lowered.
door
startbox
left in the
was
door
until it
maze
min.
up,
whichever
occurred
the end
of Period
1 5s
were
At
into
random
on
2, and the other
was
following schedule
rewarded
according to
obtained
from
P(L)
table with
number
LLRLLRLLLLLRLRLLLLLR
Period
LLRRRRLLLLL.
35
0.75:
until
by exploration of the parameter
space
Carlo
probabilities (Monte
response
similar
the experito
computations) were
mental
of the 100:0
criteria
at
the
random
L
for finding the alpha model

estimates were
modified
These
data
divided
used
the
first.
were
One
was
group
groups.
the left side during Period
two
always rewarded
those
to
parameters.
was
pellet,until it investigated the goal

side), or until 3
(on the nonrewarded
ate
used
were
The following
group.
total number
of
the
generated by the model had to match

a
mean
plot of trial-by-trial
probabilities
produced by the model
response
had
similar to the corresponding
to appear
plot of the data.
errors
the
data, and
2 lasted for
Results
days.
of parameters.
Estimation
of the
alpha model
followingway
taken
"
The parameters
in the
estimated
other
two
the Period
of trials before
observed
mean
the first
success
total number
their respectiveexpected values.

Initial estimates of the beta model
were
determined
by
methods
of
The
mean
and
errors
parameters
similar
results of the
and
experiment
are
Fig. 1 and 2 and Tables

the
Figure 1 presents
in
summarized
was
parameters
2 data of the
by equating the observed
group
number
the
The
from
estimated
100:0
were
initial probabilitypi
The
be 1.00.
to
were
to
raised.
was
the
cup
PSYCHOLOGY
MATHEMATICAL
point, its
choice
the
passed
As
the
and
startbox
IN
2.
of the 100 : 0
proportions of R response
2
Period
and
the corduring
responding
group
curves
generated by the
models.
Figure 2 depicts the same
for the
75:25
data
during
group
2
Tables
and
2.
1 give
Period
"
EXPERIMENT
"" MODEL
/9 MODEL
1"
25
-T"
30
35
TRAILS
made
Trial by trial proportions of L responses
by 25
2, Group 100:0.
(smooth line) computed with
(filledcircles); generated by alpha model
Carlo analogs
0.955; and generated by 500 beta model Monte
0.858, and at
1.00, ai
pi
and
0.647.
with
0.952,
0.97,
02
0i
(open circles)computed
pi
Fig.
1.
Period
5s
experimental
=
CURT
F.
357
FEY
I.Ql
p."
EXPERIMENT
.o"K MODEL
"."3 MODEL
"
2Y
20
TRAILS
Fig.
2.
Period 2, Group 75:75.
Trial
R R
1"
30
35
(N)
by trial proportionsof
made
responses
by
25
Carlo analogs(open circles)

Monte
puted
comcircles)
(filled
; by 100 alpha model
0.955 ; and by 200 beta model Monte
Carlo analogs
with pi
1.00,ai
0.858,and ai
food reward is in
0.97,/3i 0.952, and /Sj 0.647. (R
(triangles)
computed with pi
experimental5s
"
right
maze
arm,
is
otherwise the leftarm
baited.)
This
experiment indicates that the
comparative results of this experiment
models
under
consideration fitthe Period
A
and correspondingmodel values.
100:0
data, from which their
2,
group
detailed analysisof results is
more
estimated, quite well,
were
parameters
presentedby Fey (1960).
The
Discussion
of
(using parameters estimated from

Period 2, 100:0 group) is less successful.
of learninglies not
the data of any one
aid of parameters
model
mathematical
much
25 group
the fit to the
the
merit
Period 2, 75
but
data
Both
models
show
an
apparent
lack
in
describing of parameter invariance of approximately

experiment with the
equal magnitude.
estimated
from that
Tables 1 and 2 might give the impression
particularexperiment as in its ability
that the alpha model fits the data
to
represent accurately the learning slightlybetter than
beta
the
model.
of a varietyof different experimental This conclusion is hardly warranted
process
if
set of
situations using the same
the magnitudes of the differences and the
parameters.
parameters
In
are
so
other
estimated
words
for
once
one
the
methods
mental
experi-
of estimating the
considered.
The
parameters
alpha model parameters

situation the model
should be
determined analytically
were
; those
able to predict the course
of learning
beta model
the
estimated
of
were
by
in other
which
experiments. Models
Thus
the
Monte
Carlo
procedures.
will handle
of
a
experimental
variety
alpha model parameters were determined
situations with the same
set of parameters
more
are called parameter invariant.
exactlythan those of the beta model.
are
358
READINGS
The
basic
IN
lack of long runs

seems
of the models.
difficulty
little consequence
but it does
in 100:0
well
as
(Derks, 1960). This
generally manifested
but
only
curves,
the
of
data.
analysis
of
The
fact
serious
as
runs.
than
change
will
parameters
"stat
rats"
experimental
lack of long
the
as
in the
ing
learn-
sequential
the
size of the
the
correct
is
runs
mean
75:25
the
slowly
more
is not
^s
that
in
in
mals
ani-
behavior
long
in
not
ing,
learn-
for
choice
lack
PSYCHOLOGY
is of
important
schedules
in human
as
be
to
This
animal
be
to
seem
partial reinforcement
learn
MATHEMATICAL
model
former
ficiency.
deBAITED
"
The
reducing
by
the
about
match
for
those
of
by
ARM
parameters
total
number
of
model
analogs
experimental
the
ARM
UNBAITED
These
will
5s
reduced
the
fit
learning of
the
75:25
the
to
group.
The
slow
could
analogs
100:0
be
handled
in which
manner
modified
lack
the
the
that
group.
decrease
parameters
the
model
75:25
the
100:0
indicate
beta
25%,
by
made
errors
data
when
to
75:25.
the
With
parameter
values
fit of model
to
Galanter
no
would
slightlyduring
phenomenon
the
to
respect
small
increase
it
occurs
5s of Period
first few
began
also
to
noted
the
in
that
the
more
quently
fre-
situations
acquisition
in other
This
experi-
(Gibson "
Walk,
Lach"
Jensen, 1960; Kendler
1958). In the present experiment,
initial dip is hardly noticeable.
1956;
man,
the
A
to decrease
rise.
look
the
baited
the
maze
ARM
(Fig. 3
the
baited
in
3
TRIALS
Fig.
3.
maze
on
the
are
(N)
5s
of the
unbaited
the
up,
that
Galanter
removed
were
than
later
maze;
removal
occurred
the
the
of
arms
unbaited
of the
arm
from
until it
pellet on
min.
The
side
either
for removing
the
those
in
spent
4) indicates
for this is found
reason
in the
cup
and
^s and
rats
unbaited
experiment,
approximately
interval
The
the
Ill*
the
after
ARM
time
in the
(1959) Exp.
quickly from the
from
in
the
and
Bush
more
UNBAITED
at
initially
our
and
BAITEO
1.
in
mental
side tended
the
Trial by trial distribution of time

baited, left (filled
circles)and unbaited, right (open circles)maze
arm
by
4.
in
spent
50
change
Bush
rewarded
before
are
changes from
(1959)
three
of their
experiments
probability of turning to the
trials
Fig.
parameters
data.
and
TRIALS
by specifying
the schedule
of perseverance,
model
time
same
of
the
maze.
in the criteria
5
maze:
is left
the
investigates
food
side, until it eats

baited
side, or until
whichever
investigate the food
first.
occurs
cup
on
the
Trial
by trial distribution of time

spent in baited, left (filledcircles)and unbaited, right (open circles)maze
arm
by 20
.Ss of Galanter
and
Bush
(1959) Exp. III.
The
data
obtained
plotted in Fig. 3 were
originalprotocolsof the experiment
reported by Galanter and Bush (1959).
from
the
FUNCTIONAL
EQUATION
ANALYSIS
OF
TWO
LEARNING
MODELS*
La
KANALf
VEEN
DYNAMICS/eLECTRONICS
GENERAL
NEW
ROCHESTER,
beta
One-absorbing barrier
for learning and
random
model
alpha model)
are
derived
the
walks
linear
YORK
Luce's
arising from
model
commuting-operator
nonlinear
(called the
Functional
statistics are
equations for various
defined
Solutions
models.
branching processes
by the two
to general- functional
equations, satisfied by statistics of the alpha and beta
obtained.
The
methods
models, are
presented have
application to other
from
learning
The
models.
two-response,
number
considered.
of stochastic
models
...
contingent version
path-independent,
two-event,
for
of
learning is given by the equations
\QiPn
with
probability
W2?"n
with
probability (1
p"
where
Qi and
Q2 represent
respectively, the
model
in
discussed
(1) are
transition
probabilities of
by
defined
Bush
and
this paper,
of the
operators
^^^
"beta"
defined
are
^'^-^
terms
of the
*Abstracted
June
for the
many
valuable
A2
trial
on
[8] is obtained
(1
"
n.
the
when
p")
A
are,
linear
operators
from
îPn
(0
"
QzPn
CX2Pn
(0
"
is called
v"
author
and
helpful discussions
"alpha" model.
the
QJl
Q!2
"
1),
"
1)
when
the
^'^l'^;
'
p"/{l
is indebted
encouragement
and
/3,"0;
Pn) the transition
"
to
doctoral
Robert
received
for partial support

School
of Electrical
R.
from
from
equations for this
sylvania,
dissertation. University of Pennsupervisor
Bush, his dissertation
him
an
9^p,9^1.
NSF
and
to
R.
Duncan
Luce
for
grant.
fFormerly at the Moore

Engineering, University of Pennsylvania,
Philadelphia, Pa. The author is grateful to J. G. Brainerd, S. Gorn, and C. N. Weygandt
of the Moore
School, and N. F. Finkelstein, D. Parkhill and A. A. Wolf of General
Dynamics
for their
This
encouragement.
article
appeared
"
specialization
[13] is obtained
by Luce
proposed
portions of the author's
The
help
QlP"
by the equations:
variable
1960.
model
model
(T-Dp.
1 +
In
and
p"
by the equations
this linear
nonlinear
and
Ai
responses
Mosteller
/2)
In
operators, and
Pn),
"
in Psychometrika,
1962, 27,
360
89-104.
Reprinted
with
permission.
361
In the
model
beta
linear transformations
methods
the
trial to
from
the
into
inevitably enter
propertiesof the model,

propertiesof linear learning models
of stochastic
derivation
generally used
derive
to
than
of choice
probabilities
the
trial. Since
rather
nonlinear
undergo
probabilities
response
apply to the beta model.

applicable to both the alpha
Analytical methods
presented in this paper. The approach used is to consider
defined by the decision rules of the two
models, and
do not
functional
and
the
models
beta
from
it to
functional
formulate
and
statistics of interest. Tatsuoka
equations for various
are
branching process
ler
Hostel-
statistics for
obtain
some
equation approach
somewhat
those presented
differ
from
techniques
alpha
of attack for
here; the approach developed here leads to a unified method
and can
to others.
the alpha and beta models
be extended
[151 used
model.
the
Their
Random
Some
In
and
(4),/?, "
nonreward
Aa is
always
1, 182 "
(81 "
shown
1 and
Walks
î
1. If neither
these
If response
nature
of the
in this paper.
model
either response
model
is
are
responses
rewarded
response
always
rewarded
/3i "
1. It is
1, 182 "
to
and
barriers for these
OAB
the
model
beta
case
when
probabilityof
random
other
model
beta
two-outcome
for the
Except
diminishes
reward
and
one-absorbing-barrier(OAB),
(TRB) walks. Rigtwo-refiecting-barrier
orous
resultingfrom the two-alternative,

Lamperti and Suppes [12].Only the
considered
ever
lead
cases
and
two-absorbing-barrier(TAB),
proof of the
is
rewarded
Aj is never
1- If both
response
three
Model
with
be identified,respectively,
1, (82 "
"
the Beta
Arising from
1 may
/3, "
of the response.
rewarded
[11] that
in
to
(j8i"
a,-
walks
is given
by
/Sz "
1) is
1, in the
alpha
alpha
1,
response
Ai
; the
model.
one-absorbing-barrier
Functional
Equations for Statistics of the
One-Absorbing Barrier Models

The
OAB
p" which
has
on
the
part of organisms
eventually
processes
and
alpha and beta

all its density at
learn
is obtained
Sternberg [9]on
are
0.
an
(Considering response
learning, this
means
that
A. 1
as
an
of
error
all organisms
information
about
the
errors).Additional
various
statistics. Following the work
of Bush
from
the
statistics
considered
m
odel,
simple single-operator
to
not
which
lead to
models
make
362
READINGS
those
are
mean,
which
describe
weighted
the first
statistics
and
Ai
are
derived
of
variance
oi
of the
A2
an
to the
approach
of responses;
runs
occurrence
an
PSYCHOLOGY
MATHEMATICAL
the rate
mean,
statistics concerning
of
IN
approach; sequential
such as those describing
statistics,
(success)and the last occurrence
other
response
by consideringthe
the
as
of
rate
Functional
(failure).
response
asymptote, such
equations
branching processes
satisfied
shown
by these
in Fig. 1.
V-Â2v
Pz
V-"-/3,V
/9|2)92V
\-Pc
i^zv
V-*.^
i-Pi
/-""
V-^^
/3|02^V
2^
l-Pi
V-"/3 2^V
1-P3
/92^V
V-*'
TRIAL
Z
TRIAL
TRIAL
Figure
The
Beta
la
Model
Lattice
p"-a|P
P"0|P
2
P*a| 02?
l-a.p
1-0, a,p
2"
l-p
,02 p
I-O2P
2
a"p
l-a-^p
h
TRIAL
I
TRIAL
TRIAL
Figure
The
Alpha
Model
lb
Lattice
TRIAL
4
LAVEEN
analysiswhich
the
For
is defined
variables
363
KANAL
follows, a
sequence
a;i
0:2
"
"
"
a;" of random
that
such
fl if response
Ai
occurs
on
trial
\0 if response
occurs
on
trial
n.
random
The
The
trials is
E(Xn)
the
decrease
variables
In
(|Si ^2) and
obtained.
are
equal
to the
of Ai
Fig. 1
from
N)
"t"^{v,
to
responses
A. 1 response.
an
functional
the
E(X)
in A^"trials starting
A2
an
by /3
in the two
in
of Ai responses
result of trial 1 is
responses,
for
responses
Xat-i
number,
models
of the
bounds
X^ of Ai
Xjv-i if the result of trial 1 is

number
expectation
a:" with
\imE(X^)
02) finite upper
trial 2 if the
trials/starting from
obtained
the number
Now
trial 1 will be
equal to 1 +
the expected
^^=1
Aj, and
response
(ai
max
from
X^v
fact, by replacing the parameters
is of interest. In
models
one-absorbing barrier models, both
the
probabilityof
of Ai responses
the total number
x"
variable
E{X)
max
responses
the random
given by
]^^=ip"
expectationsp"
of the random
In terms
in N
of Ai
number
mean
have
variables
(N
"
and
response
1)
be
Letting 0 denote
equations
for "t"are
be
A^
"t",W,v,
pM
r^v ^^^^^'''
(1
1)] +
p^)M^2V, N
^
Y+~v '^'^^'''
1] +
1) +
1)
1)'
and
N)
"f",{p,
When
-^
uv)
(6)
the
=
above
1] +
(1
N
p)"i"a{cc2P,
-
1).
functions
must,
r+^
(1
of course,
"^
'^^^^^''^
r+^
p)"t"M2P)+
'
p.
satisfythe boundary
condition
0.
second
then,
as
E{X_^) the
denote
A''
"
of Ai
of the number
moment
Letting 6
(7)
Y^y '^^(î^)+
0(0)
are
1) +
"t"a{p) p"t"MiP) +
Both
The
equations become
these
(5)
p["f"M.P,iV
"
responses
functional
equations for
the second
moment
"o
,
9,iv)
=
Y^^ e^(M
]-qr^ UM
y^^ [i +
20^(^1^)]^
364
READINGS
eM
(8)
6{0)
0 is
V"MiV)
(1
P)0.{a2p)+ p[l
condition.
boundary
PSYCHOLOGY
MATHEMATICAL
IN
Finite
2"f"MiP)]-
bounds
upper
and
^^3(2;)
exist for
max
max
{^1 jSa)and a
(ai aa)
da{p)',
replacingthe parameters by jS
if
of X^ is y^Li Pn(l
the variance
Pn) which remains finite as iV *
y^T Pn does. Functional
are
equations for higher moments
easily obtained
=
"
in this
manner.
The
functional
of Ai
number
and
Tatsuoka
been
X2
of A
the random
Yq.n represents
weighting function
"
"
as
"
aî
the
^2
"
(10)
^"(p)
boundary condition
Number
Fi -\- 1 denote
first time
equal
to
at trial
so
that
if A2
zero
F2 denotes
2, if trial
p,(v) pMM
equal
(1 +
Fi,Ar_i
to
Yi,^m-i))
infinite number
P)^("2P) +
to
of trials,
Mv),
-r
(t"a(p).
(success)occurs
response
trial number
the
an
an
responses,
Fq.atis equal
is
7-4-MM
(1
the
is equal to
the first A2 response

denote
Letting v
equations for
response.
[(1
response
of trials before
F, the functional
1] +
which
on
first trial and
of trials,before
variable
0.
on
For
response.
Fi is the number
1 results in
variables
the
occurs
the number
of the random
(11)
^(0)
and
response
P^PMP)
is
variables
of Ai
number
that
by noting
of trials beforethe firstA
Let
the
^1
MM
T~"
i -f- V
the random
weighted
the
A2
an
if the result of the first trial is an
Mv)
weighted
relabelingthe random
represented by
expectation of
equations are obtained
(9)
by
From
n.
trials with
trial 2 on, the
if the result of the first trial is
which
be
can
"
in N
of Ai responses
trial number
is ^^=2 ^^n
"
number
weighted
being the
for the
the functional
[14]
is somewhat
of derivation
responses
If }p stands
by Tatsuoka
the
variable
of Ai responses
X3
method
of
moment
presented here.
Then
number
second
obtained
previously
the
and
mean
[15].Their
Mosteller
that
weightednumber
Define
have
responses
the
for
equations
and
different from
The
""
"
pM
A2
first A2
(1 +
for
occurs
Fi is
7^2),where
starting
expectation
occurs,
the
are
+ ]r^~v
YJr-y'^(M
'
LA
the
denotes
equations for
second
of the
moment
V-
random
F, the functional
variables
p are
pM
(13)
Y^
[1 +
(14)
Pa(p)
number
'
1 +v
PPaioCi V) + P[l + 2j/(aip)].
last Ai response
at which
P^(M]
2"',(M +
'^"^"^
1 +
TnaZ
365
KANAL
+
VVaiotiV)
v"{p)
(12)
If
VEEN
occurs
Let
fO if no
L"
"^
Ai
1 if the last
[(N +
Then
Ai
on
1)
random
the
Ai
any
the
trial and
and
by definition
of A
occurrence
by Ai
/x
equation
fiaip)
=
for
Ma
on
(1
+
=
[(1
3(1
(1
the
may
be deduced
p){l
(1
p)oi2p+
p)(l
p)(l
(1
response
last
occurs
A2A2A1
of responses
by Ao
the
the second
on
that
random
L, the functional
variables
1. For
infinite number
an
of trials
+ 2]
p)a2p[(Jia{aia2p)
(1
p)a2p
2(1
p){\
a2p){l
"
p)(l
"
"
"
+
ci2p)alp
"
"
"
a2p)a2p
a2p)alp+
"
"
(1
"
p)a2PHa(.ocia2p)
+
"].
a2p)a2piJLc(oiia2p)
"
"
in the
Ai
no
n.
on.
last
expression is just (1
from, the expression for
p)a2p
if
zero
"
which
at
the sequence
+ 3] +
a2p)al'p[iJLMiOclp)
in brackets
term
(1
(1
1] +
pnaiotip)+
But
so
expectation of the
developed from Fig.
p[Ma(aip)+
trial N
on
the first trial followed
the
is
occurs
the third trial. It is evident
on
denote
Li is
development,
and
Letting
trial
on
L^ represents the trial number
trial. In the following
denotes
occurs
response
after trial
or
on,
if the last A^ response
"
variable
occurs,
response
occurs
response
p){l
"
p)Ma("2?")as
Ma(p)- Also
a2p)a2p +
"
"
"
JI
(1
"
"2P)-
366
READINGS
similar
development
(15)
,x,(v)
=
(16)
-^
1 +
MATHEMATICAL
IN
for
n,iM
T-^
M^(M
1 +
'
'^''^'
'
equations
+
p)Ma("2?")
^^\ (1 +
+ (1
PfJLaiotlP)
Ma(?")
in the functional
np{v)results
^''^^^
PSYCHOLOGY
11 (l
"
^\v)
'
^2^)
"
1=0
with
n(0)
0, since for
for the
the
mean
model
alpha
the
For
will be found
Y^^y"(M
at which
second
for 7
Hosteller
to
in
occurs
[15].
consider the expectationsof

random
of the
moment
variables
are
y"m
:i-^
tion
different deriva-
occurs.
the last A, response
and
it is necessary
the
equations
ever
response
in Tatsuoka
expectation of Ll
7 the functional
(17) 7,"
Ai
Ono
1)^,etc. Denoting
(L2 +
by
of the trial number
[2,Mf\^
+
l]
,
and
(18) 7a(p)
with
be
7(0)
v^Mv)
the above
n (1
[2m.(p)
equations
for
higher
"2?")
of Lj
moments
can
easily
manner.
of lengthj, of A^
of runs,
The
vhMv)
0. Functional
generated in
Number
(1
responses
of responses
sequence
Ai^jTix
"
"
"
A.iA.2
"
'
j trials
is termed
of
The
process
of
Kronecker
of
runs
"
total number
Fig.
delta
1 it is
of
seen
function.
"
runs
that
Fig.
1 (a) For
.
5;t,,is the Kronecker
the expression
of
an
delta
to
;, and
and
o-,
i?",,+
denote
equation
the
for aj^
infinite number
function.
the
the
From
.
5",,+2
number
length greater
number
termination
the
length j is then iî
Ri,j
of
Let /?",,denote
,,
Letting
length j, the functional
lattice of
where
length j.Statistics concerning the
length exactly equal
runs
model
of
responses
of
"),are of interest.
1, 2
equal to j (j
of length j, which
between
trial n
occur
or
process.
of
of Ai responses,
run,
of A^
runs
than
of
of the
branching
5",,+2 is the
where
,
expectation of the
is developed from
number
the beta
of trials
for part
Substituting "tj^{îv)
of
gives,
i
+ (1
"Jif"{v)Pi"r,p{^,v)
=
+ IIPi[(l
Pi)(Tie{fi2V)
-
P/
i)
P/
i(l
P"+2)].
368
READINGS
(25) Kv, /?! ^2)
IN
MATHEMATICAL
Y^^ mv,
^1
PSYCHOLOGY
^2) +
pqr^ mv,
/3x ^2) +
,
g{v,^,
0.)
where
0 "
0"y"oo,
The
term
13, "
0 "
I,
g{v,^j, ^2) is,in general, different
132 "
for each
1.
statistic considered.
all except the
For
statistics
run
giv,/5i ^2) "
0,
^(0, ^1
0,
(26)
Umg{v,0,
these
For
^2)
,|S2)"
1.
statistics
/(O,^1
(27)
lim
^2)
0,
J(V,;Si 182)
"
Equation (26) does

for the
statistics have
run
functional
The
to have
the
defined
to be
equations
statistics and
run
for
the
boundary conditions
separately.
statistics of the
alpha model
are
seen
general form
y(p,"!
(28)
hold for the
not
az)
pyioiip,
Q!i
(1
aa) +
p)y(a2P,a,
az) + z(p,a,
a^)
where
0 "
0"p"l,
For
the statistics of the
alpha
/29N
and
the
boundary conditions
0 "
1,
a2
"
1.
model
^(0,"i
z(l,ai
"2)
02)
0,
"
0,
for all the statistics considered
y(0,ai
(30)
"
"!
a2)
are
and
lim
The
02)
is finite.
equations for the run statistics of the beta model differ

from the functional equations for the other statistics considered. A
statistics is presented in [11].
of the functional equations for the run
sections which
follow present formal
solutions to (25) and
(28)
functional
in nature
discussion
The
under
y(p,"!
the
boundary
conditions
(27) and
(30) respectively.Theorems
con-
369
KANAL
LAVEEN
cerning existence,uniquenessand other propertiesof the solutions have

Some of these
been proved in [11]by methods similar to those of Bellman [3].
theorems
On
stated here without
are
proof.
beta model
the junctional
equationfor the OAB
Writing f{v,0^
(31)
/Sa)simply as f{v),
(25)takes the form
r-"-^ mv)
mv)
+ -q"
giv),
where
giv)"0,
^(0)
lim^(i;)"l,
0,
and
/(O)
lim
0;
Further,let 0 " ^Si" 1;0 " /32" 1. The cases (/3i 1,jSs" 1) and {0, "
1) can be considered separately.
/Sz
Existence of solution. For any function r(v)define the operator T by
=
1,
(32)
"^
1 +
T-r{v)
=
Theorem
+ g(v).
:p^ri^-.v)
+
r(0,v)
1. -r
1.
r-'giv)
lim
j{v)
=
the limit exists.
when
Theorem
If g{v)is a
2.
increasingfunctionof v, then
monotone
solution
f{v)exists if
Z gi^'v)
1=0
is finite
for 0 "
oo, where 0 "
"
/3
(^Si (82)"
max
1-
g{v)occurringin the beta model first-moment equations

the conditions of Theorem
functions of v which satisfy
increasing
As almost allthe
are
monotone
2, the existence of the

the OAB
"
beta model
From
is a monotone
of most
mean
of the random
variables introduced
is assured.
proof similar
to
that for Theorem
2 it follows that when
increasingfunction of v,
j^giM,
i:Sf(/3")
"f(v)
"
(33)
i=0
,=0
Bwheni
0,,
for
max
(iSi^2)
,
and
0^.
min
(0, ^2),
g{v)
370
READINGS
MATHEMATICAL
IN
PSYCHOLOGY
Conti7iuity. If | g{v) \ is bounded
"
in 0
"
the
oo,
solution
j{v)is
continuous.
If
Monotonicity.
1^2, then
/5i "
j{v)is
the
solution
oo
OAB
Existence.
alpha models
(1
p-y(a,p)+
+ z{p)
p)y{ot2p)
of existence,uniqueness and
is similar to that
(31).Some
for
For
any
(1
pQic^^v)+
other
propertiesof the
propertiesof y{p) are stated

Q{p), define the operator
function
miv)
(35)
if
equation
yip)
development
of v, and
the functional
(34)
and
increasing function
increasingfunction of v.
/(y)is unique in 0 " y "
monotone
junctionalequationfor the
For
the
monotone
The
Uniqueness.
On
g{v) is
solution
without
proof.
+ z(p)
p)Q(c^2p)
let
A^"^-z(p)
Ipî
lim
Theorem
c(ai ag)-
3.
y(p)
A^"^-z(p).
lim
n-"co
Theorem
// z{p) is
4.
increasingin
monotone
p, then
00
CO
Z) ^("nP)
yip)
"
Zl zialp)
"
i=0
"=0
where
If
Monotonicity.
If
Solution
zip) is
and
is convex
2(7?)
Functional
of the
solution
model
derived
to
Theorem
(36)
is shown
here. A
min
(ai a2).
,
p, and
increasingin
monotone
(31) is
in
detailed
5.
Along
fiv)
==
ai
The
solution
Fig.
presentationwill
aj
then
,
yip) is
OAB
aj
"
0:2
then
,
be
parameter
for
Model
solutions
space
of the
specialparameter
found in [11].
(1) and (2) of Fig. 2,
E dii^Tm n
convex.
Beta
by generalizingfrom
values.
2. One
sides
"
Equation for the
obtained
equation for special parameter

beta
oi"
increasing in p.
yip) is monotone
Convexity.
The
("! 0:2),
max
a"
^2^lV
f/o(1 + /?2/3"
of the
OAB
values
is
371
KANAL
LAVEEN
"
/3
Figure
Parameter
Proof.
summation
side
Along
over
is
n,
(1),182
Model
Beta
1, and
0, |8i"
The
nonzero.
of OAB
only
the
resultingexpression is
equation, for in this
the functional
from
Space
the
0 term
one
of the
obtained
case
mv)
1 +
g(v),
givmg
^7v
/(/?"
for
0, 1,
substitution.
"
"
which
from
"
(1 +
K^r'v) +
m
the
desired
g{(3:v),
result is obtained
Along
z gm
side
(2),/3i
z
^0
1, and
1, jSa "
02V
f-4 (1 + m
rr
Ed
by
(36) becomes
+
02v)g{02v),
successive
372
the last expressionbeing also the

for the
Kv)
to
/3i
case
KM
Note
exists
case
the functional
equation
equation reduces
the functional
v)g{v).Q.E.D.
"
"
the parameter
be obtained.
resultingfunctional
in the form
of g-difference
be written
equations for which
extensive body of literature [1,2].
of the solutions for various specialparameter values suggests
an
the form
the
1, for which
from
point (1,1) of the parameter space the solutions diverge.

/S*(fc 1,2,
/Sg jSg
"),solutions along arcs of the form
Examination
by
(1 +
obtained
one
that at the
By letting/81
(6) and (7) of
equations may
there
1, 182 "
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
of the
space
can
general solution.
The
The
general solution
(31)is given
to
followingtheorem.
Theorem
6.
m=0
n=0
where
Ao,o(v)
1,
Ao.M=Jlr^r^
A^.M
X) Ao.MA,.o(02v)A"_,,,,_,{^,^lv)(m,n
Proof.
Z
m
E
0
(n=h2,-..),
Substitution
A","iv)9(^Tm
in
rxT,
1
Z
V
T-XT
E A^JMgi^T^r'v)
that
(37)
which
E A^,o{v)g{^7v) g(v)+
=
y^^
E A^_,,o(M9(^:v),
gives
m
Ao.oiv)
(38)
1;
A",o(v)
E Ao,Mgi02v)
n=l
").
E A",sM9{^r'm
m=o
so
""
(31) gives
-]-
n=0
1,2,
YJr,A^-..o(M
-A-,E
1
~\-
"=!
Ao,n-,{M9(m,
yo
"
YfW'v'
g(v)
LAVEEN
which
373
KANAL
gives,
"1
and
i:i: A_,."(îi;)^(CT2t^)
i:i:^"."(i;)^(/3r^^?;)
7^
(39)
m-l
n=l
-t
"=i
"=i
1 +
which
from
.4^1
^2V
JJ
^rv)
(1 +
"
-i
TT
(1 +
to the OAB
Solution
Alpha
Model
solution
similar
is
Theorem
/3i/3r*-V)
'
to
that
for the
used
beta
Functional
Equation
gives the parameter space of

be derived
for (34) can
[11]in
functional
model
equation. The
Replacing /Sgand /3iby az and ai in Fig.

OAB
alpha model. The general solution
manner
equation
General
nk
132V
yp
difference
"=i
[11]
follows
the
expressionsatisfythe
coefficients in this last
The
^1
given by
7.
yiv)
Z) X) 6","(p)-2(a"2p),
Tn
where
K.oip)
1,
^^V
h^.oip)
p-'ar
"o."(p)
fl(l-"rp)
^^
(m=l,2,
(n=
...),
1,2, """),
7=1
n
"m,"(p)
(m,n
Z) hi,o{a2p)bo,k{p)hrn-i.n-kiaia2p)
1,2, """)"
374
READINGS
Proof.
Details
The
proof is similar
given in [11].
are
PSYCHOLOGY
MATHEMATICAL
IN
to that
used
model
beta
for the
equation.
Discussion
Analytical techniques applicable to

presented. Functional
equations for
models, viz., Luce's nonlinear beta model
been
model
called the
alpha model,
have
class
and
statistics of two
linear
derived
been
have
learning models
learning
of
various
commuting-operator
the branching processes
from
defined by the models.

The
the
results
alpha model,
first and
second
number
last A^
the
the
[15].However,
power
power
series solutions
first and
functional
to the
of A
been
are
obtained
For
new.
for the
equations
the
and
responses
expanding
of
techniques
trial
and
Tatsuoka
by
the
in
functions
fails,as is illustrated by the fact that the
often
(obtainedby
second
had
occurs
model
beta
Luce's
of the total number
moments
series in the variable
for the
propertiesof
series solutions
power
which
at
Hosteller
stochastic
on
the functional
Tatsuoka
[14])to
of the
total number
moments
of Ai
equations
responses
for
the
one-absorbing-barrier(OAB) beta model are not valid for y " 1.

By investigatingtwo general equations, the problem of solving the
individual
functional
models was
simplified.The
equations for the OAB
functional
do not
have
the
in this paper,
could
be
the OAB
for
to
one
of the
models,
beta
Furthermore,
use
beta
model
methods
Some
method
some
beta
model
used
to
has also been
obtain
methods
upper
derivation
derived
bounds
failed for
for
in
a
bound
upper
[11],mainly
few
statistics
of the statistics.
number
of close
which
for statistics of
bounds
lower
presented in ([11],ch. 5). An
for the
remains
and
upper
find
to
been
These
model.
made
was
statistics
close bounds
lower
for the
bounds
to be found.
Empirical tests and comparisons of the beta model with other models
Luce
been presented by Bush, Galanter, and
[6] and Fey [10].The
of statistics such
parameters
Bush
attempt
an
easilycomputed.
alpha model have
illustrate the
have
requireadditional investigation.
complexity of the expressionsobtained for the
their solutions
statistic of the OAB
of the OAB
OAB
same
and
Because
of the OAB
beta model
sequential statistics of the OAB
boundary conditions as the generalequation presented
equations for the
and
and
as
those
the
in this paper
goodness of
[8],Bush, Galanter, and Luce
for measuring
Hosteller
derived
for the
fit has
been
[6]and
estimation
of
discussed
by others
by
(see[5]).
REFERENCES
[1] Adams,
C. R.
On
the linear ordinary g-differenceequation.Ann.
Math., 2nd
ser.,
30,
1929, 195-205.
[2] Adams,
C.
R.
1931, 361-400.
Linear
g-differenceequations. Bull. Anier.
math.
Soc, 2nd
ser.,
37,
DISTRIBUTION
ASYMPTOTIC
THE
FOR
BARRIER
MODEL*
BETA
La
TWO-ABSORBING-
THE
KanalI
VEEN
DYNAMICS/eLECTRONICS
GENERAL
NEW
ROCHESTER,
YORK
beta
For
the two-absorbing-barrier
learning
specialization of Luce's
sity
of the response
model, the asymptotic distribution
probability has all its denthe
of
the
The
functional
for
0 and
1.
amount
at p
equation
p
1 is investigated in this paper.
density at p
=
beta
Luce's
is
case
learning
given by
"'""" '''*
(1)
"...
and
p"
for
/?2 "
where
p"/(l
1.
derived.
are
model
In
this
(TAB)
for the
two-reflecting-barrierbeta
the
For
of p"
has
all its
1 is
"
"
^,
"
density
useful
at
"particle" starting
at
functional
for
equation
(2)
is
for the
1, /Sz "
1 is
model
2.
,
the
in
absorbed
[3]statistics
when
/3i
1,
statistics
distribution
of the
amount
the
-foo,
at
"
[4].
asymptotic
f{v) is
If
Ai and
response
presented. Some
1. The
models.
eventually
0,
two-absorbing-barrier
considered
are
beta
0 and
"
paper
obtained
statistic
model
statistic for these
companion
model
beta
/Sj "
probabilitiesof
p"). In
two-absorbing-barrier
tingent
con-
p"
"
"
paper
arising when
beta
0 "
respectively the
y"
two-event,
two-response,
equations
one-absorbing-barrier (OAB)
the
the
p.
probability
are
p"
"
and
2
transition
P"bability
with
response
[5] for
W2Vn
where
model
the
density
at
that
probability
i.e.,at
1, the
j(v) is
Y^^mv),
YT~,i(M
1,
1,
where
0
"
"
*Abstracted
Pennsylvania,
Bush,
from
his
/3i "
00
June
from
1960.
dissertation
portion
The
author
supervisor,
/32 "
of
the
author's
is indebted
for
the
/(O)
doctoral
to Prof.
valuable
B.
help
lim
0,
f(v)
1.
of
dissertation, University
Epstein
and
and
to Prof.
encouragement
Robert
R.
received
them.
sylvania,
School
of Electrical
the
Moore
University of Pennat
Engineering,
School
for the support
author
is grateful to the Moore
Philadelphia, Pa. The
D. Parkhill and
studies.
also wishes
to thank
extended
He
to him
during his doctoral
of his work.
N. Finkelstein
of General
for their encouragement
Dynamics
fFormerly
This
article
appeared
in Fsychometrika,
1962, 27, 105-109.

376
Reprinted
with
permission.
VEEN
LA
The
of
solution
(2) is
the
monotonicity of
of Bellman
[1].
Solution
For
the
subject of
solution
the
and
(3)
solution
of
log,î
fix)
The
methods
[4]by
two-absorbing-barriersymmetric
log"V
in
similar
to
those
for the symmetric model
^.
Let
Existence, uniqueness, and
this paper.
shown
are
377
KANAL
Then
model
1.
"
(2) becomes
+ 6) +
Y^rrJix
(3) is given by
beta
Theorem
rxT^-^^^
^)-
1.
1*.
Theorem
1)6]j
|-^[x
Zêxp|-^[a:-(/c
^)6r|
Z
%)
exp
(k -t-
Proof.
one
(3), letting g(x)
From
gets h(x)
-}- h)
h{x
"
Assuming
x.
f(x)
f{x
"
h(x)
Cq
h), h(x)
"
log, g(x),
and
-\-CaX^,
-\-CiX
tuting
substi-
gives
g(x)
where
p(x) is
fix +h)
periodicfunction
Kx)
g{x
-^,
(x
26
p{x) exp
of
period b.
Kx)
h),
6/2)^]
As
f(x +nh)
Y, 9(x +
Then
as
-^
f(x
^
,
Kx)
nb)
"
"
p{x)
kh).
1 and
|-^[x
f:exp
m"j-
(k-
Furthermore
Kx
and
lettingn
"
n6)
"
"
"
p(x)
Ê^
exp
|-^
[x +
(k
1)6]
gives
p(x)
(^-i)6]j
Eêxp|-^[:r
+
*Prof. B. Epstein pointed out

Theorem
1 presented by Bush
[2].
the
error
in taking limits
in
an
earlier version
of
378
SO
READINGS
PSYCHOLOGY
MATHEMATICAL
IN
that
|:exp|-^[:r
(fc-i)"]j
+
/(.r)
from
which
1 follows.
Theorem
problem mdicates.
Corollary
that
/(O)
of the
the symmetry
as
Q.E.D.
When
1.
f{x)
as
Note
^^
for large negative values of

i.e.,
"^,
-56 (^
p{x) exp
p(.t)is of period h and the
"Z'^']
'
/c
corresponding to
term
x,
0 dominates
in the
numerator.
Corollary
2.
fix)
Corollary
large positivex,
For
[-^(^
4 the
denominator
6 "
When
3.
p(x) exp
by performing
Corollary
for then
sum
fourier series
6 "
When
4.
and
constant
from
zero
to
given by
'
analysis.
4,
by Corollary 3, the denominator
by
the
1 is
of Theorem
J2^
_1
pix) ^\
obtained
W^fj-
the numerator
infinityby
may
be
closelyapproximated
approximated by replacing
integralfrom
an
1 is
of Theorem
"1/2
to
Using
infinity.
the transformation
Vb
[i-'-i]
gives the corollary.

Solution
for the generalTAB
For the general case

in
terms
of
the
|Si"
solution
beta model
1, (82 "
for
the
1, it is convenient
symmetric
1 be
symmetric model given in Theorem

is given by Theorem
solution for the general model
for the
to
model.
denoted
2.
obtain
Let
the solution
the
by R(v).
solution
Then
the
LAVEEN
Theorem
For
2.
/3i "
^\Pl)
1, /Sa "
it
m=-0
379
KANAL
w/iere
5"(/3,)
(1
/32)
(n+l)/2
n
i
Proof.
Define
(1
"v)
the transform
F{s)
dv.
f(v)v-'-'
Jo
Writing (2) in
the form
')
mv)
(l i)/(.
+
and
applying the
transform
If
R{s)
gives
F{s +
F{s) +
is the transform
of
from
The
inverse
transform
of
transform
1)
the
of
that
[4]that
and
1).
denominator
terms
2.
in the
^^'''^rRi.s)
taking
being ^(/3i~*/3"i;),
F(s) gives Theorem
It is noted
in
numerator
+ ^'r''F{s+
filF{s)
R{v), it is shown
which, by expanding
product, one gets
^\Pl)
lm.v),
the
inverse
Q.E.D.
the coefficients in the series of Theorem
2 tend
to
zero
rather rapidly.
REFERENCES
[1] Bellman, R. On
and
memo.
H.
certain
class of functional
Fuvctional
equations.In
T.
E.
in decision
equations occurring
Shapiro (Eds.),
RM-878, Rand
Corp., Santa Monica, Calif.,1952.
Hams,
R.
processes.
Bellman,
Research
380
[2]
READINGS
R.
Bush,
P.
methods
[3]
Kanal,
1962,
[4]
in
Kanal,
Luce,
Manuscript
Revised
social
of
the
analysis
for
First
Calif.:
Stanford,
sciences.
equation
model
beta
Proceedings
for
learning.
Stanford
learning
two
In
Stanford
K.
J.
Symposium
Univ.
models.
Press,
S.
Arrow,
on
matical
mathe-
1960.
Psychometrika,
27,
89-104.
R.
of
Analysis
L.
doctoral
[5]
the
functional
L.
(Eds.),
Suppes
Luce's
of
properties
Some
R.
and
Karlin,
PSYCHOLOGY
MATHEMATICAL
IN
D.
received
manuscript
thesis,
Individual
stochastic
some
Univ.
choice
12/3/60
received
7/23/61
processes
Peimsylvania,
behavior.
New
arising
from
1960.
York:
Wiley,
1959.
learning
model.
published
Un-
SOME
RANDOM
WALKS
LEARNING
MODELS
Samuel
IN
ARISING
I
Karlin
Introduction
The
models
learning
some
makes
organism
there is
that the
n, and
by
Bush
of responses
sequence
Mosteller
and
fixed finite set of alternatives
among
and
n that
probability
They suppose
/?"at moment
response 5 will occur.
the
determined
are
/j^""*"^^
probabilities
bythe/?^, response 5" made aftermoment
the outcome
or
will
which
follows
as
r^ and
outcomes
r" that follows
event
models
simplestform
which
introduced
in
operatorsarising
[2]. They suppose that the
certain transition
further
one-dimensional
in
of
analysis
present paper presentsan
r^,,for each
apply where
in their
occur
exist two
There
made
/ was
models
theory. These
and
A-^^
alternatives
experiment. There
choice
shall examine
response s". We
and
exists
set of
outcome
A^,
Vj
in detail the
can
be described
and
two
Markofi^
possible
matrices
F^j
Let/?representthe
occurs.
of choosing
of choosing alternative A^, and
1
probability
p the probability
by the
A^. Depending on the choice and outcome, the vector {p,1
/?)is transformed
initial
"
"
vector
a new
F^
probabilities
representsthe new
appropriate
probability
is interested
of preference
of A^ and A^, respectively,
by the organism. The psychologist
in knowing the limiting
form
of the probability
choice vector {p,1
p).
into
which
"
mathematical
The
follows
as
two
impulses.If
and
(j){x),
"
behavior
of
change of
ulated
of the simplest
description
process of this type can be formwalk subject
to
the unit interval executes
a random
on
particle
it is located
-^
F^x
the
at
"
F-^x ax
pointx, then x
The
with
oca;
"ji{x).
probability
of
the nature
depends on
-^
The
"j"{x).
[1
dF
"t"{t)]
introduce
an
limiting
is givenby
particle
dF.
Jo
Jo
We
probability
actual
/'(:r-l + a)/a
rxjo
{TF) {x)
the
with
the
operator representing
transition
of
the position
describing
the distribution
additional
continuous
operator, actingon
functions, and
givenby
U7T{t)
[1
+ "t"{t)TT{\a
"p{t)]TT{at)
-
at).
conjugateto U; hence knowing the behavior of U one obtains

This
much
information
about
T.
considerably.The
interplayshall be exploited
it
does
pactness
is
continuous
nor
not weakly completely
possess any kind of comoperator T
theorems
of the classical ergodic
apply to this type [3].
property; thus none
the
of
on
The
r"F
behavior
assumptionsmade
depends very sensitively
limiting
It turns
out
that
T is
about the operators Fj and

This article is from
the
(t"{x).
probabilities
J. Math., 1953, 3, 725-756.

Pacific
381
with permission.
Reprinted
382
READINGS
Section
to
IN
the
1 treats
where
case
absorbing states, and
be
MATHEMATICAL
thus
PSYCHOLOGY
(f)(x)
shown
the
knowledge of
the
of U^tt
convergence
and
out
H.
is
"
of the
any continuous
In this
between
of r"Fare
the convergence
the
different
entirely
proofs
are
obtain much
to
"
"
Additional
tt.
remark
finally
R.
are
needed
Bellman, T. Harris,
independently.
They did not point
The methods
theyused to establish
case
Tand
arguments
that
U.
paper in " 1 overlapswith theirs in some

results subsume
15; our
theirs,and their
from
and
Section
ours.
2 considers
the
case
is
"f"{x)
where
and
increasing
monotone
\"/"{x)
-cp{y)\
This
only
By examining
additional
knowledge.
concentrates
Our
probabilistic.
notably 6, 8, 9, 12,
theorems,
at these
continuouslydiflFerentiable then
1. It is worth
emphasizingthat
distributions
does not imply the uniform
we
operators
0 and
the initial distribution.
function
connection,
boundaries
times
Shapiro[1] have analyzed onlythis
N.
the connection
of the
each
convergence
for
for this conclusion.
if
that
For
example, we
(L'^"77)''''
converges uniformlyfor
the
causes
hmitingdistribution
dependson
points.However, the concentration
t/in detail,we
have been able
the corresponding
have
This
x.
the
leads
""
1.
"
the
situation,where the limiting

ergodicphenomonon, or steady-state
independentof the startingdistributions.
examine
the situation "^(.t) I
This corresponds
In " 3, we
to completely
x.
the ergodic
boundaries, and of course
phenomenon holds. Other interesting
reflecting
the
also
consider
in " 4 the case
where
We
of
are
(f)(x)
developed.
properties
operators
linear
and
monotonic
Section
further
where
5 introduces a
is
decreasing.
possibility
allow the particle
still with certain probability.
tically
This type has been statisto stand
we
examined
Flood
the
M.
M.
In
6
we
[5].
"
by
investigate generalergodictype
where
both abstract analysis
linear. The arguments here combine
^(x) is not necessarily
it is worth
and probabilistic
r
ecurrent
event
Furthermore,
theory.
reasoninginvolving
where
the
the
in
6
without
case
emphasizing, proofsgiven " apply
any modifications to
allow
of impulsesactingon
the particle.
In a future paper we
we
any finite number
shall present the extension
where
of this model
to the circumstance
changes in time
infinite
motion
of the particle
has a continuous
or
occur
continuouslyand the possible
to
distributions
are
"
range of values.
discrete
The
last section
studies
of the
some
of
properties
the
in the
distribution
limiting
in all circumstances
that the limiting
distribution is either
ergodictypes. It is shown
the value of
and
the
actual
form
+ a.
on
or
absolutelycontinuous,
singular
depends
models
where
of the analysis
carries over
Most
more
to higherdimensional
a
alternatives
allowed.
are
In
subsequent paper
that this
We
note
generalizations.
finally
and
it
is
analysis
probability; hoped that
of this type.
investigations
It has
[8],and
been
brought
[9] relate closelyto
to
my
the
represents a combination
paper
attention
content
present this theory with
shall
we
methods
the
by
will be
used
useful for future
of
the referee that the material

Their
of this paper.
other
of abstract
techniques
seem
[6],[7],
to
be
different.
1
A
.
law:
X,
and
walk
a random
particle
undergoes
is
after
unit
then
at
particle
x,
If the
x
cumulative
-*
with
1
probability
the
distribution describing
ax
"
x,
where
location
on
the unit interval
time
0
of
"
x
a,
a.
-\-{\
"
at the
to
subject
If
1
.
ing
the follow-
(x)xwith probability
"
F{x) representsthe
beginningof
the time
interval
384
READINGS
this
operator
U.
ambiguityarises
To
be
we
shall
IN
MATHEMATICAL
complete,we should denote

Let
drop the subscripts.
W^rit)
ClearlyW~^
We
W.
PSYCHOLOGY
observe
now
the
the
where
no
isometry
t).
the
denote
77(1
but
operator by t/"j^,
identity
^l-cc.l-. ^â..^-
(3)
The
of
(1
mapping {a, a)
mapping the triangleof
-*
other
located
triangle
restrict
attention
our
for the other

and
unit
to the case
where
"
"
above
otherwise, we
shall
that
The
two
theorems,
Theorem
2.
77?^
operator
Theorem
3.
The
operator
assume
which
a.
"
0,
the effect
ct=0
into the
"
"
enables
Corresponding theorems
easilyby virtue
deduced
are
From
in
on
now
us
to
valid
of (3)
" 2, unless explicitly
"0.
"
for
state
we
0.
"
cr
"
of this section.
at the end
byl
itself has
isomorphism property
"
"
into
parameter space
This
square.
circumstances, where
next
of the
a)
"
the unit square bounded
in the
will be summarized
stated
a, 1
"
from
immediate
are
completeness,
(2).
if ni{t)"
particular,
Theorem
4.
Proof.
If tt,
([/,,)(") (1
=
all t, then
7r""" "
...
0 and
1.
functions.
for
irît),
77',
at
tinuous
positiveconpositive; that is, it transforms
is
into positive
continuous
functions
In
the values
U preserves
0, then
t/77j"
U-n^.
Utt, (Urr)',
.
" 0.
(C/77)""'
simple calculation yields

0(T"77"")(aO
+ ti\
nil
a)"7T"")(a
+ (1
a)f)
a)"-i77""-i'(a
+ (1
a)/)
na''-^J''-^\ot).(4)
Since
"
at
conclude
we
assumption that \
that (UttY'^^
"
it follows
for 0
(y.
is monotonic
7j-*""^'(0
since
77("-i)(a
+ (1
The
"
a"
0.
The
(l
a.)t,
that
increasing
a)0
" 0.
77"''-l"(0r)
"
impliesthat
(1
conclusion
same
0^-1.
a)"-i "
and
argument
" 0,
77-"'*'(0
to
(UnY^^
apply
As
"/""-!.
functions
into functions
U transforms
convex
particular,
positivemonotonic
of the same
the existence
of
kind.
4 we
assumed
Although in the proof of Theorem
the argument can
be carried throughroutinelyat the expense of elegance,
derivatives,
by use of the generaldefinitions of convexityand monotonicity.
In
Theorem
0
"i
5.
and hence
"n
Proof.
true
for /
/ "
"
"
If c " tt^HO "

(WTrY^Xt) " Ki.
for
The
0.
1
"
"i
"
n,
proof is by induction.
By Theorem
have
established
the result
Suppose we
Equation (4) yields
2, the theorem
for
(t/77)""'(l)77"""(1) Ci(a)77""-l)(l)
+ [(1
C2(a)77("-l)(ff)
-
" Ki for
("/^7r)'^"(l)
then
the
is
trivially
/th derivative
")"
with
1]77"")(1),
(5)
where
Ci(a)and CgCc)are
385
KARLIN
SAMUEL
depending only on
constants
and
c;
and
respectively,
on
n.
If
7r"""(l)
" M(a,
where
is
then
suflficiently
large,
constant
c),
a,
(5) yields
"
(t/77)""'(l)
77""'(1).
Since
and
c-^{a)
c^{a)do
depend
not
k, and
on
by
\{U^7tY-\x)\"
uniformlyin
and
find
we
x,
in
generalthat
the induction
hypotheses
M
becomes
(t/''77)""'(l)
when
largerthan
a, c), then
M(a,
" (f/^-77)"")(l).
((7^-l77)""'(l)
+
the
Consequently,
for k
(t/*^77)""*(l)
iterates
M(a,
This
impliesthe
trivially
The
conclusion
of the
proof
completeness.
it for
Theorem
77(0)=0a"^77(l)
Ci(a)M
exists at
77(g
we
deduce
77(0)
impliesthat
Theorem
uniformlyas
Proof.
and
^r-^^
=
ti^
Tig denote
of Urr
maximum
solutions
two
then
Tig;
"
present
for which
with
the prescribed
0.
77-q(0) 77q(1)
"
"
Let
?q
Since
+ to7r(cc
+ (1
t^jniatf,)
a)/o),
find
we
point. Iterating,
of 7r(r).A
value
77j^
7.
(1
0?o is ^^ô
0 is the maximum
which
that
solution
continuous
We
1.
(By contradiction.) Let

Put
boundary conditions.
ttq
point where tt-qachieves its maximum.
Bellman.
R.
to
originally
one
Proof.
be
5.
is due
most
by
c^{a)M.
of Theorem
theorem
next
There
6.
c)
a,
bounded
k^ are
"
similar
shows
argument
by continuitythat
that 0
"
min
7r(r),
n^.
"
For
any
function Tr{t)
"
t^
with
"
oo
"
I, U'"{t^) converges
co.
Clearlyt
"
"
p{t),where
'0,
for 0
"
"
/q;
pit)
for tn "t
1
and
?o is close
0 and
to
1 with
fixed.
find that
fixed, we
are
Since
t "
and
of
lim
U^t
theorem
hand,
VH
at
are
Bit)for every t. Since

uniformly bounded,
the convergence
if t^ is close
to
of U"t
1 then
to
Ut
is
by
convex
Theorem
4, and
the
values
at
Hence
Ut.
jjnf
"\;
^n
"
ijn^l
6(0 is
we
f "
convex,
conclude
6(t)is uniform.
0,
and
that
by
6{t) is continuous.
Obviously, Ud
(C^)'(l)"/''(!)(see the
5 the derivatives
Theorem
proof
of
6.
Theorem
By
On
Dini's
the other
5). Since
386
READINGS
Theorem
Up
"
p,
hence
U^f
"
"
U^p,
PSYCHOLOGY
convexityof Up,
therefore
U'^+'^p;
"
U^p
point,and therefore
fixed
of U"t
the
guarantees
and
MATHEMATICAL
IN
that
lim
the
lim
Theorem
by
deduce
we
and
U'^p
(f"(t).
Again,
infer that
we
U"t^
is
"f"{t)
(f)(t) d(t). On
the
that
tinuous
con-
account
4"{t)with
0 is 0, it follows
slope at
being
convergence
uniform.
denote
We
unique fixed point of
this
whenever
or
by "f}^
by 4"it)
J^t),
no
biguity
am-
arises.
for any
77?^ iterates
8.
Theorem
The
functions
constant
fixed
are
uniformlyfor any
Theorem
7, U^q converges
by the functions (1, T). The
i|"/"||=1,
as
when
appliedto
The
uniformly
converges
functiontt).
continuous
Proof.
{that is,U^n
strongly
U'^ converge
by
actual
well-known
limit is
lim
easilyseen
U"qit)
function
to
of
of continuous
Banach,
spanned
functions.
over,
More-
strongly
U'^q converges
q(t).
be givenby
qilU.Jt)
Consequently by
U^.
q(t)in the linear space
in the space
theorem
function
continuous
any
is dense
set L
points of
q{0)[1
(6)
cf"^Jt)].
n-*oo
This
is
immediate
an
two
dimensional
two
functions
of the
consequence
fact that
space spanned by the function

ô which
q^ and
0 and
at
agree
points of
the fixed
1 and
1 have
^"
the
consist
of the
Equation (6) shows
^.
limit.
same
This
that
enables
us
to show:
Theorem
9.
IfqiO
bounded
is any
functioncontinuous
1, then
and
at
U^q
strongly.
converges
Let
Proof.
derivatives
q(t),in
0 and
at
1.
addition
Then
being continuous
to
clearlythere
exist
and
at
functions
continuous
two
finite
1, possess
hît)and
/?2(0with
Ihit)"q{t)
where
hîO)
argument
then
first
we
find
can
of the
part
theorem
/rgCO)and
"
of Theorem
"
/ "
any
proof
Theorem
0
for
follows
now
hîl)
a
with
by
\q{t) qjit)\"
"
standard
from
result
this
continuous
q{t)is
only
propertiesassumed
As
e.
the
about
0 and
1,
q{t)in the
of the
conclusion
1, the
||f/"||
using the
at
argument.
for
"
i "
Wrr^'Kt)]
then
m,
"
for
Ci
m.
The
proof is by induction.
and the constant
functions
preserves positivity,
Proof.
established
result for
the
f/TT-*"'* (1
--
\.
"
t)a'''7T'''^\at)
+ t{\
+
This
now
the
q^{t)satisfying
// W'\t)\"Ci
10.
conclude
h^{\). We
equation(6). If
7 and
hît),
"
m(l
We
note
For
are
0, the
result is trivial since
fixed pointsof U.
Suppose we
that
a)('"'77"")[a
+ (1
a)'"-i7T('"-i'[a
+ (1
a)r]
a)r]
mo'''-'^TT^"'-'^\at).
easilyyieldsthat
max
"
"*'(/)!
11/77"
max
have
U^'^Kt)]+
Cmax
^''"-I'COI
,
387
KARLIN
SAMUEL
where
I
[(1
max
t{\
1)0"^ +
a)"]
1.
"
Therefore,
" Amax|(t/'=-i77)('")(0l
|(t/*77)("'(0l
+ Cmax
max
KC/^-V^-HOI
t
Amax|(C/"'^-i"77)"""'(0l
+ K
"
by
this
hypothesis.Iterating
induction
our
inequahty yields
last
k-l
"
|(t/'^77)""'(0l
^
max
establishes
This
^'K
"
|77"""(0I
A^max
i=0
M.
the theorem.
C"
Ifqit)belongsto
U.
Theorem
{n
then
derivatives),
continuous
[U'''q{t)T^
lim
m"^co
converges
uniformly
for 0
We
Proof.
On
the
prove
"
I.
"
only for
theorem
continuityof U"q^^K Thus we can

Ijmâ)are also uniformly bounded.
1, for the other
select
cases
are
similar.
{Lf^qf^^
impliesthe equi-
of
boundedness
10, the uniform
of Theorem
account
"
subsequence converginguniformly since
Let
T(0
t/"'^^*!'.
lim
i"*co
lim
Since
As
V^'q converges
d'(t)is independent of
uniformly to a unique limit 6{t),we

the subsequence chosen, the conclusion
obtain
^(t).
6'(t)
of the theorem
easily
follows.
Let
Proof.
^(0)
0, /7(1)
=
The
12.
Theorem
is analytic
for 0
fixedpoint (fi^,^
p(t)denote
1.
By
virtue
At
2
this
through
Theorem
12 for the
case
deduce
we
0 and
that
" 0.
("/"/?)"'"'
"l"i%
=
and
hence, by
desirable to summarize
pointit seems
Theorem
1 1 and
of Theorem
monotonic
is absolutely
"f)^".
Therefore
" 0.
"f"'Jl^
I with
"
"
diflferentiable with p^^\t)
infinitely
function
lim
"
where
"x
the
theorem,
well-known
analogous results
We
"
I.
0, \,2,
the
enumerate
is analytic.
of Theorems
ing
correspond-
theorems.
Theorem
(-iy-\U7ryiKt)
In
"
If (-ly-'^rr^^Kt)
for
same
Theorem
n,
and
0, then
7T(t)"
"0.
functions
particular,
positiveincreasingconcave
of the
then 0 "
4'.
are
transformed
into
tions
func-
kind.
5'.
If
"
Tr(t)"
" Ki, and

(-l)^-i(C/'"77)"^"(0)
and
hence
" 0 for 1
(-iy-'^TT^^\t)
" Ktfor 1 " i " n.
TT^'Kt)\
"
"
i "
n,
388
READINGS
with
7 holds
Theorem
the
lie in the
providedonly they
a,
MATHEMATICAL
unchanged and
6 remains
Theorem
and
IN
PSYCHOLOGY
is valid
unit
open
modification
independentof
the conditions
on
interval.
of
the
proof where
p{t)is replacedby
function
concave
'
\,
for 1 "
"
?o
for 0
"
?o
pit)-
y.
-t,
"
^0
the
and
functions
replacedby 1
function, a familyof functions
of Theorem
to infer the validity
us
constant
enables
changes
are
in their statements
modifications
appropriate
solution
^5
which
for this situation,where
analytic.In
the
remainder
Theorem
The
13.
8.
cr
of
also constitute,with
the
in
9, 10, and
above
11, with
for Theorem
7.
completelymonotonic
the theorems
suitable
by simple
The
unique
and
established
are
the
C[0, 1]. This
established
reader, are
1, is
"
is dense
span
Theorems
for
of this section
the value
as to
specification
linear
that indicated
to
These
ty.
"
whose
leave
we
similar
(1
"
hence
without
any
(t.
functions
00
"^m(0
to
geometrically
converge
from
(6) that
U^m
tends
uniformlyto
by Theorem
Since
zero.
conclude
11
A "
Let
.
knQ
the derivative
that for
denote
the last
"
at
0 and
t)] " Xt{\
k
integer
2
t
T,(0
1 of
sufficiently
largethere
"
"
"i"J,t)
"^,,"(0
"
t)]
t/ô^l
with
t)]
0.
It is immediate
Proof.
U^m
m
0 is
"
exists
an
"
We
1 and
"1,
nîX)such
we
that
t)
for which
UV{\
/(I
kn^
t)] " Ck^
"
m.
obtain
" Cp",
Cp""o+i)fc
1=0
where
=
Theorem
converges
14.
We
first establish
simple calculation
shows
"
m,
we
obtain
the
result
for
specialfunctions
-t)"
U{t') -f"
continued
upon
Ct{l
of
application
conclusion
now
with
1 "
"
oo.
summation
that
t/'[r(l
-
t)] "
U%t')
"/""(r)"
i=m
The
U^[q(t)]
t).
and
-C
oo, then Urn
that
-Ctil
For
1.
is continuous, \q\\)\
"
"
Ifq{t)
ooand\q'(0)\
geometrically.
Proof.
A
?}nno+l) "
U\t{\
t)).
i=m
follows
from
Theorem
13.
The
generalfunction
satisfying
q{t),
SAMUEL
Theorem
hypothesisof
the
and
Pi(0
PaCO
the first
this fact and
observe
We
14,
which
part
at 0 and
agree
of this
from
bounded
be
can
389
KARLIN
above
The
below
and
result
by two polynomials
from
directly
follows
now
proof.
easilythe identity
Ut
U and
Applyingsuccessively
={:^
\)t{\
t).
obtain
adding, we
00
"f",^, lim
C/-?
W"
the
consider
2 K^^
1)
(7)
0-
of calculation.
the dependence of (f)^,^

on
describing
remarks
Some
"-00
is useful for purposes
This
(a +
and
in order.
are
We
:
followingidentity
w
Ula
"
Uâ,.{U,,. U,;.'W::r\
Va-,0.'
=
(8)
"
1=0
If
/(O
function
is any
with
bounded
derivatives,then
the
by
mean-value
that
theorem
" 1(1
Ktâ.a f/a',a')/l
[/(Ô -/("t'0]+ ?[/("-^ (1
"
-/(a' +(1
C{\o
"
Applyingequation8
by
obtain
we
Theorem
2,
f(t)
to
-aOOll
a'\ +\oL
a'|)r(l
t).
that
^^^
remembering
"f"a',a',
"x)t)
are
preserved
inequalities
obtain
we
\U^^"l"a'
a'
4"a',J C{\a
"
ja
a'\ +
"
a'l) ^
U^td
t)).
i=0
Allowing "
to
go to
easilythat
have
we
oo,
" ^(k
\i"a,a 4"a\a'\
K(rj) is finite,providedthat
where
It is worthwhile
of the unit
square.
Next
^ff,a(^) ^=
observe
0 and
by
(f,(x)
=
at
0.
and
"
(7
I and
0, 0(1)
of the unit square

"
1 is
arbitraryand
only
1, is (f"(x) 1 for 0
"
TT
while when
"
0, 0
"
"
(1
^'"^)"
\, then
00
"â,a TT
=
a;
calculated
easily
are
00
the
boundary
1, then
L^'X,
have
we
"
1.
and
that
is continuous
(f"
provided
continuous
fixed point "f"
'f"a,l 1
when
1, then for
then
that
that
the
1 then
"
the solutions
a
0(1)
"^(0)=0(0"a?"l)
0(a;)
I.
t] "
"
(a, a) lyingon
for
^
verification
0 and
hence
"
+ x"f)(x).
x)(f"{ax)
"
point with "^(0)
fixed
when
Similarly,
satisfying0(0)
boundaries
If 0
is
and
(f"(Gx),
^"
"'!),
then
I;
"
of
direct
a'cr,o'
a,
nature
1/(1, (1
Therefore, if ^
r] "
"
the
discuss
to
First, we
let
^^'l+
"
On
the
turn
out
other
as
at
two
follows
390
L"
where
0
on
a")
(o-",
the
'Pa ,a (0)
(^^
0.
Also, for any

0
as
loss of
are
We
to
tend
to the
in
"
1, and
"
a;
any
"
0"x"l
we
^(t)uniformly,for
"
with
positive,
are
interval of the form
any
1.
(t"
"
over
equi-continuous
which
be
subsequence
may
^^
select
can
8 "
"
that 1
I, the first derivatives
"
"
I,
a^ "
1 otherwise.
(f"a,a(^)
assume
0"a:;"l
0,
limit
We
boundary.
(ap,0) with
-^
interval
this impliesthe
" 1,
"f"a",a^
(a, a)
generality may
monotonic
and
increasing
interval
Since
"
where
case
the
investigate
now
we
convex,
interior
"
"
As'
1.
get T(l)
1 and
T(0)
similarly
*F at
of
continuity
the
"f)"
^
"!"" convergingto
as
(5 "
we
the
subinterval,and
denoted
and
a.)x. Finallyfor
"
identitymapping.
is uniform
without
uniformlybounded.
are
5"
the
(1
pointwiseto 0 for 0
convergence
0); then
(o-Q,
Therefore
0.
"
a"
gives
studyingthe
to
-*
to
allow
we
as
"ô converges
Moreover,
Let
and
the
to
PSYCHOLOGY
3,
that
show
we
0^
reduces
for definiteness
attention
and
operationL appUed
U
operator
dependenceof
our
the
I and
1 the
MATHEMATICAL
IN
READINGS
uniform
The
0.
of
convergence
guarantees
iââr
zero.
Put
Ur
the
consider
We
take
We
U^^^Q,and
"j"r
"i"a^,a^.
:
followingidentity
C/oT
Uq
Ua^ôL,,
(T
fixed
^,)
(t/,T
t/oT)
|/i| |T
trivially
\; then
"
{"i", t/,^0
^^| "
"
/i
when
/g.
is
sufficiently
large.Also
I/2I 10,
=
t/,T|
\U,4r
Ur'y]
1(1
+ (1
a:[";6,(a,
+
for
But
ic
aô "
xMria^x)
observe
fixed,we
that
a.^)x) T(a,
"
varies
oi.j)xq
"
(1
"
(1
a,
-T(cT,a;)]
in
"
an
a,)a;)]|.
interval
T
appliesto
convergence
for r large. Thus
S yieldsl/gl" e.
inside 0 " a; " 1
By construction, \I^\"
and
verification
for x
I.
T
for
"
0
direct
infer
the
a^
1,
"
we
t/^T
by
equality
T
T with T(0)
1 and
the fixed pointto the equationC/o^
0, T(l)
However,
"
"
as
a,
0, and
the
same
of "^,
uniform
Oj-x. The
"
continuous
is the
at 0 is
appliesto
the
furthermore
established
the
rj "
a,
"
(a, a)
The
I and
1 and
T(l)
and
hence
-^
0)
(o-p,
with
(1,a^) with
Summarizing,we
yieldto simpleranalysis.
Finally,a
word
values
lie
parameter
on
have
the following
satisfy
continuity
properties:
fixedpoints (ji^â
0
CTq "
a^
function
4"a",a"
converges
"
a,
a'
"
rj,
"
a)
If((y,
that
deduce
we
the hmit
Thus
is
" ^(^)[k
l"â,a "â',a'l
If{a, a)
1.
independent of CTq " 1- ^ similar analysis

of the
(1, a) (a " 0). The continuityproperties
boundaries
two
15.
a.' "
"
a;
that T
followingtheorem
Theorem
"
note
where
case
for the other
solution
1 for 0
subsequence of a"
for every
same
pointwise.We
IfO
^{x)
"
1, then
0, then
then
"
f^'l+ !"
"'!].
"
0 pointwisefor0 "
"f"a,r".(^)
1 pointwise
for 0
'i"a,rj"x)
-^
-^
concerningconvergence
the boundary. When
of
a
V^tt
=
0,
for
o-
"
tt
a;
"
1 and
"
a;
"
continuous
1, then
(f"aj^l)
=
1.
1.
V^tt
when
the
converges
392
READINGS
model
2. In this second
is at
1
a:,.then
(1
-^
the random
a)x with
"
PSYCHOLOGY
MATHEMATICAL
IN
walk
is described
follows:
as
and
"f"ix)
probability
If the
particle
probability
with
ax
-*
where
"i"{x),
"
"ix"\.
\"f,{x) cf"{y)\
-
analogous transition operator
The
(1) becomes
to
/-(ai-aVd-a)
V/ff
(1
the same
In
this
are
section,
considered
operate
the
are
We
"
1 ; the
"
a,
but
operator
similar
T is
conjugateto
^(0
is monotonic
that
"f"{t)
where
case
A +
a)/].
The
(10)
values
spaces
for
and
which
on
Theorem
to
manner
they
obtain:
1, we
the operator U.
This
increasing.
A +
fxt,where
Let
boundary
of great interest.
not
Again, in
1.
assume
where
important
A "
The
further
now
"
+ "^(077[a+ (1
i"{t)]7riat)
handle
in
as
17.
case
then
to
easy
same
[l
take
we
is
Theorem
the
before.
understandingconcerningF applyingas
Un
(9)
Jo
with
4"(t)dF(t),
+
i"{t)){dF{t))
1 ; and
^" "
model
whenever
includes
A +
0.
Theorem
The
18.
operator
and
preserves positivity
positivemonotonic
creasing
in-
functions.
the
Since
the
0(1)
Theorem
converges
UiT
TT
This
complete the
Theorem
Proof.
functions
by
spans
The
Since
well-known
that
assume
if
can
"^(0)
"
treated
"
in
0,
analyze
analogous
we
an
exists and
"^'(0)
0, then
increasingbounded
monotonic
^(0)
or
be
and
is finite.
then
positive,
carried
operators U^n
\\U"\\
dense
theorem
out
subset
1, and
converge
the
space
uniformly
forany
of all monotonic
of the set of all continuous
continuous
function.
continuous
positive
follows
functions, the theorem
of Banach.
tions
distribu-
21.
For any distribution F, the distributions T^F
converge as
TG
G
G
which
which
distribution
is
independent
of F.
for
unique
Theorem
to
Proof.
Theorem
U^tt
constant.
easilyusing the techniquesemployed above.

fixed
(f){t)
easilyyieldsthe fact that the only continuous
in
functions.
similar
the
used
The
is
to
constant
are
proof
proof
fact directly
with the result of Theorem
21 below.
connects
First,
function
of
for
continuous
of
V^tt
77(0.
proof
any
convergence
20.
circumstance
other
now
proof can be
hypothesison
6.
Theorem
we
The
If Tr{t)is
uniformly
The
pointsof
19.
to
The
we
1.
"
Furthermore,
manner.
^(0 implieseither ^(1)
hypothesison
where
case
verification.
Direct
Proof.
1 6.
The
To
weak*convergence
complete the proof
of r"F
we
must
follows
from
directly
establish
that
Theorem
if lim
T'^F
20
and
and
SAMUEL
lim
T^H
K, then
Indeed, let T
K.
393
KARLIN
denote
function.
continuous
any
have
We
that
(T,
K)
lim
91"
(T, T%F
H))
lim
"-C0
M"
(t/'^T,F
H)
]
"(1"^-/'
a{ \dF
\ dH
^0
"-00
(11)
as
F and
/f
distributions.
are
Hence
{
W{t)dF{t)
=
for any
function
continuous
It
fixed distribution.
Theorem
if{G^,a")
T"^,a"
T for
about
1, then
"
cr, a
K.
the
complete
it in
at
F^^^^^ F^^rt
-^
the
distribution
7t(/)denote
any
at
every
denote
We
function of
continuous
of this
nature
later section.
{a, a); by Helly'stheorem
-"
Let
T^,^.
determine
more
say
"
(cr",a")
Let
and
therefore
to
distribution F^r^ is
Fa^ ,0L" converging to
the
The
with
((T,a)
-^
shall
We
22.
Proof.
Fr
T, and
extremely difficult
seems
a,
unique
it by
that
a;
is,
point of continuity of
choose
a subsequence
point. Write Tj. for
continuity
we
every
can
fixed continuous
function.
consider
We
quantity
(tt,F
Since
F^
Now
-"
note
we
TF)
C/
(77,F^)
find for
(77,TF^)
(77,FF^
TF).
F
sufficiently
largethat [(77,
F^)! "
"
e.
that
U"^ ^^
F,)
we
distributions,
as
1(77,
Fr)
Since
(77,F
C/^77
converges
(77,TFr)\
f/
to
strongly
converges
U-n.
uniformly to
\{Ur-n
1(77,
F,F,)
"
(77,FF,)|
as
F^
it follows
verify,
we
distributions,
are
lt/^77
max
|(t/,77 C/77,F,)l.
is trivial to
U^^^,as
Whence,
Utt, Fr)\
"
Utt\
"
"
that
infer that
I
when
is chosen
largeenough. Evidently,with
1(77,
T{Fr
Therefore
Since
obtain
we
is any
77
Theorem
21.
for
function,
Consequently, as
22
is
any
FF)|
infer
we
limit
largewe
\iUn, Fr
F
largethat 1(77,
continuous
of Theorem
F))\
get as
F)\
"
distribution
hence
and
TF
of F"
that
e.
3e, and
"
before
(77,F)
therefore
must
(77,TF).
F^,^ by
be
F^ ^
In
this case
the
clusion
con-
immediate.
now
m
"
3.
The
monotonic
model
considered
The
decreasing.
operator
Urrit)
Note
and
that
does
the ends
we
not
0 and
have
1 the
"
a.
In
generality.
^{x)
=z
x.
"
is
"f"
becomes
tTT{at)+ (1
replaceda by
restrict any
is with
in this section
This
t)7T(\
-
is
only for
this model
there
greater probability
is of
the
moving
at).
convenience
closer
back
the
(12)
in Theorem
particlemoves
28,
to
into the interior. The
394
READINGS
situation
IN
MATHEMATICAL
PSYCHOLOGY
described here is of
boundaries.
completelyreflecting
Again it is easy to show
the constant
function. Therefore,
are
points Un
shall find as in " 2 that the distributions describingthe positionof the particle
we
converge to a limit distribution independentof the initial distribution. We firstproceed
to analyzeconvergence
it is no
In this case
propertiesof W^n.
longer true that U
the
of
class
m
onotonic
functions.
is conserved
positive
preserves
Only positivity
by the
U.
described
in
Theorem
23
here
well.
a
new
as
However,
serves
mapping
quality
this
section in order to avoid trivial changes of proof and different
Throughout
that
the
only continuous
results at times,
suppose that 0
we
Theorem
fixed
23.
If Tr{t)has
"
"
cr
a,
then
derivative,
continuous
"
|(^7r)'(0l
max
|77'(r)|,
max
with
equality
ifand
holding
Proof.
By
direct
onlv
obtain
t)aTT'(l
theorem
\tan'{at)+ (1
a^l
(13)
7r((7t) 77(1
-
(at
(1
a)
"
[ta + (l
max
t)a +
"
"
(a
"
(I
"
a)t]max
"
a)
"
at
"
\Tr'{t)\ max
then
let t^ denote
UXOI-
equalityholds,
at)
at)
Gt
If
a/).
get
we
t)o!.n'(l a
-rriat) Tr{l
a/) +
the aid of the mean-value
" max
|t/77'(/)|
max
is linear.
if-ît)
computation,we
U-rr'it) tav'(ot) + (1
Hence, with
pointwhere
|77'(0I k'(?o)l-
max
It follows
easilyfrom
(13) that
77-(l
TT{ot^
=
"
Q'
(14)
o'ô)!
=
atQ
This
yieldsthat 77(0
between
chord
a
ct/qand
"
is linear
a
crô"
Theorem
max^
Theorem
converges
Remark.
and
at^,
or
a)
ato
otherwise
than
greater magnitude
also
these points. Equation(14) shows
the
somewhere
slopeof
the
24.
The
proofis similar
25.
to
// 7t(?)possesses
uniformlyto a constant.
The
reason
so, will be
necessarily
that of Theorem
two
why the two

explainedlater.
"
be linear.
// 7r{t)belongsto C"* [7r(;')

possesses m
in
b
ounded
is uniformly
r (0
n for each
|(t/"77-)''"'(/)|
Proof.
U^TT
"
slope has
in (13) requires
7T{t)to
impliesthat equality
then
"
(1
that at^ and (1

by Tr(t)at
then
maximum
points of 7T'(t).Repeating this argument successively
subtended
a/g) are
for
at^ the
at(^
"
"
l^'(l
\TT'{t)\ |7r'(cTô)l
max
"
derivatives],
m).
10.
and
derivatives,
continuous
cases
continuous
"
and
are
5^ a,
then
distinguished,
SAMUEL
In view
Proof.
of
L/"v
of functions.
We
Thus
thus
can
by
virtue
of Theorem
derivatives
second
(U^tt)' constitute
equicontinuous
subsequence rij such that U"'tt converges

that
It follows
trivially
uniformly to "l"'(t).
a
23,
KC/^'TrVI
max
and
U^tt
select
and
(U^'tt)' converges
uniformly to (j)(t),
ifrii+i^ tends uniformly to Ucf)and
Moreover,
24, the first and
Theorem
23 and
uniformlybounded.
are
famiUes
of Theorem
395
KARLIN
"
|(C/"'+M'l"
max
|([/"'+i77)'|.
max
(15)
Hence
lim
\{U'''tt)'\ lim
max
i"fCO
("""00
Therefore, by the uniform
|"^'(0I
23
i(t/20)'(O|.
max
secure
and
U4"{t)are linear. However, if a 5^ rr and "f"{t)
yieldsthat 4"{t)
forces
U4" is quadratic. This impossibility
t, then
4"{t)to be
that
Let / be chosen
sufficiently
largeso
with
term
derivatives, we
|((7"A)'(0l
max
InvokingTheorem
i""-oo
of the
Kf/^'+V)'!.
max
convergence
max
contains
Kf/^'+^Tr)'!lim
max
a constant.
identically
|t/"'7r
c\
"
e.
Then
lU'^'^'^TT c\
Repeating
this
c\ +
t\U"'7T(ot)
"
shows
argument
(1
This
establishes
Theorem
26.
linearly
a
we
obtain
dense
case
where
subset
next
1 "
polynomial does
cr
"
0.
Theorem
and
is
Pn-\
of
If P{t)
27.
The
constant
degree " "

it
proof, is enough
proof
c
"
to
then
"
e.
with
ol,
then
two
all continuous
a
establish
in this
note
its
j^
c.
V^-n
continuous
derivatives
functions.
Since
theorem
the uniform
convergence
the
case
uniformly.
converges
well-known
spans
||t/"||
=
1,
of Banach.
fact that
interesting
of V^tt
t/
for the
appliedto
degree. Particularly,
[a"
"a"-i(l
a)]x" + P"_i(a;),
polynomial of degree n
the convergence
Proof.
if P
and
and
25
we
We
increase
not
of
space
theorems
two
"
converges uniformly to
continuous
using Theorem
P"_i(x) denotes
constant
U^tt
C\
all functions
of the
Ux""
where
of
space
the result
In the
that
If 7t(/)is
The
Proof.
a/)
that
IW'^^Tr
for any/7.
t) |C/"'7r(l
\.
"
polynomial, then
is any
U^P
converges
uniformlyto
is geometric.
is
by
induction
U^P
c.
[/%"
=
a"
the
degree of
have
1 that the iterates
verifythat
on
Suppose we
t/''P"-i
converge
converges
-
the
shown
for
any polynomial
uniformly. To complete the
uniformly.Let
rta"-i(l a);
polynomial.Clearly
396
READINGS
then
1 since
"
1 "
IN
0.
"
MATHEMATICAL
obtain
We
f/x"
get, for A:
Repeating,we
PSYCHOLOGY
Ax-" +
P^îC.r).
"
k-l
This
last
form
is of the
sum
with
"
|fl,.|
00,
and
Hm
b^X^)exists.
*=
function.
constant
Finallywe
speak,is the
to
U^'x^
uniformly.Thus,
converges
any
It is
well-known
theorem
that
lim
c^{x)
00
uniformlywhenever
exists
so
uniformlyto
converges
in the
that
note
of the
regardless
same
polynomial converges
where
case
of the
outcome
The
geometrically.
fixed
a.
be
must
(the rate of learning,
experiment),then
be
proof can
pointwhich
carried
for
C/"P
through by using
induction.
This
fact
yieldsthe
that
that
Proof.
Similar
We
note
V^TT
now
Theorem
unit
that
not
for the circumstance

for
a
1 "
the
U
1
(T
"
can
0,
return
Theorem
r
"
or
conclude
(7=0
when
particular,
77
is
"
0,
"
The
only
boundary
"
case
where
the
1 it is not
of f/" is trivial.
However,
V^r^-n converges
For
when
even
produces
"
for every
of the
and
77,
=
that
to show
hard
function
cr
of
argument
mention
cr
longer true
no
C/^"+^7r converge
We
traverse
dense.
1 and
0
and
continuous
now
29.
to
the
0
hypothesis
"
a,
then
If -nit)belongsto C",
"
(t/*77-)"^'(/)
converges
uniformly
for
m.
Proof.
This follows
easilyfrom
Theorems
f'x/o
TF=\
rix
dF(t)
Jo
This
is
continuous
convergence.
then
0 it is
formly.
uni-
converges
polynomialsis
otherwise.
convergence
of
lack
again a
1 "
the
V^tt
and
U'^'^tt
1 and
polynomial.The
for which
and
case
a,
for every
necessarily
converge
where
when
occurs
-n.
We
"
In
identityoperator
we
function
behavior
of all
periodicphenomenon
A
in this case
as the quantity
square for this model.

does
V^ry^TT
when
occurs
0, then
"
set
in this
geometricallyto
highermoments.
the
important example that
difficult convergence
that
26, since
It is easily
verified that
down
breaks
27
and
continuous
Theorem
to
the
converges.
separatelybut
other
is
If -rrit)
28.
expected positionconverges
similar results valid for
with
limitingexpectedposition
Theorem
the
represents the transition
ix
"
24, 26, and
28.
Let
D/a.
(1
/)dFit).
Jo
law
for the
distribution
the position
of
describing
the
SAMUEL
for
particle
this model.
sections, we
Tand
between
those
to
theorems,
following
in the
employed
using the
preceding
conjugaterelationship
For any distribution F the distributions T'^F

tions
converge as distribudistribution
which
which
F" ^ for
is independent
unique
F^^",
TF^^
of F.
in the
31.
of
sense
it
Again
The
distributions
Theorem
seems
F"^
constitute
tions
family of distribu-
continuous
11.
difficult to
very
determine
explicitinformation
more
any
iv,aThe
4.
model
at least 1 "
and
analogous
30.
Theorem
about
the
U.
Theorem
to
By arguments
establish
can
397
KARLIN
examined
Utt
Of course,
or
before,0
as
elementary in
this
"
The
//.
{Xx
is such
1.
"
cr
a,
that
he
"
bounded
immediate
An
to
-\- i.i,with
X +
fi
"\
"
which
ax).
V^tt
is
(16)
turn
out
to be
very
easilyproven.
then
derivative,
\Tr'{x)\
max
\.
"
/h)tt{\
"
with
he
the form
followingtheorem
"
\{Utt)'{x)\
max
4"{^)
"
Convergence questionsfor
of the
If Tr{x)has
(1
has
operator
iu)tt(ox)+
in view
case
32.
Theorem
"
here
0.
Let
the standard
we
way,
Theorem
of Theorem
consequence
denote
33.
distribution F^^^ which
the
obtain
For
is
transition
operator
{V^tt)'converges geometrically
is that
of distributions
for
this model.
In
distribution
any
a
32
the
distributions
functionofio,
continuous
a), and
T^F
TF"^
converge
to
the
F^,,^.Moreover,
is independentof F.
"F(j
o;
5.
feature
77?/^ section
added
the two
fixed
is devoted
first is that
points0
and
F-^x
the
of
possibility
to the
two
new
towards
the transformations
by
ax
preceding models.
impulses of motions
of the
variations
some
in addition
and
models
where
F^^
"
ax
stands stillwith certain probability.

particle
statistical
particularly
important
learningproblems, and much
this type has been done
on
by M. M. Flood [5]. They are referred to as
investigation
of this type is as
the pure models.
The
mathematical
descriptionof the first model
three
random
A
unit
interval
is
follows:
impulses: (1)
particlex on the
subjectto
with probability-n-îX x)\ {1) x
\
x
ax
a.
-\- olx with probability-n^x; and
a; with
(3) X
tt^x,where 0 " w^, wg " 1. This
probability(1
rr-îl x) + {\
is similar to model
I where
absorption takes place at the boundaries 0 and 1. The
These
third
to
allow
we
motion
the
in
are
-^
"
-*
"
"
"
"
operator analogousto (2) becomes

Utt
TTîl
"
x)TT(ax)+ [(1
"
tt-^){\ x)
"
-f (1
"
TT2)x]7r(x)
-I- Tr^x-niX
"
-I- ax).
(17)
398
IN
MATHEMATICAL
transition
operator
READINGS
denote
Again, let T
into
particle
setup, and
increasingfunctions.
If '^,
Uv
over
changed
into
"
convexity,and
analogues of
and
methods,
obtain
we
[1
that
computation.
4 does
Theorem
to
noting that
for tt-^
773
not
1 the
carry
have
we
here
condition
on.
easilyextend to
converges uniformlyto a
U^n
this model
limit
the
by
given by
(18)
+ "â,a,.".3(^)^(l),
"^",a,.".,(^)]'^(0)
of concavity.
Moreover,
5, 6, 7, and
Theorems
that
verify
property of monotone
the
direct
obtain
we
so
0,
"
analogue
34.
" 2,
to
It is easy to
Theorem
ifand onlyif
"
through by
of the
compared
as
experiment.
U.
0 the property
tt' "
in Theorem
stated
of
preservation
The
and
-n
remainder
the
the condition
of " 1 for
same
that
remark
We
under
be carried
proofcan
The
Proof.
1)
+ TTaCa
(t)77i
the
locating
0, then {Un)"
"
distribution
of the
also preserves
obtain
we
n"
'^'^nd
with
preserves
3 and
2 and
(1
otherwise
the
maps
end
conjugateto
consequently
Furthermore,
34.
Theorem
T'\s
of Theorems
fulfillsthe conditions
and
which
at the
corresponding distribution
the
for this
is valid
the
PSYCHOLOGY
0 and
fixed point of U"f"
(f"with ^(0)
unique continuous
of
function
of
"f){\) 1.
theory
geometricconvergence, continuity ^ as a
established
of a, a, 77i, and 772, and the form of the limitingdistribution of the particle
The
in
the
valid with slight
of " 1 remains
clusion
for the model
changes
proofs.
generalconstillhas no effect on the convergence
of standing
is that introducing
a probability
form
its limiting
or
providedonly the essential feature of absorbing
ary
boundstillprevails.
boundaries
Finally,in this connection we remark that for special
is
(ftaa,-^
,-n
where
The
the
parameters tt^ and

points; for example, n^
^2 the
of the
values
of the end
6.
Hê
"
but
those
in the unit interval.
includes
cases
and
so
The
on.
drift to
followinggeneralnonlinear
with
from
"f){x)
probability
to
this
function
The
ax.
that
case
the
examples
"
"f"{x)
types of
is
one
or
other
about
to
a;
0 and
"
discussed
models
the rate
one-dimensional
1
cc
"
""
2 and
4.
"
(f"ix)
in
ccx
with
only continuous
6 "
in
investigated
[1
"f"(t)]
dF{t)
""
the
d "
1 and
However,
of convergence
4-
jo
T is
become
3,
in
of derivatives,
transition operators become
TF=\
and
may
0.
stronger results
much
obtained
motion
TTg "
excludes
This
of the
subcases
some
we
0,
in this section, the
treat
The
particlemoves
learningmodel.
1
from
and with probability
"f"{x)
for
additional
important requirement
for all
entire
dF{t),
(20)
aO.
(21)
Jo
adjointto
(t/77)(0
=
(1
mX"^t)
^(0^(1
function nit). The

uniformly for any continuous
proof of this fact shall be based on the followinghighlyintuitive proposition.Let
failure at each
or
success
an
outcomes,
experiment be repeatedwith only two possible
the
of success
trial. Suppose further that the probability
/?" at the "th trial dependson
We
shall show
that
U"'tt converges
400
READINGS
a
Consequently,
as
/^
00,
-*
trials times
I\
that
of
run
success
MATHEMATICAL
IN
lengthr
is certain to
0, since /g is bounded
K.
the
On
other
twice
by
in view
hand,
PSYCHOLOGY
happen in finitetime. In particular

of no success
in n
run
probability
and equation(22) we
lemma
secure
the
of the
CA*-. Therefore,
"
iim |C/"77(a;) t/"7r(2/)|

" CA'-,
-
n"
which
be
can
exists for
for every
made
"CO
small
arbitrarily
as
shows
if
Hence,
oo.
lim
"/"77(2/)
=
lim
C/"77(x)
Since
x.
be found
subsequencecan
lim
one
^-
y, then
single
i"
for
and
for all x,
hence
If^'Trix)
"-oo
argument
an
that
so
used in the close of the
proofof
Theorem
25
that
lim
t/"77(x)
a.
W-"00
The
lemma
easilyimpliesthat
||"/"|| 1, we
can
35.
Theorem
to
the
up
Theorem
// "f"{t)
belongsto C",
uniformin
Theorem
of F
independent
This
37.
For
with
TF^^
Finally,
random
note
walks
states
at the
in
two
""
with
distributions F,
any
any
F^,^ and
ends
0 and
in
C", then
we
distribution Fg,^
to
a
converges
with respect to o, a.
T^F
of the
used
of
of T and U.
conjugaterelationship
be
can
employed to analyze
in this section
impulses.
(1
a,)w,-+
cc^x.
the
investigate
nature
In
the
case
where
the
of the
distribution
limiting
were
absorbing
boundaries
distribution is discrete and concentrates

limiting
1
distribution F and
at
weight
depends on the starting
find that the
The
1
.
is
nit)is
continuous
F(j,a
account
on
number
models.
5, we
and
that the method
various
1 and
uniformlyconverging
""
the present section
in the
as
exists
fact that
"oo
FiX
7. In
V^v
the
follows:
t.
follows
last theorem
we
lim
as
0
(C/'^7r)"""(/)
lim
obtained
then
Using
model
""**'
W-"
the
for this nonhnear
If v^t) is continuous,
36.
with convergence
is uniform.
convergence
the conclusions
limit.
constant
sum
given by
"1
i"g^^(x)dF(x),
J.
where
Many
is the
with ^(0)
uniquecontinuous fixed pointoi U"f) "f"
of "^"j,a
are
developed in those sections. In all the
properties
^^^
1.
0 and
^(1)
other
types the
ergodicproperty
Let
deal
us
and
the
with
-*
"
relevant
"
examine
Fj
case
and
(0, 1
(b) first. We
the
F^ applied to
a). Any
"
(ct^,
(1
intervals
note
that the union
unit
interval does
a)a) and
"
(a(l
the limit of the total set covered

Cantor
set
It is
C.
set
a), (1
"
by
F^^
and
overlap with
must
"
the
the
(/
givenby
Let
a.
"
us
F2[0, 1] of
subinterval
additional
this way,
its full
concentrate
with
aa;
are
open
two
2) in
is
by F^^^.
(b) ct
empty
the
walk
+
"
image sets ^^[0,1]
leave
Fg
operators
a)^).Proceedingin
"
random
denoted
"
of Fi
applications
that
seen
easily
be
of the
not
of F^ and
applications
two
The
the
with
ax
F^x
givenby
w
here
^ " j"{x)" (5 " 0. The
1
"^{x),
probability
distribution
equations(20) and (21). Let the limiting
We
1
two
"
now
cases:
(a)
distinguish
a;
independentof
limitingdistribution was
followinggeneraltype.
and x
(pix),
F^
probability1
to hold
seen
was
initial distribution.
401
KARLIN
SAMUEL
open
find
we
that
any arrangement is
on
probability
this
C.
let
Now
..
{X) =\
fl,ifx=fo
77-,
We
^
show
that
except
then
at most
neither
is zero
V^nfjx)converges uniformly to zero. Note that UTT^J^t)
value
of
if
Of
/
" t^
one
; namely, F^^/qor F~^?q.
course,
for
exists for that t^; and
inverse
lUntJ "
otherwise
1
[cf"(x),
max
exists and
only one
"l
cf"(x)]
every
\ "a,
"
cr
8.
L/^tt^ " (1
Similarly,
from
5)'*,
"
the probability
of
Consequently,
up,
have
we
established
distribution
We
of the two
one
denote
0 for
at
If
zero
{probability
now
any
this for
of F~^
application
operators obtained
every
t^
A "
"
"
F~^
F^^t^
a
total
of
so
{h
"
^q
"
Summing
Let
A"
F""
"
/;; then
1 "
^ "
i"ix)"6"0,
so
/^ in the
the
TT{t)"
that
or
same
we
F^^
get that
"
exists
way
order
nit)
^
order
specific
reverse
A",
for every
as
Consequently,
first that at least
that
"
singular
set.
least F~^
denotes
denote
note
^'^\X -y\
"
at
t^ from
F"
is
in the unit interval. Let
0). Since
construct
times.
F^,^
note
the unit interval
on
"
We
a.
"
for every
t^ to /". We
largethat
"
F^^t^, where
fny^
t^. We
by passingfrom
defined
t^
steps,obtainingt^
Choose
t^ with 0
that
observe
now
limitingdistribution
point)spread on a Cantor-like
F~^ is defined
or
"
obtain
or
We
the
then
(a) where
case
ipn^
where
a,
positivefunction
/(,(sayF^^),we
continue
at
mappings F~^
continuous
for any
/q is zero
at
a
"
examine
to
subinterval
some
"
turn
F^
follows.
the assertion
38.
Theorem
which
and
of
of the
402
IN
READINGS
TT{t)"
7? "
since
shown
thus
could
defined
Also, U^nf
{[1
max
has
spread
by
out
entire
the
unit
term
in
interval
and
involving
t/"
establish that
we
U^nt
converges
has at most
values
two
possible
U-n-f^
and
0
while
(f"{F~^tQ)
(f"(F~^to),
respectively, Utt^
"
that
four
at most
possiblevalues
and
the
maximum
value
that
"f"{F-\)"i"{F-\),
"f"iF^\)l
[1
cf"{F-\)mF-^F-\)
for the
"t"{F-\)[\ 4"{F-^F-\)]}
of
^^^ ^^
consider
the same
repeatedU^ttiq^
section.
The
conditional
of
set up
previous
probabilities
trial satisfy
the uniform
1 " 1 "??"/?""
success
^ " 0,
inequalities
pn at the "th
in
is
taken
be
of
the
the
where
this
to
to
case
an
success
application
impulseF^
particle.
that the probability
k {k "n)
of securing
It is readily
seen
by standard inequalities
To
secure
bound
observe
we
Again
is
U^tt,
'0
for
cf"(F-\m
before.
as
this end
To
zero.
achieved
be
the
/;]covers
let TTf (t) be
elsewhere.
is
"
givenby
F^^tQand -F^^ô
at
h, t^
"
a, the operator U is strictly

If " \
positive;that is,for each
that U^-n is strictly
t
here
exists
Tr(t)
an
n
so
function
dependingupon
positivecontinuous
positive.
uniformly to
PSYCHOLOGY
39.
Theorem
Now
F-"[tQ
this initial interval which
on
have
We
F".
all
positivefor
t/"7r is
MATHEMATICAL
successes
converges
deduce
as
of F
into
measures
Let
We
Proof.
denote
-nit)
/, and
on
deduce
have
Fg, where
transition
Fj
either
function
F^
f/^-n-^
-^
Thus
t.
directlythat
the
and
We
cumulative
absolutely
into
singular
measures
F^. However,
F^ vanishes.
TF^
0.
continuous
absolutely
operator transforms
measures
or
as
fixed distribution
the
uniquedistribution F"j is either absolutely

in every open interval.
positivemeasure
^^
has
^
bounded
/' of /.
We
hence
Fj is
all the conclusions
demonstrated
subinterval
and
the
"
0 for all t.
UîTf
for every
+
F-^^
the
TF^
that
continuous
closed
6 "
C/"77- "
that
and
a, then
If " \
F
urthermore,
F^
singular.
or
for
bound
it follows
Moreover,
co.
-"
continuous
absolutely
find
40.
Theorem
continuous
"
probabilityzero
"
is continuous.
we
measures,
singular
is unique,we
that
has
F^
as
Fg is singular.Observing that
and
continuous
Let
zero
successes) is
that
before
the
uniformly to
of k
maXfc(probability
distribution
maximum
in
experimentmodel
By
by
virtue
and
of the theorem
zero
outside
39 there
of Theorem
an
but the last.

open
exists
an
interval
n
such
that
note
But
1dF,^^
"
and
the
proof of
close
We
the
with
continuous.
cT
1/2
"
theorem
the
An
a, where
is
(n, F,^^)"8"0,
complete.
conjecturethat
example where
F" J^x)
x.
=
when
cr
this is the
"
1
case
"
a,
then
solutely
F^^^ is always abby "f)(x) 1 12,
is furnished
SAMUEL
403
KARLIN
REFERENCES
[1]
Bellman,
R.
Harris, and
T.
decision processes.
[2]
R.
R.
Bush
and
RM
C.
H.
N.
878, RAND
Mosteller.
F.
Shapiro. Studies on functionalequationsoccurringin

Corporation,July, 1952.
for simplelearning.Psych. Rev.,
mathematical
model
1951, 58, 313-323.
[3] J.
L. Doob.
[4]
W.
of
Asymptotic properties
Markoff
transition
Amer.
Trans.
probabilities.
Math.
1948, 63, 393-421.
Soc,
Feller.
introduction
An
to
probability
theory and
its
York:
applications.New
Wiley,
1950.
[5] M.
[6]
O.
[7]
W.
M.
Flood.
On
and
Onicescu
theory. RM
learning
game
G.
Mihoc.
Sur
853, RAND
les chaines
de variables
Corporation, May 30, 1952.

Bull. Sci. Math.,
statistiques.
1935, 2, 59, 174-192.

Doeblin
and
R.
Fortet.
Sur
les chaines
liaisons
completes.Bull.
Soc.
Math.
France,
1937, 65, 132-148.
[8]
R.
Fortet.
variables.
[9]
lonescu
(These) Sur
I'iteration des
424, Afio
Revista, No.
Tulcea
and
G.
Marinescu.
Sci. Paris, 1948, 227, 667-669.

Received
December
19,
1952.
substitutions
40, Lima,
Sur
lineaires
algebrique
une
infinite de
1938.
certaines
chaines
liaisons
completes. C.R.
Acad.
SOME
ASYMPTOTIC
PROPERTIES
BETA
LEARNING
Lamperti
John
MODEL*
and
statistics
This
given for the two-operator

reinforcement.
noncontingent
application
and
collaborators,Bush
which
the
and
of the
Some
learning
behavior
response
postulate
trial to the
next, with
of their
in
the
linear
behavior
the
shown
viewpoint
,"
response
the r"{i)are
of
that
of
of
probability
depending
no
on
of
do
not
the
that
overt
like that
of
which
models
from
response
the
beta
one
reinforcing event,
fication
general psychological justi-
more
certain
of
construct
learning
in
linear
theoretical
believe
Spence
his
model
not
are
trial. Both
T-maze
is evidence
there
empirical standpoint
an
models
and
stochastic
the
offer
they
research
Research
article
some
with
experiments
yield good
simple postulates
very
there
exists
probability of
trial
on
n.
transformed
on
process
*This
of Naval
and
learning
development
in terms
transformation
stochastic
preceding
the
Hull
this
as
tingent
con-
of
predictions
rats,
actual
ratio scale
over
[7]
the
choice
on
behavior,
with
of responses
set
that
is the
p,
stochastic
This
like
postulates. From
basis
property
of this
that
the
has
where
on
explained
far
of
[1,7].
On
Luce
so
totic
Asymp-
cases
trial to trial
from
the
be
experiments, particularly
some
that
the
in
unsatisfactory
are
From
considered
motivated
transformation
linear
theorists
best
may
strength.
response
have
considerations
have
response
probability of response
empirical
model.
Galanter, [1, 7]
probability of
in
changes
functions
model.
beta
simple learning situations, Luce
various
to
Luce's
four-operator
and
are
and
For
asymptotic properties of
studies
paper
laboratories
university
stanford
results
Suppes
Patrick
and
mathematics
applied
LUCE'S
OF
was
and
appeared
A^
Additional
supported
strengths
on
trial n, and
Vn{i)is
simple postulates
linearlyfrom
response
in part by
response
trial to
then
trial,and
determines
in part by the Group

Psychology
Rockefeller
Foundation.
lead
the
to
strength
result
the
this unobservable
stochastic
Branch
process
of the
Office
the
in Psychometrika,
1960, 25, 233-241.

404
Reprinted
with
permission.
JOHN
in the
way
study
subject of
to
the
infer by
in connection
asymptotic
of the
means
This
probabilities.
course
mathematical
numerous
behavior
asymptotic
interest
determine
then
the
and
nearly
behavior
of the
have
is made.
Let p" be the
El be the
response
Luce's
if Aj and
of
event
situations
to
behavior
[7]and
Luce
the
on
the
responses,
Ai
trial n, and
let
behavior.
of two
Ai
response
E2 the
of
path
transformations
one
and
encounters
alternative
asymptotic
of the response
on
of
event
reinforcing
Ek occurred
trial n, then
on
for j
by the followingtransformations:
1, 2 and k
1, 2,
2n
^,k
0. Luce
"
1 and
/3,i "
want
it is assumed
most
example,
occurrences
it is
easilyshown
fact about
(1) is
first
63
that
9^ 1.
in the
of A2E1
(Generally,we
general formulation.
effects
of
reinforcement;
primary
"
"
"
/812
/322.)
Throughout
1821
/Sn
that 0 9^ pi
important
suppose
more
ordinarilyassumed
this paper
|S,-,(1 Pn)
/)" +
1, to reflect the
/3,2"
moreover,
The
[7]gives a
it is
that
trials there
occurrences
the operators commute.
61
are
of A1E2
64
For
of AiEi
occurrences
of A2E2
occurrences
62
; then
that
CX)
ri
Pi +
The
The
their
characterized
is then
Pno
where
for
"
beta model
(1)
beta
would
be
learning data
strengths v"(i)and
response
nonlinear
Ai
simplest
"
any
in which
reinforcingresponse
the
probabilities a
response
taken
probabilityof
that
seem
equation given above the

is pursued rather far by
restrict ourselves
A2
the
of
studying directly the properties of the

to obtain results on
probabilities
response
We
would
with
difficulties. We
405
SUPPES
PATRICK
it
probabilities.
Superficially,
response
to
AND
LAMPERTI
aim
of the
present
paper
/3ii/32i/3i2/822(l
Pi)
-
is to
study asymptotic propertiesof
the
probabilisticschedules of reinforcement.
of attack
[4]and by Lamperti and Suppes [6]
do not directly apply to the nonlinear beta model.
linear learningmodels
The basis of our
approach is to change the state space (the probability
model
for
Pn is the
standard
certain
methods
used
state)from
the unit
that the transformations
by Karlin
interval
to the
whole
real line in such
translations.
way
The
noncontingent
simply
of independent random
variables;
(thenext section)then reduces to sums
the contingentcases
also be studied by "comparing" the resultingrandom
can
variables. The probabilistic
tool for
walks with the case
of sums
of random
this is developed and applied in later sections. The general conclusion
to be
ment
drawn
of noncontingent reinforcefrom our results is that for all but one
case
(1) become
case
individual
which
probabilitiesare ultimately
to corresponding results
contrast
response
is in marked
either
zero
for linear
or
one,
learning
406
READINGS
models.
Absorption
at
zero
PSYCHOLOGY
MATHEMATICAL
IN
or
also
one
for many,
occurs
but
of
all,cases
not
contingent reinforcement.
Noncontingent Reinforcement with Two
If the
trial
Let
probability of
number,
TT
be
have
we
the
reinforcement
what
probability of
is called
an
/?2i
/3i2
1S22
"
j8
"
1.
for
random
of the
simplicity let
/?,
7,
1,
"
variable
and
x, |8,
numbers
is defined
rj^
and
reinforcement.
expression for the asymptotic probability distribution
probabilitiesin terms
The
of response
simple noncontingent
/3n
I
seek
independent
is
Ei reinforcement, and
an
(3)
We
Operators
of response
7.
recursively as follows:
1(3
with
prob
[y
with
prob (1
r,
\r)"^with
prob
[r]"ywith
prob (1
tt);
"
TT,
The
random
variable
X" is defined
follows:
as
X"
tt).
"
log
??"
Then
,
Jz"+
log /3 with prob
tt,
[X" +
It is clear from
(4} and
what
identicallydistributed
By the strong law

/c^
Define
now
-^
Xn
""
for any
variables
F, defined
I^*^S^
with
prob
[log7
with
prob (1
tt).
"
of
sum
independent
by
tt,
"
of large
X"
prob (1
with
preceded that X" is the
has
random
log 7
if
00
"
numbers, with
"^
7r
if
log /3 +
TT
real number
tt).
probability one
(1
log /3 +
x
"
w)
(1
"
log 7
"
tt)log 7
as
0,
"
0.
-^
0"
408
READINGS
where
0 "
with
(and
1
b, (fix),
a,
the
same
and
// for all x
Lemma.
then
Pr(X"
Pr(F"
Let
Proof.
{^"} be
uniformly distributed
"
type
same
the transition
as
6, and if Pr(F"
(p(x)"
oo)
-^
"
0,
"
6 and
M, "p{x)"
random
independent
{X"} process
[0, 1].The
0")
-^
if
0.
of
sequence
on
6 and
the other hand, for x
0, then Pr(X"
of the
process
(p(x).
"
has
M, one
0. //, on
be another
constants
"
)"
oo
oo)
b) but with
of (p(x)and
in place
probabilities
{F"}
(p(x).Let
"
PSYCHOLOGY
MATHEMATICAL
IN
variables,each
referred to {^"}
will be
by letting
(10)
'X" +
if
lX"
otherwise.
This
does
lead
Choose
Fo
that
F"+i
for all n, the
so
X"
F"
"
Pr(F"
n
on
-^
S, X"
of
{X"]
and
is
X"
the
6.
0 there
"
Xq
assume
the property
the
(10) to
(p(x)"
that
that
assumption
6 and
"
that
since
F"+i
F" +
F"
(p{x)
is
is
Yq
"
our
"
impossible,
set
in the
F"
-^
in
proved
Hence
oo.
F"
sample
Pr(X"
of the
F"
oo)
using the
"
"
0. The
{^"};
oo
second
construction
same
is
for all
sequence
in the set X"
-^
oo
,
space
is contained
similar way,
oo) is positive,so
Pr(-F"^4-
the event
for all n). But
as
b/{a
2. Let
exist. Then
if a
"
and
Pr(limsupX"
while if a
"
(")
(13)
c, and
"p{x)
=
fi "
+
and
6)
lim
(11)
since
part
linking
"
Pr
(14)
0
"
and
(X"
6 "
-^
1
.
lim
^(.r)
/3
c,
lim
oo
inf X"
jS "
"
that
suppose
and
(")c,
oo)
"
({X"|ts recurrent),
=1
then
Pr(X"-^-a"
Finally,if a
for some
"
{F"}.
Theorem
(12)
if ^"+i
since
therefore
m;
and
positiveprobability,and
of the lemma
only
of Xq
{F" } with
processes,
X"+i
F" and
"
if and
of
manner
increase.
considered
be
set
the
{F"}
The
seen.
F" is also valid for all n. This follows from
proof,note
"
F"
00
,
may
"S is
only
can
complete the
To
easily be
may
some
sequences
the
the transition
M;
"
a;
"linking"
as
value
Yq for
"
inequalityX"
construction
(9)
F" +
the
"
that for those
assert
now
law
Whatever
positiveprobabilitythat X"
for
"p{X"),
{X"} by referringit after
to
M.
"
"
transition
{^"},so
sequence
same
the
to
be linked
can
process
We
^,,,1"
X".,
(4-co))
1.
c,
4- oo)
5,
Pr
(Z"
-^
oo)
"
JOHN
Proof.
be
d "
"
with
process
The
c.
Yn
transition
constant
{F" } process
J2Zi
Yo +
be
may
"
{F" } (as in
Let
c.
probabiUties
regarded as
where
Pr
(Z,-
a)
e.
and
the
where
variables
and
lemma)
1"6
of random
sums
409
SUPPES
PATRICK
instance,that
for
Suppose,
AND
LAMPERTI
(15)
Pr(Z,E(Z,)
But
ad
cd)
Pr(F"
Pr(Z"-^ 4- oo)
I
Similarly,if a
-^
"
if j8 "
while
this
the
Consider
the
0, since
"
of
law
large
"
this
c;
the
From
numbers.
that
impUes
lemma,
0.
"
it follows that
for convergence
obtain in the
6),we
"
^)
by
also holds
6(1
-h)
to
when
"
"
"
and
"p
that /3 "
way
probabilityis
case
(with
oo
"
same
Pr(X"
replaced by
makes
the lemma
0. Since
"
Pr(X"
-^
"
ex?)"
and
/? "
c; there is then
positive probability
-^
oo
0"
"
-^
the
remains
to
in
But
whose
oo
"
N. Now
0,
zero.
It is not hard to see that X,

but not at +
absorption at
+
with probabilityone; the idea is roughly as follows. Since X"
close to 1 for
often with probabilityarbitrarily
have Xn " N infinitely
of
and
(p
"
probabilitythat from
the left ol N
"
or
infinite sequence
probability on each
walk
some
and
goes
) " 0.
positivesince Pr(X"
event
an
necessarilyindependent trials,
be
must
of not
an
the random
to the left of N
we
oo
"
trial is bounded
from
away
zero
"
"
oo
is certain
to
and remain
walk will eventuallybecome
M, the random
with probabilityarbitrarily
to the left oi N
M, and therefore X"
other
The
close to 1 (and so equal to one).
are
cases
similar;one can think
is an absorbing or
the conditions under which
+
of a "
"
c as
c or
reflecting
barrier,etc., and the process behaves accordingly.
for any
Hence
occur.
-^
"
oo
"
oo
The
{X"}
be
the four-operator
generalizationto
real Markov
(17)
X"+i
where
.Oi
"
aa
0 "
lim
(18)
and
"pXx)
(Pi(x)"
0.
then
and
v',(a;),
"pi{x) /S,=
let
M+
methods
it is
Let
""03
I"
By
be described.
Suppose
lim
and
ai
now
x, then
with prob
a,
"+a3
2"
exist,and
a:
a;,, at
that if X"
such
process
will
case
entirely similar
o,q:,
the
Theorem
the process
m-
used
to those
possibleto prove
3. For
and
Zl
above,
(îî
but
"
rather
more
involved,
following.
(12)holds;ifijl+ " (")0

"
0, (14) is valid.
M-
and
[X"}
n-
"
described
(")0
above,if /i+ " 0 and /x_

(13)applies;while if m+
then
"
"
410
READINGS
IN
MATHEMATICAL
PSYCHOLOGY
ContingentReinforcement with
probability of
(on the
preceding response
Two
Operators
reinforcement
If the
depends only on the immediately

has {simple)contingentreinforcesame
one
trial),
ment.
and let the two operators
Let Pr("'i |î)
Pr("'i |^2)
tti and
7r2
variable X"
i8 and 7 be specifiedas in (3).Using (6), define the random
s
ince
and
0
"
recursively. (Note that log 7 appears
first,
log 7
log /8 " 0,
in order most
directlyto apply Theorem
2.)
=
X,. +
(19)
X",i
FxM{l
with
log ^
prob [1
tti)
(1
,X" +
FxM){\
TT^)
^(X"),
^(X")].
that
Observe
lim
(20)
(p{x)
2,
with
then
one
"
Ti
and
lim sup
has
"
if 1
(iii)
"
tti
"
immediately
Theorem
4.
of the two-operator model, let c
then
and
Pn
"
tti
"
lim inf p"
(ii)if 1
"p{x)
one
probability
"
lim
the contingent case
log /3/log{y/fi)Then
(i) if \
and
7r2
"
Theorem
4. For
Theorem
(20) and
Combining
"
with prob
log 7
0,
T2
"
and
^2
"
and
tti
"
tti
"
"
then p^
"
then p^
1,
0.
Moreover,
(iv)if \
TT2
"
in (i)and
then
are
both
intuitive
and
1)
-^
an
"
î
(p"-^ 0)
and
if 1
5.
ttz "
"
xa
"
and
and
of
probabilityone
"
5 "
the results
between
be clear. If 1
response
reflecting
barriers,whereas
5 with 0 "
then for some
Pr
5,
should
(iv)of this theorem

of
tti
"
of the distinction
character
probability zero
both
(p"
Pr
The
"
"
expressed
"
^1
an
xi
"
or
c,
tti
"
c,
response
they
are
absorbing barriers.
It is also to be
Theorem
noticed
that
except when
"
tti
ttj
"
for the
c,
contingent case. It can

shown
methods
that
1
be
if
c) then
c (or 1
[5]by deeper
7r2
tti
is again a reflecting
probabilityone
(respectivelyzero) of an A^ response
with those given by Luce
barrier. These
results agree
([7],p. 124) and in
Detailed
addition settle most
of the open questionsin his Table 6.
comparison
differs considerablyfrom ours
is tedious because
his classification of cases
as
4
covers
all values
of jS,7,
tti
and
7r2
"
given
in the above
theorem.
"
LAMPERTI
JOHN
We
formulated
four-operator model
(21)
3 to the contingent
finallyto apply Theorem
want
X"+i
in
Operators
Four
ContingentReinforcement with
411
SUPPES
PATRICK
AND
(1). Analogous
FxXpi))
7r2)(l
log i322
with
prob (1
Z"
logjSi2
with
prob (1
X"
log (S21
with
prob 7r2(l
Z"
log /3n
with
prob x,Fy"(pi)
"
"
general
(19),
to
Xn
of the
case
^22(^0,
7ri)Fx"(pi) "Pi2{Xr,),
=
"
-"
Fx"(P^))
""2i(X"),
(Pn{X,).
Also,
(P22ix)
lim
1"1:2
lim
"P22
lim
(pi2ix)
lim
^21(2^)
0,
lim
"Pu(.x)
TTi
0,
I-4+CO
(Pj2(x) 0,
lim
TTi
"
(22)
lim
^21(3-)
TTa
"pn{x)
0,
lim
Then
(23)
M+
M-
2 log /3,i lim tpikix)
TTz
log /321+
(1
S log /3yfclim "Pik{x)
tt,
log 13^
(1
ttz)log /322
i
and
(24)
To
apply Theorem
On
this
and
assumption,
"Theorem
For
5.
assumes
that
utilizing(23)
and
also
one
ttJ log /3i2
^^22 /3i2 "
"
the contingent
1821 /?n
5.
of the four-operator model,
case
0.
"
infer Theorem
(24), we
with
one
probability
and
"
0 and
(ii)if n+
"
0 and
/x_
"
0 then 7)0=
(iii)
if /z+
"
0 and
m-
"
0 then
0 and
m-
if m+
"
(iv)Pr(p"
I
0 then lim sup
(i) if n+
-^
Specializationof
1)
"
"
n-
pa"
0, then for some

5, Pr(p"
this theorem
to
-^
0)
cover
and
"
lim
inf p"
0,
1,
0;
Pn
the
with 0
"
"
5.
noncontingent
case
is immediate.
412
READINGS
PSYCHOLOGY
MATHEMATICAL
IN
REFERENCES
[1]
Bush,
K.
[3]
Mem.
J.
Hodges,
J.
[4]
and
L.
variables.
Ch.
1959.
Press,
Chung,
Math.,
1953,
of
Tests
the
"beta
learning
model."
In
theory.
Stanford:
values
of
R.
R.
Bush
Stanford
W.
H.
Math.
On
J.
Soc,
M.
of
distribution
the
1951,
of
sums
random
1-12.
6,
time
Recurrence
in
moments
walks.
random
Pac.
127-136.
3,
Some
S.
Karlin,
D.
mathematical
in
18.
Rosenblatt,
and
R.
Luce,
Studies
Fuchs,
Amer.
L.
and
E.,
(Eds.),
Estes
K.
Univ.
[2]
Galanter,
R.,
R.
W.
and
arising
walks
random
in
models
learning
I.
Pac.
J.
Math.,
1953,
3,
725-756.
[5]
Lamperti,
math.
[6]
[7]
Applications,
Lamperti,
Luce,
Manuscript
Revised
R.
and
J.
Pac.
theory.
Criteria
J.
Anal.
D.
J.
manuscript
the
recurrence
(in
1959,
Chains
9,
choice
4/^7/59
received
or
of
transience
stochastic
processes
I.
press).
P.
Suppes,
Math.,
Individual
received
for
11/10/59
of
infinite
order
and
their
739-754.
behavior.
New
York:
Wiley,
1959.
application
to
learning
J.
CHAINS
Lamperti
John
used
a
The
Introduction.
1.
behavior
of
models
of
learning experiments.
as
Namely,
completes."
history of the
process,
in order
employ; however,
found
it necessary
for
the
and
details
quite in
the
The
form
simple.
trial he
of
makes
of
finite
response,
which
This
response
the
which
upon
models
may
Suppes
[6].
lines for
will
We
definitions
give
references
thorough
under
that
Received
Statistics
of
involving
Branches
linear functions
two
asymptotic
linear
20, 1958.
the
the
research
This
Office
of
in
properties.
Naval
has
of
type
Suppes
teractio
in-
[1].
in "3.
learning models
time
much
and
below
do not, except
and
Estes
along similar
constructed
given
pends
desuch
about
[4],and
Atkinson
the
of
functions
the
results
Many
ject's
sub-
the
is that
subjects and
and
are
of
[2], Estes
more
9]
form
models
or
above
of
after
the
occurred.
processes
treatment
of
are
[6, Section
these
reinforcement (again
trial
next
each
finite set
model
here
also study
from
are
on
the
Hosteller
and
trials, and
choice
models
these
of
has
mentioned
is, that
November
not
"linear
called
are
by
the
approximately stationary and
and
but
is followed
trial, where
in Bush
them
of
assumption
general conditions
behavior; that
some
[7] and
Harris
arguments,
tools
these
series of
consists
reinforcement
experiments
The
The
present
found
be
between
Precise
presented
probabilitieson
probabilitieson
and
aration
prep-
in
only
us
psychological standpoint
is
number).
response
of "2.
as
serve
by
papers
results
shall study with

a
subject
possible actions.
one
[3],
to
to
original with
it is
addition
we
From
4, and
tained
self-con-
results is the content
section is included
this
we
of learning models
cases
their hypotheses.
closely related
which
close to that
require.
we
processes
certain
past
rems
theo-
Such
past.
form
chastic
for sto-
entire
the
on
remote
in
additional
some
"
In
very
earning models."
very
of
extensions.
[3]
somewhat
that
theorems
[8] contain
Karlin
and
emphasize
should
We
relax
to
Fortet
accomodate
to
of these
discussion
and
Doeblin
given by
were
the
only slightlyon
but
liaisons
theorems
limit
transition probabilitiesdepend
whose
processes
"chaines
or
certain
employ
shall
we
order"
been
applying
this by
do
will
ptotic
asym-
have
which
processes
We
infinite
of
theory of so-called "chains
we
stochastic
of
the
study
is to
this paper
of
purpose
large
Suppes
Patrick
and
class
THEORY
LEARNING
TO
APPLICATION
THEIR
AND
ORDER
INFINITE
OF
very
special cases,
shall
We
exhibit
passed these
prove
"ergodic"
processes
come
be-
infiuence of the initial distributions

was
supported
Research
by
under
the
Group
contracts
Psychology
with
Stanford
University.
This
article
appeared in the PacificJ. Math., 1959, 9, 739-754.

413
Reprinted with
permission.
414
READINGS
This
is not
goes
to
used
in experimental
zero.
proved by
it.
Our
work,
method
our
MATHEMATICAL
IN
theorems
the
case
but
it
in almost
this
to
PSYCHOLOGY
for
all models
seems
as
all the
in which
cases
eifect, their
which
have
been
if ergodic behavior
proofs and
might
one
be
can
expect
corollaries
some
are
given in "4.
The
major
Karlin
[8], who
models.
do
work
obtains
However,
apply
not
to
of
far
so
many
the
detailed
linear model
is impractical arise
is selected
in the
estabhshed
treats
under
depend
is reinforced
right.
two
the
process
or
above.
On
the
on
rat's
probabilities. The
has
been
been
initial response
thoroughly studied
generalized
We
hope
Chains
2.
of
of
non-Markov
influenced
theorems
they
stochastic
this
given
type
here
weaker
hypotheses
it is in
[3], but the
has
also
that
his
subject.
studied
numerable
Let
the
integers
remote
proofs
chains;
point
chain); we
from
of
the
shall
out
/.
The
the
left turn
upon
have
form
shall
that
the
2.1
be
can
of
Fortet
2.2).
complicated than
T.
Harris
E.
remark
but
results
background
to
[3];
The
and
the
on
of
finite number
extended
to
the
de-
methods.
1 to
notation
"m"
and
restriction
from
subscript
his
references
theorems
2.1
more
use
convergence
and
affected.
much
not
the
original
(^Theorems
theory
probabilitiesare
Doeblin
to
future.
present
we
The
Lemma
not
are
we
integers
in the
other
chains."
order
transition
due
and
results
"infinite
section
the
are
change
much
use
left
goes
results
these
detailed
of
past.
the proof of
without
/ consist
of
we
this
generahzed
other
For
of this experiment
model
more
where
process
essential,and
case
of
make
these
Finally
is not
states
in
In
[7] gives additional
paper
he
it is depends
this development
to
processes
of
work
approach.
our
whether
2], and
ideas
the
order.
only slightly by the

for
are
further
infinite
both
that
possible using
contribute
to
these
[9].
comment
we
applications seem
(ii)
subject (a rat, say)
the
which
in [8, Section
by Kennedy
In conclusion
nearly 1, and
or
and
situations
ergodic behavior
appropriate linear model, the probability of
the
forcement
rein-
the
responses,
of
scope
trial regardless of
eventually is either nearly
the
hand, Karlin's
other
the
sentation
repreare
tion
representa-
which
Both
in which
experiment
each
such
of
paper
states
chains, and
outside
cases
when
previous
more
mentioned
is
starting point
whose
is
classes
Karlin's
of
probabilitieswith
restrictions.
(rewarded)
In
the
on
T-maze
Markov
certain
for
techniques
His
using infinite order
mild
consider
even
typical situations
interesting non-ergodic
example,
or
as
situations
many-person
theorems
the
interest.
(i)when
(and will)be studied
can
of
probabilities. Two
response
limit
the results and

cases
of learning models
limiting behavior
on
on
for
x",
(to represent
a
finite sequence
merely adds
the
the
states
ia,h,
"
"
"
specifica-
416
READINGS
of
then
to
"m'
define
We
Proof.
eS,!^
course
12
and
in k
\v^^\x + x')
PSYCHOLOGY
instead of
quantites ";,f^by using y"^*'
"""
uniformly
-*
MATHEMATICAL
IN
the
as
conclusion
^^
of
lemma
the
in (2.4);
jPi
is equivalent
Now
co.
+ x") I
v')!'\x
Wi'~''\3 +
+ x')
x')'Pj{x
i"
pl*~'Xy +
"
+ a?")}
|
cc")pj(a;
a:
.7
S ^'X^'+ x')\v?~'\3+
contains
Suppose
that
estimate
is less than
but if j
"'-m~^\
than
of (2.3) and
(In
case
Uq
"^.
series
these
1,
be
be
can
of the
the
above
is less
first term
"";/'. Taking
to
obtain
we
8el^;,'"
+ (1
idea
not
improved
term
in the
account
estimate
S)""i-"
carried
out; the
details
are
more
given.)
iterated
obtain
to
of
estimate
an
in
"S,f'
of
terms
computation the result is
some
"'"f^
^
If the
el'^^ Ne,,
same
be
x")\
second
the
value
absolute
can
=
be
(2.7) can
After
that
will
and
this
ô
1, the
"
cumbersome
Now
j^
Then
times.
Ne,,,. The
assuming
(2.7)
j^
x')- vl'-'îj+
-^
Ne,Jza
8y
are
S)^
I ^)(1
'g'('
Ne^,,S'
extended
Ns.n,./"ii
+ i)(i
to
"
"
"
iVS^-^",..._,
.
true;
ing
call-
have
A^--^we
"
"
"
inequality remains
infinity,the
series A^, A^,

(infinite)
sy
e':^^Nê,,,,8Â,.
1
But
it
can
be
shown
A,+i
AijS.
Since
A^
"
A,
S''
(1
8)A^,,
obtain
we
Ai
and
8-'-'-'^\
hypothesis (2.5),the uniform
convergence
(2.8).
Lemma
2.2.
lim
(2.9)
and
hence
"^r ^S-S",,,.
(2.8)
RecalHng
that
diflficulty
much
without
A,,,
or
the convergence
is
\p'r\x') p\"\x")\
uniform
in
x' and
x"
.
of
"5"^follows
from
JOHN
clarity
For
Proof.
LAMPERTI
shall
we
purely analytic rephrasing is

operating independently with
In
of
view
two
least
at
being
take
can
will
and
this
occupying
this proves
n
be
can
Theorem
The
by
both
to
less
two
than
"
at
of
state
s/2.
For
bilities
probae,
and
2.1
that
most
Lemma
j^.
state
have
processes
(2.3) and
lows
it fol-
sometime
"run"
differ by
from
see
includes
in
remain
simultaneous
which
will
there
if the
But
e/2.
most
at
not
x"
their probabilities
then
n,
processes
this
the
with
period which
time
probability
x"
.
shall
now
pl"'(a;)
tî
we
the
prove
first theorem:
quantities
lim
independent
are
with
that
such
processes
other
the
probability one,
in x' and
(2.10)
exist,
differ
that
preparation
2.1.
an
for
before
time
i at
uniformly
this much
With
is
therefore, the
n,
and
states
It is also easy
(2.9).
chosen
"
which
so
of
state
there
with
with
values
"
same
during
time
time
time
large enough
all greater
of
i at
to
sometime
(2.3) that
before
occur
the
ends
state
period of length
We
in
condition
from
be
and
up
any
stochastic
two
probabilitiesPiix),one
transition
occupied
times
of
io
have
Consider
hard.
history
2.1, for
Lemma
processes
ja
its past
x' for
sequence
although
probabilisticarguments,
use
not
417
SUPPES
PATRICK
AND
of
satisfy X
and
x,
î
1/ ^^^^
"
is
convergence
in
uniform
Applying
Proof.
where
x.
(2.2) repeatedly,
have
we
+
+ x)"'
p,^(i,
Pi^_^{x)p,^_J,i^-,
"
x^
i^,i^,
"
"
"
"
in-^ + x)p'r\x,,+ x)
"
i^-^. Therefore
"
\p[^^'^"(x) p["\x)\
-
and
by
Lemma
absolute
value
Pi,^J^)
'
2.2, for
signs
+
^'m-i(^) îo(î
*
'
"
"
"
"
any
"
the
on
"
"
+ x)im-i + x)\p','^\x^
p'r\x)\
that
within
there
is
an
right
is
less
'^m-i+ ^)
"
+
Piji'î
"
sum
to
than
so
number
p'"\x)
has
(uniform
in
x) limit
".
"
tt^.
each
term
Since
the
Since
there
of states,
E
i
TTj
2 lim pl"\x)
i
m-"oo
lim "
n-""i
weights
have
we
one,
\p[''^"'\x)p["'ix)\"
and
such
p'^\x)
1
,
are
finite
418
READINGS
and
IN
this completes the

Next
(2.11)
x,r,
If
joint probabilities.
is \,\,
x^
""",i^_j, let
v.S''') ^^::(^')
=
is, of
This
proof.
shall define
we
PSYCHOLOGY
MATHEMATICAL
starting with
probabilitityof
the
course,
past history x'
the
executing
We
define
can
of states
sequence
also
the
higher joint
probabilities:
+ ";').
pL"'(a;')S??Xa;')Pl."""(i
(2.12)
of
Analogues
the
Lemmas
arguments
same
Theorem
and
2.1
used
The
2.2.
independent
Remark.
stochastic
on
not
have
quantities
this
will
we
useful
the
past history
in
^^
imply
the
existence
"
prove
convergence
is itself
PzJ,^)is
Theorem
2.3.
the
The
Pi(x) for
the
at
state
time
variable, and
process
so
"moments"
idea is that
transition
if
we
bilities,
proba-
is i given the
it makes
sense
to
+ x)p^ (x)
p]'(r",
V,
Thus
"
al(m, x)
tz^ exists.
is the
We
as
same
shall
now
p["'\x).
prove
quantities
\m\.a\(m,x)
positive integer
is independent
data.
functions
lim")(m, x)
(2.15)
exist for every
stationary
certain
for
theorems
random
by (2.11).
that
The
measure
of members
formally, define
defined
states
This
idea
here.
with
process
probability
extended.
be
stationary
infinite sequences
in studying experimental
a]{m, x)
2.1
space
then
probability Pi(x,J that

x^
of
of
probabilities. The
define
to
convergence
prove
(2.14)
limit
the
further
study E{p]{x,J). More
Theorem
Xi
used
be
can
tt^
can
us
stochastic
where
1/ ^^^
satisfy
Pi{x) for transition
the
measure
concern
are
a
difficult to
theorems
two
with
process
Finally
which
it is not
quantities by
p':''{x') TT,
of x', and
"cylinder sets"
of /, and
need
These
the
the
for these
in x'.
uniform
is that
in this way
quantities
Urn
exist, are
be proved
can
already;
(2.13)
is
2.2
of
x.
v;
a)
convergence
is
uniform
in
and
the
LAMPERTI
JOHN
We
Proof.
use
419
SUPPES
to show
simple estimate
PATRICK
AND
that
is
a]{m, x)
Cauchy
sequence:
k +
\a'^{m-h
^vi +
^m
is chosen
(2.3) and
times
the
nothing
this proves
hmit
made
/c-"
oo,
we
Theorem
case
exist
ji
'
"
'
"
by Theorem
the
the
that
at
states
k, x')\^
limit
limit
once;
in
the
along much
of
a]{m
is the
for
same
"
"
all
"cross"
accordingly
"
k, x) exists
additional
some
h, and
for all
as
x.
moments
define
we
+ x)P:,J,x)
p]l(,x,,
.
2.3, which
generalization of Theorem
quantities
lim
uniformly
in
jk " /, and
the limits
used
x)
oc]'^'Zl
aj^\';.'j*(m,
=
all
non-negative integers
are
independent
for
of
2.3
in proving Theorem
kThe
only trivial changes, and
that moments
yields
1:
The
2.4.
argument
with
Since
is then
(2.17)
'
ing
carry-
0,
"
"
any
a}(m
"
consider
to
for
the
treats
by
many
is uniform
limit
estimate
+ x)
+ x)p;](,x,,
S p}jix,"
x)
a}Jf.'.]lim,
following theorem
The
k, x)
conclude
involving Pi{x,")for several

(2.16)
that
large.
desirable
exist; the
Another
Cauchy.
show
a]{n, x)\ is small
"
(2.15) must
to
are
can
also
It is
jo
in x^,; this
all x) if k is large enough,
fc, \al{n + h, x)
and
from
for
(and
\al{m
m
rewritten
be
may
arbitrarily
contains
sequence
last term
be
(resulting
the
be
can
conditions
all the indices except those
since a\{m, x) is uniformly

line
will
terms
the
long
two
than
ix)
I ^ S 1pC''(^) ^^'J^)
1
(x) p'^'Jîx))
xXp'J^J''^
that
provided
first
more
over
for all h
if
Thus
same
the
0, and that
-^
e^
summation
is small
2.2.
-^m +
high probability. The
IS p}{x^
which
pKa^,"+ x)\p^
large enough,
(2.5))that
with
out
k, x)\
k + li
involves
this
small;
nl{m
"
+ x)
S lP'(a^"+^
If
h, x)
need
not
involving several values
can
"
"
"
v^
all
and
x.
works
be repeated.
of
Vj
be
in this
Finally we
case
also
remark
considered, and
it
420
READINGS
be
can
tion of Theorem
apply
to
trial the
followed
by
by
the
choice
the
value
actual
of
j ^
r,
and
i +
is
sequence
event
reinforcing
reinforcing
the
numbering
Our
single
Axiom
(3.1)
where
We
0 ^
ing
representtrial
each
responses
pairs of integers
thus
represents
the
of
is
theory
variable
random
for
these
is from
here
our
the
on
to
when
inforcement
re-
is the
If En
=
"
of
in
sequences
to
present,
in
Chapter
different,
not
j\x,)
(1
1 and
P{x")
e,)P{A"
S ^jk
linear model
"
ments.
reinforce-
a;".
Thus
order, but
note
notation
the
as
reverse
in
then
j\x,^,)+ e,Xj,
studied
intensitivelyin [6] by setting:

fc 9^ 0
1^
for
1^
for /c
==
\^jj
=
Xj^
t
"
foYJ-^k
0
r
1-
trial
and
responses
this
on
response
following linearityassumption:
k and
[6].
the
use
we
and
responses
is somewhat
Suppes
sequence
formulation
the
probability of
the
past
with
notation
and
Estes
preceding
write
to
6^, Xj,,^
the
general
and
aim
model
following
are
[2], although
entire
P{A",,
obtain
(3.2)
we
[4] and
Estes
axiom
L.
on
distributions, is imposed
this preceding sequence
(It is convenient
data
these
response
linear
general
the
events
given the
For
literature:
j representing the
a
variables
general
class of
theory is formulated
The
of
sequence
The
the
number
in
envisage
random
of the
in
pair (j, k) of integers with
ordered
Any
is
variables, where
relevant
The
is, we
t, that
is
which
represented
be
may
number
of E"
n.
an
response,
variable.
Hosteller
and
being closer to
by
established
is
of trials.
sequence
of random
")
"
the value
outcome.
dealing with
of Bush
"
conventions
A"
makes
consider
we
experiment
an
probability distribution of the

random
t -\- 1
generahza-
models
consists of
A^, E^,
"
trial
particular distribution,or
In
"
events.
of values
possible experimental
"
on
k ^
0 ^
Thus
and
represented
and
predict the
provides
The
experiment
variable
trial n,
on
be
1 ^
the
follows
random
the
reinforcing
then
may
letters
response
the
of
subject
(î,E^, A., E^,
of
This
learning models.
reinforcing event.
sequence
of linear
experimental situation which
an
each
On
exist also.
2.2.
Definition
3.
PSYCHOLOGY
MATHEMATICAL
their Hmits
that
shown
IN
that
"2.)
linear model
satisfying (3.2) we
models
be
such
(3.1) may
shall term
3\Xn)
(1
0)P{A"
j\x"-,) +
ilx"_0
HE,
if
(p(A"ib"-0
Hosteller.
formulation
also
interest
The
and
dependence
whose
(3.4)
And
a),,
if the
S P^(A"
a)
The
they
in
enter
natural
easily observed
P{A"+i
to
a
j, An
We
j). (For other
in
situation
of
ii ^
represents
We
where
in Estes
1,
""-,
[5]
and
1 and
tests
in
s,
(1
on
Atkinson
"
"
and
subjects
as
his
on
and
inforcements
re-
prior
own
be
then
trial may
for
event
responses
in
presented
re-
j\,k^) of
integers
with
sequence
of such
tuples
any
Let
variables
if El^^
E^n^ be the
A'n^ and
for the
ith. subject
on
trial
to:
and
P(a;")"
then
0^;')P{A^:' j\xn-,)+ e^^y^fi

=
2 ^jV
of Axiom
each
outcome.
random
î
s,
well
as
2s-tuple {ji,k^,
ior
j\x")
6';*\X'A^^
Experimental
preceding
on
data
The
have
we
model
linear
the
particular reinforcing
subjects
"
of
that
suppose
generalize Axiom
For
P{A^:i,
0 ^
[6].)
see
extensions
in general
reinforcement
M.
(3.6)
ki ^ti,
then
may
quantities which are

joint probability
instance, the
examples,
possible experimental
Axiom
other
however,
way;
expression of
for
may
depend
ordered
an
and
response
the
0 ^
Ti,
We
unsymmetrical
studying
reinforcements.
by
1 ^
will
subject
responses
define
an
that the probability of
such
and
in
"
are:
the
experimentally
also interested
are
one
in
way
multiperson situations.
any
n.
formed
(3.4) are
moments
investigate subsequently.
we
lim a)^"
experimental
of
are
j 1a;"_OP(x"-i)
appropriate limits exist, we
(3.5)
which
probabilitiesat trial
response
a;".
sequence
moments
properties
asymptotic
a)^, of the
moments
the
and
Bush
of
(3.1) essentiallytheir general

plicitly
obtained, although they do not ex-
is
on
0.
in
certain
here
define
by
model
linear
the
of
indicate
We
replacing
Upon
kiÔ, ki^j
E^^k,
condition
classes
combining
satisfies the
^3
if^"
Axiom
for
and
Model,
Estes
an
replaced by the simpler condition:
f{l-e)P{A"
(3.3) P(^".,
421
SUPPES
PATRICK
AND
LAMPERTI
JOHN
and
1for two-person
Suppes
[1],
situations
Let
are
be
xl,'Li
reported
just
the
422
READINGS
of first
sequence
a
Axiom
of
consequence'
it is in terms
and
their
Axiom
all the
the
in the
exist.
trial
on
with
these
reasonable
and
responses
ments
reinforce-
statistically
independent:
are
work
additional
the
preceding
logous
ana-
joint moments
To
need
we
"
responses
If P(x"_i) "
I.
exactly
(x['j,n
moments
interested
of Axiom
when
given,
define
we
if they
J]^....,J
in terms
that
assumption
are
that
xl^'l^
asymptotes
latter moments
that
shall also be
We
(3.4).
to
and
of
It is
of subject i.
reinforcements
and
1 responses
"
PSYCHOLOGY
MATHEMATICAL
IN
then
P\An
.?!"
^=
'
"
"
"
An
Js\^n-l)
iA
^^
restriction
experimental
The
4.
of notation,
matters
some
of
asymptotic
broad
In
it will be
we
write
may
to
indicate
Xm
x' is
for counting back

To
interested
are
just the
notion
of the
on
all x^, x' and
the
other
1
Proof
if, and
/Ck"
that
It is to
of
place
of
in
of
terms
We
Xn-i.
given trial
probability of
of
P(A"
the
reserve
ilx"-i)
"sum"
The
x^-i.
of
notation
the
subscript
n.
to
define
in
reinforcing
an
exact
pending
de-
event
and
trial outcomes
past
has
only
reinforcement schedule
if,for all
x^
be
X')
includes
noticed
P{En,
the
k, n
that
fact
is
analogous
/CK
and
n' with
the
of
use
of
to that of Theorem
X")
pendent
inde-
with
past
n, n' "
precedes E^.,^^
Aj," which
response
side of (4.1) yields independence

of this
the
x"
P{En
trial n.)
some
it is desirable
conditional
linear model
dependence of length
(It is understood
some
trial number.
Definition.
(4.1)
use
x')
last
finite number
to
sequence
trials from
of the
only
on
and
j\x^
in the
combined
give
begin with
We
L.
clarify the general theorem
the
way
we
existence
the
on
theorems
ergodic behavior.
convenient
P(A"
dealing with
After
the
of
hypotheses
guarantee
satisfied in
been
theorems
general
satisfying Axiom
this section
Thus
"2,
which
models
one-person
The
moments.
conditions
has
learning models.
state
we
"
linear model.
the
for
theorems
Asymptotic
Ji\^n-l)
Axiom
implied by
multiperson studies employing
the
^\^n
l
on
one
side and
trial number.
4.8 of
[6].
The
n'
on
term
424
READINGS
(2.5),consider
establish
To
|P(A",.i
(4.3)
j,E",
=:
where
contains
least
at
i|^",
1P(A",.,
(1
do
We
not
the
obtain
6,)\P{A,,
6"
factor
(1
6^*)at
(1
inequality is
for
"
least
e,r\P{An,-u
length
not
of
times,
1,
that from
so
equality
get, ignoring
x")\
k, x
x")\
repeatedly,
j, E,"
P{A".-Ax")\
diiference
The
we
that
so
P(A"",i
sequence
The
x")
apply Axiom
we
J\^')
x\
than
more
k\x
j\x
x")\
of (i)
j\E^"
P(A""
as
/clx + X')
i, "'",
is the
where
the
x")\
k, x
m*.
"
P{E,,"
P(A"",i
X')
0, but
that
"
x')
j\x
know
(4.4) 1P(A".,,
^
A;,a;
with
by virtue
k\x
right-hand side of (4.3) we
the
to
once
where
inequalities:
=^
j\E^"
and
x,
k*,
for
k\x-\-x')
Axiom
of
E",,
3,
P(A""+i
of
terms
hypothesis,
;r,.,^^, P{E",
Applying
m*
P{A"".,
x')
occurrences
(i)of the
from
follows
last
the
means
x,^.
X')
k,x
=:
following equalitiesand
the
k\x
7r,,,JP(A",îj\E",
PSYCHOLOGY
MATHEMATICAL
IN
term
k\x
x")\
on
the
the
obtain
(4.4) we
right of this
estimate
m*
(1
"", ^
e,.r
whence
is (2.5).
which
On
so
for
the
ting
the
order
are
allows
trials without
can
chain
that
I:P(A"
^f
which
they
original condition
inclusion
are
of
independent
of
cases
reinforcement).
where
as
insures
do
not
to
be
existence
the
depend
upon
about
made
of the
of the
sums
given in [3] would

some
k\xn-,)
i,"'"_i
be expressed
remarks
several
If all 0* -V= 0, the
(2.5)
a)^"
(3.5) and
moments
There
ilx"_,)
moments
infinite
jS^" exist and
2.4 that
Theorem
from
know
the
the
initial
But
of responses.
P(A"
(2.5) we
of
cross-moments
distribution
and
(2.3) and
of
basis
the
asymptotic
e^
be
are
cross-moments
of
the
limit-
initial conditions.
the
theorem
satisfied;our
0 (i.e.where
just
weaker
there
dition
concan
be
JOHN
First,
proved.
observe
we
that
min
(4.5)
on
Contingent
k\An-"
x)
3,
for
with
case
that
(i)of
need
(4,5) we
An
and
responses
cases
(ii)aof
trials which
general apply
On
the
that
the
2.2, conclude
Theorem
(I) and
satisfy
Experimental
their
is
that
they
method
(I) excludes
of
if
(I) only
may
not
in
virtue
of
0.
4.1
[8] do
Karlin
are
moments
response
constructive
in
Theorem
proof of
0.
It
computational difficulties.)
theorems
to
j', tt^-jj,^
[5] test
Estes
(ii)aand
for
although
known
no
(The
the
apply
satisfied,and
asymptotic
is
(II)
4.1, there
Ti^jjr
for all j and
0 and
convergence
basis of the
To
0.
"
P("'"
"
immediately
cause
(II),and
to
j, x)
for all j.
j',x)
j',x)
actual asymptotes.
the
noted
be
^^
about
fact
P(^"-"
let
model
Estes
[5].
A"_i
3,
j, A^-^
that
the
Let
basis of Theorem
the
on
non-reinforced
we
by
may,
of successive
the asymptotic jointprobabilities
that
also exist:
responses
Corollary
for
positive
(4.5),(i)and
of
k, 7Zj^{v)
i^ 0
Estes
experimentally and
test
to
that
(4.1) is
such
interesting
such
case.
Theorem
computing
also
P(A"
in
given
are
contingent
such
Then
for
has
interesting experimental
In
some
P{E^ =^k\A^
exist
of
x^*
in terms
v.
lag
only that for

0, 1, 2
Double
II.
simple
sequence
of
described
for all
Tt^^{v),
"
need
(4.5),we
all
what
k*
event
4.1.
I.
for
reinforcing
number
be
can
the
matter
no
preceded.
model
hnear
Theorem
data
trial
every
reinforcements
the
dition
con-
7t,c'.x.^
interpretation of (4.5) is that
probabihty
of
simple sufficient (but not necessary)
(ii)bis
for
The
425
SUPPES
PATRICK
AND
LAMPERTI
every
1.
//
limit
the
-i
as
(-^n+m
hypothesis of
the
^^
^-
Theorem
4.1
is
satisfied, then
of
co
3mf "'^n
+ m-l
"
3m-l"
'
'
'
"
-^n
"
3o)
exists.
We
random
1, and
of
the
may
regard
the
probability vector
distribution
existence
probabilities.
F^
of
on
the
quantities P{A"
with
trial
moments
an
n.
jlx^.^), for
1 "
arbitrary joint distribution

The
following corollary is
a) independent
of
the
j ^
F^
r
on
as
trial
consequence
initial response
426
READINGS
Corollary
is
there
IN
the
//
2.
PSYCHOLOGY
MATHEMATICAL
hypothesis of Theorem
unique asymptotic distribution
is satisfied,then
4.1
F", independent
of F^
which
to
the distributions
have
of
F" converge.
multiperson situation characterized
the
For
analogous
theorem
this theorem
we
such
have
and
schedule
all x"" x' and
^.(1)...,.(.o,
Theorem
(i)
exactly
m,
For
in the
use
of
reinforcement
as
we
i ^
if for all /c, 1 ^
in
did
s, all
we
hypothesis
schedule
with
(4.1), namely,
and
M,
n' with
we
n, n' "
x"
PiEir
P(Ell'
Let
4.2.
has
^//
4.1.
notion
the
of length
past dependence
Theorem
to
define
/ and
by Axioms
k"'\
k^'"\x^ + X')
/c^')\x,,+
k^'\ "',"[:'
be
^//
linear
s-person
an
schedule
reinforcement
Eir
...,
X")
model
with
that
such
dependence
past
of
length m*,
(ii) there
(a)
^1H,*^0,
(b)
there
all integers
is
8*
and
an
i ^
1 ^
integers k^^^\ for
are
that
for all
that
such
m^
such
s,
the
of the initial distribution

The
Proof.
"
"
",
that
k^'^ is the
Using
for which
establish
we
to
(2.3) and
P{Al,'l, i",
X)
Pn.AJ''"\k,
we
PiAi'l,
omit
the
verify (2.3) we
applying
"
"
",
subject
"
"
preceding
be such
j'^'"*
the
state
notation, it
simpHfy
(2.5). To
let
as
k^''"*)
",
by the ith
the
on
hypothesis,
k^'^*,
i^'^*,
2s-tuples
as
made
response
that
k'-'^*of the
(j^'"\
are
is
jo
venient
con-
define:
P..,{j,k\x)
To
take
We
defined
now
are
for
and
responses.
j^'" is the
reinforcement
reinforcements
the
^ 0.
^^(^,.^(0-
Moreover,
chain
the
j^'\ k'^^\""",A;("), where
subject and
trial.
of
states
of
ctZi exist
7\i). (3),...,,(s)of -/^
moments
"
asymptotic
independent
(i*^\
and
^ r
P{E\r.n., k^'^\ "-,E\:U,^ k^'^*\x,,)
Then
sequences
now
Axioms
.-.,
j^^\ E^:'
A'"n,
i^^^l^^''
-
k^'\
superscript notation
proceed exactly
I and
instead
as
Elp
.,
from
in
A;",
the
of L, and
""",
k^^",x)
and
^^'^
X.
proof of Theorem
obtain
we
k\x") ^n
mo+l(y,
6*^(0
A^(f)*^(t).5*
.
k^''\x),
2?n+
that
4.1,
JOHN
For
first observe
(2.5),we
\Vn'î{3,k\x+ x')
AND
that
PATRICK
the
hypothesis
riVn^'ÂPAk,
a^')
a;
notice
the
-np""UJ^'^k,x
x")\
Up^"^,{j^'^\k,x
+ x')-p^.,,,{j^'^k,x
+ x")\
x")\p".,,{j^'"\k,x
development,
same
x
E \Pn'.AJ^'^\k,
X')
Continuing this
x")\
right-hand side is
+ x')\Up",^,iJ^'"
\k,x
W^\k,
^^.r.*\pn'^,
i=i
that
next
x")\
1=1
(i)of
of
by virtue
Vn"AJ, k\x
II Vn'AP^\k,
TT,,"*!
We
427
SUPPES
Axiom
and
LAMPERTI
obtain:
we
x +
x")\
x')~ Pr,"^.,U'''Ak,
1=1
And
Hne
by the
contains
sequence
reasoning used
of
in the
proof
of
at
least
(i*^^^',", A:"^'^*)
state
"
4.1, if the
Theorem
"
times
of
""j
last
the
is
quantity
I
i
Provided
conclude
we
follows
then
A
w*
"
pair
(2.5) holds.
from
the
Finally, we
which
want
of
this
linear functions
these
in the
as
from
remark
of
the
Theorem
section
are
from
asymptotic
of Theorem
case
which
moments
Q.E.D.
4.1.
which
just proved
are
4.1.
Axiom
that
the
theorem
apply
involves
diminishing, i.e., have
asymptotic results
which
to
estimate
an
existence
theory of "2
given after
two
distance
are
The
corollaries follow
like the
exactly
this inequality yields
that
of
'^
=
slope less
to
linear functions
than
The
one.
learning models
many
replaced by non-linear
functions
in
having
this property.
References
1.
2.
C.
Richard
in terms
Robert
R.
and Patrick
analysis of two-person
Suppes, An
learning theory, J. of Experimental
Psychology, 55
Atkinson,
of statistical
Bush,
and
Frederick
Mosteller,
Stochastic
Models
game
situations
(1958), 369-378.
for Learning,
New
York,
1955.
3.
W.
France,
4.
W.
Doeblin,
65
K.
and
R.
Fortet, Sur
des
chaines
liaisojis
completes. Bull.
Soc.
Math.
(1937), 132-148.
Estes, Theory of learning with
reinforcement, Psychometrika,
5.
,
Of models
and
22
men,
constant, variable,
or
of
contingent probabilities
(1957), 113-132.
Amer.
Psychologist,
12
(1957), 609-617.
428
6.
READINGS
W.
Model
Sirnple
for
Mathematics
and
7.
T.
8.
Samuel
(1953),
9.
E.,
Stanford
chains
On
Harris,
of
Some
of
Studies
of
random
of
Contract
Nonr
1957.
Learning
Pacific
order,
arising
16,
Theory,
225(17),
edited
by
sion
ver-
R.
in
J.
learning
Math.,
models
(1955),
I,
Pacific
Kennedy,
Math.,
University
(1957),
convergence
1107-1124.
theorem
R.
707-724.
J.
Math.,
725-756.
J.
The
plied
Ap-
abridged
An
Tlieory,
I.
1959.
Press,
infinite
u'alks
Learning
University,
Mathematical
in
University
Statistical
No.
Report
Stanford
Laboratory,
Stanford
Estes,
Karlin,
Maurice
Pacific
K.
Foundations
Technical
Learning,
Chapter
as
W.
PSYCHOLOGY
MATHEMATICAL
Suppes,
Statistics
and
appears
Bush
Patrick
and
Estes,
K.
Linear
IN
for
certain
class
of
Markoff
processes.
MARKOV
FINITE
PROCESSES
George
MASSACHUSETTS
Finite
Markov
A.
Miller
INSTITUTE
TECHNOLOGY
OF
reviewed
are
processes
PSYCHOLOGY*
IN
considered
and
usefulness
for their
alternative
in
data.
The
various
description of behavioral
responses
and
define a vector
changes in the probabilities
experimental situation
an
space,
in this space.
ods
Methof these alternatives
represented by movements
are
considered.
of fitting the theory to experimental data
are
of transitional
with
matrix
The
constant
probabilities
a
simplest process,
the effect of successive
that
to
is applied repeatedly
trials, seems
represent
function
that may
be useful for
A matrix
learning data.
inadequate for most
learning theory is presented.
the
in
In
as
the
general
two
where
areas
probabilistic considerations
of
these
two
in time.
effects
This
areas,
The
of
basic
based
model
the
the
has
very
of
misrepresent
statistical
the
of
assumption
probabilities are
are
examined
possible
familiar
use
because
ignored
sequential
randomized.
or
probability
of
For
and
ask
failures
models
help
their
to
attitude
In
either
fail
rejection
abandon
is to
had
in
pendent;
inde-
of
the
from
dependent
incorporating
dependent
be
can
models
processes.
and
may
not
must
lead
familiar
it is intrinsic
are
variables
proper
what
Markov
usefulness
example,
measurements
independent
more
this
psychology, however,
mathematical
finite
for their
of
Such
process.
simplest
the
to
relatively invariant
length
be
construction,
test
It is characteristic
are
at
can
successive
inadequate;
as
and
variables.
that
independence
probabilities. The
it
theory
basic
concepts
and
profitable results.
to
use
explored
relatively successful
worth.
observations
problems
learning
to
attempts
or
led
often
not
be
random
dynamic
more
notion
can
makes
independent
upon
With
the
secondary
are
situation
their
proved
ago
that
parameters
measurement
fortunate
long
however,
been
psychology
quantitative science, i.e.,sensory
has
psychology
this
limitations
such
paper
for
processes
describing psychological
data.
1.
Simple
experiments
in
order
in
*This
Jersey, while
This
the
time
which
article
the
usually
was
author
appeared
written
was
in
on
Often
choices
at
the
in
the
Psychometrika,
for
Advanced
Harvard
1952, 17, 149-167.

429
to
purpose
from
data
from
of
sequences
possible
The
occur.
leave
of
form
it is
Institute
sabbatical
The
Alternatives.
Two
come
continuum.
alternative
article
with
Chains
Markov
ignore
of
logical
psycho-
choices
the
this
temporal
discussion,
in Princeton,
Study
University.
Reprinted
with
bedded
em-
New
permission.
430
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
should
the temporal sequence
situations in which
however, is to examine
model
of dependent probshall adopt the Markovian
be ignored. We
abilities
not
with
the
We
such
discuss
simplest
begin, therefore,
to
sequences.
of
possibleexample
Consider
and
letters A
the
produce
are
its future
We
shall
the
words,
where
.
the
might
durations
this sequence
is
produced
of
of the
state
present
and
number
of the trial: 0, 1, 2,
the two
alternative
p*"^(^) probabilityof
p(A)
asymptotic
d"
the
system
governs
plT^B)
at
at n, the
conditional
probabilityoi
at
2, 3,
on
as
trial n,
or
of two
1.
-\-m,
probabilities.
of the matrix
roots
it follows
ways.
at trial
occur
can
"
of transitional
X,-
1 in either
trial n, considered
at
probabilities
of
probability
characteristic
an
n.
"""
"
conditional
at trial
at n, the
matrix
Alternative
p'"'
(^4)as
of
set of absolute
given A
given A
Va{B)
responses.
alternative
value
...
vector; [p'"'(^),p'"'(B)].
assume
of trials
sequence
.
are
responses
alternatives. If the
two
development.
adopt the followingnotation:
at
that
the distribution
that
i.e.,
In other
n.
follows
one
ABBAAABA
of responses
process;
of trial
outcome
alternative
two
of these
choices,then
these
ignored. We
only
of
at trial n + 1
probabilities
of trial n. However, the knowledge of outcomes
the outcome
the
does not change our
descriptionof the system if we know
by a Markov
depends upon
prior to
choice
designate
sequence
latencies
and
of
which
in
experiment
an
trial consists
possible.A
chain.
Markov
1 in either
trial
on
n.
of two
Either
ways.
Similarly,B
fact leads
obvious
This
T.
to
can
the
it
occur
following
equations:
p'"\A)pM)
p'"'{B)ps(A) p'"*^'(A)
=
^j^
p'"\A)pAB) +p"'\B)ps(B) =p'"''\B).

In matrix
notation
these
be written
equations can
ipM)
PB{A))y"\A)}W''''\A)l
^2)
]p,(B)
psiBMp^'HB)) y-^'HB))
The
reader
is assumed
If the distribution
vectors
d" and
of
to
be
familiar
probabilities
d"+i in
on
with
trials
two-dimensional
the
n
and
space,
of matrix
elements
n
then
1 is
the
theory.
regarded as the
square
matrix
of
432
similarlyfor
convenient
det
The
{T
\I) =\'
of this
equation
equation
and
of T
y,-
we
obtain
the
(r
roots
PsiA)]
Pb(A).
that
note
into Tvi
the
columns
X.y,-and
S,
solving
[1, Pa{B)/pb(A)]
vectors
of
unity is always
and
so
from
Eq.
+ ^=(^)
-if)owA)-p.u)rf^^(^)
PsiA)
be written
Eq. (6) can
Pb(A)
(6)
conveniently
more
\pb{A)pBiA)}
,
__
(pAB)
Pa{B)+Pb{A)
pAB))
[pM)
pb{A)T ^ ^^^^^
i-pAB)
pAB)+pBiA)
-
|Pa{A)
Since
to
zero
as
With
A
0.
of the matrix:
(1, "1). These vectors comprise

(5) we obtain,after invertingS,
"
unity,we
are
and
rpn
in the
be written
can
roots
Pa(A)
Substitutingthese
vectors,
[pM)
Pb(5)]X+
Xa
of all the columns
sums
the characteristic
are
characteristic
for the
\pM)
of these matrices.
root
determinantal
the
subscripts),
Xi
PSYCHOLOGY
form
roots
Since the
MATHEMATICAL
IN
READINGS
on
00
J
the
so
Eq. (7) we
successive
p^B)
can
PBiA)
right of Eq. (7) goes

first term
represents the asymptotic form of T".
and so obtain the probabilityof
calculate T"do
Pb(A) \ "
"
^'^M. (7)
1, the second
term
on
the
trials:
Pb(A)
p''\B)Pb(A)
,.,^1n?"^"^(A)?".(^)
-
The
value
r^/,^
of
Pa(B) + PsiA)
It is apparent that
Eq. (8) can
be written
p^-'iA) ail
=
be-n,
(9)
433
MILLER
A.
GEORGE
where
pAB)
b
Eq. (9) is
The
from
are
term
the matrix
subject
2^
[pM)
p^^\B),
Pb(A)].
function
"
follow such
may
form
that
Suppose
of the
use
alternative
two
scribe
frequently used to dethat
be noted, however,
learning function, the individual

do not
generating stationary time series that

"learning" probably should be reserved
operator changes on successive trials.
shall illustrate the
We
'
learning experiments. It should
the average
subjects
-In
exponential growth
an
data
while
-p''\A)
Pb{A)
+ PsiA)
with
chain
Markov
in which
cases
numerical
called
are
responses
represent learning.
for those
ample.
ex-
and
right (R)
by the percentage of
(W), that p''"\R)and p^"\W) are measured
subjects in a large sample that choose R and W on trial n, and that the
constant.
observed
successive pairs of trials are
transitional probabilities
on
values for T do
Assume
the followingnumerical
di :
wrong
time;
follows wrong
wrong
that the successive
73 per
values
of
Eq. (8) we calculate

0, .27,.46, .59,.68,etc., approaching
p^"\R) are
we
know
that
probabilityof R
on
cases
defined
on
we
Function.
wish
to
will mention
consider
tedious
next
it
0, 1, 2,
.)
from
the
for the
function
matrix
chains
Markov
because
now
the autocorrelation
compute
to
of such
simple parameter
We
function.
is most
or
(n
.7")
particulartrial a W occurred,this equation gives the

succeeding trial.
is the autocorrelation
complex
the nth
2. Autocorrelation
not
of the
cent
per
of the time. From
cent
p'-'iR) .9(1
If
97
equation is
of .90. The
the asymptote
right response
by another
is followed
right response
j-27
1.73
more
is either
of transitional
probabilities.
The
autocorrelation
displaced 0, 1, 2,
series with
responses
steps. With
+1.
itself is, of course,

on
is the correlation
function
trials 1, 2, 3,
"
"
"
are
zero
of
time
series with
itself
displacement the correlation of

a
displacement of one step,
With
correlated
with
the
responses
on
the
the
trials
434
READINGS
2, 3, 4,
after
IN
binary choices is fairlylong,the autocorrelation

step is given by
one
We
that
note
Vi
More
is
pM)
PsiA).
characteristic root
(10)
of the matrix
of transitional
abilities
prob-
generally,
r"
Pa'Â)
where
PSYCHOLOGY
.If the series of
...
displacement of
MATHEMATICAL
p'rXA)
p]r\A) are
and
that these elements
of T""
^^^
^'
p's'-XA),
elements
(11)
of T"*. From
Eq. (7) we
observe
are
Va{B) + pM)
and
Vb{A)
Vb(-)(^)
ps{A)[pM)
PBiA)T
When
these values
short,for
and
If
simple Markov
is the wth
[p^{A)
in
pBiA)r
(12)
rT.
chain, the autocorrelation

of the autocorrelation
power
simple example
noted
that
the
is
between
as
Markov
Consonants
function
one
is
easilycomputed from
letter the
.06, -.03,
The
value
of the
.51
in written
zero.
Samoan.
this matrix.
correlation
function
the determinant
is the determinant
When
1.
For
autocorrelation
The
successive
displacements
coefficient is 1, ".49, .24, ".12,
etc.
autocorrelation
as
Py{V)\ h
follow consonants
never
positions
and
|0.49/
\pc(C) py(C))
iPciV)
language. E. B. Newman
(C) and vowels {V) in Samoan
chain with the. followingmatrix
of consonants
sequence
between
by the Samoan
provided
writingis adequately described

of transitional probabilities:
of
obtain
Eq. (11),we
Iri I " 1, then |r" |declines monotonically toward

A
has
substituted
are
r"
In
Pa{B) + ps(A)
the
for this
of T". Thus
can
process
is the determinant
Tq
of T, r'2is the determinant
distribution
simple
of
also be
of T"
scribed
de-
I, Vi
T^,etc.
of
1 depends upon
events
at n +
probabilities
prior to n as well as upon n itself,
Eq. (10) stillholds as a definition of the
than two
autocorrelation
but
more
function,
Eq. (11) does not hold. When
unsealed alternatives are used, the autocorrelation
function is not defined.
3. Extension
to
More
than Two
Alternatives.
than
The
equations
straightforward.
Designate the alternatives A, B, C,
to
experiments involving more
two
extension
alternative
...
of the matrix
responses
A^. Then
,
we
isj
havei
A.
GEORGE
435
MILLER
Vn{A)
Va{B)
vb{B)
(n
l)
{A)
Vn{B)
\ =\
[va{N) Vb{N)
General
solutions
considerable
[p"'''\N)}
V.{N)]W\N)]
known
are
(13)
for certain types of operators. These
interest in
and
initial distribution
five
genetics,where
the
of
are
of T
elements
are
physics
scriptive
given by theory. The present use of such operators is almost purely dehowever, for we do not know what specialtypes of matrices will
of
the
be
greatest psychologicalinterest.
It is not always necessary
to find a general solution. A qualitativeunderstanding
of an experimental situation is often provided by simply transforming
the
or
direct
steps by
ten
matrix
plication.
multi-
three
analyzed
might
kinds of responses:
correct
(G).
(C), slightlywrong
(S), and grossly wrong
of learning a subject begins by making gross mistakes,
During the course
then slightmistakes, and finally
Such a
correct
to make
responses.
manages
situation could produce a matrix equation like the following:
For
example,
(PciQ
learning situation
psiQ
into
be
p,
.(5)""19""'(*S)"".l
=
(pciG)
W"\G)) (o
ps(G) PoiG))
It is tedious
.1
.7)
(l)
general solution of T", and it is easy to see

happens. The proportion of grossly wrong
successive
on
.3""0".
to find the
multiplicationwhat
declines steadily:1, .7, .52, .40, .32, .26,
errors
.6
responses
proportion of small
.08. The
"
first increases,then
trials at
by direct
decreases:
0, .3, .39, .40,
gives a roughly Sresponses

.69. This situation is analogous
shaped function: 0, 0, .09,.20,.30,.38,.45,
to pouring water
from one
vessel into a second,which in turn pours the water
into a third. The
asymptotic distribution can always be found by solving
.38, .35,
"
"
.23. The
"
proportion of
correct
.
the equation Td"

The
form
of
distinct roots, as
d"
general solution
"
can
be
indicated,for finite matrices
X,- represent the N
follows. Let
polynomial det {T
X/). We
define
XJ)
(T
characteristic
set of matrices
roots
with
of the
/.(T) by
fm
(r
(X,
J)(T
xo(x,
-
X2)
'""
"
"
"
X,-J){T
(X,
x.-_0(x,
-
X,. J)
x,.o
jT
"""
"
"
"
(X,
\Î)
x.v)
(14)
436
READINGS
In terms
of these
matrices,T
T
If
g{\) is a
rational
g{T)
In
\jm
transformation
the roots
X,-
we
other
know
"
\^f^{T),
"
(15)
"
"
gMf^T).
"
(16)
have
is
that
fall between
roots
g{X2)UT) +
xr/iCr) +
expressed
polynomial, then
X",we
PSYCHOLOGY
\,f,{T) +
9(\;)MT) +
2X2
be
can
scalar
if g{X)
particular,
The
MATHEMATICAL
IN
x:/.(70 +
expressed in
Xi
be
can
and
"1
"
"
xif^T).
"
this form
Eq. (7).Concerning
in
assigned the value

the
Thus
+1.
(i7)
1, and
that
value
asymptotic
all the
of T"
is
givenby /i(r).
The
solution for
particularmatrix
always
polynomial, det(r
be obtained
by (a)finding
X7); (b) determining the
fi{T) according to Eq. (14);(c) substitutinginto Eq. (17);and (d) solving
This procedure has the advantage
T"do for the given boundary conditions of do
of avoiding the problem of inverting a large matrix, but if two
or
roots are
more
nearly the same, the computations may be quite difficult.
the roots
of the characteristic
can
"
The
autocorrelation
because
alternatives,
the
to
various
alternatives.
2X2
much
case
same
inconvenience
the
prior
to
as
to N
that
of the
state
of T",
as
coefficient varies
for
predictingthe
What
case.
system
in order
we
to
to be
no
outcome
We
is to
such
is
must
known,
1. We
are
"1,
possibleusefulness
explored.
trial
at
do
must
make
and
+1
in
in
periodicities
memory.
of the
outcome
coefficient
psychologicalpurposes
have
processes
The
abilities
prob-
and
coefficient,
reveal
can
For
Responses.
different
the
to
of n, lies between
needs
according
of transitional
matrix
function.
transformations
that, if the
values
and
processes,
unordered
two
autocorrelation
function
than
more
correlation
the
autocorrelation
Markov
irrelevant
of
and
Markov
Compound
to
of the
characteristics
an
the non-Markovian
of
for
of the correlation
determinant
restriction
are
the
way
4. Extension
an
defined
determinant
of the
0 for the
of this extension
remove
the
determinant
toward
the
is not
possible assignments of numerical
many
identical. The
declines
the value
However,
has
the
function
expand
must
now
events
sider
con-
definition
the
Markovian
systems
it is
in
largerspace.
and
If the
trial
of events
"
at
probabilities
but
1,
knowledge
for
to
be
Markovian
the
we
-f- 1,
we
have
by changing
state
of the
priorto
the
system.
definition
by
characterize it by pairs of responses.
the
1 does
"
non-Markovian
system
the outcomes
-\- 1 depend upon
of
an
are
This
of
two
of trials
change
our
system
is made
Instead
event.
occurrence
If there
not
diction
pre-
acterizing
of char-
single response,
alternatives,
atomic
A.
GEORGE
437
MILLER
and
B, in the originalsystem, then there are four compound alternatives,

AA, AB, BA, and BB, in the new
system. Thus we must define a distribution
four
a
nd
is
matrix of fourth order:
7"
d" over
a square
alternatives,
Td"
Paa(AA)
Paa{AB)
Vba{AA)
Vba{AB)
V'^'\AA)
V'"'{AB)
p'"'
(BA)
Pab(BA)
Vbb{BA)
Vab{BB)
Vbb{BB)] [p'^Xbb)J
^p'^^^'ÂAy
p'^^'ÂB)
(n+l)
d",i
(18)
{BA)
yp'""^''
{BB)]
Note
that
for the
AB
"
"
to
system
the
AA
of the
many
"
BB
made
been
vowels
{B)
can
as
E.
B.
AA
from
move
others
to
state
some
BB
to
it is not
zero;
in
singlestep. For
in less than
two
and
of vowels
Newman.
in written
consonants
The
by
sequence
be adequately representedby a matrix
of consonants
ample,
ex-
steps:
of the form
Hebrew
{A) and
of
be applied iteratively
T can
to
before,the transformation
distribution.
stable
into a final,
unique,
As
possible
AABJB.
in the sequence
of sequences
Tabulations
have
cannot
system
"
from
move
are
probabilities
transitional
Eq. (18):
carry
any
This
merit. For
to
seem
requires
tray. In
animal
an
order
define
could
responses
of the Markov
extension
and
to
be
similar
of the
approaches
manner.
system
of
to
length m
the transformation
The
include
into account
far
verbal
case
is
as
much
of the past
so
possible
there
would
be of order
would
in human
verbal
all the
-\- \. Thus
adequately discussed in this paper.

In principle it is possibleto extend
to take
as
as
the data
conditioning
example, fixed-ratio reinforcement
then
times in one
to respond m
approach the food
way,
we
keep track of the sequential aspects of this behavior
sequential dependencies arise

in
be carried
in operant
state
states,and
can
process
behavior
2'"*\ More
and
complex, however,
the
Markov
sequences
complex
be treated
can
that
definition
historyof the system
of
be 2'"'^âlternative
as
it cannot
indefinitely
one
desires.
438
IN
READINGS
Cases
known, however, in which
are
PSYCHOLOGY
MATHEMATICAL
the extension
far into the past in order for the Markov

infinitely
Such
the information.
likelythat
seems
cases
better handled
are
methods, and that Markov
other
are
probabilities
model
in other
learning situations will need
most
when
to be
summarize
to
At
ways.
all
present, it
be described
to
carried
by these
using a singlematrix of transitional
processes
valuable
most
need
would
the behavior
has settled into
relatively
stable pattern.
5.
Fit
Leastr-Squares
describes the
one
T. We
give the best estimate

fairlynatural
the
assumption that
trial can
wish
for T from
be the most
not
may
Under
behavior,every
singletransformation
of the
will
to Data.
be considered
to find
formation
singletrans-
measurement
least-squaressolution that
the available
data. The
followingprocedures
efficient for Markov
extension
of the
processes, but they represent

used
with more
familiar statistical
procedures
represent the observed
problems.
introduce
We
is formed
by placing
from trial
trials,
trials,then
matrix
is
an
(n
placing in successive
trials from
through
the elements
to the observed
wish
in iVto
to determine
have
and
the
Eq. (20) we
Thus
is also
A^ and
the
observed
known
matrix
distributions
an
the
(n
obtain
an
is formed
observed
"
on
1) matrix.
distributions:
corrections
that
must
be added
N.
of the transformation.
a
From
singleoperator throughout
C.
(20)
-N
+
be
must
CC
MC
TM.
(21)
minimum.
to T to
zero
This
is obtained
for C
from
0.
(22)
Eq. (21) into Eq. (22) and obtain
TM)'
-MN'
MM'T'
by
dT
M{-N
cessive
suc-
expressionfor C\
least-squaressolution,CC
substitute
tains
con-
for
equation :
-1
now
on
(19)
assumption of
putting the partialderivative with respect
We
matrix
C,
are
are
This
distribution
of the successive
T, the best estimate
C
For
the
give the best estimate
TM
From
1) matrix.
"
distributions
1. If each
"
columns
n.
of the matrix
values
the definition of M
learning,we
data.
distributions
A^ represents the best estimate

N
We
the
trial
through
quantities,and n such
the successive
where
colunms
alternative
analogously by
The
to
in successive
successive
a
matrix
0.
440
READINGS
fit of this
IN
to the observed
function,p^"'(i2),
fitfor the transformation

From
7^,^
PSYCHOLOGY
MATHEMATICAL
Eq. (21)we
can
calculate the corrections that
added
are
to A^:
.708
.708
.814
.814
.814
.761
.814
.867
(.345
.239
.292
.292
.186
.186
.186
.239
.186
.133
.814
.761
.814
.867
.920
.920
.814
.814
.867)
.186
.239
.186
.133
.080
.080
.186
.186
.133)
.114
-.039
.045
.086
.108
.161
-.092
-.108
-.161
.067
-.086
.092
.114
-.067
.014
-.039
-.014
-.014
-.114
-.086
-.133
-.080
.039
-.114
.014
.086
.014
-.120
squared deviations
are
of the
-.086
-.014
.086
-.033/
.033)
givenby
(-.144
best estimate
.039
.080
.133
.120
The
least-squares
(.655.761
(-.045
The
have
data; we
T.
.144)
dispersionof the calculated from
the observed
values is
""'
=
/
In
"
matrix
The.variance-covariance
V
From
Eq. (25) we
p^{A) and VeiB):
compute
same
procedurecan
-092.
(24)
-^H-
(25)
17
'"''
1-2.91
the standard
data matrices M
\fW
V is givenby
a[p,{B)]
The
1
1
AMMr'=^\
c[pM)]
The
"
be
.092
J~~
.092
^^
11.99)
deviations
of the estimates
.04
.132.
appliedto the data from
and A'^then have either 0
of
or
on
singleanimal.
trials;
e.g.,
successive
GEORGE
^^jlOl
441
MILLER
A.
11001011
(01000110100
(O
N
(10
In order
110
to solve for T
10
determine
we
h^^'^^"^^^M, MM'
NM'=
(m(l,0)m(0,0))
The
symbol m{i,j) represents
i^j;m{i)
represents the
1, where
"
of
the number
of
number
of i; and
m(0)
invert MM'
and
occurrences
of trials. Next
is the number
of the ordered
occurrences
we
(mil,l)m(0,l)^
(_}_
NM'iMM'Y"
solve for T:
^
\
\hri{l)
pair
m{l)
(m(l,0)
m(0,0);
^)
(26)
m(0,l)\
/m(M)
m(l)
(^
m(0)
m(0,0) i
)m(l,0)
is the
Eq. (26)
transitional
would
that
result
m(0)
m(l)
be
from
expected
to
estimate
the
fM
m(0,l)
) "^^^^ "^^^^
m(0,0)
)m(l,0)
m(l,l)\
m(l,l)
"^^^^
"^^^^
m{\,G)i
m(l,0)
TM-N
'
"
m(l)
m(l)
-m(0,0)
m(l,l)
) "^^^^ "^^^^
)-m(l,l)m(0,0)
-m(l,0)
"^^^^
-m(l,0)\
'
squared
deviations
m(0)
mil)
are
given by
CC
"
'
"^^^^
V
m{\,0)i
m(l,0)
'
The
find
Eq. (21) we
/
m(0)
m{\)
'.
"
'
of the
calculate
dispersionwe
/m(l,l)
from
definition
the
probabilities.
In order
Then
m(l)
'
"
m(l)
442
READINGS
IN
MATHEMATICAL
PSYCHOLOGY
where
4"^]
m(l
[m(l,0) +
The
m(l)
m(l
M^^î
M'm]
[m(l,l)m(l,0)']
+ [m(0,0) +
î
'
m(l)
w(l)
w(0)
m(0)
m(l,0)1 ".^Jm(0,l)
m(0,0)1
[^(1,1)
+ m(0)
L
"
m(l)
m(l)
"^
'
^1
m(0)
m(0)
J'
dispersionis,therefore,
m(l)
\n
"
"
m(l,l)
m(l,0)
w(l)
m(l)
"
r m((
(0)
+
The
"m(0,l)m(0,0)1
m(0,l)]
variance-covariance
matrix
"
Ln
"
m(0,l)
m(0)
"
m(0,0)iy^
(27)
m(0)
Jj
is
|m(l)
F
/^-l
a\MMO
"
m(0).
and
from
this matrix
compute
we
'
a[pA(A)]
and
a^
Although these examples are worked

the same
alternatives,
procedures can be
or
with
Markov
o-[ps(5)]
m(l)
processes
defined
out
used
""VmCO)
for the Markov

with
more
than
case
two
with
(28)
two
alternatives
for
be
It should
compound responses.
of
Markov
chains
neither
are
properties
techniques will undoubtedly develop as
widely applied.
that the statistical
stressed,however,
simple nor well understood.
the Markov
process
becomes
Better
more
the explicit
Transformations. Up to this point we have made
could describe the successive
assumption that a singletransformation
alternative seof the alternative responses
or
quences
changes in the probabilities
theoretical
of responses.
This
the
assumption greatly simplifies
the data hint that it might be true.
landscape and should be made whenever
Simplicity is not, however, an intrinsic property of the behavior of living
organisms, and so we must be prepared to deal with situations that obviously
violate the assumption.
6.
Variable
GEORGE
A.
443
MILLER
is adequate means
that the
assumption that a singletransformation
transitional probabilities
fixed from the first through the last trial. Since
are
determine
the transitional probabilities
the sequences
of responses
that are
that
the
animal's
of action
course
probable or improbable, we are assuming
is
fixed
the
In
certain
or
throughout
a
experiment.
strategy
sense, therefore,
that there is no
such an
the
assumption means
learning at all;as soon
as
situation
is
encountered
for the first time, the subject adopts
experimental
the set of transitional probabilities
that will later describe the statistical
propertiesof his behavior after he has had long experiencein the situation.
The
would
be justified,
for exassumption of a singletransformation
ample,
after a long series of alternate conditioningand extinction. In this
for the reexperiment the subject is able to evolve a singletransformation
inforcement
The
conditions
animal
has
adopted
distracted
temporarily
is removed
But
of the
in most
priori reason
there
several
are
In
to
order
to
and
stable
in
situations
to
reasons
consider
The
same,
and
from
follow
situation
normal
to
Or
and
when
if
an
then
the
is
pediment
im-
singletransformation.
that are
studied experimentally there is no
a
will be adequate, and
singletransformation
illustrate what
data
expected
to
expect that it will
has
in
his return
conditions.
is involved
been
another
10 rats
the
to
assumption
show
one
assumption
consecutive
20
on
be.
in the
prepared
where
not
where
case
is wrong.
choices
of
in
Once
a
single
the
more
T-maze.
choice,and 0 represents an incorrect choice.

lA and IB the numbers
of rats making the correct
choice are
the
the
both are the same
in
the
fitted
section.
as
example
preceding
symbol
In Tables
the
extinction
of behavior
way,
be
expect that
for the
mode
some
might
transformation, Table I
assumption is correct and
we
another
represents a
correct
TABLE
Hypothetical Data
for Ten
Rats
1
on
Twenty
Trials in
T-Maze
444
READINGS
PSYCHOLOGY
MATHEMATICAL
IN
TABLE
(Continued)
Variable Transformation
IB.
Trial
Rat
12
successive
lA
There
to be
seems
secure
no
trend
more
trials by
10
we
can
11
12
13
14
15
16
17
18
19
20
10
10
clear trend
IB
in IB
for
Po(0)
Pi(l)
Trial
pi(l)to
pi(l) is observable in
reliable estimates,we get
for
Pi(l)and Po(0)
of
the values
estimate
[m(i,j)]/m{i):
Po(0)
Pi(l)
whereas
trials,
to
in Table
pairsof
Trial
by fives
the data
From
on
increase
lA. If
we
on
group
successive
the trials
GEORGE
Comparisons
transformation
lA
and
IB
such
these
as
that
show
identical in this respect. The
are
analysis of short
constant
distributions alone, for
assumption
is
if
justified
relativelyconstant
of trials shows
sequences
of
assumption
the
by the successive
be checked
cannot
445
MILLER
A.
the
transitional
definite trend,
in lA. If the transitional frequenciesshow
as
a
frequencies,
in
the
as
IB,
assumption is not justified.
transformations.
The
face variable
question is what to do when
we
Whatever
do cannot
PQRST
we
do, the situation will not be simple.If
translated
be
into
TTTTT
plex.
do the matrix products may
get quite comIf we
could choose P, Q, R, S, T as commutative
matrices, it would be
solution for all of them; all matrices would
possibleto find a simultaneous
.
"
have
the
characteristic
same
however,
not
different characteristic roots.
but
vectors
it does
possiblein
seem
general to
choose
fortunatel
Unmutative
com-
by the data.
propertiesdemanded
If the complexity of the problem is admitted
inevitable,we can still
as
reasonable
look for a matrix function of n, T(n), that changes in some
way
successive trials.The followingargument illustrates one
approach.
on
possible
We assume
that at the beginning of the experiment the subjectsare equipped
with transitional preferencesgiven by the matrix
U. After long experience
in the situation the subjectsdevelop transitional preferencesgiven by the
matrices
matrix
the
by U are
represented
slowly strengthened.
experiment progresses the tendencies

slowly extinguished and those represented by V are
Consider the followingsequence
of equations:
where
the
V. As
with
0 "
the
T(0)
T{1)
wT{0) +
T(2)
wT(l) -^ {1
T(n)
wT{n
perseverationof the
If the extinction
Eq. (29) can
w)V
w)V
(1
1) +
tendencies
represents the abilityto adopt the

the old pattern
(1
rationale for this set of
1. The
"
the
on
mode
new
(29)
w)V,
equations is that w represents

w)
preceding trial,and (1
of response
symbolized by V.
"
of the old pattern of responses
extinguishesrapidly,w
be written
in terms
T{0)
T{\)
wU
T[2)
w'U
+{l
T{n)
vfU
-f- (1
is
{\
is
near
unity; if
V:
w\U
V)+V
w\U
V) +
V.
w')V
w\U
V)
V.
w'')V
w\V
w)V
zero.
of U and
near
is slow,
F) -h F.
(30)
446
READINGS
IN
PSYCHOLOGY
MATHEMATICAL
creases.
that, since 0 " w; " 1, T(n) approaches F as n inThe importance of U becomes
progressivelysmaller as the subject
and more
has more
experiencein the experimentalsituation. This formulation
has the advantage that it is relatively
easy to compute the successive values
of T{n), given U and V. The initial and final matrices,U and V, can be given
from data obtained
be determined
or
can
theoretically
priorto the first trial
and after the learned behavior has stabilized again in the new
of action.
course
it is clear
In this form
For
illustrative purposes,
that
assume
and
and that the weight
Then
on
have
known
are
to be
Eq. (30)gives
to be 0.8. Then
learningtrials we
successive
n:
is calculated
and
.9
.4)
.1
.6i
10---
Pa{A):
.5 .58
.644
.695
.736
.768
.796
.816
.832
.846
.857
PsiB):
.5 .52
.536
.549
.559
.567
.574
.579
.583
.587
.589
Next
we
proportionsof rightand
is given by the equation:
calculate the
trials. This
wrong
responses
no)rfo
d,
T{i)d^
d2
Ta)no)do
T{2)d,
ds
T{2)T{\)T{G)do
T{n)d,
rf",i
=
"""
"""
successive
on
(31)
n T{i)do.
n
It is assumed
n:
p{R) :
.5
.53
Considerable
errors
are
from
Assume
givesthe
mentation.
preliminary experidirect
Then
putation
com(.5,.5).
known
U and do are
T{Qi)
the boundary condition d'o
that
values
.559
.587
.614
.639
.662
.683
.700
taken
with
care
must
be
such
10
.716
"
"
"
"
"
"
"
.800
iterated computation, for the
cumulative.
It should
be noted
that if it;
0, the variable
case
reduces to the constant
MAXIMUM
THE
ON
LIKELIHOOD
ESTIMATE
MEASURE
SHANNON-WIENER
OF
A.
George
OF
THE
INFORMATION
Miller
AND
G.
William
form
limiting
The
the first two
and
of the maximum
of information
from
Also,
the
drawn
per observation
approximationsto the bias and
of the
asymptotic moments
estimate
likelihood
Madow
of the Shannon-Wiener
multinomial
a
mean
square
error
bution
sampling distriof amount
measure
distribution
are
determined.
of the estimate
given.
are
Preface
statisticdefined
The
information
in
from
drawn
event
an
(3)and by Wiener (4)to
by Shannon
multinomial
the amount
measure
distribution has been
certain aspects of stimulus
of
adoptedby
in
psychologists
response events
these
In
the
is
however,
applications,
psychologist
psychological
experiments(2).
small samplesand the samplingdistribution of the
forced to work with relatively
usually
to
some
of real interest. In the present paper
becomes
measure
measure
asymptoticdistribution
explored.
1
.
LimitingDistribution of the
The
an
Pi "
per
has
or
experiment
operation
0, i
\,
performanceof
the
results,
possible
Wiener
,k, the Shannonthis
Likelihood
Maximum
of the
moments
samplesis
Estimate
of Information
of Amount
If
the first two
the bias of the statisticfor small
derived and
are
and
or
operation
measure
/th of which
of the amount
has
ability
prob-
tion
of informa-
is
event
k
J,Pi ôg2Pi.
likelihood
estimate
of the maximum
properties
H' of H obtained from
n independent
performancesof the operation.Since if is a
values of the probcontinuous and diff'erentiablefunction of/?,,
abilitie
/(^ for all positive
We
propose
to
consider
the
it follows that the maximum
likelihood estimate,H', is
where
n^ is the
This
Research
with
frequency
article is from
the
Center, Air Research
AFCRC-TR-54-75,
contract
AF
which
"
the /th of the k
outcomes
possible
occurs
in the
Laboratory,Air Force Cambridge

Operational
Applications
and DevelopmentCommand,
BoilingAir Force Base, 1954,
18(600)-322. Reprintedwith permission.
448
GEORGE
performances,and
will
show:
now
distribution; and
H'
pi
one
be
to
Ijk,
or
sponding
corre-
0.
all
not
are
I,
define the
of /, we
values
more
equal,then
k, then
has
H'
has
H'
normal
limiting
chi-squarelimiting
freedom.
degreesof
"
obtain
we
preliminary,
0 for
(a) If the pi
if
(b)
k
distribution with
As
if "j
where,
449
MADOW
G.
WILLIAM
AND
MILLER
(njn) logg",/"of
terms
We
A.
in
H'
"
further
the
simplifies
that
form
calculations.
The
Lemma.
H
difference
is given by the
H'
"
equations:
following
Let
.Ci
"
npi
and
K=%^\j^
-PiJH^Pi.
(1)
Then
-H'
where
0, i
"
p^
vanish,but
k.
\,
that
"i
n j
to do
0 but
is
ni "
i\n
follows
as
0 are
n-
^"^2Pi
Suppose, for example,

and
definitions of H
the
from
Pi
i
stated.
as
-^Pi\og^Pi +2
=
definedto
Then
themselves
are
logg/?,.
H-H'
"
expand H'
can
the eff'ectsof rt^
otherwise.
ni
-'ly^-Pi)lOgaA
npi
verifythat
we
iî
10g2TiZ
have
0 stillyield "pi
npi
-17
need
Vn
in C/" that
Terms
By simplesubstitutions
we
in Vn that have
terms
Proof.
All
Un
H'
have
we
"
-floga-:
2
"
"
and
Un
=1
i
=2
Tioga"
=
npi
rloga- -I
=
"
"
-ôg^Pi
=
"
77-
J^n
=2
i
so
that
if
we
Theorem
a.
with
mean
the
combine
Ifthe
0 and
values
of Un
-10g2/'i +^.
=
"
and
V^,
we
verifythat
H'
Un
V^.
1.
Pi
are
not
all
equal,then Vn(H
variance
H')
has
normal
distribution
limiting
450
b.
IfPi
distribution with k
"
the
precedinglemma,
problem
equivalent
^ninjn
variables
distribution with
\,
,k
"
with
pd,
I,
Therefore,
^nVn
has
(k
a.
be
replacedby
the
the
V" F"
has
proof: The
normal
random
normal
l)-variatelimiting
"
of the random
linear combination
^nEj^ I-
distribution with
normal
limiting
VnEVr,
variables.
value
mean
pA loga/?,
^n^\og2PiE\-^-pi) =0,
and
sketch
"
the
1
values 0, variances p^qi, (qi
pi),and covariances -piPj,
mean
the
\, (i 7^ p. Since
log pi are constant weightsappliedto these
variables,it is clear that Vn F" is a
random
a^. We
\, have
of
Because
all equalthen
not
are
variance
,k
cally
asymptoti-
^nVn.
pi
are
almost
of the calculations
most
0 and
mean
estimates
since
asymptoticmoments.
H') can
evaluating V"(//
if the
that
estimates
for the
evaluating"^nUn
of
distribution
limiting
of
problem
chi-square
limiting
likelihood
partsof the theorem
case
any
has
likelihood
Also, maximum
both
first note
We
Proof.
in
H')
"
for maximum
1 holds
Theorem
needed
be
would
then
,k
will prove
efficient. We
i,j
[1],p. 500).
(e.g.
exception
without
made
\,
part of
first
The
e) {H
(2n/log2
degreesoffreedom.
Xjk,i
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
variance
ff2
-/'Jloga/'^
t/"\^
Var
2(log2/'^)'Var
+
n
iÔg2Pj)PiPj
2 iÔîPifPiî 2.(l0g2/'t)
Pi
(Piloga/?,)
0og2/'i)^ 2 (Piloga/î)
-
lPi(ôg,Pi)'-fJ'
1
We
that
Vn\i-p,
Cow
Ôog2 Pi)(iog^Pj)
i^j
i\og.Pi +Hf
Pi
next
2nUjlog2 e
Let
us
that
show
has
"f/" converges
in
to
probability
chi-squarelimitingdistribution with
define
"
=
,
nPi
1
,
K.
,
zero
"
as
increases, and
degreesof
freedom.
GEORGE
A.
MILLER
AND
WILLIAM
G.
451
MADOW
Then
if 1
npi
2 Ml
and
1, it follows
since /7j "

A.2*
Lemma
and
that
Xi "
^t)log2(1
.-c.^=
(2)
1.
"
Hence
we
apply
can
obtain
we
Un
l/np^"
1 +
"
î),
"i
log:
4r i-'^y iî
"Pi
=1
np"i
Pi
f^P
K^
1) \
^;+i
(3)
"P
where
j+1
"/'i
\ni
Pi
""^"^^-JiyxTn)
"/'i
npi\J+^
(npiY
ijXj+l)
since
Furthermore,
2
i
("t
npi)
0,
have
we
(4)
log2^
'2v{v
(npi)
l)iî
where
*
J^'Ui"l
1) iti
y(y +
It follows
rtj
in the
from
(2) that
do
we
|",
not
"/7,|^+l
(npiY-^
need
of terms
specialtreatment
any
(3), since
the
with
of "j
a
as
approximations to t/" yieldedby
appearance
of (2) to vanish
the corresponding term
when
of terms
it possiblefor the re0 has made
elimination
mainder
involving",
did not requirerif logg"; to vanish
when
to be bounded, for if we
0,
"j
will automaticallycause
multiplier
Hi
"
The
0.
terms
it would
follow
that
Furthermore,
there
from
would
be
that
positive
probability
and
B.l
Lemma
C.2
Lemma
it
1
"
Pr(R';+,
e)=0
be
can
would
H'
seen
be
minate.
indeter-
that
1
^
,2;-2-j-l /
\ fjj-3
and
ER''
where
O,
rj
Actually,it is
easy to
(-iv+^
î+l
from
see
4, (",
(y + i)/\=i
The
Appendix
A.
letter "A"
in
(4) that
we
"/;,v+i
itjis odd
"
ifyis even.
have
(-1P+2
(".
(j + 2)(j+l)if,
(my
"Lemma
"
A.2"
indicates
that
this
np,y+^
{npd^
lemma
will
be
found
in
452
and
MATHEMATICAL
IN
READINGS
PSYCHOLOGY
symbolically,
hence,
^;^:=o(jzj)
o(i)o(;l5)=o(;lj).
Eq. (5) shows that
unnecessarily
large,but
0
converge
fast
as
that
0(l/"-'"^)
the upper
bound
of
the
device
is sufficient to
Thus
to
(5)
above
have
we
that
prove
found
is
R'j^^
ER'I^^
for
and
R'^+i
as
(npiy
i=i
and
ît'i (my
2n
of
the first term
Now
Un
is
logge
^
i
which
to
whereas
all other
C.2.
Lemma
have
2n
Hence
npif
f^Pi
I degreesof
distribution with k
chi-square
limiting
in probability
to zero
of 2" t/"/log2
e converge
by
distribution.
On
the other
UJlog2 e has a limiting
chi-square
is well known
freedom,
(n;
"
terms
hand, since
\log2^/\2V"/
productof
is the
that ^
0, it follows
to
converges
Thus, if the/7j
are
other
in
converges
the
probability
to
has
to
probability
H')
"
VnVn
^n{H
Hence
0.
^nU"
is the
of
sum
the
whereas
distribution,
limiting
normal
zero.
hand, if the pi
are
distribution
limiting
same
degreesof
in
variable that
H') has the
"
same
limiting
^nV^.
as
the other
On
equal,Vn{H
all
distribution by
limiting
U" converges
of which
one
distribution
has
not
variables,
random
two
variable that has
random
as
e)](H
[2/;/(log2
with
e)]U", namely, chi-square
[2"/(log2
all equal,
then
K"
0 and
H')
"
"
freedom.
2.
The
Limiting First
of H
Moment
"
H'
By(l)
EH'
Since
we
EVn
now
0, it follows
that
"EUn
EUn
EV^.
is the bias of H'.
In order
to
evaluate
this bias
approximate "(/".
From
u"
_
loga^
have
(4) we
J_ Y
2n
"^"/
iî
"/^^)"
_
npi
J- y
^n
^"'
f^^
1
"^
I2n
"P'^^
(PPif
^ (n,-
iti
np,f
inp.f
1
20/7
^(n,-np,f1
{i^
{np,f
"'
n
GEORGE
From
B.l
Lemma
J_ y
EUn
^
we
A.
WILLIAM
AND
MILLER
J_ y npiqlqi p,)
J_
453
MADOW
that
see
npiqi
G.
|.lOn^pfqfiqiPi) +
J_
+ npiqlX
3"^/?|^|
npM^J
-Pi)0-
^piq,)
12/;,^^) ^
A:-l
or,
\qMi-pd
combining terms,
v!?i
have
we
k-l
EU^
-/'f
^1
estimate
an
and
estimate
an
of H
of H
that is unbiased
J_
to
^
^zj'
Thus,
have
we
Theorem
ôg2g
e)(k
(iogg
"
l)/2",
is
1/rt^
ôg2g
^
V
+(i"g^^"^;r-T2;;^ i2;;^,?,^'
^
+,
theorem
proved the following
Under
2.
1 /" is H'
of order
terms
/1\
of order
to terms
is unbiased
that
_/l
_^
Hence,
^/1\
the stated
[k
conditions
-\
1\
/I
7/-î/'=log,.(^^-"
,+^,I-J+0(;^
Furthermore, ifwe
let
k
H"
and
=H'
+{\ogê)"^
let
H'"
f/"_!^S2f l^g2fy 1
,
if
EH'
0(l/"),
EH"
0{\ln%
EH'"
0{\ln%
and
H
Theorem
H"
has
2 enables
us
n"^,namely, (/c
of order
bias of lower
are
{(Îjpd l]/12rt^
"
of order
of course,
""^ for all

so
that
make
order
both
than
or
H'
not
depend
for all values
on
EH"
of the /?,. (Terms
may
be
greater than
about
the bias:
(1)the
term
the
of the
H'
quantities,
positive
possiblevalues
EH"
several observations
does
l)/2rt,
"
to
probabilities
/?, and hence
Since
{k
\)j2nand
/?,;.(2)
is biased
"
downward
of
higherorder
for small
to
even
may
values
be
of
terms
negative,
n.) (3) An
454
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
TABLE
when
Case
increase
omits
in bias
pi
results
H', H", and
of the Estimators
ExpectedValues
=
and
0.50
if one
(4) When
(L\jpi)ll2n^.
uses
when
[(k
pi
all the pi
"
are
/k
equal,H"
for the
H'
Binomial
Sample Sizesupto
1/12"^
l)/2"]
"
for
0.05
as
an
overall
20
correction
and
becomes
k^ -l\
-I
^'4-log,e^^^+-^^j,
which
is
lower
In order
case,
we
state
illustrate the
to
the
to
terms
Ifk
of order
=
(that is
use
of the
say, if the pi
to
bias
corrections
are
of
" k^).
unequal,21//?^
Theorem
2 for a simple
:
following
Corollary.
H
for H'"
bound
2 and
For
the binomial
case,
2,
we
estimates
obtain the following
"~^:
Pi
0.5, then
//"'=//-+
e)(i+^).
(log,
of
456
READINGS
MATHEMATICAL
IN
PSYCHOLOGY
Hence
B. 1 it is clear that
1
Ei,ni-np,r
(7)
to
all terms
retain
consider
need
we
shall want
We
(-ir
+ /f
10g2/7,:
logPi
Pi
logaA
onlyterms
^n^pW
npiqiiqi p,)
to
4.
Hence,
"/^/^Xl 6/?,^,)
lower.
(7) in order (!/")or
of
we
From
Lemma
beginby evaluating
XOn^plq^qjpd
_
_
'~
2
omit
the
where
we
order
l/"^.Then
second
in (7) we
By substituting
of "j since it will
yielda
term
of
+4-(^^^-^^^+4^-
obtain
P. (log./.,
,.2
+
"="?
12
of the fifth moment
term
^3^
n^pl
npi
")
g;;
^4iaog.,,^)+|-i!2iiiî"^,
+
(8)
since
2 A0og2/7i
i
c.
first three
the
details
(nj
are
npif
(";"
-
6jfi
npi
very tedious,
they have
"/?,)'^
J_
(tti
-
(^/î)'
(/7/7,)2
been
put
in
npif
Appendix
D.
Here
result.
Theorem
3.
terms
Including
nUr,^
k^-\
Vlogae/
and
?7"givenby (4)will be used,namely,
approximationto
2ifi
loga^
the
of the
terms
nUn
Since
=0.
+if)
of EU^
Evaluation
The
ifk
=2,
p^
=}/2
we
of order \jn we
Ik-WjL
have
\
ii^pi
12"
have
"(:^T-?+'
logaej
An
91c'
10k
Un
we
state
GEORGE
Finally,from
(6),(8), and
4.
Theorem
E{H
MILLER
AND
Theorem
Hf
114,
(log2e)2
/I
equal,then
are
H'f
1)
(log2e)2(A:2
^'
i\og,ef
E{H
^î(log2/7,
H)
i\og,e)W
21ogae|^log2/7,
(lopêflk
the Pi
457
MADOW
obtain
ifall
G.
In (general
6"
but
WILLIAM
3, we
=\tpiaog2/',
H'f
A.
'2
^
=
.T
("7^
ll)'^'
(logse)^
Furthermore,
and
ifk=2
p^
Theorem
For
any
random
have
we
variable
H'
3(log2e)7"+ 1\
where
o\' is
the
variance
{EH'
"
is
H)
the
H'f
where
E(H
the
mean
E(H'
Theorem
given by
2,
By
Then
we
way
have
mean
the
H.
Hf
EH'f.
we
error
E{H
square
error
approximate a^- by using
can
4.
\/
1
+
1 2"^
4n^
H'f
"
is the
of illustration,consider
3
2
(log2ef
square
estimatingH, the
about
H'
of H', i.e.
H'f
of
error
square
g\, + {EH'
'{k -if
a^,
mean
have
we
a\.
Since
^/l
,
approximated
E(H
3^, //zen
In
will
binomial
Theorem
4.
For
quantity.
where
case
from
fundamental
more
the
^'f
1 p
obtained
be
0,
2 and
p^
0.5.
approximation
(logaef/n
(loggef
1\
1\
(n +
{ôg^/n
^'
4/72
Appendix
We
(1
begin with
an
A.
4"2
Expansion for (1
An
expansionof log (1
2n^
a;)log (1 + x)
and
x)
(-1)^-1
then
derive
the
expansion
of
x)\og{\ +x).
Lemma
A. 1.
"1
Let
log (1
^)
Then
"Xq"x.
a;
"
"
"
i?,.+i,
(A.l)
458
READINGS
MATHEMATICAL
IN
PSYCHOLOGY
where
and hence, ifx^ "
while,ifXq
"
0,
1^13+1
l","l
Proof.
and, if we
If
1 "
expand 1/(1 + t),we
Hxq
(A.l) and (A.2) hold.

"
Lemma
=2
Then
(A.4) follows from

A.2.
(A.4)
obtain
0 and
(-^)
x, then
logo +:r)
Thus
"
Let
(-1)^-1
the fact that
c//.
"
'
the fact that
1/(1 + ?) "
1/(1 + t) " 1/(1+ x^)
0 if a;,,"
0.
Then
"x.
(1 +x)\og{\
"
Jo
(A.3) follows from
1 "Xq
"
+(-iy
-vj^7-"^
+x)=x
R]+v
^-"dudt,
) 1
fl"c/hence, ifx^ "
(A.6)
0, ?Ae"
w/f//^,
//"Q " 0, then
Proof.
If
"
1 "x,
then
1 +
Jo
and
I +t
also,integrating
by parts,
I Y^t
"^^
so
^
^
dt
^^^ "^ ^^^"^^^
"^
'^^"
"
f'^"^
^^
"^
'^'^^'
that
(1
From
Lemma
a;)log(1
a;)
A.l, it follows that if a;
Jo
logd
+t)dt=2
log(1
t)dt.
"1, then
"
7-^-+
0
1"
-
i=2
(-1)'
jo
(A.5)
IJ'
""dudt,
jo
1 +
"
GEORGE
and
hence
and
(A.5)
A.
MILLER
(A. 6) hold.
AND
If aô "
WILLIAM
G.
then, from
that
so
(A.7) is proved.
times
n
and
Then
(A.8)
follows
B.
in
+ 1)
ô)/'0-
similar
Multinomial
let n^ be the number
of occurrences
fashion
from
(A.4).
Moments
operationhaving k possibleoutcomes
an
that
Appendix
Let
it follows
(A.3)
~Jo(1 +ô)/^(1 +
'^'+''
459
MADOW
be
independently
performed n
of the /th of the
in the
possibleoutcomes
performances.Then,
n\
\
nJ
"i! "2!
""""*!
is the
"j.
]Pl"'P2"^~---Plc'
of obtainingany specified
values
probability
^ 0,/?!+
+/?fc
in
possibleoutcomes
it
is
Then,
possible,by
n,pi
"
"
"
1, and/j^ is the
/th of the
of "i,
"
"
W/, where
"
probabiHty of the
each
operation,i
easy
but
\,-
"
"
"""
of the
occurrence
,k.
calculations
tedious
"!
to
prove
the
following
lemma.
Lemma
The
B.L
Eni
six
first
of n^
moments
are
given by
the
following
equations.
npi
Eirii
np^)^
npiqi
Eiiti
np,f
npiqlqi p,)
Eiiii
-
np^^
^n^p\qi+ np^qlX
E{ni
np^^
lOn^plqliqip,) + npiqi(qip,)(\np^q,)
np,f
+ \20pfqf].
+ "/7,^,[l 30/?,^,
\5nYiq\+ 5n^pfqf[5 26/7,^,]
E{ni
In
(where q^
p^)
epiq,)
general,
ifm
is
then
integer,
an
E(ni
npif^
Oin"^)
and
E(ni
The
proof of
We
Lemma
not
is omitted.
B.l
shall need
0("").
npi)'^'^^^
only the
of Wj about
moments
its
but
mean
also certain
of the
productmoments
E(ni
The
followinglemma
from
the
Lemma
fact that
npjf.
conditional
needed
will
moments
Its usefulness
moments.
be
obtainable
results
easilyfrom
B.l.
Lemma
E{x'
helpfulin derivingthese
will be
the
npiînj
ExJ
Proof.
B.2.
Let
be
random
variable
and
let A' be
'
y
^
In
x'
E{\E{x'
IA')
is any
random
Ex'Y'^'EiVx'
random
event.
Then
E{x' I A')Y 1A')}. (B.l)
^,
if u
general,
Eu
E{E{u
variable,then
IA')}
(B.2)
460
where
\s
A'
random
generalformula
and
event
put
we
"
Cx'
Ex'y
a!
Since
E[{x'
E[{x'
Let
us
("
a)!
A')
[E{x' I
'
will
apply(B.l) for
IA']
Ex'Y
for
t-
E{x' I ^')f
Ex'Y-'îx'
"
\A')-
E{x'
Ex'f IA']
[E(x' IA')
Ex'f IA'\
[E(x' IA')
evaluate
now
the random
event
E[{nj
will be
IVe
in
Ex'
"
Ex'Y''',
(B.l).
1, 2, 3 in the
Lemma,
following
we
write
now
(B.3)
Ex'
Ex'f
E{[x'
Ex'f
3[E(x' \A')
E{[x'
for
jointmoments
some
B.3.
Lemma
i*
x'
{E{x' \A')
1, 2, 3.
I^']
Ex')
we
Ex']'-''\A'}
for
by substituting
follows
Lemma
E[(x'
"
\A')
E{[E{x'
E[{x'
that
note
since
Then,
out
To applyto
expectation.
conditional
denotes
and
Ex'y
"
^
"
^^0
the
"
I''
"
the
(x'
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
""j
has
assume
Ex']E{[x'
multinomial
E(x' \A')f \A'}
IA')f IA'}.
E(x'
(B.4)
(B.5)
In all cases,
distribution.
value."
specified
population and
multinomial
\A')f \A'}
E{x'
suppose
Then
j^j.
Pi
E(nj I",)
npj)I"J
(Hi
npj
npi)
Pi
("i
np,)
Hi
E\{nj
np,f
np,fI",] =-\{ni-
"[("," npjfI"J
-
np,)+
Hi
(",
-^
Hi
Hi
npif
(",;
-
Pi
PfJHi Pi)
^
3
"
.2
np,f
{rii
-
/^|(^"Pi)t
-
+,
3"
^Pi^Hi Pi)iHi ^Pi)
PiiHi Pi)(Hi ^Pi)
("i
^
H
f^Pi)
size of
sample is "
",)
"
np.
Hi
Pi
{rii
Hi
Hi
Pi
n,)
(rt^
npj
Hi
npj)I",] =(n
"j and
the conditional
Pi
Pi
"
is pjiqi.Hence
outcome
/th possible
E[inj
"
Hi
If ", is fixed the conditional
E(nj ni) ={n
Hi
of the
probability
np,)
Hi
Hi
Proof.
(","
npi).
npi).
GEORGE
A.
MILLER
WILLIAM
AND
G.
461
MADOW
Also
E{nj I",)
Erij
{tii
np,).
1i
Hence
E[(n,
^'^^' ^'''
npjfIAz,]
=
("^
("t
"
Wif
("
"^")
"
"Pi)
"
rii
nqi
(jti
"
"
("i
"ji
since
"Pi) +
qi
qi
npi).Finally,
qi
qi
qi
.Pj(qi-Pi\(qi-^p^
,r
("i
"Pi)
("i
qi
pfiqi Pj)
Pl
=
qi \
3
"PiT
qi
pfiqt-Po)
("^
3rt
"Pi)
qi
pAqi
Po)(qi ^Pi)
(rtj
npi)
npAq-; Pj)(qi ^pi)

-
+
Hence
E(ni
npi)(nj np^)
E{ni
np,)
qi
Pnpiqt
E(ni
-npjCii
npifitij npjf
-
'P'
np,f % (Hi
J
Eiiti
pMi -Pi)
,2
npiY
qt
"PMi -Pi)
.
("i
"Pi)+
^
qi
qi
qi
pMi
Pi)
"Piqiiqi Pi)
"
"Pi^qi~Pi)
,
"Piqi
"
qi
qi
npip]iy epiqd
-
^"Yiq! +
npipj
(qi-pi)iqi-pi)
qi
(-"i
-
V
2
i ^j
^J,
"^PipAqi Pi)'
-
"Pif("j "Pi)^
-
-î^rz
"
PiPi
+ (qi
2 n[î/'i^r
-
M
Pi)^
i "=j
2
~jL
"/!
"qi
^Piqi) (qi Pi)(qi Pi)

-
"qi
462
READINGS
PSYCHOLOGY
MATHEMATICAL
IN
Now
Pj
Pi
2 ^"Ji Po)
qi,
(^
=(k
l)î-qi
l)qi.
Hence
np,)\nj npjf
Xrii
IE'
n'^PiPi
i"=j
^lPiqi +(^
Appendix
Order
C.
3piqi+ik
-1X^-2)
of Convergence
epiqi
(k
2)(qi Pi)
-
2)qi +
k-62piqi-(k-2f
+-
and
Convergence
in
Probability
If
lim
W"
is
bounded,
we
say
that/(")is
n"^f{n)
"-oo
at most
of order
l/w" and
nj(n)
0,
write
If
lim
n"i-co
we
say
that/(")is of
sequence
for every " " 0, we
lower
of
order
random
and
l/""'
than
variables
u^, u^-
PKI"nl
"
"
becomes
"
in
converges
to
probability
if,
If la.
"
e)
0.
fi,then
converges
in
to
probability
as
infinite.
and
simplemanipulations
Using some
Proof.
the
we
Tchebycheffinequality,
have
PH^î-=^".|=P.
so
npj)^
(rtj
C.l.
"
have
lim
Lemma
write
that convergence
in
"
el/^"("'/"-l"
if
occurs
probability
2a
1 "0,
if 2a
i.e.,
"
Lemma
/3.
C.l.
IfloL" ^, then
l"i
converges in
to
probability
as
becomes
npi
infinite.
Piqi
"^2/^"(2a/^)-l
464
READINGS
Now
the first term
since
that in
it follows
terms
that
From
Lemma
E[{nh
B.3
obtainingthe expectedvalues
d
-^
("i
np,Y
this in the
("j
qh
so
of
of the terms
shall do
We
^2.
(6) we
can
ignoreall
followingevaluations.
have
we
npnf \ni\
PSYCHOLOGY
of, say,
0(l/"^)where
are
MATHEMATICAL
IN
np,)+
qi
qt
that
nyipuqi
nyiPhqt
+
^("i
2"
-T^ V"nYiq1+
"Pi)
npiqlX
epiq,)-]
P^Hi
"
+
/'.O]
["/'^"9^"(^^
"/'^^^
"
i"
2-
""
and
hence,
"^
Â2W'z2
H
3/?^^^"
V (K
Ijq^.
Also,
'^
PiPhq
"
I
PiPh qi
nPhiRi-Ph)
+
iqi-Pn)
"
""
"
"Pi)'
.]
"2"2^
"
^("t
npiqlqi p"),
where
2^2 +
"2,
^3^
n^îPiPh
stands
for
that
quantities
I0p^(qi-pi)
""^
will
Oiljn^)or higher.Hence,
yieldterms
3(qi-p^)
(qi Pn)iqi Pi)

-
'^
npi
GEORGE
A.
WILLIAM
AND
MILLER
465
MADOW
G.
and
E
"^mî^
3ik
lOqiiqi-pi)
;;
n
(k
2)qi
2)qi{qi pd
-
+
n
npi
Also,
Pl
Ew^^Wii
Phiqi -Ph)
E{ni
=
o
npif
'^'''n^plPuq'
3^
.3
E(ni
Ph
"
"
npd
nPiPnqi
[l5n-pU +
"""]-
%:^
["""]+
npiY
%J;^ \ln-pU
ny-qi
f^pm
nyiqi
Eirii
r^
npi
Hence,
\5q1
3(k
"
l)q\
Wi
Finally,
i^h
Ew,,-"Wi^
Whfiî
Wi^^
and
First,we
that
note
Ewf^
3qt +
npi
and
that
1
E
Z
h
^mî2
^Piqi+ik
2)qi +
6piqi-{k
-2) (qi
Pi)
:
"
"
"]
466
MATHEMATICAL
IN
READINGS
PSYCHOLOGY
that
so
32 ^1 +
3(^
1)
32 ^1 +
(A:
2)(A:
1)
1
+
62^1
î
;=1
ir
{k
\)ik
1)
1)
6(/c
62 9l
i=l
2)^
(A:
+k
-6{k
-If
-{k
-\)
n\_i iPi
=
k^ -\
k^ -2k
+2
n\_i=iP
for
Similarly,
the second
term,
^12^13
^mîS
npi
-pi) -(k
lOqiiqi
U=i
(,^",
nPi
^^ptq,
iiAnpi
2
=
-/'D =2
-f-(^f
fiPi
i
(^^-p^)"
"
^/'i
term,
k
"
/15^
^hzî^
îi
ii,,^s,_3
i=l
^1
term,
U'
the final
"/'f
+ 2
2 M^fa
For
2)qi{4pi qj)
(,^B),|-(4.+2),.,,
^1
the third
npi
_y
For
2)qlqi p,)
1=1
i=
(k
Kk-2)qi
^^qlqi-Pi)
I0qi(qi-pi)
îi
2
Z^h2-
îi
\5qf I5q!
L "Pi
ijth
^'[%
2
i
2)ql
"
f^Pi
^-
0^2
k
=
3(k
,
Wi
5a
Kk
2)]
=2
1
^^2
^(3^ !)"
-
"/'i
GEORGE
A.
MILLER
AND
WILLIAM
k^ -2k
G.
2
6"
A: +2
15
(/c
1) +
36//
4
1^1
+
7-"
A: +
5(A: + 2){k
6/2
6/2
_"_
l^^l3(3^-1)
/?,"
z-
6"
i=i/7i
^1
^/7/
12/2
12/2iri/?i
1 2//
1^^
A:+8^1
1)
rfi
fC
1)
j
77
4/2
4"i'î/?j
+
12//
~k^
-2k
1)
36/2
/7j
3(3A:
(^
"
^
/?i
1=1
15
^
"
6"
yc2-l
(k +8)
"
ii=iPi
467
MADOW
Finally,
A:2
/2t/"\2
"L-^
="
logae/
/ 4^
1
V
+Pr
4^
(3 -2^-16+9^+2)
'
12//\,fi/?
-3k^
2^2
5A: +
5 +
[6 -ek
igyr. ^
jQyr,2
12/2
lOA:
A:2
check, E\
r-
18A:]
12/2
iPi
computed
was
"
5A:
9A:2-20A:+7
"
-^-
12/2
As
7^-11^
20
for
specialcase.
Let
=2so
that
\log2ej
n-i^+
n2
np2
"
n^
"
n^
n,
til
"
np^
ii^,
"
-("i
"jPi).
"
Hence,
l(/2i-np^fl
nU^
_
log2e
/2
^2
l/î ^1
1
l(/2i -/2/;i)7
6
//^
Z'?
1
(/2i-/2/7i)V1
l;,3^3
_
12
/23
468
READINGS
IN
PSYCHOLOGY
MATHEMATICAL
and
1
("!
np^'^
("i
np-^y
nY^q\
'^'
77*/7^^f
36
1
("i
"Tl^l
4:j-(î+/'i)-
"ViM
12
_...2"2.2
6nY^q\
l5nY,ql{cjl+p'^
nnYq\
6/^1^1 5{q^-p^f
[3
1 /
log,e/ ^4
3
qi
20{q^ p^f
-
15(9^+ p%
15
"^T
3^V
/12
""4
Pi
4"/7i^l
that
1/2,so
2 and
3"/7i^l
\Sp,q,+ 5iq, p,f
^3
12/7/7iî
12"/7iî
^1
Let/?!
5(ql+ pD
p,f
5(q,
_
4/7/7iî
If k
j"(.qi -pi)
"3-3
np^)
".^!^f
(/7i /7/?i)
iqi-pxf^^'
18
15
3^
An
'
1/2,
1/
4+"ll-4
logse
(from square of
1st
term)
of 2nd
(from square
term)
(from productof
1st
by 2nd)
(from product of
1st
by 3rd).
5
4n
From
generalformula
3
1st term
2nd
term
squared
15/
1/4
15
15
15
36
ITr
1/2
36n
36"
36/7
10
product of
1st
product of
1st
by
2nd
by
3rd
6"
6n
each
part checks.
1\
term
12"
5x4
3x5/1
Hence
4+4^^4-
squared
\2
^2/
15
_
5
_
^\2n^4n
GEORGE
A.
MILLER
WILLIAM
AND
G.
469
MADOW
REFERENCES
[1]
Mathematical
H.
Cramer,
methods
of
Princeton:
statistics.
Princeton
Press,
University
1946.
[2]
Miller,
[3]
Shannon,
27,
[4]
G.
What
A.
C.
E.
is
A
information
mathematical
Amer.
measurement?
theory
of
communication.
379-423.
Wiener,
N.
Cybernetics.
New
York:
Wiley,
1948.
Psychologist,
Bell
1953,
System
8,
Tech.
3-11.
J.,
1948,
OF
DESCRIPTION
STATISTICAL
George
A.
Miller
OF
INSTITUTE
LEARNING*
J. McGill
William
and
MASSACHUSETTS
VERBAL
TECHNOLOGY
of a probability model.
verbal
Free-recall
learning is analyzed in terms
that the probability of recalling a word
on
general theory assumes
any
of times
the word
has been
determined
trial is completely
by the number
of this general theory are
recalled
particular cases
on
previous trials. Three
the
three
examined.
In these
placed upon
specific restrictions
are
cases,
of previous recalls.
The
relation
between
probability of recall and number
data
is illustrated.
to typical experimental
application of these special cases
of set theory is suggested but is not
in terms
An
interpretation of the model
The
essential
The
the
to
argument.
learning considered
verbal
end
list of words
following experiment:
procedure is repeated through

to extend
prepared
not
series of
presented
all the
statistical
the
is
down
presentation he writes
of the
is the
in this paper
theory
to
he
At
in the
learner.
the present
of
range
the
At
This
remember.
can
wider
observed
the
to
words
trials.
kind
time
we
are
experimental
procedures.
The
We
has
shall
learned
been
been
recalled
word
will be
it has
Then
Tk
all the
times
1"
word
shall say
that
has
previous
so
function
research
seminar
Social
is to
the
from
so
state
remains
This
different
article
has
probability that
of times
k, the number
previous recalls be symbolized
by
their
meanings
are
recall
exactly k times
Thus
Ak
before
first trial
Aq
is
preceding
first trial
the
they have
say,
the
on
word
the
to
proportion
On
tq
The
Ai
state
ô
recalled
been
second
the
zero
of these
proportion
trial the
"
by the
Science
authors'
Research
Behavior
many
word
in
in state
College, June
Theory, held at Tufts
for advice
especially grateful to Dr. F. Mosteller
for
the
hsted
is in state
; that
passes
facilitated
was
of the
of
and
recalled
Ideally, on
and
recalled
is not
*Thi8
and
words, the
material
test
of times
probability of failing to
been
Aq
trials.
in the
paper.)
word
the
in state
are
is recalled
To
Summer
words
on
words
(Symbols
previously.
of the
1 is
trials,we
word
any
other
In
corresponding
the
When
Tk
"
which
to
probability of recall after
trial
on
end
the
at
degree
Model
completely specii"edby the number
recalled
the
preceding trials.
on
the
Let
is
recalled
been
Appendix
that
assume
General
membership
Council,
entitled
28-August
and
in
the
24, 1951.
criticism
Inter-University
Mathematical
that
The
Models
authors
proved helpful
are
on
occasions.
appeared
in
Psychometrika,
1952, 17, 369-396.

470
Reprinted
with
permission.
472
IN
READINGS
The
generalsolution
p{Aq
of
(1
PSYCHOLOGY
all the
(1)when
n)
MATHEMATICAL
for
To)",
"
n)
ToTi
"
"
T,_i
"""
A;
denominator
differences
of each
(r,
of the form
The
of the
fractions
of times
expected
including trial n, is,by definition,
E(k,n)
in the
for the
r.) except
"
number
word
(2)
0.
r.)
The
for A; "
n (r,-
A) :
0,
^"^"
[^
~
p(A,
different is (seeAppendix
are
t^
summation
difference
zero
is
includes
all
(t,- t,).
told, up to and
recalled,all
"
i2kpUk,n).
(3)
A-O
The
E{k,
This
difference
Thus
-i- 1)
we
have
is the
the
p"+i
An
the
0,
E{k, n+\)
alternative
word
If these
to k
n
theoretical
1.
word
The
.
will both
have
we
That
and
score
1 is the
values
E{k, n),
difference,
trials.
successive
on
symbolize
we
product
be in ^4^ on
the total
is to say,
p{Ak
"
trial
forn
it
by
p"+i
and
(4)
0.
1 "
follows.
as
On
trial
The
n).
also be recalled
all the
over
probabilitythat
0,
probabilityof
the probability
n) is,therefore,
A^'is piA^
summed
have
we
r*
forn
be obtained
can
p"+i
is in state
joint probabilitiesare
n,
recall
expression for
probabilitythat
a
trial
on
the cumulative
general relation
recall in state A^ is r^
that
recalled
n), between
E(k,
"
Po
of words
expected proportion
trial
-4.^from
will be recalled
word
states
on
k
on
1.
0
trial
p"+i
n
Pn+i
The
follows.
np(At
From
(5) are equivalent,which

(4) together we have
(3) and
first summation
be shown
as
1)
J2 kp(At
n).
t-0
the
on
can
2 kpi^k
*-0
n+l)
(5)
n),
expressions(4) and
two
Pn+1
The
by substitutingfor p(At
be rewritten
right can
according to (1):
n-H
22 kp(A,
1)
i-O
2 kpiAk
ti
n)(l
r^) +
k-O
n)Tt-i
2 kp{Ak ,n)
t-0
A-0
2 ^(^*-i
J2 kp{Ak
n)n
t-0
Z(^+
l)p(A*,n)r.
GEORGE
A.
MILLER
this result is substituted
When
Pn+i
AND
WILLIAM
into the
473
MCGILL
expressionfor
J2 kp(Ak ,n)Tk+
J.
p"+i
have
we
J2 (k -{- l)p(Ak n)Tk

,
H TkpiAk n),
which
is the desired
be deduced
of the model
general solution
of the transitional
the
have
words
as
increases
the
with
transitional
case
zero
to
can
in which
All the words
zero.
positiveprobabilityof moving along
limit
without
(2). First consider
probabilitiesr^ is
to the first state, A^
etc.,up
There
result.
the
Ao and
in state
A2
from
more
or
asymptotic behavior
The
one
start
A^
states
probability,r^
0.
trapped; eventually all the words are recalled exactly

h times and
be recalled again. This fact can
cannot
be seen
from (2): If
all
the
in
to
then
terms
Thus p(Ak
r.)" (2) go
(1
zero
asn"^02.
n)
0,
Ti "
for
k
h.
For
k
the
in
of
summation
"
"
front
the
to
zero
product
h,
goes
are
"
must
include
(1
ThY
go
"
to
0, and
r^
zero.
lim
p(Ah
n)
recall score,
then
X)
ft
since
the
This
0.
th
asymptote
shall be
follows,we
different and
If all the
as
Tfc[limp{Ak
'
(Ta_i
"
1-
Th)
"
asymptote; from
an
n)]
not
(5),
0,
n-"m
is concentrated
is at
learning curve
the
only with
concerned
greater than
zero
'
(2) does
h,
at state
is of little interest for
case
of the
Th)
"
h, however,
of
approaches
p"+i
""'"V
"
Th){Ti
"
probabilityat the asymptote
this state
0 for i "
"
r^
When
h.
in the summation
'^"^'
n-"oo
since the
this term
0 and
lim
0 for k "
(To
p"+i
n)
so
n^co
The
p(Ak
so
0)"
(1
1, and
when
Instead,
ta
"
an
zero.
case
and
Ah
for
acquisitiontheory,
Therefore, in what
all the
in which
are
r^
zero.
probabilities
r^
transitional
are
greater than
then
zero,
from
approaches infinityall the terms in the summation

go
of the p{A^
toward
for all finite values of k. Consequently the sum
n)
zero
be made
please for any finite k by selectinga large
we
can
near
zero
as
as
(2) we
see
that
as
enough value
number
almost
of
In
n.
of recalls is
the
Since
zero.
all the probability
for the limit when
all
t^
limit, therefore, the probability of
"
comes
the
to
sum
of the
be concentrated
0,
p{A^
piAk
oo)
1.
n)
must
in state
any
finite
equal unity,
A^, and
we
have
474
READINGS
We
are
moving
able
now
to state
to show
Ak+i
that
if the
,
PSYCHOLOGY
MATHEMATICAL
IN
word
Ak has probabihty
in state
is continued
learning process
happens because almost all words eventually reach

write,for the probabilityof leaving state A^ on some
one
of
indefinitely.This
A^
state
Thus
.
we
can
trial,
00
2] Tkp(Ak ,n)
1,
n=k
or,
2 p(Ak
n
In all the
an
cases
asymptote
for
"
0.
"
Tk
Tk
We
"oo.
"
shall consider
we
A;
as
,n)
in this paper
interested
are
in
the value
of
t^
will
approach
tions
placing the followingrestric-
the Tk'.
on
lim
Tk
Tk
"
0,
Tk
Tj
"
1.
k-'co
The
first two
conditions
insure
large n.
The
condition
that
p(Ak
n) goes toward zero for finite k
provides the asymptotic value of r^ for
,
and
infinite k.
out
to
third
In the summation
and
infinity,
so
for the
limitingvalue of
all terms
p"+i
are
zero
have
we
lim
mp{Ao.
p"+i
^)
(5')
m.
n-"oo
In other
then
value
words, if we
that
assume
is also the
value
asymptotic
In the special cases
discussed
m
of
Tk
in the form
that
0 "
Tk+i
"
that
acquisition,
so
Consider
the
of the linear difference
1 and
is bounded
asymptotic value of
as
p"+i
below,
Tk+i
where
is the
of
"
"
between
r^+i
"
"
zero
t^
(XTk
a.
The
and
one
7^
as
/c ^"
oo
-ôo
,
restriction
is
placed
upon
the
equation,*
(6)
limits for
and, since
have
we
are
been
chosen
so
interested in
followingdevelopment of (5):
n+'l
Pn
+ 2
2
k
*We
have
TkP(Ak ,n-\- I),

0
tried to observe
the convention
that parameters
are
representedby Greek
statistical estimates
of a and
letters. In the case
are
represented by Roman
have violated this convention
in order to make
m, however, we
our
symbols coincide with
those used by other workers.
The
symbols m, o, a, and p were
originallyproposed by
Bush
and Mosteller.
letters and
GEORGE
substitute
Now
for
A.
p(Ak
MILLER
n+l
Pn+2
n+1
Z) np(î
n)(l
Z) TtP(^*-i n)n-i
Ti) +
Pn+1
substitute
Pn+3
"'(7*
Ti+iTtpCAi n).
,
a)p"+i
(1 +
a)p",x
(1
(1
1) is the second
a)
S (a + a7-0Ttp(Ai n)
,
S 7-iP(Â w)
,
a)E(T^
of the
moment
raw
(7)
1),
r*
is the first
(as p"+i
for trial n+l.
moment)
raw
J2 rlp(Ak,n)+
(1 +
according to (6):
t^+i
Pn+1
where
for
n) +
we
S np(Ak
"
":
Next
475
MCGILL
J.
1) according to (1):
WILLIAM
AND
(6) brings the system into direct correspondence with a

nology,
In their termiand Hosteller.
of the theory developed by Bush
p, to give
operator Qi is applied to the probabilityof response,
A second
trial is successful.
the new
a
as
probabilitywhenever
Restriction
specialcase
an
"i
ocip
operator Q2 is applied to give aa +

the present
restriction
say,
is
az
applicationof
and
unsuccessful
upon
its
unity, so
to
probability of
simple assumption
that
assume
the
occurrence
is will be
seen
trial is unsuccessful.
In
omission
of the
trial consists
reasonable
seems
is
az
generaltheory, Qi is preservedintact by
is to
to be the identity operator. That
In the present application,an
Q2P
p.
more
Q2 is assumed
(6), but
zero
this
whenever
aaP
of the
word
of
non-occurrence
the
on
when
trial.
next
examine
we
during
a
word
recall.
has
effect
no
this
successful
How
It
the data.
Analysis of the Data

At
word
the
end
lists
the
"
of the
experiment
recalled
words
the
experimenter
by the learner
on
has
collected
successive
trials.
of
set
These
that did not
occur
of words
usually contain a small number
of
learner
the
some
are
in the presentation. These spontaneous additions by
in the present discussion.
shall ignore them
interest in themselves, but we
recall lists will
We
estimate
we
the
would
of
suppose,
like to
p"+i
to all of the
transitional
the data
contained
in the word
lists to obtain
an
There
are,
(5). We shall refer to the estimate as r"+i
material
in
learning
words
provided by the experimenter as
in
experiment.
these
use
words
words
It
seems
are
reasonable
A^
we
be considered
may
probabilityof recall,t^
that
assume
By this
homogeneous.
in state
to
imply
as
under
that
estimates
certain
ditions
con-
the responses
of the
same
476
READINGS
We
define
then
can
IN
convenient
statistic,
numbers, Xi^^.n+i,
the same
meaning
The
have
indicate
that
we
words, with
recall
occurs
various
are
zero
or
that
(8)
"
The
one.
attached
event
an
have
we
first summation
that
determine
them
occurs
zero
trial
This
summation
on
1 for any
word
only
goes
correct
in state
to
up
is trial
are
Ak when
Ak
provided
that
k,
over
our
trial
on
the
reference
rules determine
These
n.
state.
straightforward.
summing
occur
because
the A^
to
responses
one
extends
summation
of states
number
or
in each
if a recall fails to
second
The
1.
in state A^
zero
word
experimental
of words
in state
not
1 to
i, the
over
Xi^k.n+iis
an
for all words
previously. They
trial
on
out
the number
count
we
and
subscriptsk
to
is carried
whether
are
X,-,i."+i
states.
experimental
words
on
1.
To
show
that r"+i is unbiased
E(r..O
the
either
that
point for determining the

r"+i as the proportion of
The
iî
They are zero for any word

1. Lastly the X,,i,"+iare
i.
trial
icô
The
on
iV
l~f
k fixed to show
rules that
The
2^ Xi^k.n
looking at
are
The
Aj,
in state
2^
AT
J
^n
PSYCHOLOGY
MATHEMATICAL
expectation of
sum
in the
any
t Z.,
[e{
"
Thus
the expectation of
Xi^^.n+iin state A^ is t^
is N-Tk-p{Ak
n). Substituting this into the ex~
,
.
brackets
we
pression for "'(r"+i),
i
=
that
observe
we
find
n
E(r"+i)
E(r"+0
The
sampling variance of
X,,i,"+iaround
of the various
Var
The
variance
The
variance
(r"î)
=
Tkp{Ak
p"+i
(9)
p"+i
the transitional
^2
n),
around
r"+i
Var
is determined
by
probabilities,
t*
the variances
.
^J2 ^Y,.*,"+,].
of any
X,,a,"+iin state Ak'is binomial and is given by ri(l

r*).
thus
becomes
A^
Xl^-i^..it.n+i
ing
p{Ak n)Tk (1
n). Substitutthe expressionfor Var (r"+i),
obtain
we
"
of
"
this into
"
Var
It should
be noted
(r"+j)
=
S Pi^k
t;
that this variance
"t^Pn
is never
"
(1
"
w)n(l
largerthan
Pn+l),
n).
the binomial
(10)
variance
GEORGE
since the binomial

the variance
on
Var
order
In
of the
includes
around
r^
p"+i
(10) a
to
477
MCGILL
J.
in addition
^^'^
"^'^^^
WILLIAM
AND
MILLER
variance
(r",0
that
term
depends
"
""^j-
''^^^^ ""^
(100
'
obtain estimates
general theory we must
Now
the
is
probabilities,
probabilityof moving
Tk
r^
of the
trial.
After
the
apply
to
transitional
A.
from
Ak
state
trial n
some
up
obtain
we
remain
some
are
,
an
of
r^
in state Aj,
trial
^1^ on
in
trial to
from
constant
estimate
estimate
an
be
to
words, Nk,n
Ak+i provides
to
trial
of
Aft+1 and
to
on
go
every
is assumed
certain number
moves
Ak+i and
to
of
-\- 1.
that
on
r^
Of these N^.n
Call these
The
trial.
estimates
words,
fraction that
Therefore, on
Then
tk,n+i
"
is zero,
If A^fc,"
Next
we
transitional
the
estimate
no
wish
is
possible.
the
combine
to
4,n+i to obtain
The
probability,Tk
"
upon
to
are
respects the
which
For
6
example,
accuracy
the 4,7.+i
by
estimate
"
it places undue
of A^,,," We
.
of the various
This
trial 8.
on
values
This
of
,
emphasis
prefer,therefore,
estimate,
after trial 7 there may
recalled
are
small
on
the maximum-likelihood
use
10,
based
least-squares solution, obtained
of
is the direct average
r^)^,
minimizing (4,"+i
variance
because
is unbiased, but it has too large a
the 4."+i that
singleestimate,
4.n+i
"
be 10 words
gives the
estimate
in state
^3,8
A3
Of these
.
6/10. Every
final estimate
provides a similar estimate, ^s.n+i
of T3 is obtained
by weighting each of these separate estimates according to
it is based and then averaging. This procedure
the size of the sample on which
data
permit.
is repeated for all the t^ individually as far as the
dependen
basic
the
assumption that r^ is inThe
4."+i are also useful to check
trend, this basic assumption
If the 4,"+i show a significant
of n.
trial on
which
The
i^g.^5^
"
is violated.
The
computation
The
tedious
relation
as
and
among
of
k become
the
r*
Simplest Case:
p(Ak
n) from
One
(2)for
Parameter
the
of
is exceedingly
look,therefore,for
restriction (6). The first case
moderately large. We
of the form
general case
simple
that
we
478
READINGS
PSYCHOLOGY
MATHEMATICAL
IN
shall consider is
To
Tk
In this form
the model
of the difference
,1
a,
(1
contains only the
(12)
d)Tk
The
singleparameter, a.
solution
equation(12)is
1
Tk
(1
ay.
(13)
The
follows: On
of (13)in set-theoretical terms runs
as
interpretation
the firstpresentation
of the list a random
sample of elements is conditioned
for each word.
The measure
of this sample is a, and it representsthe probability,
state
If
word
is
not
of
from
to
state
a
A(,
Ai
no
going
recalled,
To
When
a
change is produced in the proportionof conditioned elements.
random
word is recalled,
the
effect
is
condition
another
to
however,
sample
of elements,drawn
a to that
independentlyof the first sample, of measure
word.
Since some
of the elements sampled at recall will have been previously
pendence
a
fter
recall we
have (becauseof our
one
conditioned,
assumption of indebetween successive samples)
:
.
/Elements conditioned\
\
duringpresentation/
A2
we
a^
"
the firstto the second
independentrandom
elements
\
/
a)^.
recall. The
sample of
second
measure
time
is drawn
word
and
to
is recalled
conditioned,
have
[1
^2
(1
a)']+
Continuingin this way

With
-fl)
a[l
(1
in
(1
a)'.
p(Ak
generaldifference equation(1)becomes
n)(l
a)'^'
+ p(Ak-^ n)[l
The solution of this difference equationcan

outlined
a)']
generatesthe relation (13),
this substitution the
p(Ak
The
(1"
"
/ Common
Ai
quantitygivesus the transitional probability
ti of going from
from
another
so
duringthe recall
This
/Elements conditioned\
Appendix
or
by
the
be obtained
(1
a)'].
by the generalmethod
in (2).
t*
appropriatesubstitution for
solution is
p(Ao
n)
(1
p(Ak
n)
(1
a)",
ar' n [1
1
From
definition
(1
a)"-].
(14)
(5) it is possibleto obtain the followingrecursive
ex-
480
READINGS
reach
asymptote
an
introduction
with
the
of
an
PSYCHOLOGY
MATHEMATICAL
IN
somewhat
below
the
less than
asymptote
theoretical
value
unity will be discussed
unity. The
at
in connection
three-parameter case.
0=0.22
NUMBER
TRIAL
Figure
Comparison
As
shows
k
the
of Theoretical
further
and
check
Values
of
piAt
n) for the One-Parameter
Case
correspondence of theory and data, Figure 2

observed
values of p(Aa
n) as a function of n, for
on
piredictedand
Observed
the
0,1, 2, 3.
Case:
Second
In the one-parameter
Parameters.
Two
that the proportheory it is assumed

tion
the
the
list
is
of
same
as
presentation
data are
recall. Most
not adequately
of the
form
the
sampled during
proportion sampled during each
to
described by such a simple model.
At the very least,then, it is necessary
different.
In
these two sampling constants
consider the situation when
are
restriction
order to introduce
the second
(6) in the
we
phrase
parameter,
of elements
the
followingform:
To
Tt+i
Po
a
-h (1
"
a)Tk ,
(17)
GEORGE
po is the
where
MILLER
proportion of
On
the
first
conditioned
to
measure
and
T2
[1
(1
1)
,n
po)(l
solution
p(Ao
n)
p{Ak ,n)
After
d)] +
Po
(1
{l-po)
n)(l
po)(l
subject
K^.-i
for the recall
form
=
was
given.
(r"+i)
found
that
Figure 3
(1
(1
"
po)(l
po)(l
a)]
a)'-']. (19)
^^""'
a)
"
Po)[l
(seeAppendix B)
becomes
now
(1
0.10
data
are
and
of
p"+i
(p"+2
-vr
was
(21)
ar]pr.
was
these
read aloud.
he could
A
(22)
Pn+i)-
"
equations we
and
collected by Bruner
reading.
every
analysisof
the values
The
n)[l
applicationof
first set
The
before
the
From
(1
Po +
all of the words
wrote
scrambled
(14).
to
monosyllabic words
was
(1
-7T
to illustrate the
sets of data.
Ust of 32
is
a)"
^
1
Var
two
a[l
a).
of r"+i is
variance
In order
elements
11
Pn.i
The
po)(l
general difference equation (1) becomes
the
(20) reduces
recursive
The
of
poT,
a,
is
(19) is
of
(l
a-
.=0
When
one
sample of measure
po
recalled,a random
sample
recall,
therefore,the measure
a)^
+
The
(18)
generates the relation (18).
p{A^
a)\
is
of conditioned
this substitution
With
presentation.
random
word
apo
"
Po)(l
in this way
Continuing
When
measure
(1
the list
the
be written
po)(l
during
is
Po
recalls the
two
(1
conditioned.
elements
Ti
p{Ak
word.
every
of conditioned
After
can
481
MCGILL
J.
equation
presentation of
is drawn
WILLIAM
conditioned
T.
AND
elements
of this difference
solution
The
A.
At the end
The
remember.
total of 32
have
selected
A
Zimmerman.
of each
order
reading the
of the words
presentationsof
the fist
particularsubject it was
good descriptionof the data. In
tion.
(21) are shown by the solid func-
the tk calculated for this

po
0.27 gave
computed
given by
from
the open
circles.
The
dotted
lines
are
drawn
482
READINGS
"
standard
one
MATHEMATICAL
deviation from
check,Figure 4 shows
function of
IN
for A;
the
another
computed from
observed
predictedand
(22). As a further
p(Ak n) as a,
values of
0, 1,2, 3.
distribution of cumulative
The
as
p"+i
PSYCHOLOGY
recalls on
given trial providesstill
any
viewing the data. In Figure 5, the cumulative distribution

of recalls,
is shown for trials 5, 10, 15, 20. The proportion
of test words recalled k times or less is plottedfor comparison on each trial.
The second set of data was
collected by M. Levine.
He read aloud a
100-word anecdote.
At the end of the reading,the subjectwrote down
all
he could remember.
order
The
of
the
words
Four such trials were
given.
trials.
not scrambled duringthe interval between
was
From
the analysisof the data for this particular
subjectit was found
that a
of the results. Figure6
0.87 and po
0.61 gave a good description
shows the comparisonof theoryand experimentboth for p"+i and for p{Ak ,n)
of
way
of
k, the number
for A:
0, 1, 2.
have noted that when the order of the words

we
generalobservation,
is not scrambled between
the parameter a is relatively
large. This
trials,
is to say, when the words are not scrambled,there is a much higherprobability
As
that the
at
will be recalled
successive trials. This effect is related
on
to be recalled.
rate determined
by
subjectrecalls words
The
curve.
serial-position
continue
at
words
same
beginningand
the end of the list. If these words remain in their favored positions,
they
to the
words
New
po
to
those recalled at the ends
learningworks from the
the
so
added
are
at the
two
This effect has been
the
middle,which is the last to be learned.

listsof randomly selected Englishwords as well
Third Case:
as
ends toward
noted
with
with anecdotes.
Three Parameters
and two-parameter cases we have assumed that after sufficient

reach perfect
the subjectshould eventually
performance. Some data,
practice
In the
one-
to consider
to evade this simpleassumption and so it is necessary
however, seem
Such a parameter
what happens when a lower asymptote is introduced.
be necessary when, for example, the periodof time allowed for recall is
may
limited.
To introduce the third parameter
To
Tk+i
The
solution of
Po
a
adopt the generalrestriction (6)
we
-\-aTk
(23)can
r.
where
0"a"l"
a"l.
(23)
be written
j^"
1
"
(y^- Po)a\
-
\1
"
(24)
(24)reduces to (18). From (24)we see that as k increases

without limit,
asymptote. From (5')we know
t^ approachesa/(l "a) as an
When
1 "a,
GEORGE
100
''
'
''
A.
'
MILLER
'I
AND
'I
WILLIAM
483
J, MCGILL
a-'
80
"OO.
_Q^"
ôo
ô
lu
_i
"
60
"J
tJ
cc
o,
^ô
UJ
MONOSYLLABLES
32
40
"L,-"
o.
liJ
CL
20
=0.10
Po=0.27
'
'
'
"
30
26
20
IS
10
NUMBER
TRIAL
Figure
3
Values of p" for a Two-Parameter
standard deviation from p"
Comparison of Theoretical and Observed

line is drawn
"
Dotted
20
NUMBER
TRIAL
Figure
Comparison of Theoretical
Case.
IB
10
one
and
Observed
Values
4
of p(^*
n) for
Two-Parameter
Case
484
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
number
cumulative
of Theoretical and
Comparison
recalls,
of
Figure
Distribution
Observed
Two-Parameter
5
of Recalls
Case
on
Four
Different
Trials in
lOO-WORO
ANECDOTE
0.87
a=
Po-0.61
1
-J
L_
TRIAL
TRIAL
Figure
Comparison
that
Tk
of Theoretical
and
p"+i
and
approach
Observed
the
Values
Case
of
and
p"
p(4*
n) for
asymptotic value, m.
same
So
Two-Parameter
we
have
the
equation
a
lim
Since
"
m
0,
"
"
"
cannot
po
a,
be
for if po "
cannot
we
In
obtain
(25)
unity; and since both a " 0 and

general,we are interested in cases
forgettingrather than acquisition.
exceed
negative.
w,
p"+i
"
where
set-theoretical
of the
conditioned
material
Of
remainder,
as
before,but
1
now
the
of
sample of
measure
measure
of the elements
measure
"
"
sentation
pre-
is
po
is conditioned
{1
extinguishedduring recall,i.e.,
and
measure
"
the
and
the conditioned
add
We
the
elements
conditioned
a) po.
Thus
have
the second
recall the
Continuing
When
its solution
First,we
take
to
variable
new
(18), Math
is
"
"
into
"
"
Tk/m
that
of
(1
the
now
our
(1
d)po
"
is
"
(m
repeated:
a) Ti
"
po)a
"
obtain
the
solution
same
as
appropriate difference
the
is
case
simplest
hardly
to
way
"
forpo
in the
for po and
of two
case
a
less cumbersome
with
work
of the two-parameter
probability,ri
Po/m)a,
po/m
of
know
(1),we
transitional
new
for (1
"
such
these
case.
that
m.'
(26)
parameters given in
a). Therefore, from
that
(1
"
0,T\
appear
substitution
(20),we
"
Po)(x.
"
aTi
advantage
introduce
ri
{I
for the three-parameter
(2). It would
equations is
(2) and
(24) is substituted
than
"
generates the relation (24).
in this way
equation, but
This
(m
~\~O,
apo
"
sampling procedure
Tl
"
same
T2
Po
ToT]
first recall
the
On
follows.
as
of elements
sample
subtract
must
we
Ti
At
At
runs
485
MCGILL
J.
(24)
elements, a portion of
extinguished.
a, are
"
during presentation
we
word.
these
"
for
random
WILLIAM
AND
rationalization
for every
is drawn.
MILLER
A.
GEORGE
tQ"
2^
Ta_
A-l
1=0
ml
p'iAk
When
Thus
we
a^'
n).
,
ri is substituted
of the summation
to
(2), the factor m'
into
in the
cancels the factor m'' in the denominator

know
the
in front
tion.
summa-
that
1^
~
p(A,
product
under
n)
T'or[
"
"
"
tL.
j=0
''^"
(28)
,
486
READINGS
which
is the
same
summation.
(1
This
T,)"
[(1
(1
p\Ak
as
m) +
my
When
term.
consider
we
m)""'w(l
this sequence
substitute
we
"
(1
Z-J
Tk-1
"
m"(l
in
(29)
term
sum
by
have
we
'î)
J.
'""
tQ".
(28) and
of this sequence
/
"
rd' +
for the numerator
,
'
tQ
the last term
TqTi
the
T^T
n(l
under
be written
can
m(l
for the numerator
(27) except
my-Va
(2)(1
Now
in
n)
numerator
PSYCHOLOGY
MATHEMATICAL
IN
n (r;
r!)
j=0
which
know
we
from
(27) is equal
to
m"p' (A^
n).
The
to last term
next
gives
A
n(l
r[r'
m)"-'(l
(r;
rO
1=0
which
know
we
in this
we
know
from
manner
the term
p{Ak
n)
(27) is equal
ni'p'{Ak n) + n(l
the
parameter
We
in
m)w"~^ p'(^-t
"
brings us eventually to the case where

is zero.
Consequently, we can write
asymptote
"
m)m'~^p'{Ak
"
k, and
"
1) +
Wm''(l^)"~y(^^
-
is
unity (m
ing
Proceed-
1).
"
[n-k}^^ "^y'^^'ViA,
When
n{\
to
"
"
"
k)
(30)
-i)-
(30) reduce
1), (29) and
then
to
the
two-
case.
recall that
(1),(30) can
of the
because
be written
viA, ,n)
Z
i=o
way
in which
our
were
probabilities
as
('')m\l
my-y(A,
-
\1/
i).
fined
de-
488
READINGS
It is of interest to observe
in
(33)for
is,
and
p"+2
p"+i
the
,
lim
reflects the
This
the
variance
that when
t^
the
limitingvalue,m, is substituted
limitingvariance
Var
around
m(l
(r"+i)
is found
to be binomial.
That
m)
"
fact,established
of the
PSYCHOLOGY
MATHEMATICAL
IN
earlier in
to
goes
that
(5'),
as
grows
very
large
zero.
example, we have taken the data from

another
Sixty-four
subject in the experiment by Bruner and Zimmerman.
words
the
and
the
order
of
read aloud
was
monosyllabic English words were
scrambled before every presentation. A visual inspectionof the data led us
to choose an asymptote in the neighborhood of 0.7. This asymptote is drawn
the plot of the 4 in Figure 7 and on the plot of the r" in Figure 8. Then
we
on
In order
to
obtain
numerical
I .00
o
"
0.80
"
"
0.60
a.
"
0.40
"
0.20
"
10
CUMULATIVE
NUMBER
Figure
12
16
RECALLS,
OF
14
Function
of Number
of Recalls
a
as
Transitional
Probability of Recall, ta
circles. The curve
Values of tk are indicated by open
Case.
Parameter
the tk is Tk
0.7
0.57 (0.83)*.
,
in the Threefitted to
in
consideringall the trials on which words were
of the to,n+ifor all those
state Ao and calculating
po as the weighted average
This was
estimated
the sampling parameter a
0.83.
trials. Next
we
done by obtainingthe estimates,4 for successive values of k; these estimates,
used the
We
together with (24),give us a set of equations estimating a.
of these estimates
(ignoringnegative values). Then we
weighted average
the
m(l
from
on
0.12
obtained a
equation a
a). We shall comment
the estimation problems later.
estimated
po
0.13 by
"
GEORGE
'00
"
(III
'
"
"
"
AND
MILLER
A,
WILLIAM
J.
"
^^^
"
489
MCGILL
"
r-r
Ô
80
60
40
"S"'
"
Ui
u
q:
^^"^ô
9.
MONOSYLLABLES
64
.^cr.
UJ
20
"
"
"
"
'
"
'
and
Observed
'
'
0.12
oc
0.83
Po
0.13
'
'
'
'
'
30
25
NUMBER
Figure
of Theoretical
"
20
TRIAL
Comparison
15
10
"
8
Values
of
for Three-Parameter
pn
Case
"
"
10
"
trial
20
Figure
Comparison
of Theoretical
and
2S
90
number
Observed
Values
9
of
pCÂ
n) for Three-Parameter
Case
490
these parameter
When
function
for A;
Tk
shown
1, 2, 3, 4, we
When
for
they
in
values
Figure 7.
obtained
substituted
were
MATHEMATICAL
IN
READINGS
PSYCHOLOGY
substituted
were
When
the values
the functions
into
(31) we
for
(24) we
were
substituted
p(Ak
obtained
obtained
into
shown
into
the
(28)
in
n)
Figure 9.
the function,for p" shown
,
Figure 8. In Figure 8 the dotted lines are drawn " one standard deviation
from p,
as
computed from (33)
,
A comparison of the values of p" computed from
(31) and from (32) is
given for the first eighteen trials in Table 1. With this choice of parameters
the Bush-Mosteller
highly satisfactory.
approximation seems
in
TABLE
Comparison
and
of Exact
1
Values
Approximate
of
p"
for First
18 Trials
Discussion
In the
preceding pages
being memorized
we
several words
explicitassumption that the

izing
simultaneously are independent, that memorhave
made
the
probabilityof recallinganother word on the

be justified
list. The assumption can
only by its mathematical
convenience,
contradict
it.
The
learner's
the data uniformly
because
introspectivereport
one
is that
word
groups
affect the
does not
of words
go
together
to
form
associated
clusters,and
this
pairs of words
impression is supported in the data by the fact that many
successive
trials.
If the theory
recalled together or omitted
together on
are
is
behavior
of
reasonable
50
describe
the
used
is
a
to
rats, independence
assumption. But when the theory describes the behavior of 50 words in a
list that a singlesubject must
sumption.
learn, independence is not a reasonable asthe
examine
of
ducing
introto
It is important, therefore,
consequences
covariance.
The
the
difference between
theory can
The
independent and
the
dependent versions of
interpretation
of the set-theoretical
Imagine that we have a large ledger with 1000

presentationof the list is equivalent to writingeach of the words
of the two-parameter
pages.
the
best be illustrated in terms

case.
GEORGE
at random
100 pages.
on
at random.
are
On
the
sure
the
that
word
for
written
and
could
simply make
words. A, B, and C, on
recalled again it would
we
The
the
rule is that
Thus
on
for C.
With
sample of
same
be
A, B,
each
a
and
of these
50/1000
likelythat
random
at
50 pages.
and
C.
page
These
words
must
0.05.
With
and
make
dependen
in-
pages
dependent model, however,
selection of 50 pages
the
words
select
we
first select 50 pages

at random
all of them, then select 50 more
more
one
probabilitythat
of the elements
same
and
Then
C would
does
the theoretical
of the estimates
give a
paid to
of
fair
the
write all three
whenever
also be recalled
was
at the
the
subjects
The
may
to
ledger on
written
are
which
describe
each
combined
surprisingthat
even
though no
the equations
of pairs of words.
tive
Associa-
not
scores
attention
the rate, of memorization.
from
word
the linear difference
in the
to estimate
word,
to
is
p"+i
the
on
tion
equa-
list.
the various
only
Thus
r*
from
If the parameters
approximation
an
data
of the
mean
by averaging the recall probabilities of all
be expected
Similarly,the expressions given for p"+i cannot
result of averaging several subjects'data together unless
to have
general theory,
(6). The data
for
applicable,though
the
the
or
is not
theory
For
xt
values
same
of course,
of
oi
recall
obtained
measure
in this way
does not
effect is to increase the variance
not
variability,
and
a, po
word
functions
values
affect the
be
known
are
the form
the
of recall determined
words.
other words
only
words, it is
the
upon
in the
of covariance
The
assumed
vary
introduction
descriptionof
from
probabihty
what
upon
depends
of pages
of jointoccurrences
probabilities
different words
describe
depend
not
In other
p"+i
parameters
(6), are
(the number
recall,
p"+i
clusteringshould
The
will be recalled
to it
Therefore, the
pages.
change
word
conditioned
it is inscribed)and
the
50
Now
0.1.
491
MCGILL
time.
same
was
The
would
J.
100/1000
find written
we
we
was
B,
po
WILLIAM
selected at random.
model
independent
AND
first trial.
50 pages
on
MILLER
Thus
this page
on
responses
be written
A.
to use, and
p{Ak n).
a
descriptivemodel
limited
us
the
will enable
linear
to
force
cases
tedious
of the parameters.
may
all such
to
all
restrictions
consider
more
of
plicated
com-
general solution
to
us
to
(2)
compute
the necessary
used
tease
is
Once
parameters, the
necessary
and
to
step, however,
the
data.
As
yet
we
have
has
been
the
vary
these parameters.
efficient methods
need
we
step is to
next
the effects upon
observe
of this sort
found
no
to
experimental
In order
to take
the
out
conditions
this next
of
estimating the parameters from

to the estimation
satisfactoryanswers
problem.
There
functions
a
is made
example
is
p(Ak
at
of
sizeable amount
,
n) and
p"
If
.
the outset, it takes
in the
computation
choice
poor
several
preceding section,
we
hours
estimated
involved
in
determiningthe
of the parameters
to discover
the
the parameters
a, po
fact.
and
,
In the
successively
492
READINGS
used
and
had
been
too
low.
not
different
parts of the data
computed it seemed
particularlygood
to
ourselves
that the
problem
leave the estimation

with
problem
estimates
our
have
used
have
considered
beyond
with
the
of po and
to fit the
all parameters
was
the mathematical
that
us
different estimates.
we
one.
to estimate
for the
We
Clearly,the method
all of the data
use
PSYCHOLOGY
MATHEMATICAL
IN
pious hope
is
in order
simultaneously. We
to
convinced
Consequently, we
that it will
p"
both
were
to the data
least squares
abihties.
our
theory
After
appeal
to
must
one
some-
to solve it.
competence
Appendix A
Solution
The
solution
enumerated
below
our
has
n) in the General Case
equation (1) with
of
been
obtained
method
own
for p(Ak
several
of solution
in the past
times
because
conditions
boundary
the
(4,5). We
present
procedures involved
the
have
we
may
be
of interest in other
applications.
follows:
as
Equation (1) may be written explicitly
(1
This
1
Top(Ao ,n) + (l-
Ti)p(Ai , n)
Tip(Ai
T2)p(A2 , n)
7i)+
system of equations
-
To
Ti
72
The
T,
infinite matrix
infinite column
we
be written
can
0
T2
p{Ao
p(Ai
p(^2
in matrix
This
(1
To
To)p(Ao , n)
p{A,
p(^3
T3
1)
1)
1)
notation
p{Ao
P(^2
as
follows:
1)
1)
p{A2
,n
\)
p{A^
n)
p{Ao
n)
p(i4i
n)
n)
4- 1)
probabilitieswe shall call T, and the

trial n and
of the state probabilities
on
of transitional
vectors
made
shall call d" and
d"+i
initial distribution of state
up
So
.
we
can
write
do is the infinite column

probabilities,
,
vector
{1, 0, 0, 0,
"
"
"
The
state
trial two
on
probabilities
trial
on
probabilities
state
Tdo
The
WILLIAM
AND
MILLER
A.
GEORGE
J.
MCGILL
one
are
493
then
given by
rfi
given by
are
Tdy
d,
by substitution,
so
Td^
this
Continuing
T{Tdo)
T'do
d2
procedure gives the general relation

rdo
rf"
Therefore, the problem of determining d" can be equated to the problem

determining T".
that it can
be expressedas
know
Since 7" is a semi-matrix,we
T
where
as
are
is
arbitrary,so
SDS-\
its diagonal
elements
on
diagonal matrix with the same
of S are
elements
The
diagonal
2).
diagonal of T (e.g.,
infinite
an
the main
on
of
let Sa
we
S,,
Now
1.
can
we
write
Ti
it is
Now
simple
solve for "S2iwe
which
2 and
(from row
construct
To
for "S,,term
to solve
matter
(1
"
For
example,
equation
term.
1) the
column
To),
Szxil
To/(ti
Ti)"S2i
"
gives
S21
To
by
solve for S31
we
use
T1S21 +
the
(1
"
To).
"
equation
=
*S3i(1
Ti*S2i/(t2 To)
ToTi/(ti
T2)*S3i
S31
To)
"
"
To)(t2
"
To).
to
494
READINGS
this
Proceeding in
IN
gives the
manner
PSYCHOLOGY
MATHEMATICAL
of
elements
necessary
we
have
0
and
S,
""
To
(ti
For
1.
column
times
we
"
"
example,
one
To) (tz
To)(t3
of S~^
elements
The
*S"S~^
To)(t2
"
of *S~^ :
"
obtained
be
can
the element
to/(ti
by
term
term
T2)
"
from
aSziof S~^ is given by
tq) +
"
Tj) (ts
Ti)(t3
"
S21
equation
two
row
Continuing
0.
the
of S
in this way
have
s-
(to
T3)(ri
"
"
matrices
These
"
permit
T3) (ti
T'
T3)(r2
"
T3) (t2
simple representation of
in
(SDS-'XSDS-')
the
"
T3)
powers
of the
SD{S''S)DS-'
SD'S-\
general,
r
Since
"
T., Thus,
matrix
and
T3)(t2
Z) is
diagonalmatrix,D"
SD^S-'.
is obtained
by taking the
nth power
of every
496
READINGS
The
right side
of this
PSYCHOLOGY
MATHEMATICAL
IN
(5) and
equation is,from
The
(18),p"+,
left side
can
be rewritten
LI
k=i
which
becomes
trial
on
\^
O,)
know
we
subtractingp(Ao
we
ft-O
A-O
")V(^* n)
pn
[1
(1
ay]pn
(1
Po)
Z (1
")V(^* ^);
obtain
Rearranging
(1 -Po){l
[1
(1 -a)"]p"}.
gives
terms
Pn.i
is the desired
From
a)V(^* ,n)
P".i
which
n),
~l
23 (1
so
"
that
Pn
and
''".
"
2Z Piîc,n)Z (1
Now
and
"
1),
"
[1I [II a|.

?('!".")]
have, by adding
now
a)
"
(with n
g
We
(1
"
(1
po
Po)[l
(1
(21)
a)"]p"
result.
this result
(15) is
obtained
directlyby equating
C
Appendix
List of
a
parameter.
Ak
state
parameter.
d"
infinite column
infinite
number
that
word
is in after
of times
value
asymptotic
number
total number
Nk.n
number
Po
probabilityof
Their
being
t^
n)
its elements.
as
similar to T.
word
of
Meanings
recalled k times.
vector, having p(Ak
diagonal matrix
and
Symbols
has been
and
p"
recalled.
of trial.
of test words
of words
in state
recallinga
po and
to be learned.
A,, on
word
trial
n.
in state
Aq
a.
GEORGE
p(Ak
A.
probability
,n)
that
observed
r"
MILLER
word
recall
WILLIAM
AND
will
score
trial
on
in
be
Ak
state
estimate
n;
497
MCGILL
J.
trial
on
of
n.
p"
.
p"
probability
Sij
elements
S'ij
elements
infinite
tk
estimate
U,n
observed
of
recall
of
*S.
of
aS~\
matrix
on
used
of
trial
n.
transform
to
into
matrix.
diagonal
similar
Xk
"
fraction
Tk
probability
infinite
of
words
of
recalling
in
Ak
state
word
in
that
recalled
are
trial
on
n.
Ak
state
matrix
of
probabilities
transition
r^
.
Var
variance
(r")
of
estimate
the
of
p"
.
"Xî.it.n+i
random
variable
equal
to
0.
or
REFERENCES
1.
Bush,
R.
presented
2.
Cooke,
3.
Estes,
4.
Feller,
and
R.,
the
to
R.
W.
G.
the
Proceedings
5.
Woodbury,
M.
A.
received
the
theory
Berkeley
of
Psychol.
learning.
Symposium
for
with
particular
on
learning.
December
London:
spaces,
processes
model
Boston,
Statistics,
sequence
stochastic
operator
(Paper
27,
MacMillan,
Rev.,
1950,
57,
reference
Mathematical
1951.)
1950.
to
94-107.
tions.
applicaand
Statistics
403-432.
On
313.
Manuscript
of
of
1949,
and
statistical
theory
linear
Mathematical
matrices
Toward
On
Probability,
for
Institute
Infinite
K.
W.
Frederick.
Mosteller,
3/ 11/
52
probability
distribution.
Ann.
math.
Statist,
1949,
20,
311-
CHOICE
ULTIMATE
BETWEEN
TWO
PREDICTIONS
ATTRACTIVE
FROM
GOALS:
MODEL*
MosTELLERf
Frederick
HARVARD
UNIVERSITY
AND
Maurice
Tatsuoka
university
mathematical
A
choices
model
for two-choice
is discussed.
desirable
are
hawaii
of
behavior
According
in situations
to the
model,
one
where
both
the
other
or
tion
ultimately preferred,and a functional
equation is given for the fracsolution
population ultimately preferring a ^ven choice. The
and
the initial probabilitiesof the
the learning rates
depends upon
upon
choices. Several
techniques for approximating the solution of this functional
One
of these leads to an
that gives
equation are described.
explicitformula
This
solution
be generalized to the two-armed
bandit
can
good accuracy.
in each
the equivalent T-maze
or
problem with partial reinforcement
arm,
the calculations
for a highto program
speed
problem. Another
suggests good ways
computer.
choice
is
of
The
mock
has
Buridan's
seemed
always
equihbrium
an
equilibrium
in
of
theory
for
model
for
model
the
initiallyshifts its choices
who
starved
No
this paper
In
to
doubt
behavior.
One
situations
approach-approach
mathematical
In
ass,
unreasonable.
goals will be chosen.
attractive
a
of
immobility
haystacks,
to
the
death
be
unstable
in these
from
to
one
was
such
any
of
one
"
the
properties that flow from
some
behavior
repetitive approach-approach
behavior
two
invented
that
expects
will
between
the story
choice
situations, an
after
another, but
while
cussed.
dis-
are
organism
settles upon
single choice.
Thus
in the
expression
some
different
on
early part of the learning the theoretical

to
the
trials,but
notion
of
eventually
organism
equilibrium by making
an
this behavior
even
different
vanishes
give
may
choices
for the
single
Science
Foimdation
*Support for this research has been received from the National
Health
of Mental
Institute
the National
(Grant M-2293), and the
(Grant NSF-G2258),
University.
Laboratory of Social Relations, Harvard
ance
and express
our
appreciation for the cooperation and assistfWe wish to acknowledge
given by Phillip J. Rulon,
Albert
Beaton,
and executed
numerous
up, programmed,
method
of solution, and by Cleo Youtz
work.
We
also wish to thank
Ray Twery and
3
some
Illiac
calculations
for extensive
Robert
R.
Donald
connected
Bush
article
P.
Nash,
appeared
in
the
at
unpublished results of their calculations. Those calculations

Laboratory of
through the cooperation of the Digital Computer
John
Spearritt, who
with
every
for permission to
of the
Illinois,Dr.
This
Ho, and
Wai-Ching
calculations
set
were
the
linear
tions
equa-
stage of the
use
made
in Table
on
Director.
Psychometrika,
1960, 25, 1-18.

498
Reprinted
with
the
University of
permission.
MOSTELLER
FREDERICK
AND
MAURICE
499
TATSUOKA
ultimately choose one

organisms may
organism. On the other hand, some
goal and others another, so that a notion of equilibriumor balance could be
a
recaptured across
population of organisms. The quantitativeaspects of a
cussed
behavior
for
such
model
are
investigated.The model employed is one disby Bush and Hosteller [1].
then the mathematical
A simple situation will be discussed first,
problem
bandit
encountered
there will be related to the more
complicated two-armed
each
reinforcement
each
that
with
Suppose
on
arm.
on
partial
problem
infinite sequence
trial of an
organism may
an
respond (or choose) in one
of exposition,specify the ways
R and L (for
of two ways.
For purposes
as
for
think
of
that
concreteness
rat choosing
one
can
a
right and left,say), so
the left-hand or right-hand side in a T-maze, or a person
choosing the leftin a two-armed
bandit situation.
hand
the right-hand button
However,
or
intended
for a general pair of attractive objects or
R and L are
to stand
exclusive
lead to attractive
and
exhaustive, which
mutually
responses,
goals.
Suppose that on a given trial

that of choosing L is 1
p, where as
of
time
next
R
probability choosing
"
probabilityof choosing R is p,
usual 0 " p " 1. If i2 is chosen,then
and
is increased
if L
the
to aip
-|- 1
"
but
ai
the
where
to a2p,
probability of choosing R next is reduced
choice is made,
1- The point is that when
0 " ui " 1,0"q;2^
a reinforcing
of
chosen
choice
increased
has an
next
that
being
probability
time, and
The asymmetry
in the formulas
both R and L are
regarded as reinforcing.
the probabilityof choosing R,
from
the fact that the notation
uses
comes
and not the probabilityof choosing the particularside chosen on each trial.
discussed
The
are
by Bush and
operators used to change the probabilities
Mosteller
([1],
p. 154 ff.).
Suppose the organism continues making the choices and that his probabihties are adjusted after every trial according to the rules just given. Then
later the organism stops making one
of the
that sooner
be shown
it can
or
extreme
choices and thereafter chooses only the other. An
example occurs
then the organism chooses forever what he chooses
if both "! and aa are zero
is chosen
the
"
first
learning).
(one-trial
One
mathematical
eventuallychooses
problem is to discover
R rather than
the
probabilitythat
the organism
L all the time. If he does choose
"ultimatelyattracted
R all the
is "ultimately
by R," or
time, then he is said to be
should
be
The
desired
as
expressible a function of
probability
attracting."
coefficients ai and 012 (the
the initial probabilityp and of the attractiveness
attractive the side). For convenience, this will be
smaller an
a, the more
the
called
cated
complisimple approach-approach problem, in contrast to the more
problems.
partialreinforcement
Consider
an
now
as
experiment with paradise fish
example a T-maze
of this experiment a fish
Wilson
On
each
trial
and
described
[2].
by Bush
500
READINGS
IN
PSYCHOLOGY
MATHEMATICAL
other,where the left or right

side could be chosen. When
the right-hand side was
chosen, the fish was
rewarded
75 percent of the trials. When
the left side was
on
chosen, the fish
rewarded
of
the
trials.The
25 percent
was
on
operationwas to placethe reward
started
on
at
side
one
reward
end
one
of
tank
the other
or
through
and
time.
every
to the
swam
In
transparent divider
fish
one
group
when
he
chose
In the other
divider was
used. The
an
group
opaque
that the fish tended to stabilize on one
side
showed
Within
the
framework
of the
was
data
or
see
from
the
side.
unrewarded
the
operators described
able to
these groups
the other.
earlier in this paper,
probabilityof choosing the right-hand side on a given trial,and

right-hand side is chosen and rewarded, the new
probabilityof choosing
if p is the
if the
the
were
be
If the left-hand side

expressed as ap + 1
a.
chosen and rewarded, the new
probabilityof choosing the right might
reduced
to ay. The
parallelwith the previous descriptionsis very close.
three
But
the side chosen
is not rewarded.
Then, essentially,
suppose
right-hand side might
exist.
possibilities
(a) The side chosen
be
"
before. The
likelyto be chosen than it was
explanationmight be, for example, that the organism is building up a habit
pattern, or that he is secondarilyreinforced for being in a place that earlier
was
rewarding.
planation
(b) The side chosen is less likelyto be chosen than before. The exhas been received that
might be, for example, that information
this side is not
Whatever
and
to
is
more
paying off.
the explanation
(b) make
quite different
probability associated
reward
is given or not.
with
corresponding to
be, the models
predictions.The model for (a) says that
may
the
side
chosen
This
is
increased
always
(a)
the
whether
^for the operators described
ultimatelyimplies
every time, that is,that eventuallythe organism
stabilizes on
for (b) would
side. On the other hand, the model
imply
one
that an organism
that the organism does not stabilize. To see this,suppose
choose
the
side
that
is certain (p
1) to
right-hand
is,he has stabilized
of partialreinforcement
the organism will exthe right.Then
because
perience
on
will reduce
trials on the right-hand side. These
nonrewarded
some
left-hand
side
the probabilityof choosing the right-hand side,and
the
so
shows
that the organism
will be chosen
sometimes.
A similar argument
here
that
"
"
side is chosen
one
cannot
"
stabilize
assumption
on
the
(b) would
left. Thus
partialreinforcement,a model for

A subject does
asymptotic instability.
the other,nor
does he finally
acquire a
under
typicallyhave
by one side or
fixed probabilityp of choosing R. Instead,his value of p drifts up and down,
stable way.
model
Thus
though in a stochastically
(a) has attractingand
barriers.
(b) has reflecting
absorbing barriers,while model
The
then
is
everything
(c)
probability
unchanged by a nonreward
not
become
attracted
"
depends
upon
the rewarded
trials.
AND
MOSTELLER
FREDERICK
501
TATSUOKA
MAURICE
(a). In
paradise fish the data suggest model
shall deal with the type (a) model. On the basis of the model,
this paper
we
would like to know
we
(interms of the learningrates,the initial probabilities,
of reward on the two sides)what fraction of the organisms
and the probabilities
In
the
will stabilize
on
given
the
Because
and
to
problem
general problem
the
because
what
know
side.
numerical
work, we will sketch

is time-consuming in
want
with
experiment
has
turned
has
some
solutions
various
that
of R
on
{aip"
-I- 1
P"+i
ai
"
will
worker
the
of
,,.
(1)
research
of them
simpleapproach-approach
probability that an organism is
Let /(pi ; ai
aa) be the probabiUty
on
choosing R. The transition rules

trial n, then the probabilityof R on the
probabiUty
been
in choices
of trials ends
infinite sequence
an
by previous
tried. Each
Work
of previouswork
facilitate discussion
shown
as
development and testing,so a

ground has already been plowed.
problem, a functional equation for the

ultimatelyattracted to R will be derived.
that
have
its
Previous
To
interest
some,
trouble-
rather
be
to
out
of R.
Here,
pi is the initial
are:
if p" is the
next
trial is
if
is chosen
on
trial n,
if
is chosen
on
trial
probabiUty
"{
[asPn
n.
usually no advantage in referringto the trial number

for the
with p, so the subscript on
associated
p stands
pi is dropped and
that the desired
initial probability.
Similarlyit is always to be understood
full
notation
the
is needed,
function / depends upon
ai and
Ui ; so except when
In the
sequel there
is
f{p)will be used.
quantity f(p) may
the notation
The
the
to
of
member
R
with
the first choice

of R
of 72
of L
or
the
same
the fraction p
is uip
-f- 1
ai
"
of two
composed
"
the
parts
initial trial. Assume
the
on
parts
sponding
corre-
that
each
p of
choosing
probabiUty
on
Then,
simple approach-approach problem.
of the individuals choose R, and the new
abiUty
prob-
largepopulation has
is faced
and
choice
be
the
initial
same
for any
of this group.
member
This
means
being ultimately attracted
by R is
probabiUty
the
contributes
portion
f{aip H- 1
ai). Consequently this group
those organisms choosing L
manner
ai) to f(p).In the same
p fiuip 4- 1
first contribute
(1
p) jiazp)to f(p).Thus one derives the basic functional
equation for the simple approach-approach problem:
that
in this group,
of
the
"
"
"
(2)
The
/(p)
boundary
because
if p
conditions
0, then L
p/(a,p -fl
are
/(O)
occurs,
and
aO +
0 and
the
new
(1
p)/(a.p).
conditions
1. These
/(I)
for
R is az-O
probability
=
hold
=
0.
502
READINGS
Therefore
L is
and
Thus
if
them
certain
(A
satisfies it
/(I)
and
after
terms
derived
always
for the function
conditions
the
new
chosen.
needed
are
(2)only determines / to within a

(2),direct substitution shows
that
Af -{- B
also
constants).
are
have
four
had
parts if
related
we
the desired
probability
generally 2"
after
two
occurring
trials,or more
all
are
equations
equivalent,but they
applicationsof (2) to the /'sappearing on
terms
successive
can
all be
the
right-
side.
hand
properties of f{p) have been

Shapiro ([3],Parts II and III),and by
The
all of their results
not
is
trials. These
by
satisfies
four
the
1. These
and
occurs,
Therefore
1.
ai
"
1, then R
linear transformation.
Equation (2) could

to
PSYCHOLOGY
SimilarlyUp
i2 is oii-l +
without
chosen.
always
probabilityfor
Thus
0
/(O)
because
MATHEMATICAL
IN
useful here
solution
of the solution.
the
once
tonicityis
Equation (2) has a unique, monotone, analytic

conditions
are
given. With our boundary conditions
boundary
the solution
before
below.
given
are
i. Nature
and
by Bellman
by
Karlin [4](c.f.[1],p. 163-4). Since
those propertiesof f(p)especially
readilyaccessible,
are
studied
is
with
"
ai
0:2
for
concave
,
cui
"
a2
The
.
mono-
ing
probabilityinterpretationgiven by the learnthe
of
the
larger
given a^
probability
choosing R
az
the more
initially,
likelythat R is ultimatelyattracting.
ii. Solutions under
the
follows, suppose
special conditions. In what
relevant boundary conditions
and
hold.
1 to
The
0
special
/(O)
/(I)
model
consistent
for
convex
for
"
the
and
conditions
(a)
have
=
"!
f(p) is both
(b) ai
p
or
to do
5^
az
0,
1. The
and
convex
=
with
a2
the values
assumed
solution
f{p)
the
is
and
concave
by
by
probabilityof
one
or
both
of the a's.
implied by the fact that

boundary conditions.
defined in our
problem unless
as
p,
the
function / is not
I. The
because
changes
never
and
no
attraction
occurs.
(c) "!
1,
a2
5^
because
1. The
of R
occurrence
leaves
the
probabilityof
toward
unchanged
can
only move
ai
a^p
p, so
L's unless p
1. Thus
choosing more
f{p] 1, az)
0, a2 9^ I, p ^ 1, and
1.
/(I;1, "2)
?^
(d) az
1)
1, ai 9^ 1. Similarlyj{p; ai
1, p 5^ 0,
1, a,
and /(O;aj
=0.
1)
0. Here, the only way
to be ultimatelyattracted to L is always
(e) "!
to choose
L. The
probabilityof the latter behavior is
-{- I
the process
"
"
(3)
Sf(p,
a^)
(1
p){\
a2p){\
alp)
"
"
"
fl(1
1=0
Therefore
(4)
the
probabilityof
ultimate
/(p;0,a2)
attraction
1
by
9(p,cx2).
is
a^p)
.
504
READINGS
close to the true
IN
MATHEMATICAL
PSYCHOLOGY
of this paper,
In the remainder
ones.
techniques for
several
approximating f(p),are
A
then
be
provided.
designed for high-speed
method
an
excellent
approximation
considered,and
calculation
from
obtained
the unit
grid of
interval,and
of these
values
notation
develop
no
the
numbers
longer correspond
set of equations
(7)
=0
KPi)
there
methods
of
bandit
approximating
Equations
,
P2
"
"
"
"
(= Pn+i) in
appliesto each
1
Pn
with
earlier
as
they
but the subscripts

probabilities,
in earlier sections.)
Then
has
one
did
+/(0),
PiKdiPi +
/(P2)
P2K(XiP2+
KPn)
+
PnfiaiPn
first and
to the two-armed
equation (2) as it
(Lest confusion
"
ai) +
(1
ai) +
(1
^l) +
(1
/(I) =/(l)
The
first,
equation will
Pi stillrefers to
to trials
/(O)
considered
variables.
independent
that
note
(= po), Pi
the functional
write
of the
other
Simultaneous
Approximation hy
a
that result will be extended
then
of some
problem. Finally, brief mention
this functional equation will be given.
Consider
will be
differential
Pi)Ka2Pi),
pî{a2P^
"
Pr)1{oC2P^
,
+0.
of this set of
last members
equations are,
of course,
tautologies;
only n nontrivial equations.

The
right-hand sides of the n nontrivial equations of the set (7) each
involves the values of j{p) at points that do not ordinarilycoincide with any
of the chosen grid points.However, by using an interpolation
formula, both
be approximated by
+ 1
ai) and f(a2Pi),i
j{oLiPi
1, 2,
n, may
linear combinations
consecutive
of the values of f{p) at two
more
or
grid
The
number
of grid points required depends upon
points Pi
Pi+i
are
"
"
"
"
"
"
"
interpolation(two grid points),interpolationwith

second differences (threepoints),third differences (fourpoints),and so forth.
Whatever
the number
of points may
be, each equation of the set (7)
be
can
just the
replaced by an approximate equalityinvolving as unknowns
values
of /(p) at several predetermined grid points, and
these unknowns
Thus
occur
mately
only linearly.
a system of n linear equations is obtained,approxisatisfied by the n unknown
f(Pn)-The
quantities,f(pi),fipz),
idea of deriving a system of linear equations whose
roots approximate /(p"),
whether
one
linear
uses
"
'
"
1, 2,
"
"
"
n,
was
first
unpublished memorandum,
f(aiPi +
"
ai)
and
suggested to
in which
linear
fiazPi).
by J. Arthur
interpolationwas
us
Greenwood
used
to
in
an
mate
approxi-
FREDERICK
this and
In
which
"!
example
in the
.75, ttz
has
fairlyeasy
so
easy to fit,
the
to
attained
MOSTELLER
.80 is used
of
from
compute
the reader
not
Taking
pi
0.25,
short,in accordance
is illustrated
standard
example,
the
precision
ai
grid
0.75, as
of five
0.80.
p)m.80p).
and
0.75
for
=
writing /(p,)
fi
for
,
/i
0.25/(0.4375)+
0.75/(0.20),
/2
0.50/(0.6250)+
0.50/(0.40),
0.75/(0.8125)+
0.25/(0.60).
First,linear interpolationwill be
of linear
/(0.6250),etc., by means
,/3and/4(=
(1
0.25) +
0.50, pa
equations (7)
pz
with
(9)
A ,U
thinking that
described
pK0.75p +
into
be misled
obtainable.
method
just
Example.
equally spaced points,using the
Here, the functional equation is
Kp)
This
methods.
various
in
are
being easily displayed;further,numbers
the disadvantage of being relatively
The
(8)
illustrate the
to
example
it. It has
should
always
numerical
standard
505
TATSUOKA
MAURICE
following sections
advantage
for it is
AND
used
to
approximate /(0.4375),/(0.20),
of the five /'s:/o(= 0),
combinations
1).Thus,
0.5000
,rn
Ao^r^
^(0-^3^^)
0.75/2
0.25/i +
0.25
0.20
,
0.80A
0:^500
^^
^" +
~0:25-^^
0.25
=
0.2500
0.20
^-
^^ +
0^500
=
,,",..
^^^"'
0.4375
0.4375
and, similarly,
/(0.6250)
0.50/2 +
0.50/3
/(0.40)
0.40/1 +
O.6O/2
/(0.8125)
0.75/3 +
0.25/4
/(0.60)
O.6O/2 +
0.40/3
0.75/3 +
Substituting these approximate expressionsfor the

in the right-hand sides of (9) and collecting
all terms
into
the
left-hand
sides,one
0.3375/1
(10)
-0.2000/1
+
-
several
functional
involvingthe
obtains
0.1875/2
0.25,
0.4500/2
0.1500/2 +
0,
0.2500/3 c-i 0,
0.3375/3 c-
0.1875.
values
unknowns
506
READINGS
Replacing the
by
~;
in the
the
resultingequations, one
(The best available values are
PSYCHOLOGY
MATHEMATICAL
IN
set
obtains
of
the
approximations (10) and solving

following approximations to /,"
for comparison.)
.
also shown
only fair.
use
approximating the non-gridin the right-hand sides of (9).The
general
point values of j{p) that occur
formula
(with equally spaced grid points)is
The
with
agreement
Now
the
best
second-order
Kx, +
e)
available
values
is
interpolationfor
i--
Ax
'" +
"
Ax
fA^- icP-'
(11)
e
i ^^- I 1
Ax
Note
Axr^'
'
that (11) givesthe
interpolatedvalue as a weighted
of the three adjacent tabled values instead of using differences.
average
mate
Applying (11) to the problem at hand and substitutingthese approxiinto
the
of
obtains
the
sides
following
one
right-hand
(19),
expressions
where
x.
a*.
"
system of approximations.
0.2410A
(12)
-0.1400/1 +
-
whose
These
roots
0.1744/2 +
0.0235/3
0.3925/2
0.3150/3
0.1369/3
0.0497/2 +
0,
-0.0625,
0.0872,
yield the followingapproximations.
results
are
definite
improvement
over
those
obtained
by
linear
interpolation.
to indicate that a considerable
improvement
example seems
of the approximation can
be expected when
higher differences are used in the
of f{p) in
for expressing the non-grid-pointvalues
interpolationformula
of the grid-pointvalues. However, the interpolation
formulas become
terms
and more
cumbersome
with numerically as higher differences
more
to work
included. It therefore is pertinent to see how
much
are
improvement can
be gained by increasingthe number
alone.
of grid points
The
above
FREDERICK
MOSTELLER
Improvement
Points
Obtained
in
Grid,
Entries
AND
by
using
Increasing
Linear
the
Values
507
TATSUOKA
Number
Interpolation
Approximate
are
MAURICE
of
only;
of
f.
Using only linear interpolation,

approximations from grids of 4, 5, 6,
obtained. These pointswere
not equallyspaced because
11, and 21 pointswere
it was
hoped that better results would be obtained by spacing the grid so
that the functional
values would
be approximately equally spaced. Information
needed
for such spacing was
available from other methods
described
later.
Linear
in the
made
interpolationswere
above
to obtain
approximate
The
0.90.
numbers
shown
are
values
in Table
results for the

at
five
scribed
grids de-
0.10, 0.25, 0.50, 0.75,
1, together with
the
known
best
values.
Using
the best value
and
the cell entry for
a given
the
decreases
error
that,
roughly,
Pi
very
linearlywith the spacing.On the other hand, with a five-pointgrid,changing
from linear to second-order
alent
interpolationgives improvement roughly equivof points to 21 and using hnear
to that given by increasingthe number
interpolationonly. Since simultaneous
equations are expensive to solve,it
that second-order
interpolationis well worth the effort,contrary to
appears
as
usual
of error, it will be noted
measure
advice.
Calculations,with the
points and second-difference
have
obtained
been
made.
using third-order
they
could
In
be
more
labeled
The
an
results
electronic
are
well
are
differences,though in
values"
computer,
summarized
differences
useful. The
"best
of
interpolationas
by using second-order
those
numbers
aid
third-order
using
21
third-difference
as
in Table
2. The
grid
polation
inter-
results
hardly distinguishablefrom
sharply curved example

interpolationcolumn
provided
more
throughout this
degree of accuracy
paper.
be attained by using
can
principle,any desired
finer grids,but the cost of the calculations increases roughly as the square
of the number
of grid points used.
A
could
be
high-speed computer
to write its own
programmed
equations and solve them, but such a program
508
READINGS
MATHEMATICAL
IN
PSYCHOLOGY
TABLE
Approximations
With
21
not
written.
this section
are
If
Grid
good
and
Order
Third-Order
the
is
accuracy
Interpolations
Approximation
Differential
Equation
required,the techniques proposed in
recommended.
Approximation by
An
and
Points
Second
By
was
Second-
Using
essential
of
feature
the
Differential
Equation
approximation
simultaneous-equations
replacement of non-grid-point
continuous
of grid-pointvalues. The
values of f{p) by linear combinations
variable analogue of this procedure is the expansion of fiocip-f- 1
ai) and
will
be
of
This
now
approach
as
fioczp)
Taylor's series in the neighborhood
p.
solution yieldsan approximation
used to derive a differential equation whose
discussed
in the
preceding
section
the
was
"
to the desired
function,f(p).
aO
Rewriting f{aip -f 1
the latter as a Taylor'sseries.
"
Kp -H (1
(13)
a.)(l
P))
as
f(p +
(1
aO (1
"
f(p) -h (1
(1
+
ai)(l
a^yq
2!
"
p)),and expanding
p)/'(p)
p)
f'iv) +
FREDERICK
where
/' and /"
MOSTELLER
the first and
are
Kv
(1
509
TATSUOKA
derivatives
of
with
respect
to p.
follows:
a,)v)
MAURICE
second
Similarly,expand /(aap) as
/(a.p)
AND
f(p)
(1
a,)pf(p)
(14)
^^^^np)-
Using only through the term in f'(p) in the two series (13) and
substitute these expressionsfor the functions in the right-hand side
functional
equation (2).The result is a differential equation
^^^
(15)
^f^^^^+
^^
(1
in
By rearranging terms
M(l
ai)'
[(1
"^^^^
+ ^(^
P'^^'^P^
p)[f(p)
(1
"')'(! P)'/"(P)]
+ Ki
a,)pr(j))
(14),
of the
a,)yr'(p)].
(15),
"i)'
(1
a,y]p}r'(p)+ {a,
aOfip)
0.
Hence,
f'ip)
(16)
2(a,
I'iv)
which
is
"2
"i)'
r/1
[(1
constant
Integratingboth
of
Ci and
sides of
C2
are
aO^
1, the final form
(19)
is an
abbreviation
for
+ a2)/2.
{oci
(17),
"]l+l/(l-a)
^2
of
C2 from
of
"|i/(i-a)
^^
/v
"
constants
new
Determining Ci and
=
and
integration,
[n
where
x2
,-,
(1
..
integratedto yield
Ci is a
where
^2^"
cc,Y]p
(1
[n
/(I)
gQ
_
-
^//^^
integration.
the
boundary
conditions
f{p) is
/(?")"."
A^
,.a.
..
{A
ly
where
(1
^=.
(1
a,r
-^y
(1
",)=
and
^
1
(ai
+a.)/2"^^-
/(O)
0 and
510
READINGS
Example: Taking
occurring in (19).
0.75,aa
ai
0.80, as before,calculate the
(0.20)^
from
(19),
260.42
ifr^
Kp)
f9(^^
(20)
be
f(p)in
compared
of 0.05 for p
difference
Two-Armed
reward
xi
of i2
New
out
given
If R
follows
trial,the
new
with
a2p
"
"
"2
nonreward
if L and
reward
if L and
nonreward
in
[1].
brieflyon
and
reward
if -R and
results represent
discussed
21
grid points
with
those
and
third-
whichever
and
L,
follows
reward
If p is the
follows.
is
with
as
occurs
probability
ability
prob-
Probability
of happening
]" R
a2P
and
desk
probabilityfor R
ai
aip
These
with
probabilityts
for R
+
and
occurs,
probability
aip
they may
mate
approxicalculators,the
the various
Among
can
responses
follows.
reward
occurs,
on
two
are
nonreward
; if L
2, where
equally easilybe applied to the

bandit
problem with
appropriate to the two-armed
each arm
experiment).
(or the equivalent T-maze
on
model
or
0.75.
Bandit
differential equationapproach
partialreinforcement
Suppose that there
a
far obtained.
so
equations using
by the simultaneous
interpolation.
general
more
in Table
shown
are
0.25,0.50,and
yields results in closest agreement
The
The
best values
easily carried
be
can
differential equation method

obtained
f(p)for
fi (approx.)
the best values
which
methods
of
the values
intervals
with
p)^""
^^f;^
Pi
of
(2.7778
Using (20),calculate
Values
constants
^"^'^^^'
"
(0.25)^
Hence,
PSYCHOLOGY
MATHEMATICAL
IN
specialcase
p. 287
in the
occur
wip
(1
iri)p
"
^2(1
(1
"
p)
^2) (1
"
"
p)
presented in ([1],
p. 118, 286)
paragraph followingequation (13.22)
of those
512
READINGS
of
Twery and Bush

/(0.50)for two-armed
made
bandit
combinations
used
calculate
the value
the stated
parameter
For
for Two-Armed
series of Monte
experiments
The
of a-values.
various
to
PSYCHOLOGY
Calculations
Carlo
Monte
MATHEMATICAL
IN
Carlo
with
of
case
/(0.50)from
of
Bandits
ai
ti
calculations
0.75, ^2
0.90, 0:2
"
on
=
lUiac
0.25 for
0.95 will be
(23).
values.
(0.75)(0.10)^
+ (0.25)(0.05)^
=
(0.50)[(0.10)' (0.05)']
2.1667,
+
1
Hence, (23) in
this
65015.7
From
f(p)
this
(2.1667
65006.6
formula,
0.977,
The
of alpha
p)"'^'
/(0.50)
compared
14.3333.
becomes
case
(24)
(1.85/2)
with
values
values
Twery and Bush's result,0.970.

of /(0.50),calculated from
(23) for the various combinations
shown
in Table
and Bush, are
3 along with
used by Twery
3
TABLE
Comparison
of
With
Obtained
At
Those
from
the
Differential
800th
of
the
Twery
And
Level
Probability
Mean
Trial
{second
Bush
and
for
IT,
Various
1
a,,
"1'
-
(first
Results
Equation
of
a,,
"2'
0.75
entry)
entry)
100
for
Sequences
p
0.5
FREDERICK
the
Monte
obtained
were
in
run
The
100
result
AND
obtained
by
random
numbers.
The
trial 800.
at
sequences
MAURICE
100
it has
The
between
agreement
the Monte
numbers
of 800
sequences
were
trials each
of p
value
average
variation
random
some
pre-asymptotic to the extent that 800 trials

agreement is quite encouraging for the use
method.
Their
entry itself is the
Thus
513
TATSUOKA
authors.
these
in which
pseudo-experiment
with
for the
is
Carlo
MOSTELLER
and
is not
an
of the
differential-equation
Carlo
infinite number.
results and
the differential
used
equation is surprisinglyclose,consideringthat only 100 sequences
were
that the differential equation is only an
approximation. On the other
hand, both learning parameters are near
unity in these examples; in that
neighborhood the differential equation should be quite a good approximation.
and
T-maze
Experiment
In
the
Wilson
first section
[2]using
for response
and
and
a
was
when
in which
they
(estimatedfrom
probabiHty for response R

varied considerablyfrom
trials)
0.496,
of p approximately
nearly 0.50. Bush
or
followed
(25)
7/
This
attracted
by
R. The
relative
Wilson
the
=
the
of reward
rate
of
to be
results
on
symmetrical
Beta
0.75
0.75
0.916
through
initial
side. The
the first 10 of the 140
another, the
report that
ai
the reward
see
unrewarded
and
was
model, tti
our
estimated
were
fish to
one
and
The
the fish could
chose
Bush
experiment by
L. In the notation
learning-rate
parameters
divider
transparent
T-maze
described.
0.25 for response
for the group
0.942
ttz
R and
Fish
of this paper,
paradise fish
0.25. The
TTj
with Paradise
value
average
the
being
distribution
3.61[p(l -p)f-\
was
areas
used
under
to
calculate
the
curve
the
expected
(25) in
the ten
fraction
intervals
[0,0.1],[0.1,0.2],"".,[0.9,1.0]
found, the
were
and
In the
to
f(p) at
the
midpoints
was
obtained.
Wilson
found
of these intervals
The
15
result
was
of the 22
lated,
calcu-
were
f{p)
0.800.
fish in the
perimental
ex-
This
after about
100 trials,
making nearly all R responses
for
0.68
the proportion ultimately attracted to the R
result is only about one
standard
from the fitted
error
away
the estimate
response.
value
of
weighted average
experiment, Bush and
group
leads
values
their
That
0.80. That
of the
unreliability
small
deviation
does
originalestimates
not
even
take
any
account
of the
of the a's.
Other Methods
Several
One
/(I
that
"
p;
other methods
of
approximating the function have been explored.

rather
successful
was
employed the function f(p; a, 0) or
the iterate change very
0, a), choosing a value of a that made
514
READINGS
little.
This
fo(p)
method
MATHEMATICAL
IN
superior
was
PSYCHOLOGY
to
iteration
an
technique
p.
Since
special
knows
one
case
a^
the
exactly
the
a2
notion
solution
of
to
the
expanding
functional
f(p;
in
tta
the
as
the
series
power
of
neighborhood
developed
note,
in
equation
az)
ai
in
with
beginning
such
itself.
suggests
ax
Robert
R.
in
Bush,
an
published
un-
technique.
REFERENCES
[1]
Bush,
R.
R.
and
Mosteller,
[2]
Buah,
R.
R.
and
Wilson,
1956,
51,
[3]
Harris,
in
[4]
R.
E.,
Bellman,
processes.
S.
Some
R.,
Res.
random
and
Memo.
walks
725-756.
Manuscript
Revised
Stochastic
models
Two-choice
for
behavior
of
York:
New
learning.
paradise
fish.
Wiley,
/.
exp.
1955.
Psychol.,
315-322.
T.
decision
Karlin,
F.
T.
received
manuscript
1/9/59
received
6/29/59
Shapiro,
H.
arising
N.
The
P-382,
in
Studies
RAND
learning
in
functional
Corp.,
models
Santa
I.
Pacific
occurring
equations
Monica,
J.
Calif.,
Math.,
1953.
1953,
3,
THEORY
OF
DISCRIMINATION
FRANK
LEARNING
RESTLE
Stanford University^
This
presents
paper
two-choice
similar
Though
of
theories
and
(5)
this
theory of
learning.
discrimination
form
in
earHer
to
simple learning by
Bush
"relevant"
Estes
introduces
(2,3),
powerful
definite
assumption which makes
tain
quantitative predictions easier to oband
Several such
test.
tions
predicfer
dealing with learning and transderived
from
the
are
theory and
tested against empirical data.
The
stimulus
situation facing a subject
new
in
is
trial of discrimination
of
thought
as
of
set
be used
can
be obtained.
by the
how
or
is
cue
ject
sub-
reward
For
is
if food
example,
black
card
always found behind
a
in
rat
a
experiment, then
cues
aroused
vant.
by the black card are releA
aroused
cue
by an object
to
is
uncorrelated
with
For
is
always
black
from
cues.
"irrelevant."
predict where
to
the
ing
learn-
or
if it
relevant
Mosteller
and
system
In problems to be analyzed by this

theory, every individual cue is either
behind
card
left
reward
example,
to
the
is "irrelevant."
if the
black
is
reward
card
but
randomly
right, then
"position"
moved
These
irrelevant.
cues
are
correspond
concepts
discussed
Lawrence
are
by
abstract,
(6).
or
thing
any
scription In experiments to be
deconsidered,
present, past, or future, of any
which
the subject can
the subject has just two
choice reto
sponses.
subset
of these
to
cues
may
concrete
"
"
learn
In
make
to
differential
this definition
whether
the
learn
of
learned
the
used
way
the
capacity
of
set
to
is
cue
in
to
set
sidered
con-
the
Theory
parts of it.
"cue" will occasionally
refer to any
are
testing the theory. Any

consistent method
of describing these
be applied
which
two
can
responses
throughout a complete experiment is
acceptable in using this theory.
cannot
responses
term
all of which
same
the
different
to
Informally, the
cues,
to
other activities
in
matter
"indivisible"
as
different
be
not
individual
An
that
sense
he has
as
one.
thought
be
response
long
as
No
response.
subject actually makes
differential
cues
it does
of
manipulated in
during a whole experiment.
are
In
tion
solving a two-choice discriminathe
problem
subject learns to
relate his responses
correctly to the
relevant
cues.
responses
irrelevant
become
cues.
At
the
same
time
his
independent of the
These
two
aspects
of discrimination
This
Ph.D.
paper
is
adapted from
dissertation
submitted
to
part of a
Stanford
debted
author
is especially inUniversity. The
to Dr. Douglas H.
Lawrence
and to
Dr. Patrick Suppes for encouragement
and
criticism.
Thanks
Estes
loaned
who
are
also due
Dr.
W.
K.
prepublicationmanuscripts
and
Dr. R. R. Bush
who
pointed out some
the present theory and
relations between
the
model
Bush- Mosteller
(3).
*
Now
at
OfiRce,The
This
the
Human
Resources
Research
c{k,n+l)=c(k,n)-\-e[_l-c(k,n)'][1]
George Washington University.
article appeared
in
learning are represented

by two hypothesized processes,
"conditioning" and "adaptation."
Intuitively,a conditioned cue is one
how
which
the subject knows
to use
If ^ is a relevant
in getting reward.
and c(k,n) is the probability that
cue
conditioned
k has been
at the beginning
of the wth trial,then
Psychol. Rev., 1955, 62,

515
11-19.
Reprinted
with
permission.
516
READINGS
MATHEMATICAL
IN
PSYCHOLOGY
fraction of unadapted cues

probability that it will be conditioned
adapted
of
each
the
the
trial.
next
on
by
beginning
The
trial. On each trial of a given probperformance function
lem
p(n),
constant
a
proportion, d, of unconditioned
representing the probability of a correct
the
relevant
becomes
nth
is
cues
on
trial, in
response
is the
accord
conditioned.
To
the
that
extent
to
correct
unconditioned
an
relevant
equally
incorrect
an
only,
response
to
cue
tributes
con-
and
correct
The
the
is in the form
total number
in the denominator
cues
of conditioned
to
the number
times
response.
given above.
of a ratio,
of unadapted
adapting
function
with
whereas
tioning
the definitions of condi-
and
conditioned
afifectsperformance, it contributes
cue
with
the
and
plus
cues
of other
ber
num-
one-half
in the
cues
Thus
conditioned
numerator.
cues
Intuitively,an adapted cue is one
contribute their whole effect toward
the subject does
consider
a
not
correct
in deciding upon
his choice response.
adapted
cues
tribute
conresponse,
either reis thought of as a "possible
nothing toward
If a cue
sponse,
and other cues
contribute their
solution" to the problem, an adapted
effect equally toward
and incorrect
correct
is a possible solution
which
the
cue
Formally,
responses.
subject rejects or ignores. If a{k,n)
is the probability that irrelevant cue
k has been adapted at the beginning
Ec(M)+IECi-c(^,w)]
of the wth trial,
then
which
a{k,n+\)=a(k,n)-]-d[\-a{k,n)'][2]
-hhJ:ii-a(k,n)-]
P(n)
[4]
r+j:Ll-a(k,n)-]
is
probability that
the
it will
be
r
adapted
by
trial.
next
On
trial of
each
in
the
an
incorrect response.
It will be noticed
relevant
that
it
Some
to
the
cues
is the
[3]
-\-i'
in the
number
of
cues.
i is the
Thus,
ber
num-
the
fraction of unconditioned
conditioned
on
each
cues.
Regarding
subject is
naive
the beginning
at
of training,so
vant
that for any releand
for
k, c(k,l)
0,
any
cue
irrelevant
receives
can
cue
by
k, a(k,l)
0, and if he
on
a
given problem,
=
trials
mathematical
be shown
induction
it
that if k is relevant,
=
(1
e)"
and
[5]
if k is irrelevant,
6 is the
relevant
in the
cues
proportion
problem. This proportion is the same
as
taken
of relevant
problem and
of irrelevant
Learning
c{k,n -f 1)
r
sum
Consequences
If the
same
then
where
is the
the
tributes
con-
nor
over
2Z
Simple
that
taken
functional
non-
equations
2.
1 and
The fundamental simplifying
this
of
assumption
theory deals
with 6. This assumption is that
e
and
cues
the i irrelevant
in both
6 appears
constant
sum
i
over
is
cue
is the
correct
23
Here
becomes
cues
sense
to
the
given
proportion of
problem a constant
unadapted irrelevant
adapted. An
adapted
neither
of
beginning
the
trial,and
cues
the
a(k,n +
Under
1)
these
substitute
equation
(1
circumstances
5
and
d)\
we
[6]
can
into
equations
and, taking advantage of
FRANK
the simplifying efifects of equation 3,
517
RESTLE
An
Empirical
have
we
Test
Learning
of
Combination
Consider
three
53, all of which
shows
that
Plotting equation
p
an
S-shaped function of n with an
(for0 " 0) at 1.00. Also,
asymptote
p{l)
|. Since p{n) is a monotonic
mate
estican
increasing function of 6 we
is
from
of
observations
If
we
want
know
to
theoretical proportion of relevant
cues
Cues
Si, S2, and
problems,
involve
the
vant
irrele-
same
of the
problems, 5i
entirely separate and
different relevant cues, while in problem
and
have
52,
all the relevant
53
and
know
di and
since
cues
relevant.
present
ri -\-^2 and
are
formance.^3
per-
the
"
of
Two
cues.
"
Simple
the
Theory
ii
62
i^
we
by equation
of Si and
That
52
is,
i^. If
we
63,
compute
can
problem for a particularsubject,

Bi
YxKry + i)
have the subject work
the probwe
on
lem,
02
r2/(r2+ i)
record
his performance
curve,
and
solve
for
03
{ri+ r2)/iri-\-r2-î).
6.
This
equation 7
the simplifying
result depends directlyupon
Solving these equations for ^3 in terms
assumption of equation 3.
of 01 and 02 we
get
Since
the instabilityof individual
makes
it difficult to
learning curves
03
(01+ 02- 20102)
/{I
0A).
[9]
in
fit curves
6
to
them, it is fortunate
be determined
can
in
the
can
This
dififerentway.
E errors
in
subject makes
of solving the problem to
it is
a
rigorous criterion and
very
assumed
for practical purposes
that
he has made
all the errors
he is going
make.
to
Theoretically, the total
made
number
of errors
on
a problem
Suppose
that
course
theorem
errors'
made
are
differential
used
to
errors
cue
learn
used
know
in
learning
how
how
many
to
in
are
many
learning a problem
either X
and
use
many
Y, then how
will be made
(if X
following
we
and
cue
in which
the
answers
question : Suppose
or
be
can
crete)
entirely dis-
are
be written
Eninger (4) has run

which tests equation 9.
of white
rats
were
experiment
an
Three
in
run
groups
maze
n=l
successive
on
Under
the conditions
satisfyingequation
be evaluated
can
mately
approxitime
by using the continuous
variable t in place of the discrete trial
variable
and
integrating. The
n,
result of this integration is that
The
discrimination
first group
7, this
^^l
i
(1
6) log (1
ey
[8]
black-white, the
number
of
errors
made
relates
d, it is possible to make
stable estimates of 0.
to
second
group
and
the
third
tone-no-tone,
had
available
and
both
cues
an
auditory
tion,
discrimina-
relevant.
rigorous
used
are
each
was
group
criterion, total
to
estimate
The
run
error
0i and
values
to
scores
02 by equation
estimated
are
the total
'
on
crimination
dis-
learned
8.^
By equation 8, which
visual
group
Since
log 9
learned
problems.
Total
in
do
not
error
scores
appear
problem
are
no
Eninger's original publication and
relatively longer known.
trials-to-criterion
However,
scores
were
reported. Total
error
scores
were
518
di
READINGS
.020, based
average
made
errors
and
auditory-cue problem,
based
64.5
these
Putting
9
values
two
.029,
"
of
average
the visual-cue
on
errors
the
on
62
estimated
an
on
estimated
an
on
of 98.5
tion
equa-
thereafter
.029
(.020)(.029)/
1
(.020)(.029)
.049.
This
of 63 substituted
value
8 leads to the
equation
about
33 total
errors
expectation of
adapted
an
irrelevant
new
wise,
Like-
appearing as
new
problem
cue
in
cue
if
However,
is made
an
is
conditioned
irrelevant it is
obviously
longer conditioned, since it cannot

larly,
Simias
a
serve
predictor of reward.
it is assumed
that if an adapted
no
is made
cue
relevant in
it becomes
problem,
new
and
unadapted
available
According to the present definition

conditioning, a conditioned
cue
of
average
in
cue
for conditioning.
the combined
In fact,an
problem.
cues
on
into
appears
relevant
in
immediately
problem, it is stillconditioned.
adapted.
.020
as
is conditioned
cue
problem and
cue
^3
that if a
assumed
one
problem.
into
get
we
PSYCHOLOGY
MATHEMATICAL
IN
of
jects
by the four subcontributes
to
correct
a
The
tion
predicresponse.
will
Therefore
the
above
is not very
assumptions
accurate.
However,
the
if
hold
relation
between
not
a cue
only 14 animals were
employed in the
is reversed
in changing
of five, and a reward
entire experiment, in groups
This theory cannot
the problem.
be
five,and four. Individual differences
used
reversal
and
to
animals
analyze
learning,
within
were
among
groups
is applicableonly in cases
in which
is taken of
If account
considerable.
26
errors
made
was
this problem.
on
single- relevant cues maintain

significance.
If two
problems are
prediction is
sampling variabilityof the

cue
groups
group
not
of subjects, the
significantly
wrong.
is needed
whether
Further
to
the proposed law
conditions
same
perimentation
ex-
and
determine
is tenable.
It is easily seen
that 63 will always

larger than di or 62 if all three
problems are solved.
Learning will
always be faster in the combined-cues
problem. Eninger (4) in his paper
ment
points out that this qualitativestateis a consequence
of Spence's
theory of discrimination.
However,
Spence's theory gives no quantitative
be
and
differ
(as where
white
and
Transfer
of
apply this theory

transfer-of-training
experiments in
which more
than one
problem is used,
certain assumptions are
made.
It is
to
estimated
from
trials-to-cr iterion scores
by
using other, comparable data collected by
Amsel
(1). Dr. Amsel
provided detailed
results in a personal communication.
same
the
ratus,
appa-
the discriminanda
problem
one
the other
is
dark
black-
gray-light
discrimination), it is assumed
that both problems involve the same
gray
cues
be
the greater the difference to

discriminated, the more
cues
are
; but
relevant
and
Empirical
Training
to
under
run
in the
the
Tests
less
of
of-Training
order
unchanging
only in the degree of
difference between
law.
In
an
two
of the combined-cue
and
As
Lawrence
it seems
is
that
Transfer-
the
Theory
(7) has pointed out,

a
difficultdiscrimination
easily established
more
irrelevant.
are
are
firsttrained
of
the
same
on
type
an
if the subjects
easy
than
lem
probif all
cult
training is given directlyon the diffiThe
discrimination.
tal
experimenevidence on
this point raises the
question of predicting transfer per-
520
ance
relativelyaccurate, though performis higher than predicted.
also considered the possibility
Lawrence
that
TABLE
Prediction
Rats
Transfer
of
After
Series
Performance
of
of
Pretraining
Problems*
from
gradual transition
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
lems
through successivelyharder probin
result
would
rapid mastery of
He tested this
the difificultproblem.
another
by
giving
group
proposition
of
three
series
of subjects a
pretest
easy
problems before the final test problem.

ing
problems in order of ease of learn-
The
the problem learned by

first,
were,
1 with
No.
ATG
î
.14,
which
problem
mediate
inter-
an
wise
other-
not
was
Data
Lawrence
from
(7).
used, the difficultpretest problem

with
6z
problem
with
di
test
.04.
02 in Lawrence's
estimate
To
finally the
.07, and
ing procedure, subjects,

equation adopted is
ment
experi-
was
problem ^2 never
separately in simple learning,
.09881ogio(.4J)
The
etc.
[11]
where
used
discriminanda
between
whose
Si
problems
is
are
6 is
data, made
available
with
and
cues
It
zero.
that this assumption, along
found
was
foot-candles,there
zero
relevant
no
are
the stimulus
properly controlled,and
difference
Si, S3,
known.
are
if the
that
know
We
values
it
possible
tentative empirical function

6
relating to the difference between
to write
foot-candles.
in
discriminanda
This
holds only in the

apparatus, train-
equationpresumably
of Lawrence's
case
TABLE
The
Relation
STIMtTLl"
AND
2
Between
"Difference
of
VaLUE
OF
in foot-candles.
that
in apparent
for problems
foot-candles
and
d is the difference between
ences
differ-
the relation of d to
notice
we
where
PROBLEM*
theoretical
criminanda
dis-
It is
this equation has
and
significance
phasized
em-
no
is
merely
equation 11 it is
expedient. From
the 6 value of
possible to determine
the intermediate
pretraining problem
by interpolation. Table 2 gives the
data and results of this interpolation.
Ten trials were
given on each of the
first three problems and
fifty trials
the final test problem.
on
Using the
2 it is possible to
in Table
6 values
predict the test problem performance
have
of subjects who
through
gone
gradual transition pretraining.^ This
predictionis compared with observed
performance
in Table
noted
the
that
case
very
be
It may
tween
be-
correspondence
predictionand
in this
3.
close.
observation
Again,
is
ever,
how-
the prediction is consistentlya

than observed
little lower
ance.
perform-
The
generalpredictionfor transfer through
series of problems which

get successively
be derived by following
difficult can
more
*
**
Data
from
Estimated
Lawrence
(7).
by interpolation from
empirical
16.
t Theoretical
"
see
text
for
explanation.
tion
equa-
note
through and repeating the reasoning in foot4.
Since the resultingequations are
rather
be derived
extremely large and can
easily,they are not given here.
FRANK
worked
Data
New
521
RESTLE
to
we
theory has thus far been tested

Its
of rats.
against the behavior
tested with college
generahty is now
in a simple discrimination
students
learning task.
The
high criterion
assume
can
in
pretraining,
p{n) is
that
the
at
negligibly different from one
of pretraining. Then
end
by equation
7 we
that (1
see
0i)"~îs small,
and equation 10 simplifiesto
"
Subjects and procedure. The subjects in

p{n+j)
.[12]
in the ele23 students
mentary
02+(l-^2)'-H^l-02)
experiment were
versity.
Uniat Stanford
psychology course
This theoretical function of j is compared
The
seated at one
end of a
5 was
=
this
table
"5".
or
On
singlestimulus, which
white
circular
used
squares
In
size.
could
responses
each
trial 5 saw
his
told that
and
either "^"
problem
the
Si
on
two
trials differed in
at
For half the

the
problem
"5"
and
square
^s had
to
was
the
the
Stimuli
A
ten
trials and
the
correct
The
law
well
as
as
then transferred
the
the
same
to
The
he
what
problem
Transfer
"Easy-Hard
5s
The
other
then
transferred
11
^s
of task.
was
thought
was,
of the
and
Certain
sort.
same
in
relevant
are
This
si.
52
responses
and run
to
made
up
called
Group"
trained
were
to
correct
problem
These
criterion.
first
was
on
the
the
EH.
S2
the
and
which
These
one.
in
the
hard
to
be
The
Group" called HE.
were
approximately equated for
known
specialvisual skills.
two
groups
sex, and
age,
proportion
of
estimated
group,
relevant
the
age
aver-
cues,
di,
.254
by equation 8.
the
Using
pretraining performance of
the HE
the average
proportion
group,
of relevant
in problem 52 was
cues
was
estimated
The
at
at
62
.138.
transfer performance of group

first learned the easy and
EH,
which
then
the
hard
by equation
problem,
10.
Since
is
be
harder
identified
transfers from
be
must
cues
when
Therefore,
Prediction
Using the pretrainingperformance

EH
the
problem. For performance

perfect in the easier problem
all relevant
Easier
of the
in
cannot
cues
problem
easy
irrelevant
were
cues
the
the hard
to
fied.
identi-
subject
the easier
"Hard-
Easy Transfer
Results.
be applied
performance
can
Using the line of reasoning which

can
developed
equation 10 we
duce
prosize discrimination.
a
to
an
equation
predict transfer
alternated
domly.
rancalled after each
performance from hard to easier problems
criterion of 15 successive
and
the
other
curred
possible solutions which had ocof questioning
This method
to him.
is a modification of Prentice's method
(8).
5s were
trained first on problem 5i
Twelve
a
on
confirmation
rat
outline
to
to
based
rence's
predicted Law-
This
that the
this type
the smaller
to
error.
is
also
data.
rat
on
asked
to
which
"A"
were
was
solution
formula
experimental group,
period was
rest
negligibleconstant
This
prediction
human
largerone.
problem.
problem was
converse
to
the
to
told that
never
say
formance
per-
that
seen
the correspondence is quite close with
suggests
6 ft.
5s in each
transfer
It is
4.
viewed
were
squares
of about
distance
The
3 in.
was
observed
in Table
differed in
squares
height by \ in., in problem 52 they differed

height of each pair of
by \ in. The mean
squares
with
square
The
background.
alternate
on
black
was
be
predictable
these subjects
of
TABLE
Transfer
of
to
Human
Harder
Training
Problem
Subjects
from
in
522
READINGS
TABLE
Prediction
to
MATHEMATICAL
Transfer
of
Harder
IN
of
Easier
Training
Problem
from
these elements need not be of the

of "pointsof color" or "elementary
nature
tones."
in
Subjects
Human
PSYCHOLOGY
learn
If
subject can
consistent response
to a certain
configurationdespite changes in its

constituents, then
by definition
the
configuration
separate from
its constituents. The intention is to
strated
accept any cue which can be demonis
to
be
cue
possiblebasis
for
differentialresponse.
of conditioningdescribed
process
is formally
in this paper
The
problem
number
small
should expect some
of errors
On the
to be made.
we
similar
to
the
processes
of
tioning
condi-
(5) and Bush and

(2,3). In the present
of Estes
assumption that the hard problem Mosteller

was
completelylearned in pretraining, theory conditioning takes place at
not
only on "reinforced"
the formula for transfer performance each trial,
the easy
on
conditiontrials. In earlier theories ing

is said to occur
forced
only on such rein-
problem is
e2-\-(di-e2)(i-diy-'
=
where
cues
nation
trials. In two-choice discrimi-
[13]
P(n-\-j)
di is the proportionof relevant

problem and 62 is
easy
in the
the proportionof relevant cues in the

harder problem. The
proof of this
theorem is similar to that of equation
the incorrect response

has a
cause
behigh initialprobability
(one-half)
of the physical
of the nature
the way
of recording
Therefore, a theory of
for
learningmust account
situation and
responses.
two-choice
the consistent weakening of such responses

above, and is not given here.
through consistent nonreinEquation 13 yields the prediction forcement.
for transfer performance of the HE
The notion of adaptationused here
subjects. In Table 5 the prediction is formallyanalogous to the operation
12
compared with
performance.
is
Despite the
very
observed
transfer
and
of Bush
Mosteller's Discrimination
Operator
small
"Z""
(3). However,
frequencies whereas
Bush and Mosteller's operator

tion
predictedand observed, the predicis appliedonly on trials in which the
is quite accurate.
In all,seven
reward condition is reversed for a cue,
made by eleven subjects,
errors
were
the present theory indicates that this
whereas a total of eightwere
expected.
takes place each trial. In
process
This is an
of .64 errors
average
per
while
the Discrimination
addition,
subjectobserved,and .73 predicted.
Operator and the

both
are
Discussion
The
definition of
Bush
and
process
of adaptation
exponentialin form.
Mosteller introduce
new
k for this purexponentialconstant

pose
in terms
the
and
the
uses
theory
present
is selected because
"cue"
of possibleresponses
the theoretical results do
not
conditioningconstant 6.
the
The major point differentiating
earlier
from
similar
present theory
the nature
of
depend critically
upon
the stimulating agent.
While
cues
are
thought of as stimulus elements, theories
is the
use
of the strong sim-
FRANK
plifying
assumption
exponential
identifying
of
relevant
may
with
constant
the
portion
pro-
This
the
of
sixth
range
within
was
reasonable
the
tion.
devia-
sampling
sumption
as-
intuitively
appear
and
rate,
the
cues.
523
RESTLE
likely,
un-
REFERENCES
but
further
if
should
it
experiment
predictive
be
be
to
of
power
shown
by
tenable,
the
1.
a.
Amsel,
of
learning
is
theory
enhanced.
be
to
useful
so
no
discriminanda
assumption
an
results
abandoning
unless
2.
Bush,
R.
R.,
1952,
model
Psychol.
3.
Bush,
R.
for
J.
1951,
stimulus
simple
58,
matical
mathe-
learning.
313-323.
F.
Mosteller,
"
comp.
341-346.
F.
for
Rev.,
R.,
function
45,
Mosteller,
"
mental
experi-
it.
require
Psychol,
visual
as
durations.
There
for
reason
learning
discrimination
physiol.
seems
of
Rate
brightness
discrimination
generalization
model
crimination.
dis-
and
Summary
two-choice
4.
theory
earlier
of
Mosteller
been
but
to
and
summation
U.
learning
physiol.
5.
Bush
Estes,
of
differs
M.
selective
similar
(5)
Estes
(3)
Eninger,
presented.
formally
is
theories
and
Habit
1951,
58,
tion
discrimina-
has
learning
The
Rev.,
413-423.
of
theory
Psychol.
Psychol.,
W.
K.
1952,
Toward
learning.
in
problem.
/.
45,
511-516.
statistical
Psychol.
comp.
Rev.,
theory
1950,
57,
what
some-
94-107.
in
basic
new
From
laws
uses
6.
D.
of
three
theory
derived
empirical
with
dealing
one
in
exp.
the
7.
combination
Lawrence,
assumption.
this
are
and
concepts
simplifying
of
relevant
and
cues,
H.
Lawrence,
Selective
stimulus
constant
1950,
Psychol,
D.
H.
of
with
quantitative
of
type
The
laws
transfer
four
groups
of
permitted
rats
these
six
human
predictions
and
subjects.
were
of
crimination
dis-
continuum.
havior
be-
Psychol,
8.
Prentice,
W.
C.
learning.
1952,
Continuity
H.
/.
two
Five
quite
physiol.
J.
45,
511-
516.
the
39,
of
groups
transfer
of
predictions
/.
175-188.
two
comp.
These
training.
of
special
association
situation.
40,
along
dealing
ness
distinctive-
Acquired
II.
cues:
exp.
187-194.
of
accu-
(Received
January
14,
1954)
Psychol,
in
man
hu-
1949,
Part
L.
'
LEARNING
DISCRIMINATION
BY
IN
RESPONSES
OBSERVING
OF
ROLE
THE
I
WYCKOFF,
BENJAMIN
JR.
University of Wisconsin
in the
Theorists
literature, largelybecause
the
in
tion
of discrimina-
area
often had
learning have
became
occasion
necessary
which
situations
the
to
of 5
refer to a set or predisposition
to a parto learn differential responses
ticular theory is intended to apply.
Such
of
stimuli.
a
pair
disposition
pre-
Spence's
to
discrimination states
Spence'stheory of
attributed
has often been
it
clearly
delimit
to
are
stimulus-response connections
tending
reaction of 5 such as an atto some
nation
strengthenedor weakened during discrimiorienting response,
response,
the same
trainingin essentially
way
tional
perceivingresponse, sensory organizationing
would
these
occur
during condias
changes
implement
To
activity, etc.
the discussion
that
of the role of such
or
actions
re-
learning we
sponse"
"observing re-
in discrimination
shall adopt the
(Ro) to
term
refer to any
all
aspects of the
will be
which
of
probabilityof
The
observing
will be denoted
response
These
po.
the
reinforcement
which
are
responses
from
of
occurrence
to
be
is not
the
at
stimulus
situation
pinging
im-
time
weakened
reinforced.
when
the
response
Certain implicationsof
questionedby Krechevsky
theory were
the
and became
(11) and other theorists,
nuity"
subjectmatter of the "continuity-disconti-
this
an
by
tinguished
dis-
controversy.
reviewed a number
upon
responses
response is
it and
between
the response
curred
ocon
will be strengthened.These connections
S
response
to the pair
results in exposure
stimuli involved.
discriminative
extinction. When
reinforced the connections
is based; that is,
running, turning right or left,lever

etc., which, for convenience,
pressing,
need
not
This
of
repeated in
be
material has been

times
(2, 5) and
detail here.
One
aspect of the controversy is pertinentto

the present discussion. Krechevsky (12)
.
cated
responses." presentedexperimentalfindingswhich indiwith
that
learned
rats
nothing
respect
Spence (19) has proposed a theory
stimulus patterns during the first 20
of discrimination which is specificallyto two
shall term
we
intended
to
"effective
observing response
no
discrimination experiment even

a
reinforced
systematically
though they were
ing
for approaching a particularpattern dur-
trials of
deal with situations where

is required of S,
that is to say, to situations in which 5

tablished
esthis interval. Failure to learn was
criminative
is certain to be exposed to the disof
interference
lack
a
showing
by
each trial or
stimuli on
criminati
reversed distested on
5s were
when
a
(po
prior to each effective response
tion
discrimina1). The fact that in some
experiments this condition has
These
been satisfied has become
not
an
in which
issue
used
is submitted
This
paper
of the
in
ment
partialfulfill-
requirements for the degree of

Doctor
of Philosophy, in the Department of
writer
University. The
Psychology, Indiana
to
Burke
for his invaluable
This
express
article
his
Here
were
terferenc
in-
obtained, indicatingthat
learninghad occurred in
the earlyportion of the experiment.
these results,
In interpreting
Spence (20.
that
the stimuli (patterns)
p. 277) argued
not
suflSciently
used by Krechevsky were
appeared in Psychol. Rev.,1952, 59,

524
as
weights
differing
discriminative stimuli.
was
some
to Dr. C. J.
appreciation
lation.
guidance and stimu-
wishes
in apparent
findingswere
disagreementwith the data obtained by

and Pratt (13) in a similar exMcCuUoch
periment
cumulative
431-442.
Reprinted with
permission.
L.
WYCKOFF,
BENJAMIN
to providea legitimate
test of
conspicuous
that
had
He
5s
not
suggested
theory.
will be
his
learned to orient toward

trails. He
the first 20
such
cases,
"...
the stimuli within
pointsout
the animal
orient and fixate its head and

receive the criticalstimuli."
must
525
JR.
made
tensive
exto develop a more
theory of discrimination which
will include situations
that in
learn to
eyes so as to
He then gests
sug-
in which
some
referred
observingresponse (hereafter
is
5
is exbefore
posed
to as Ro)
required
to
the
discriminative
stimuli.
example of such a situation would

in which
this learningmay
a
way
be
an
experiment in which stimulus
reactions are learned
"These
occur.
In this
cards
were
placed overhead.
because they are followed within a short
the response
of raisingthe head
sponse." case
temporal interval by the final goal rewould be the Ro).
This interpretation
was
If we accept the notion that changes
experiput to an mental
.
test by
the
exp"eriment
Ehrenfreund
(5). In his
in po can
be accounted for within the
receiving framework
of reinforcement learning
manipulated by
to devise
theory,it should be
likelihood of 5's
critical stimuli
the
An
was
possible
right a
changingthe positionof the stimuli (uptheory of discrimination which will
with respect
and inverted triangles)
where
include those cases
Ro is
some
to the landing
platformof a jumping stand.
The purpose
of this paper
necessary.
The
tially
designof the experimentwas essensuch
shall
is
outline
to
a theory. We
the same
as
sults
Krechevsky's.The rediscrimination
that
see
by analyzing
conform
to Spence'sinterpretation.
When
the stimuli were
placed relatively learningin this way it will be possible
for stimulus generalization
high,no learningoccurred within the first to account
ing
durwhereas
when
40 trials,
and also changes in generalization
the stimuli were
ing
discrimination learning without
placedcloser to the landingplatformlearndid occur.
ured
tween
Learningwas again measpostulatingany direct interaction bein terms
of
ing
of interference in the learn-
subsequentreversed discrimination.
stimuli.
Several
hypotheses
will be derived from this theory which
have been tested in an experiment by

ations
analysisof discrimination situwhere
the author presented in detail elsein w^hich some
sponse
observing re(22). Finallywe shall outline
is required is of interest for
in which the present theory can
a way
several reasons.
First,discrimination
with existingquantitabe integrated
tive
in
atory
learning situations other than labortinction
theories of conditioningand exexperiments, such as human
theory
to form a quantitative
of every
day
learningin the course
of discrimination.
ondly,
Secevents, is largelyof this kind.
To
simplify this discussion let us
in
the most
trolled
even
closelyconconsider a hypotheticalexperiment
laboratory experiments it is
using a situation similar to that used
seldom, if ever, possibleto say with
by Wilcoxon, Hays, and Hull (21),
criminative
certaintythat 5 is exposed to the discriminatio
and later used by Hull (10) for a distive
stimuli priorto each effecperiment
experiment. In this exIn the case of pattern
response.
in a small
rat
was
a
placed
strated
demondiscriminations it has been
with
a
single exit
that
tively compartment
relaEhrenfreund
(5)
by
ment.
small dififerencesin the position through a door into a goal compart-
The
stimuli will
of the discriminative
indicating
effectdiscrimination learning,
of
fixation
that relatively
precise
the stimulus
In
the
is
required.
present
paper
an
attempt
of the latency of
of running through this
tive
discriminaThe
obtained.
door was
stimuli consisted of a black or a
A
measure
the response
white
door, either
one
of which
was
526
READINGS
present
each
on
color
other color
present.
was
an
present on
of the trials.
was
cent
For
During
of
purposes
the
let us consider
crimination
dissimilarly to
situation in which
stimuli
Each
lus
stimu-
average
of 50
the discriminative
If
each trial,when
lookingup
S
occur
black
will
or
fails to
be
white
When
occur.
When
S will not
occur
Ro does
exposed either
card.
either card, but
rather
to
the Ro
whenever
present
In
not.
or
information
We
the
are
now
the white
card
actually looks
is
if we
sense
between
observing
we
as
can
give
see
rise to
generalizationbetween the
More
specifically,
that po will increase during
assume
tial
learning (differentween
reinforcement),generalizationbethe
sponses
re-
discriminative
decrease.
Similarly,
stimuli will
might
we
that po will decrease

a procedure in which
is reinforced
of
if we
the subject
equally often
either
sume
as-
duce
intro-
in the presence
ential
(non-differreinforcement). This decrease
in po would
give rise to an increase in
the stimuli.
generalizationbetween
In the
stimulus
of the
case
periment
hypotheticalex-
suggested above, generalization

will be
shown
effect between
trials
but
in
"crossover"
positive and
Reinforcements
trials.
on
(positivestimulus
card
negative
positive
present
will
not
necessarily observed)
to
strengthen the effective
tend
sponse
re-
while unreinnegative trials,
on
up
then, S gains only

by making the Ro.
in a positionto examine
relation
changes
involved.
neutral
population of stimuli (walls, floor,

etc.). Note that in this situation S
does not improve its chances
of ultimate
reinforcement
by making the Ro.
The
food is placed in the goal compartment
whether
in
stimuli
be exposed to
to
that po
assume
learningprocesses
that these changes would
5 is
in the apparatus,
there will be
certain
that
the Ro of
probability
will
we
discrimination
placed
a
differential
result of
overhead
if
raising the head, will be necessary
5 is to be exposed to the tive
discriminaOn
learn
in
po increases.
as
rather than directlyin front of S.

In
this
case
an
observing response,
stimuli.
stimuli, or
two
of failure to
terms
cussionchanges
dis-
present
placed
are
the
responses
the
different
slightly
PSYCHOLOGY
readily. Thus we can

see
forcement
reinthat stimulus
degeneralization will crease
present, whereas
withheld
when
was
was
per
trial.
MATHEMATICAL
training the running response

reinforced with food when
was
one
IN
forced responses
tend to weaken
on
negativetrials will
the effective response

If 5"s tendency to
positivetrials.
on
look
during differential
increases
up
and
stimulus
generalization. reinforcement, this "crossover" effect
will decrease.
If during non-differential
In general it is apparent
that if po has
the
reinforcement
to
tendency
be exposed
a low value, S will seldom
look
the
"crossover"
decreases,
up
the
to
discriminative
stimuli
black
will
On
white
have
learn
any
(the
cards). S therefore,
minimum
opportunity to
and
discrimination
discrimination
manifest
the
manifest
hand, if po has a high

opportunity to learn or
decreases
generalization between
stimuli
is usually defined either
of 6"s tendency to respond
terms
emphasized that these

regarding increases and
sumptions
po are, at this point, as-
be
in
which
true
in
may
or
may
not
be
ation.
particularexperimental situ-
We
discrimination will be large.
Stimulus
two
to
It should
statements
the other
value,
in
or
already learned.
effect will increase.
shall present experimental

sumptions
suggest that these as-
findingswhich
are
below.
quite generally
true
528
READINGS
PSYCHOLOGY
MATHEMATICAL
IN
5s' tendencies
to
stimulus
discriminative
stimuli
is exposed to the neutral

population, since, on positive
is reinforced
trials, the running response
when
though 5 does
even
up.
is reinforced
effective response
The
5 is
consistently when
stimulus.
the
positive
most
exposed
to
as
the two
to
the
"degree
of discrimination."
look
not
respond
Earlier it
pointed
was
probabilityof
that the
out
of Ro is
occurrence
one
of the factors determining the rate of

ing
Accordformation of discrimination.
hypothesis the opposite

The
relationshipis also true.
of a circular
the positive and negative
to
resulting picture is one
i
n
which
Ro affects
stimuli will have
net
a
reinforcing interrelationship,
it
Therefore,
effect
still
is
the
postulate that
that, before
expectedto
be
of these
any
of its effect
nisms
mecha-
increase
in po
6" must
learn
occur,
that
differential effective responses,
learn to respond
is to say,
5 must
differentlyto the
stimuli.
discriminative
two
of the "jumping
5 does not have
if
experiment,
In the
cause
be-
of discrimination
formation
the
of both
the present
to
posure
ex-
Rg.
on
It is true
can
plausible to
intermittent
on
criminativ
dis-
to
exposure
stimuli,while the degree

affects Ro
through
involving either
of discrimination
mechanism
another
We
case
changes
or
probabilityof reinforcement.
in the
present four propositions
now
implied by this general

was
hypothesis
differential jumping
ward
tohypothesis. The
of
the
basis
formulated
the discriminative
partly on
stimuli, the
able,
already availwill always experimental evidence
probabilityof reinforcement
which
suggested that these
be 50 per cent, and will not be
true
(22). At presof Rowere
propositions
ent
improved by the occurrence
shall consider them as specific
forcement
we
When
we
apply the secondary reinhypotheses. The first two of these
principlewe can see that
stand"
which
are
tendencies
the positive stimulus
must
temporal relation
the proper
number
in
appear
of times
before
acquire secondary
of
reinforcing properties. In terms
and
Schoenfeld,
Notterman,
to
responses
2. po will decrease
are
high)
under
of
conditions
(or remain
low)
non-differential
reinforcement.
discriminative
secondary reinforcing
acquired by these
before
properties
the
(or remain
forcement.
of differential rein-
conditions
essary
nec-
ive
for 5 to learn differential effect-
stimuli
1. po will increase
under
Dins-
interpretation it will be
moor's
sumptions.
as-
as
forcement
rein-
to
will
this stimulus
introduced
already been
have
It is apparent
are
consistent
stimuli.
that these hypotheses

with
the
pothesis
general hy-
nation
the degree of discrimiwill tend to increase (or remain
since
In view
of these considerations
we
pothesis:
following general hyhigh)under differential reinforcement,
decrease
discriminative
while
it will tend
to
(or
to
Exposure
under
nondifferential
will
effect
stimuli
have a reinforcing
remain
low)
introduce
on
the
the
observing response
that
differentlyto
has
the
to
learned
two
to
the
tent
ex-
respond
discriminative
reinforcement.
but
magnitude
we
shall
refer
to
the
of the difference between
other
words,
will learn to respond differentlyto the

forcement,
differential reinstimuli under
two
stimuli.
Hereafter
In
in the
same
way
will learn to
to
them
differential reinforcement.
respond
under
non-
Additional
hypotheses of interest can be derived

from this general hypothesis.
529
JR.
WYCKOFF,
BENJAMIN
L.
Krechevsky
terval
that during the inresponding approximately
also noted
while
was
accordingto chance with respect to

a
stimuU, he showed
deis reversed
po will crease
These
findings
strong positionpreference.
to
temporarily and then return
in complete agreement with hypotheses
are
3. When
established
well
crimination
dis-
discriminative
the
high value.
this change in po
gree
because, following a reversal,the dewill decrease
of discrimination
originaldiscrimination
the
as
It
will then
discrimination
new
4.
If at
increase
the
is formed.
point in
some
ishes.
vanas
an
of discrimination
experiment
to
be retarded
interval,but finallyto
some
hypotheses presented so
in an
experiment by
were
is presented in
the writer (22) which
In this experiment
detail elsewhere.
four
The
far
tested
direct
the degree of discrimination is low and

time po is low (but greater
at the same
tion
than zero), we shall expect the formafor
and 4 in the present formulation.
shall expect
We
obtained
were
non-
differential reinforcement
ing
dur-
used
were
in
which
The
Skinner-box
the
and
reversal.
discrimination
striking
quite rapidly.
Ro
an
during differential reinforcement,
in
occur
of
measures
Pigeons
situation
effective response
was
single translucent
stimuli
discriminative
key.
ored
col-
were
lights(red and green) projected

hypothesis arises from the fact
the back of the key one
at a time.
on
crimination,
that increases in the degree of diswithheld
and
The
colored lightswere
and increases in po, are
the key was
lighted white until the
Early in
dependent upon each other.
of
The
occurred.
Ro consisted
This
Ro
will be
5
the process
discriminative
stimuli
exposed
only
to
the
small
stepping on
pedal on the floor of the
for using
The reasons
compartment.
proportion of the time and hence the
this response
observing response
as an
crease
incannot
degree of discrimination
discussed in detail elsewhere (22).
are
time
At
the
same
po
rapidly.
that this response
Here
it will suffice to
will not
because
increase
of the
say
low
Then, as
degree of discrimination.
becomes
the degree of discrimination
sufficientlygreat to bring about an
increase in po the entire learning process
will be accelerated.
fallswithin
observing
exposure
As
stimuli.
Krechevsky (11) presents data obtained

experimentsin a jumping
stand situation which
correspond in some
respects to the predictionsof the present
formulation.
Curves
for individual 5s
relatively
abrupt discrimination
In
general the
curves
also
mation.
for-
effect
for discrimination reversal
at
as
the
any
potheti
hy-
of the
discussed
had
response
no
ment
probabilityof reinforce-
given
moment.
above
hypotheses were
of this exresults
the
periment.
by
supported
first
three
the
Concerning
All
of the
hypotheses,po
was
ential
higherunder differthan
under
When
differential reinforcement.
were
shifted from
differential to
differential reinforcement
non-
5s
non-
marked
tion
degree of discriminaAll of these
decrease in po occurred.
terval
level,followed by an inat a 5 per
differences were
significant
duringwhich improvement was much
better.
confidence
of
or
cent level
rapid. Finallythe process accelerated
to
less
shows
on
reinforcement
slight improvement in discrimination

prior to the abrupt change. A curve
sented
pre-
discriminative
case
experiment
show
rapid decrease
the
above, the observing
in discrimination
show
the
to
in
an
in that it resulted
response
in
definition of
our
the
in the
chance
reversed
discrimination
formed.
The
fourth
hypothesisdoes
not
ap-
530
READINGS
ply unless
at
MATHEMATICAL
IN
point in the experiment
some
degree of discrimination
both low.
This condition
the
Po
are
was
satisfied consistently since the

(or base) level of the pedal
operant
turned
high for 5s.
out
this condition
when
learning
satisfied and
fected
af-
We
shall
now
derive
to supquantitative statements
plement
the
above
analysis. We
some
several
in
be
may
observingresponse
some
is requiredof S.
relatively shall attempt
be
to
However,
discrimination
and
not
response
PSYCHOLOGY
in such
that the
way
be
grated
readily intein these cases
the results conformed
into existingquantitative theories
of learning such as Hull's (9),
to the hypothesis.
We
illustrate some
Estes' (6) or
Bush
and
Mosteller's
can
now
ways
in which
this theory might be useful
potential applications of
(3). The
in interpretingbehavior in other exthis development could proceed along
periments.
cases
was
present theory
the relationships
down
to set
involved
two
1. If this
theory is applied
in which
discriminative
can
make
some
more
than
discriminations
pair of
one
different lines.
First,we could attempt to state the

relationships between
sponses
observing reand
stimuli is involved
of 5 to form
on
aspects of the
in such a way
that
effective responses
po could be estimated
where
based
measurable
we
predictionsregarding
in the readiness
changes
ations
situ-
to
can
some
direct
situations
in
of Ro
measurement
is
ticular
parnot
pair of stimuli.
feasible.
This
might be the
case,
ing
for example, if the Ro involved focus2. It has been demonstrated
that
If
of
the
the
we
apply
eye.
ent
preswhen
is reversed rea discrimination
peatedly
in this way,
development
po
5s tend to learn the reversed
would
become
able,
varian
intervening
discrimination
and
rapidly
more
more
which could be used to account

(15, 8). According to the present
for and predictbehavior in situations
theory, during discrimination reversal
the observing response
is partially where (1) the apparent generalization
between stimuli changes,or (2) where
reconditioned.
extinguished and
Thus, during repeated reversals,the

tently.
Ro is, in effect,reinforced intermit-
ease
of discrimination
of formation
changes
as
function
of
training.
Studies of intermittent reinforcementBerlyne (1) suggests that "attention"

be treated in a similar way.
have indicated that when a
could
criminati
predict disSecondly, we
is intermittently extinguished
response
and
the
reconditioned,
strength of the response

a
the
tends
to
tain
at-
relativelyconstant
high value
the first reversal po might
(18). On
drop to a low value, and recover
slowly, but with repeated reversals
would expect this drop to become
we
less prominent, and finally,
po would
remain
high throughout the reversal.
It is apparent
that
high, a
discrimination
reversed
be learned
more
if po remained
would
rapidly than
wise.
other-
by
learning functions
of
set
garding
assumptions readopting some
the component
learning processes
involved.
These
assumptions
could be adopted from some
existing
theory which treats the simpler processes
of
examined
precedingdiscussion
some
of the ways
we
main
the
moment
is the
absence
quantitativefunction
for
at
of any
predicting
shall
we
po. However,
down
the
be able to set
relationships
changes
in
in such
function
have
in which
extinction.
obstacle to this endeaver
The
involved
In the
conditioningand
inserted.
a
can
way
that any
ceptable
ac-
immediately be
BENJAMIN
L.
Quantitative
Analysis
of the Ro.
experiment discussed above.

it was
pointed out that we must
There
take into consideration
positiveor
To
three different
stimulus
populations which may effect

Si
5's*behavior.
We shall adopt the following
notation
stimuli.
the
Ro
S is exposed
card
on
on
52
Sz
when
(white) is
the stimulus
represent
the Ro octrials when
curs
S2
and
and
occurs
positive stimulus
present,
when
the negative stimulus

(black) is present, and 5$ the
stimulus
population
which
to
of the effectiveresponse
at any
given moment
during a trial.
This
variable can
be related to the
occurrence
variable of response
Estes (6)has shown
be
can
expected
probabilityat
trial,the
to
any
during
the present
case
probabilityof
we
must
us
response
Si, and
of
account
to
posed
ex-
the value of p when
5 is exposed
53
to
the net
which
of p for a trial on
positive stimulus is
value
the
present
the net value of p for a trial on
which
the negative stimulus is
the
present
In
probabilityof
given
Ro at any
trial.
a
We
the bols
symrepresent the
of the effective
5 is exposed to Si,
refer to the
occurrence
is
of p
52
to
consider the
Sz, respectively.
also wish
Si
to
of
occurrence
during
moment
net
We
shall
ity
probabil-
of the effective
sponse
re-
on
a
given trial,taking into
that S may
be exposed to
shall
tional
certain func-
express
now
these variables.
relationshipsamong
adopt
pi, pi, and pz to

when
when
value
for each of three stimulus
response
any
trial
posed
ex-
of p
of the effective
occurrence
populations. Let
at
is
the
ive
the effect-
occur
used.
will
when
p2
occur.
during
value
mean
units of measurement
ent.
pres-
given moment
(= k/L)
the
latency of the response

will be proportional to the reciprocal
of the probability; that is to say,
pL
k/p,where L is the mean
latency,
of
p the probability,and k a constant
will
which
depend on
proportionality
po
the
fails to
the probability that
p+
are
population of stimuli to
S is exposed if (1) the
positivestimulus is present and
(2) the Ro occurs.
the population of stimuli to
which
S is exposed if (1) the
is present
negative stimulus
and (2) the Ro occurs.
the population of stimuli to
which 5 is exposed if the observing
pi
occur
moment
stimuli
the
the
response
p3
represent
trials when
summarize:
latency as follows.
that if a response
with a given
the
to
negative
response
is
exposed when the Ro fails to occur.

In this analysis we
shall use
the
symbol p to represent the probability
of
the
use
which
to
population to which
population
these
represent
the
stimulus
Si represent
Let
trials when
shall
currenc
purposes
return
card
We
symbols /"+ and p^

of this analysis let us
net
probability on
of the hypoto consideration
thetical
For
the
531
JR.
WYCKOFF,
shall express
we
p+ and
functions of the variables
First
P-.
as
two
pi, p2, pz, and

which
variables
po.
and
p+
can
be
p- are
evaluated
such as
experimental measures,
out
latency of the effective response, withfrom
reference
Ro.
different stimuli during the trial depending

the
on
occurrence
or
non-oc-
They
to
direct
correspond
of
measures
to
the
ures
meas-
tained
of response
tendency usually obin discrimination
experiments.
532
PSYCHOLOGY
MATHEMATICAL
IN
READINGS
in
this assumption
However, in the present framework

to be the net
p+ and p- are assumed
result of the operation of the variables
we
pi, p2, pz, and po. Our task will be to

this dependence as a pair of
present framework, since, as we have

already pointed out, stimulus ization
general-
express
functional
done
relationships. This
making
forfeit the abilityto handle

the
generalizationwithin
do not
stimulus
be
can
follows.
as
However,
be accounted
can
such
postulatingany
a selected moment
during
5
positive trial. At this moment
will be exposed to either î, with a
probabilityof po, or to Sz, with a
posed
probabilityof (1
p^. If 5 is ex-
for without
direct
tion.
interac-
Consider
the
In
adopt a
conditioning and
Si he will make
with
response
the effective
proababilityof pi.
in such
way
of functions
effective response
and Ro are
independent of each other the probability
not
of functions
for
tempt
extinction, but atthe relationships
down
set
to
do
we
paper
particularset
"
to
present
that any acceptableset

serted.
be immediately incan
If the
that both
will
response
If S
pipo.
Ro and
will be the
occur
is
exposed
product
^3 he will
to
the
effective response
with a
probabilityof pz, and the probability
make
that both
(1
will
po)pz. The
"
that
the
of these
total probability
effective
this moment
at
occur
will be the product
occur
rect
assumption of "negligibledithat
implies
changes
of
the probabilityof occurrence
The
interaction"
the effective
will
response
will be the
in
with respect to
the effective response
stimulus
a
population 5,particular
only during
exposed to Si,
of change with
that the rate
and
will
time
depend on :
respect to
(i
1, 2,
P+
po) pz.
is reinforced.
2. The
(1)
If we
By exactly parallelreasoning with

respect
a
to
selected
moment
obtain
negative trial we
during
let r,- represent
of the time
Si, the
to
of pi at the time.
value
The
(1
P0P2 +
step will be to derive

for predictingthe values
next
pi, p2, and
The
(2)
Po) Pz.
follows
of
rate
of
dpi/dt
change of pi
and
It
readily ascertained.
to
possible
predict changes
if the
be
values of pi, p2, and

learning functions
if
we
assume
in
that
respect
to
each
respect
to
the others.
learningwith
implies
stimuli
will have
This
interaction
a
dPi/dt
(3)
fifoipi)
is reinforced,
(4)
TifeiPi)
effective response
tion
assump-
between
negligibleeffect.
functions
The
is not
forced.
rein-
/" represent
change
an
of probability of
rate
of
occurrence
tioning
effective response
during condiand extinction, respectively.
It will be
value
the
approximate
which
of
/" and
tions
acceptable set of analytic func-
any
tion
extinc-
of these stimuli, ceeds

proindependently of learningwith
that
as
the
pz on the basis of
for the simpler
of conditioning and
processes
be
will
be
can
functions
two
if the effective response
Pz.
contingenciesfor the effective response
in the presence
of î, 52, and Sz can
proportion
S is exposed
pressions
ex-
reinforcement
the
during which
approximated by
/)_
effective
the
not
or
response
(1
Popl +
occur
S is
sum
1. Whether
products. Thus
=
3) will
or
in which
the time
noted
of 1 to
that
we
for simple
if
we
will obtain
cases
assign
sions
expres-
of conditioning
L.
of
values
the
present model
be expressed as
each
are
to
of the time.
per cent
time
the subject will be
this
5i
to
Si with
or
Hence
give the
On
Ti
r3
as
of p+
follows.
ability
prob-
of S2
of the
Popi +
(1
p.
P0P2 +
(1
are
in the
of
presence
average
an
of
one-
non-reinforced
and
dpjdt
the
in
Using the above

appropriate functions
obtain
+
We
would
dp+/dt
.5poJc(pi)
.5{l-Po)mps)+fe(Pz)l
.5(1
function
.5poJe{p2)
.5{l-Po)'Uc{Pz)He{Ps)l
(7)
Po)Up3).
of time.
be derived
Such
Equations 1, 2, 10, and 11 represent

simultaneous
equations. By
can
combining these equations we
four
the
singleexpression:
empirically
fi{t). Thus we can obtain

pi
values of pi, p2, ps, and po for any
be
These
values can
point in time.
=
in
equations
1 and
ment
statethrough some
regarding the factors which
bring about changes in po. U po can
be expressed as a function of time we
rewrite equations 5, 6, and 7 to
can
obtain
expressions involving only
dt,
dpi,
pi and t. If these differential
will obtain
be solved we
equations can
substituted
of
pi, pz, and pz as functions

obtain
and
variables
other
express
dpo/dt
G(p+, P-,
theoretical
or
{pi-pz){dPo/dt), (10)
-\-{p2-pz){dpo/dt).(11)
(measurable aspects of effective

responses)if we can predict the values
could
respect to time
po)Mpz)
now
(2)
Substituting values for dpJdt, dpz/

dt and dpz/dtfrom equations 5, 6, and
obtain :
7 and rearranging terms
we
(6)
p-
of po as
function
po)pz.
(5)
outline the steps which

be necessary
to predict p+ and
can
(1)
{\-Po){dp^/dt)-Pi{dPo/dt). (9)
dpJdt
.5(1
po)p3,
po{dp2/dt)
+p2{dpo/dt)
dP'i/dt .Spofeipi)
dpz/dt
proceed
poidpi/dt)+pi{dpo/dt)
sponses
re-
.SpoUpi)
can
to
known
-f (1-Po){dp3/dt)-pz{dpo/dt), (8)
sponses
re-
forced,
rein-
time.
and
obtain
dpi/dt
wish
2 state:
Differentiatingwith
reinforced, and
not
are
for reinforced
we
/?_, we
p+
dp+/dt
that all effective
effective responses
^3 are reinforced
and
we
of po from
1 and
we
po).
of Si
in the presence
effective
responses
of
if
hand
Equations
.5po.
(1
also know
values
other
the values
values
"
half
the
estimate
probability
S will be exposed to S3 with

of (1
po)- Hence:
presence
and
Ti
We
prediction of p+
desired
P-
tive
posi-
50
present
During
exposed
The
follows.
as
negative stimuli
and
of po.
the
can
r^
of po
functions
be
In
extinction.
or
533
JR.
WYCKOFF,
BENJAMIN
to
dp+/dt,dpJdt, po), (12)

G will depend on
/e and/" adopted for the
the function
where
the functions
conditioning and extinction functions.

representing the
Now, if the curves
values
of /"+ and
p-
experimentally,
determined
express
these
analytic functions of time.
variables
as
We
also
can
are
can
we
obtain
dp+/dt and dp-/dtas

SulDstituting the
expressionsfor
functions of time.
functions
for
p+,
534
READINGS
dpj^/dtand dp-/dt in equation
p_,
obtain
we
3. When
12
(13)
G'{t,po).
obtain
be
can
temporarilyand then recover;

4. If the degree of discrimination
po are both low, the formation
will be retarded
discrimination
Po
This
=fo{t).
(14)
will
the
equation
give
us
quite rapidly.
in support of these specific
in an
obtained
hypotheses was
value
experiment in which
directly.
Summary
In
situations
This
learning
discrimination
many
such
response,
some
an
as
will be required of
orienting response,
S before
he is exposed
responses
and
some
"observing responses" {R^,
changes
as
their probabilityof
as
Increases
po.
result in increased
opportunity for 5
results
to
stimulus
learn
or
increases
stimuli.
The
of
training.
discriminations
more
sented
pre-
offers
mulation
for-
phenomenon.
opposite
alent
operationallyequiv-
are
function
the
of stimuli,
particularset
a
of
ease
on
relativelysimple and
of this
testable
interpretation
readily
This
effect.
decreases
discriminative
or
Decreased
the
will have
These
to
the
rapidly if reversals are

more
repeatedly. The present
creased
in-
hence
stimuli, and
po
in po will
and
where
and
5s learn reversed
criminative
to the dis-
exposure
discrimination.
currence
oc-
where
cases
discrimination
of
basis of
in
in
occur,
call these
indicate
manifest
stimuli
ured
meas-
was
be useful for
may
interpreting behavior
changes
Ro
an
formulation
formation
discriminative
the
to
We
stimuli.
for
will finally occur
but
interval
Evidence
of po for any
point in
during the experiment.
time
of
and
some
desired
ination
discrim-
is reversed, po will decrease
If this differential equation

we
well established
dpo/dt
solved
PSYCHOLOGY
MATHEMATICAL
IN
in
the
lends itself to precise
formulation
A quanquantitative statement.
titative
in two
be
used
could
analysis
quantitative pre(1) to make
dictions
ways:
of behavior
ing
followset
of theoretical
based
on
some
ing
regard-
statements
general hypothesis regarding

the component
learning processes,
changes in po can be derived from the
tions
and
to evaluate
(2)
po from observaof
reinforcement.
secondary
principle
of
effective
of
measurable
aspects
native
Hypothesis: Exposure to discrimifor
The
required
steps
stimuli will have a reinforcing responses.
such an analysis are outlined.
effect on the observing response
to the
that S has learned to respond
extent
REFERENCES
differentlyto the two discriminative
stimuli.
1.
general hypothesis we
derive the followingspecifichypotheses:
From
this
Berlyne,
58,
2.
under
2.
under
conditions
(or remain
Some
M. E., " CoATE, W. B.
crimination
of disexperiments on the nature
conditions
reinforcement
learning in
comp.
forcement
of differential rein-
of
nondifferential
3.
Bush,
R.
Psychol.,1950, 43,
R., " Hosteller,
model
4.
Rev., 1951,
Psychol.
BiTTERMAN,
high)
(discrimination training)
;
low)
po will decrease (or remain
Attention,perceptionand
theory.
137-146.
new
1. po will increase
D. E.
behavior
the
rat.
/.
198-210.
F.
for learning.
matical
mathe-
Psychol.
Rev., 1951, 58, 313-323.

son
DiNSMOOR, J. A. a quantitativecompariof the discriminative
and
reinforc-

Readings in Mathematical Psychology v1 1000003886 PDF

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Readings in Mathematical Psychology v1 1000003886 PDF

Transféré par

Droits d'auteur :

Formats disponibles

Readings

Biophysics, the Proceedings of

to Changes in the Intensity

Assuming Equal Standard

Aspectsof Theories of Measurement

Theoryof SpontaneousRecoveryand Regression

Asymptotic Propertiesof Luce's

A formal set of axioms is presented for the method

comparisons [17,24] have

[7, 21] and

satisfy the checks. Empirical

presented [8, 18],

satisfactory scaling within

scaling assumptions, are

8, 9, 12, 15, 21, 25]. Criteria of goodness of

Psychomelrika, 1958, 23,

checks. However, tests of the scaling

Thurstone's Successive Intervals

stimuli and asked

intervals subjects are

proportionof times f,i that

category actuallyrepresents a certain interval of stimulus values for

the interval. So far scale values for the end

pointsof the intervals

then scale values

under the distribution

[6],and Bock [4]have described least

for the purpose

k, the relative frequency/","with

to the real numbers.

A;}into the real numbers

such that for each

corresponding to the categories.It is

the real line.

should be referred to instead of Axioms

t(^k-i)and the functions N,

only the set-theoretical character of the elements

theory states the connection

/,.," and the assumed

1-4 state the formal

f' NXcc) da.

1, ta-D is set equal to

(Fundamental hypothesis)For each

the real numbers.

assumptionsof the theory although,because

hypothesis(Axiom 4) involves the unobservables N, and

be discussed in the next

yet been introduced. These

section. Scale values for the stimuli have

easilyderived. The function

will represent the

scale values of the stimuli.

mapping S into the real numbers

will be said to fit exactlyif all of the testable consequences

observable conditions which

and sufficientto insure the existence

be linear functions of each other in the

(7),is that all 3, ."

there exist real numbers

6^,,such that for each

only if for each

and s, the ratio