Vous êtes sur la page 1sur 14

1 Modeling phoneme classification data: Markov

Chains
In a phoneme classification task words and pseudowords are presented to par-
ticipants. The task of the participants is to decide which of two phonemes
they perceived at a certain location of the word. For example, participants
are presented with the string of phonemes: /?/,/o:/,/t/, where the first
phoneme is a sound in between the range with endpoints /p/ and /b/. To
create this range, the voicelessness of the sound is artificially varied. The
task of the participant is to decide whether they heard the word ’boot’ or
’poot’ which are, in the case of Dutch, both existing words. The stimuli can
also be chosen such that one option is an existing word, while the other is
a non-existing word, or such that both words are non-existing words in a
particular language.
Several characteristic plots can be derived from the data obtained in
phoneme classification experiments. First, a sort of psychometric function is
found for the percentage of decisions for one of the phonemes. An illustration
of such a function is presented in Figure 1. The variable on the x-axis, is in
this case the amount of ”voice” present in the signal.
% decisions /p/

100 %

0%
Voicelessness
Figure 1: Example of a psychometric function found in phoneme decision
experiments.

Second, for the response times, a typical distance effect is found: The
response times near the response criterion are slower. This effect is, for
example, also found in a task where participants are presented with a digit,
and where their task is to decide as quickly as possible whether this digit

1
is larger or smaller than a certain value. Digits with values close to the
reference value show typically slower response times than digits with remote
values. An illustration of a distance effect function is shown in Figure 2.
Response Time

Voicelessness
Figure 2: Illustration of a distance effect function.

Each point in the distance effect function is a mean of a distribution of


response times for that condition. The distribution of response times has
also a particular shape, illustrated in Figure 3. Typically the distribution is
right-skewed: Most of the density is at the left side of the distribution. Sev-
eral distributions have been proposed to describe response time distributions
(for an overview, see Luce (1986), Chapter 1). A model that explains the
psychometric function, and the distance effect function, should also be able
to account for the skewed response time distributons.
Two models that have often been used to describe data obtained in de-
cision experiments are investigated here. These models can be described by
using Markov Chains, which have the property that the probability to be in
a certain state at a certain point of time, only depends on the state that was
occupied one time-step earlier. The two models are known as the random
walk model, and the accumulator model. Their versions in discrete time are
investigated here. The properties of the continuous time versions can easily
be studied by letting the number of stage increase, when decreasing the time
step sizes.

2
0.12

0.1

0.08
Proportion of observatons

0.06

0.04

0.02

0
0 200 400 600 800 1000 1200
Response Time

Figure 3: Illustration of a response time density function. The data are


sampled from an exGaussian function.

1.1 The random walk


A random walk starts at a point in between two boundaries and steps in the
direction of one of the two boundaries at each time step. The step direction is
selected at random, with a certain probability to go in one direction, and one
minus that probability in the other direction. An illustration of a random
walk is shown in Figure 4. The simple random walk has four parameters:
The probability to go one step up (p), the probability of going one step down
(q), the distance to the upper boundary (a), and the distance to the lower
boundary (b). Often the number of parameters are reduced by setting a = b,
and p + q = 1.
The random walk can be put into the Markov Chain format, which makes
the use of results from Markov Chain theory possible. Markov Chains can be
described using a matrix and a vector. The matrix contains the probabilities
to go from one state to the other state in a certain time step. The vector
contains the probabilities to start the walk in a certain state. The states for
the random walk are the amounts of steps from the start point. The initial

3
Response Time

Figure 4: Illustration of a random walk.

probability vector of the random walk contains a one for the zero point of
the walk, and zeros for the remaining states. The definition of the matrix is
a little more complicated. It helps to rearrange the states in this matrix, so
that the states at which the walk ends (the boundaries) are separated from
the other states.

1.1.1 The random walk in more detail


Suppose one has a random walk with seven non-boundary positions, as illus-
trated in Figure 5.
Response A
m=7

m=6

m=5
Start Random Walk
m=4

m=3

m=2

m=1
Response B

Figure 5: A random walk with 7 states.

4
The walk in these 7 states can be represented by the Markov chain shown
in Figure 6. Because states 1 and 7 are absorbing states, no arrow is departing
from these states. Only one step transitions are possible.

P77

m=7
P66
P67
m=6
P55
P56 P65
m=5
P44
P45 P54
m=4
P33
P34 P43
m=3
P22
P23 P32
m=2

P21
P11
m=1

Figure 6: A Markov chain with 7 states.

The transition matrix for this chain can be represented by the following
matrix:
" #
P1 0
P=
R Q

where !
1 0
P1 =
0 1

5
P1 is the transition matrix of the absorbing states: 1 and 7. So the
probability of making the transition from state 1 to state 1, that is, staying
in state 1, equals 1. The probability of entering state 7 from state 1 equals
0.
 
p22 p23 0 ... 0

 p32 p33 p32 0 ... 0 

0 p43 p44 0
 
 p45 ... 
Q= .. .. .. .. ..
 
 

 . . . ... . . 

0 0 . . . pm−2,m−3 pm−2,m−2 pm−2,m−1
 
 
0 0 0 ... pm−1,m−2 pm−1,m−1

So in this particular example Q equals:


 
p22 p23 0 0 0
p32 p33 p34 0 0
 
 
 

 0 p43 p44 p45 0 


 0 0 p54 p55 p56 

0 0 0 p65 p66

In the case of the random walk pij is defined to be p if i < j, q if i > j (such
that p + q ≤ 1, and 1 − p − q if i = j. Q therefore eq uals:
 
1−p−q p 0 0 0
q 1−p−q p 0 0
 
 
 

 0 q 1−p−q p 0 


 0 0 q 1−p−q p 

0 0 0 q 1−p−q

Matrix R contains the transition probabilities from the transient to the


absorbing states:  
p21 0
 0 0
 

 .. ..
 
R= . .


 

 0 0 

0 pm−1,m
In this example R equals:  
q 0
0 0
 
 
 
R= 
 0 0 


 0 0 

0 p

6
The remaining 0-submatrix is filled with zeros. This results in the fol-
lowing transition matrix:
 
1 0 0 0 0 0 0
0 1 0 0 0 0 0
 
 
 

 q 0 1−p−q p 0 0 0 

P= 0 0 q 1−p−q p 0 0
 
 
 

 0 0 0 q 1−p−q p 0 


 0 0 0 0 q 1−p−q p 

0 p 0 0 0 q 1−p−q

With the transition matrix defined, the behavior of the random walk can
be investigated using the following equation:

Z(n) = Z(0) · Pn
where Z(n) is the vector of probabilities of being in each of the states after n

steps, Z(0) is the vector of probabilities of being in each of the states at the
start of the walk, and P is the transition matrix.

1.1.2 Predictions by the random walk


A first indication of the predictions of the random walk model can be obtained
by filling in plausible values for each of the parameters, and compute the
predicted functions, which then can be compared to the observed functions.
The following assumptions will be made:

• p + q = 1. That is, at each time point a step will be made.

• a = b. That is, no bias is assumed in the response of participants.

• The number of states equals 7. This means that a = b = 3.

• The transition matrix does not vary with the time. This is only plau-
sible if the phoneme to be decided results in two nonwords. For words
it is known that the influence of the lexicon will increase over time.
Probably the frequency of the words will be of influence.

First it is investigated whether the shape of the psychometric function


can be obtained. This is done by varying the parameter p, and keeping the
other parameters constant. The probability of deciding for the option repre-
sented at the upper boundary can be obtained by computing the stationary
distribution of the chain. An easy way to do this is by taking steps in the

7
1

0.9

0.8
Probability of absorption at upper boundary

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Probability of going one step up

Figure 7: The psychometric function resulting from the random walk.

walk until the probability of not being at a boundary equals zero. The re-
sult of this procedure is shown in Figure 7. As can be seen, a nice-looking
psychometric function is obtained.
The effect of the transition probability p on the expected response time
can be computed using the equation for the expected value of a random
variable:
X
E(T ) = i · f (i)
i

where f (i) is the probability of the walk ending in exact i steps. The sum
must be taken over all positive values of i, but since the probability of i being
equal to 100 or higher is almost zero for the current parameter values, values
of i larger than 100 are not taken into account. The plot of the expected
response time as function of the transition probability p is shown in Figure 8.
In order to obtain the distance effect, it must be assumed that the distance
to the criterion is monotonically related to the transition probability p, with

8
p equal to 0.5 at the criterion.

7
Expected response time

3
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Probability of going one step up

Figure 8: The distance effect function resulting from the random walk.

Similarly the response time histogram can be obtained by computing the


probability of the random walk to end after exactly n steps. The result for
p = 0.5 is shown in Figure 9. The histograms for other parameter values
look similar.
So, the random walk, with only three parameters, can account for the
shapes of the three characteristic functions found in phoneme decision experi-
ments. Another model which is also a Markov model will now be investigated
in its performance to account for the three functions. This model is called
the accumulator model. Instead of one counter that gathers evidence for one
or the other alternative, this model has two counters, for each alternative
one. These two counters sample evidence for the corresponding alternative
until a treshold is reached. The counter that reaches the threshold first will
cause the response.

9
0.3

0.2
Probability

0.1

0
0 10 20 30 40 50 60 70 80
Response Time

Figure 9: The response time distribution resulting from the random walk,
for p = 0.5.

1.2 The accumulator model


The accumulator model has several counters to which evidence for a partic-
ular alternative is added. Suppose there are two alternatives: A and B. The
discrete time version of the accumulator model will have four free parame-
ters: ra : the probability that one unit of evidence will be added to counter A
in one time-step, rb : the one-step probability for counter B, ta : the amount
of evidence needed for response A, and tb : the amount for response B.
If it is assumed that the sampling of evidence for alternative A and B
occurs independently, the process can be restated in terms of a Markov Chain.
The states in this Markov Chain will be of the type: ”Counter A has counted
5 samples, and counter B 3 samples”. The states for the situation in which
counter A has its threshold at 3 and counter B at 2 are shown in Figure 10.The
states in which counter A equals 3 or counter B equals 2 are absorbing states:
The process returns to that state with probability 1. The absorbing states

10
represent possible end positions of the process. The number of states if A has
its threshold at m and B its threshold at n equals n · m, and the transition
matrix will be of size (n · m)2 .

p=1 p=1
p=1

a=1 a=2 a=3


b=2 b=2 b=2

a=1 a=2 a=3


b=1 b=1 b=1
p=1

Figure 10: The Markov Chain corresponding to an accumulator model with


thresholds equal to 2 and 3.

The transition probabilities not equal to one can be derived from the
probabilities ra and rb . For example, the probability to go from state Sa=1,b=1
to Sa=2,b=2 equals the probability that both counters to add one sample of
evidence in the next time step, which equals ra · rb . The probability of the
two counters reaching their threshold at the same time is small, but not equal
to zero. One has to define what the response will be in this situation. In the
simulations it is assumed that response A will be given in Sa=m,b=n . One way
to avoid this situation, is by using a Chain in continuous time. This option
will be discussed after the predictions of the discrete time Chain have been
presented.

1.2.1 Predictions of the accumulator model


The transition matrix is defined for the accumulator model according to
the description in the previous paragraph. The initial vector with state
occupation probabilities is such that the probability of starting the process
in state Sa=1,b=1 equals 1 and all other probabilities equal zero. In addition,
the rate rb is linked to ra , such that rb = 1 − ra . That is, the faster evidence

11
accumulates for alternative A, the slower the accumulation for alternative B.
The thresholds of both counters are set equal to 7.
The psychometric function that is predicted by the accumulator model is
shown in Figure 11. This function is obtained by varying ra from 0.1 to 0.9
in steps of 0.01. The parameter rb is therefore varied from 0.9 to 0.1. As can
be seen, a similar shape of the psychometric function is obtained as for the
random walk model.

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 11: The psychometric function predicted by an accumulator model


with thresholds equal to 7.

The distance effect function is obtained in a similar way as for the random
walk. The parameter ra is varied from 0.1 to 0.9 in steps of 0.1 (larger
steps are taken, because of computational complexity). The distance effect
function for the accumulator model is shown in 12. The shape of the function
is similar to the function obtained for the random walk.
In order to obtain a response time distribution, the parameter ra is set
to 0.3, and rb to 0.7. The resulting distribution is shown in Figure 13. The

12
10.5

10

9.5

9
Response Time

8.5

7.5

6.5
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Rate of accumulation stimulus A

Figure 12: The distance effect function predicted by an accumulator model


with thresholds equal to 7.

distribution function of the accumulator model is more like a real response


time function, with some density left to the mode of the distribution.
The accumulator model seems to predict the observed functions a little
better than the random walk model. Also is the accumulator model more
plausible with respect to neurophysiological readings. In the brain cells are
found that function as the counters in the accumulator model, but no cells
are found that function as the single counter in the random walk.

13
0.6

0.5

0.4

0.3

0.2

0.1

0
0 100 200 300 400 500 600 700

Figure 13: The response time distribution predicted by an accumulator model


with thresholds equal to 7, and ra equal to 0.3.

14

Vous aimerez peut-être aussi