Judgments of Learning and Recall

How Many Dimensions Underlie Judgments of Learning and Recall?
Evidence From State-Trace Methodology

Yoonhee Jang and Thomas O. Nelson
University of Maryland
The authors used state-trace methodology to investigate whether a single dimension (e.g., strength) is
sufficient to account for recall and judgments of learning (JOLs) or whether multiple dimensions (e.g.,
intrinsic and extrinsic factors) are needed. The authors separately manipulated the independent variables
of intrinsic and extrinsic cues, determining their state traces for recall and JOLs. In contrast to the
supposition that intrinsic cues have similar effects on both recall and JOLs whereas extrinsic cues affect
JOLs less strongly than recall (i.e., 2 dimensions underlying recall and JOLs), the authors found repeated
support for the sufficiency of a single dimension for both recall and JOLs (not only immediate JOLs but
also delayed JOLs) across a variety of intrinsic and extrinsic cues.
Keywords: metacognition, judgments of learning (JOLs), state-trace analysis, intrinsic and extrinsic cues,
single-dimensional versus multidimensional theories of JOLs
This research concerns a kind of metacognitive monitoring
known as judgments of learning (JOLs), which are judgments
that occur during or after acquisition and are predictions about
future test performance on recently studied items (Nelson &
Narens, 1994, p. 16). JOLs are one of the most frequently
investigated self-monitoring judgments and have been investigated
across diverse areas of psychology (reviewed in Schwartz, 1994).
In spite of the large amount of research on JOLs, there is little
consensus about what kind of theoretical structure underlies
them. In this research, we take an initial step toward under-
standing this structure by asking whether a single dimension
provides a sufficient account of JOLs or whether multiple
dimensions are needed. The answer to this question will facil-
itate development of a parsimonious and consistent theory of
JOLs and, more generally, of metacognitive monitoring. Al-
though this particular question has not been addressed previ-
ously, several theories have been proposed that bear on this
issue.
Single-Dimensional Theories of JOLs
Direct-access (i.e., trace-access) theory proposes that people
monitor memory content, assessing the magnitude along some
underlying single dimension (e.g., memory strength in Busey,
Tunnicliff, Loftus, & Loftus, 2000; Loftus, Oberg, & Dillon,
2004). According to direct-access theory, a strong correspon-
dence is expected between recall and JOLs because recall and
JOLs are affected by the same underlying factor. The observed
correlation between JOLs and recall performance may be less
than unity because of noise, limited degrees of fineness in the
JOLs, and other stochastic issues that can be accounted for by
probability theory.
Other ideas for a single dimension underlying JOLs include
ease of processing (e.g., Begg, Duft, Lalonde, Melnick, &
Sanvito, 1989) or retrieval fluency (e.g., Benjamin, Bjork, &
Schwartz, 1998), although it has not been claimed specifically
that these mechanisms are solely responsible for JOLs. Instead,
the claim is that one of these mechanisms may underlie both
recall and JOLs, but this will happen only to the extent that
recall relies on this mechanism (e.g., retrieval fluency may
sometimes mislead people into giving inappropriately high
JOLs when it has a greater effect on JOL magnitude than on
recall; Benjamin et al., 1998).
Multidimensional Theories of JOLs: Intrinsic and
Extrinsic Cues From the Cue-Utilization Framework
Koriats (1997) cue-utilization framework proposes that peo-
ple evaluate several different cues that are differentially pre-
dictive of subsequent recall and that JOLs are based on heuris-
tics that attempt to forecast the likelihood of recall. For
instance, research by Begg et al. (1989; replicated and extended
by Wixted, 1992) found that although JOL magnitude was
greater for high-frequency words than for low-frequency words,
Yoonhee Jang and Thomas O. Nelson, Department of Psychology,
University of Maryland.
This research was partially supported by Cognition and Student
Learning (CASL) Research Program Grant R305H030283 from the
Institute of Education Sciences of the U.S. Department of Education.
We thank Donald Bamber, Morris Goldsmith, David Huber, Thomas
Wallsten, and Michael Dougherty for their many valuable comments
and Geoffrey Loftus for useful information about his recent studies.
Yoonhee Jang is deeply indebted to Thomas O. Nelson, who passed
away January 14, 2005, for his careful guidance throughout her grad-
uate career.
Correspondence concerning this article should be addressed to Yoonhee
Jang, Department of Psychology, University of Maryland, College Park,
MD 20742. E-mail: yjang@psyc.umd.edu
Journal of Experimental Psychology: General Copyright 2005 by the American Psychological Association
2005, Vol. 134, No. 3, 308326 0096-3445/05/$12.00 DOI: 10.1037/0096-3445.134.3.308
308
subsequent recognition performance was greater for low-
frequency words than for high-frequency words. This crossover
interaction
1
was interpreted as establishing that more than one
factor is required to explain both JOLs and memory perfor-
mance. However, recent research by Benjamin (2003) revealed
that JOLs correctly predict both recognition (better perfor-
mance for low-frequency words) and recall (better performance
for high-frequency words), although that did not occur in the
initial study/test sequence, suggesting that the single-
dimensional account is contingent on metacognitive knowledge
acquisition.
The cue-utilization framework assumes that intrinsic cues in-
volve characteristics of the study items that are perceived to
disclose the items a priori ease or difficulty of learning (Koriat,
1997, p. 350), whereas extrinsic cues are factors that pertain
either to the conditions of learning or to the encoding operations
applied by the learner (p. 350). For instance, Koriat (1997)
proposed item difficulty and item relatedness as prototypical ex-
amples of intrinsic cues and number of study presentations and
study duration as prototypical examples of extrinsic cues. On the
basis of scale-dependent interactions (which are potentially prob-
lematic, as discussed below), Koriat speculated that intrinsic cues
have approximately the same effects on recall as they have on
JOLs, whereas extrinsic cues have greater effects on recall than on
JOLs, which he interpreted as requiring a multidimensional struc-
ture underlying recall and JOLs.
The Present Approach and Formulation of the Problem
By separately manipulating intrinsic and extrinsic cues, we
tested a single-dimensional account of JOLs and recall versus a
multidimensional account. Because it is often assumed that recog-
nition involves more than one process (e.g., memorability vs.
discrimination), we considered only recall in the present study,
focusing our question on whether a single factor underlies both
JOLs and memorability.
We conceptualize the relation between JOLs and recall in the
following way. If JOLs arise from only a single dimension, then
independent variables can affect only that one underlying dimen-
sion (perhaps with noise and/or bias). However, if JOLs arise from
multiple underlying dimensions, then the independent variables
can differentially affect those underlying dimensions.
For instance, Figure 1 illustrates the single-dimensional model
in which the two independent variables (e.g., item difficulty and
number of study presentations) are assumed to affect a single
dimension of the memory representation (D1), which determines
both JOLs and recall: JOLs f(D1), and recall g(D1), where f
and g are positive monotonic functions, with f not necessarily
identical to g. Thus, both JOL magnitude and recall are assumed to
be monotonic functions of the same single dimension of the
memory representation, D1.
In contrast to this single-dimensional account, one possible
multidimensional model is illustrated in Figure 2, which is based
on the hypothesis that intrinsic cues have the same effects on both
recall and JOLs whereas extrinsic cues affect recall more strongly
1
Crossover interactions are important because they allow conclusions of
an underlying multidimensional structure, in contrast to converging/diverg-
ing interactions, which do not require an assumption beyond a single-
dimensional structure (e.g., Dunn & Kirsner, 1988). Converging/diverging
interactions (i.e., scale-dependent interactions) also have problems of
meaningfulness (Krantz & Tversky, 1971; Loftus, 1978; Townsend &
Ashby, 1984), as elaborated in the General Discussion section.
Figure 1. A single-dimensional model for the relation between two independent variables (i.e., item difficulty
as an intrinsic cue and number of study presentations as an extrinsic cue) and two dependent variables (i.e.,
judgments of learning [JOLs] and recall). The single underlying dimension is referred to as D1, and both f and
g are monotonic functions with f not necessarily identical to g.
309
STATE TRACES FOR JUDGMENTS OF LEARNING AND RECALL
than JOLs. This multidimensional model (which is only one of
many possible multidimensional models) contains two underlying
dimensions (D1 and D2), where D1 is a monotonic function of
both item difficulty and number of study presentations, and D2 is
a monotonic function of only number of study presentations (i.e.,
a two-dimensional model). JOL magnitude is primarily determined
by D1 as in the single-dimensional model. Although JOL magni-
tude may also be affected by D2, the effect is small or nonexistent,
as indicated by the dotted arrow between D2 and JOLs. By
contrast, recall is equally affected by both D1 and D2: JOLs
f(D1), and recall g(D1, D2), where f and g are positive mono-
tonic functions, as described above. Hence, item difficulty exerts
similar effects both on JOLs and on recall, whereas number of
study presentations affects JOLs less strongly than recall. This
particular two-dimensional model is in accord with the cue-
utilization framework.
State-Trace Analysis
We utilized state-trace analysis (originally proposed by Bam-
ber, 1979) to test the number of dimensions underlying JOLs and
recall. State-trace analysis is related to conjoint measurement
theory (e.g., this relationship is elaborated on by Loftus et al.,
2004) and the logic of additive and multiplicative effects (see
Loftus, 2002). State-trace analysis is achieved by means of a
scatter-plot graph showing the covariation of two dependent vari-
ables and the manner in which the independent variables affect the
dependent variables. A major goal of state-trace analysis is to
competitively evaluate conceptualizations of single-dimensional
and multidimensional theoretical structures by evaluating curves,
referred to as state traces, within such a scatter plot. According to
a review by Dunn and James (2003),
State-trace analysis comprises a conceptual framework within which
models of the relationships between different dependent variables can
be represented. It also incorporates a method for identifying and
testing these relationships. . . . Thus, if two dependent variables are
functions of the same latent variable, the resulting state-trace is a
one-dimensional curve in two-dimensional state space. (pp. 404405)
The application of state-trace analysis to the present situation is
illustrated in Figure 3, which shows predicted outcomes from the
Figure 1 single-dimensional model (see Figure 3A, 3B, and 3C) as
compared with predicted outcomes from the Figure 2 two-
dimensional model (see Figure 3D, 3E, and 3F). The first two
panels of each row illustrate the separate effects of the independent
variables on the two dependent variables of recall (see Figure 3A
and 3D) and JOL magnitude (see Figure 3B and 3E). In isolation,
these first two panels do not qualitatively differentiate between the
single-dimensional account versus the two-dimensional account of
the dependent variables. Instead, this qualitative model compari-
son is achieved by combining the two plots into state-trace plots,
as shown in the third panel of each row.
As illustrated in Figure 3C, the critical prediction of the single-
dimensional model is that both the one-presentation and two-
presentation curves lie along a single curve. The location of each
curve arises from the causal paths shown in Figure 1: (a) from item
difficulty through D1 to JOLs, and from item difficulty through D1
to recall; and (b) from number of study presentations through D1
to JOLs, and from number of study presentations through D1 to
recall. The critical aspect of the state-trace analysis is highlighted
by the two arrows in Figure 3C, which identify the overlapping
portion of the one-presentation and two-presentation curves, with
the upper arrow identifying easy items that have been presented
once and the lower arrow identifying difficult items that have been
Figure 2. A hypothetical two-dimensional model for the relation between two independent variables (i.e., item
difficulty as an intrinsic cue and number of study presentations as an extrinsic cue) and two dependent variables
(i.e., judgments of learning [JOLs] and recall). The two underlying dimensions are referred to as D1 and D2, and
f and g are monotonic functions with f not necessarily identical to g. The dotted arrow between D2 and the JOLs
indicates that there is little or no effect of D2 on JOLs.
310
JANG AND NELSON
presented twice. In the single-dimensional model, different condi-
tions giving rise to a particular JOL rating must also give rise to a
particular recall performance level (and vice versa), considering
that any specific value on a dependent variable is obtained only
through dimension D1. Therefore, the critical test of the single-
dimensional model is the extent to which the two curves fall atop
each other.
This can be formally stated as follows. Assume that C(i, j)
represents the joint condition of the ith level of one independent
variable (where i can take on the values of, say, a or b) and the jth
level of another independent variable (where j can take on the
values of, say, q or r). Then, according to the single-dimensional
model in Figure 1, the following must be the case: (a) If C(a, r)
produces the same JOL as does C(b, q), then C(a, r) must produce
the same value of D1 as did C(b, q); and (b) because C(a, r) and
C(b, q) have the same value of D1, they must produce the same
recall. In short, the strong prediction of the single-dimensional
model is that whenever JOL magnitude is the same for C(a, r) as
for C(b, q), then recall must be the same for C(a, r) as for C(b, q).
Hence, in Figure 3C, the two curves must be atop each other
throughout the range in which the two curves have the same values
of recall (or the same magnitudes of JOLs), as indicated by the
portion of the two curves between the two arrows.
By contrast, the data pattern that definitively specifies a multi-
dimensional interpretation is one in which the two curves are
separated, such as in Figure 3F. The two-presentation curve could
fall either to the left or to the right of the one-presentation curve,
with each pattern indicating the particular connection strengths of
the multidimensional structure. For instance, the prediction from
the cue-utilization framework (from Figure 2) is illustrated in
Figure 3. Predictions of the two models from Figures 1 and 2 (the top three panels show illustrative outcomes
predicted by the single-dimensional model, and the bottom three panels show illustrative outcomes predicted by
the two-dimensional model from the cue-utilization framework). Panels A, B, D, and E show traditional data in
which the two dependent variables (i.e., recall and judgments of learning [JOLs]) are plotted as functions of the
two independent variables (i.e., item difficulty and one vs. two presentations). Panels C and F show state traces
in which JOL magnitude is plotted against the percentage of correct recall. The data shown between the two
arrows in Panel C are the overlapping portion of the one-presentation curve and the two-presentation curve.
311
Figure 3F, wherein the two-presentation curve falls to the right of
the one-presentation curve. The location of these two curves arises
from the causal paths shown in Figure 2. For this example, JOLs
are determined by only D1, and so both of the presentation
conditions provide the same value of JOLs (viz., 55%). However,
because recall is determined by both D1 and D2, the percentage of
correct recall is higher in the two-presentation condition (e.g.,
80%) than in the one-presentation condition (e.g., 60%). Hence,
the two-presentation curve in Figure 3F falls to the right of the
one-presentation curve (instead of the two curves falling atop each
other as in Figure 3C).
It is important to note that a multidimensional model is capable
of producing a single curve, such as would be the case for equally
weighted inputs and outputs to each of the dimensions. However,
if such a single-curve pattern is observed repeatedly, across dif-
ferent situations, this indicates at a minimum that the multiple
dimensions are entirely redundant and function as a single dimen-
sion. For this reason, a single experiment finding a single curve
cannot definitely rule out the possibility of multiple dimensions
that happen to have conspired to produce a single curve for that
particular situation. It is for this reason that claims of a single
dimension are most effectively made by examining multiple
experiments.
It is also important to note that traditional tests of interactions
between independent variables appear to test something similar to
that tested using state-trace analysis, but there are critical differ-
ences. Specifically, interactions in an analysis of variance
(ANOVA) assume a linear combination of variables, whereas
state-trace analysis allows for nonlinearities in the relationship
between the underlying dimensions and the dependent measures.
Indeed, it is possible to observe a significant interaction even
though the state traces lie along a single curve. This fact serves to
highlight the inherent limitations of the general linear model.
Specific Goals of the Present Research
The primary goal of the present research was to assess the
number of dimensions required for JOLs and recall. Specifically,
we wanted to empirically test the predictions of the single-
dimensional model to determine whether it is sufficient to account
for JOLs and recall or, alternatively, to determine whether a
multidimensional model is necessary (e.g., the cue-utilization
frameworks version of a two-dimensional model).
A secondary aim of our experiments was to separately examine
this relationship for delayed JOLs as well as for immediate JOLs
(which were the only kind of JOLs investigated by Dunlosky &
Matvey, 2001, and by Koriat, 1997). Delayed JOLs typically
predict subsequent recall better than do immediate JOLs (Nelson
& Dunlosky, 1991). There is not yet a consensus explanation for
this delayed-JOL effect, which can be explained through a variety
of mechanisms (e.g., Kimball & Metcalfe, 2003; Nelson & Dun-
losky, 1992; Spellman & Bjork, 1992; Weaver & Kelemen, 1997).
Although cursory consideration of that issue suggests different
mechanisms for immediate and delayed JOLs, some recent studies
have proposed that the delayed-JOL effect may instead correspond
to different settings within a single underlying mechanism (e.g.,
Nelson, Narens, & Dunlosky, 2004). Nelson et al. (2004) reported
that JOL accuracy differences can be explained by the differential
breakdown in the number of dyads comprising immediate and
delayed JOL accuracy. Most of the dyads for immediate JOLs
consist primarily of items that can be both recalled at the time of
the JOL, whereas most of the dyads for delayed JOLs consist
primarily of one item that can be recalled and one item that cannot
be recalled at the time of the JOL. In other words, for immediate
JOLs, discrimination between the two items is relatively difficult
(i.e., discrimination between recalled items), whereas for delayed
JOLs, discrimination between the two items is relatively easy (i.e.,
discrimination between a recalled item vs. a nonrecalled item). It
is important to note that this explanation appeals only to a natural
increase in variability of memorability as a function of delay and
is consistent with a single-dimensional interpretation of JOLs and
recall. Thus, we wanted to evaluate the state-trace plots for both
immediate and delayed JOLs. This procedure allowed additional
opportunitiesnot only from immediate JOLs but also from de-
layed JOLsfor the single-dimensional model to fail and for the
necessity of a multidimensional model to be confirmed through the
observation of separate state traces.
Overview of the Experiments
All experiments used paired study and cued recall testing, with
JOLs given in response to the cue word. Separate experiments
manipulated the intrinsic cue of item difficulty or item relatedness
and the extrinsic cue of number of study presentations or study
duration. One intrinsic cue and one extrinsic cue were manipulated
in each experiment, and all experiments included both delayed and
immediate JOLs.
We used a 2 2 2 repeated measures design for each
experiment in which the three independent variables were an
intrinsic cue (viz., easy vs. difficult items for Experiments 1A and
1C; related vs. unrelated items for Experiments 1B, 1D, and 2), an
extrinsic cue (viz., one vs. two presentations for Experiments 1A,
1B, and 2; short vs. long presentation for Experiments 1C and 1D),
and timing of JOLs (viz., immediate vs. delayed JOLs). The two
dependent variables were the percentage of correct recall and JOL
magnitude. For each experiment, we investigated the assumption
that the independent variables of intrinsic and extrinsic cues had
significant effects on both recall and JOL magnitude. Significance
of these effects is a critical prerequisite for state-trace analysis
because, otherwise, an observation of a single trace could result
from a null effect for one of the independent variables. In all, the
experiments yielded 10 state-trace plots that were used to compet-
itively evaluate the predictions from the single-dimensional model
(see Figure 3C) versus a multidimensional model (e.g., the two-
dimensional model from the cue-utilization framework; see Figure
3F).
Experiments 1A1D
For Experiments 1A and 1C, the manipulated intrinsic cue was
item difficulty, whereas for Experiments 1B and 1D, it was item
relatedness. For Experiments 1A and 1B, the manipulated extrinsic
cue was number of study presentations, whereas for Experiments
1C and 1D, it was study duration. During the study phase, partic-
ipants were instructed to learn each pair and to make a JOL about
the likelihood that they would subsequently be able to recall the
target word when the cue word was presented. During the test
312
JANG AND NELSON
phase, they were instructed to recall the target word in response to
each cue word.
The main hypotheses under investigation were (a) in accord
with the single-dimensional model, the state-trace curves should
fall atop each other, versus (b) if the two curves do not fall atop
each other, then a multidimensional model is needed (e.g., for the
two-dimensional model from the cue-utilization framework, the
two-presentation state-trace curve should fall to the right of the
one-presentation state-trace curve in Experiments 1A and 1B, and
the long-presentation state-trace curve should fall to the right of
the short-presentation state-trace curve in Experiments 1C and
1D).
Method
Participants. Forty-five volunteers from undergraduate psychology
courses at the University of Maryland received course credit in return for
their participation in each of Experiments 1A1D. Participants in all
experiments of this study were treated in accord with the Ethical Princi-
ples of Psychologists and Code of Conduct (American Psychological
Association, 1992).
Materials. For Experiments 1A and 1C, 64 SwahiliEnglish transla-
tion equivalents were drawn from the norms of Nelson and Dunlosky
(1994) according to the normative likelihood of the English word being
recalled when the Swahili word was presented. Thirty-two pairs were
normatively easy pairs (e.g., yaiegg) with a mean normative probability
of recall of .25 (range .18.55), and 32 were normatively difficult pairs
(e.g., nafakacorn) with a mean normative probability of recall of .05
(range .02.08).
For Experiments 1B and 1D, 64 nounnoun pairs were constructed on
the basis of the pairs used by Dunlosky and Matvey (2001). They were
divided into two lists according to the degree of associative relatedness
between the two nouns comprising each pair. Thirty-two pairs consisted of
nouns that were moderately related (e.g., stovekitchen), and the remaining
32 pairs consisted of nouns that were not obviously related (e.g.,
bottlecalendar).
In all experiments, the first 8 pairs constituted practice, and the last 8
pairs were excluded from recall so as to prevent recency effects. The
remaining 48 pairs comprised two blocks of 24 pairs per block and were
the only pairs that were analyzed.
Procedure. Participants in Experiment 1A studied SwahiliEnglish
word pairs and then indicated their JOL for each pair when the Swahili
word appeared alone as the cue for the English word. During study, each
SwahiliEnglish pair was presented in the center of the screen for 9 s. All
pairs were randomly ordered anew for each participant, with the restriction
that at least four pairs separated the two presentations of a given two-
presentation pair.
A self-paced JOL occurred for each pair and was prompted with only the
Swahili word and the question How confident are you that in about 10
minutes from now you will be able to recall the second word of the pair
when prompted with the first? The participants reported their estimate on
a scale ranging from 0 (definitely will not recall) to 100 (definitely will
recall) (e.g., 20 20% sure, 40 40% sure, 60 60% sure, and 80
80% sure).
Immediate JOLs versus delayed JOLs were randomly assigned to the
pairs, with the restriction that one half of the pairs in each condition
received immediate JOLs and the other half received delayed JOLs. Each
immediate JOL occurred immediately after the offset of the pair. After the
final immediate JOL or the final study trial of a given block of 24 pairs,
JOLs occurred for the first third of the pairs that were slated to receive
delayed JOLs within that block; then, JOLs occurred for the second third
of the pairs slated to receive delayed JOLs within that block, followed by
the JOLs for the last third of the pairs slated to receive delayed JOLs within
that block. The order of presentation of the pairs within each third of the
pairs was randomly determined anew from study to delayed JOLs.
During the self-paced test phase, the participants were instructed to
attempt to recall the English translation equivalent when cued by a Swahili
word. If they had no guess, then they typed NEXT to proceed to the next
test trial.
The procedure of Experiment 1B
2
was identical to that of Experiment
1A except that, during the study phase, each pair was presented for 5 s and
the first noun of the pair was the cue word during the JOL and recall
phases.
The procedure of Experiment 1C was identical to that of Experiment 1A
except that, during the study phase, one half of the pairs were presented for
5 s per pair and the other half were presented for 15 s per pair. For
Experiments 1C and 1D, no pair was presented more than once, and the
order of presentation of each pair was randomized anew for each
participant.
The procedure of Experiment 1D was identical to that of Experiment 1B
except that, during the study phase, one half of the pairs were presented for
2 s per pair and the other half were presented for 8 s per pair.
Results and Discussion
For each experiment, we first report the outcome for the pre-
requisite of significant effects of intrinsic and extrinsic cues on
both recall and JOL magnitude. The descriptive statistics are
shown in Figures 4A, 4B, 4D, and 4E8A, 8B, 8D, and 8E, and the
main effects of intrinsic and extrinsic cues (and the two-way
interaction between them, along with the three-way interaction of
them and immediate vs. delayed JOLs) from the 2 2 2 (i.e.,
timing of JOLs, extrinsic cue, and intrinsic cue) ANOVAs of all
experiments are referred to in the text; the complete results from
the ANOVAs are reported in Appendix A. Next, the results of the
state-trace analysis are reported, with the means and standard error
of the means shown in Figures 4C and 4F8C and 8F. Analyses of
item-by-item JOL accuracy are not relevant to the hypotheses
under investigation but, for completeness, are reported in Appen-
dix C. Throughout, all differences reported as statistically signif-
icant have p .05, and estimates of effect size (ES) are reported
as partial eta squared for statistically significant effects.
Prerequisite: Effects of intrinsic and extrinsic cues on recall and
JOL magnitude. The mean percentage of correct recall for each
condition for items receiving immediate JOLs is shown in Figures
2
Before Experiment 1B was conducted, we ran an experiment that was
identical to Experiment 1B. In the earlier experiment, the pattern of the first
30 participants data collected at the beginning of the semester was incon-
sistent with that of the final 15 participants data collected toward the end
of the semester. Because we were concerned that the two samples might
not be homogeneous, we reran the experiment, which we report here as
Experiment 1B. The results of Experiment 1B are most similar to those of
the first 30 participants data from the earlier version of this experiment.
We mention that the data from the first 30 participants were more stable
data than those from the final 15 participants (e.g., 23 out of the 24 results
from the 2 2 2 ANOVAs for each of the three dependent variables
the percentage of correct recall, JOL magnitude, and gammayielded
smaller standard deviations for the first 30 participants than for the final 15
participants). Also, when three extreme outliers were removed from the
final 15 participants, the resulting outcomes were similar to those of the
first 30 participants data. We have no explanation for the relatively
unusual performance of those three outliers, but we mention this for
completeness.
313
4A7A, and the corresponding mean for items receiving delayed
JOLs is shown in Figures 4D7D. The mean magnitude of JOLs
for each condition for items receiving immediate JOLs is shown in
Figures 4B7B, and the corresponding mean for items receiving
delayed JOLs is shown in Figures 4E7E.
As shown in Appendix A and in Figures 4A, 4B, 4D, and
4E7A, 7B, 7D, and 7E, the pattern of results was quite consistent
across Experiments 1A1D. Thus, for both recall and JOL mag-
nitude, the prerequisite was met that both the effect of intrinsic
cues and the effect of extrinsic cues were significant. For both
recall and JOL magnitude, neither the two-way interaction of
intrinsic and extrinsic cues nor the three-way interaction involving
intrinsic and extrinsic cues was statistically significant. The pre-
requisite allows for analyses of state traces, as described next.
State traces of JOLs and recall. The outcome of the state-trace
analysis of recall and JOL magnitude for items receiving imme-
diate JOLs is shown in Figures 4C7C, and the corresponding
outcome for items receiving delayed JOLs is shown in Figures
4F7F. Of primary importance, each of those panels of Figures 4
and 5 shows that the two-presentation curve falls atop the one-
presentation curve, and each of those of Figures 6 and 7 shows that
the long-presentation curve falls atop the short-presentation curve
(also note that the bidirectional standard errors are quite small).
The consistency of this outcome across Experiments 1A1D sug-
gests that the extra flexibility of a multidimensional model, such as
the two-dimensional model from the cue-utilization framework, is
not needed, with the results most parsimoniously explained
through a single underlying dimension for both recall and JOLs.
Figure 4. Results of Experiment 1A. Panels A and B show the mean percentage of correct recall and the mean
magnitude of judgments of learning (JOLs) for items having immediate JOLs, whereas Panels D and E show the
corresponding data for items having delayed JOLs as a joint function of item difficulty and number of study
presentations. Panel C is the state-trace plot of the mean magnitude of immediate JOLs against the mean
percentage of correct recall, whereas Panel F is the state-trace plot of the mean magnitude of delayed JOLs
against the mean percentage of correct recall. Each vertical and horizontal hash mark depicts the standard error
of the mean.
314
JANG AND NELSON
Experiment 2
Experiment 2 was similar to Experiment 1B, in which the
intrinsic cue was item relatedness and the extrinsic cue was num-
ber of study presentations, except that the instructions during the
study/JOL phase were changed to encourage participants to inten-
tionally use a comparison process when they made JOLs. Suppose
that a participant studied a pair comprised of unrelated words (e.g.,
bottlecalendar) and responded with a JOL of 30%. If the partic-
ipant subsequently studied a pair of related words (e.g., stove
kitchen), then the rating might increase, say, to 70% because the
person could compare and contrast the degree of the relatedness in
the second pair with that in the first pair (Koriat, 1997). Because
participants might not have used such a comparison process in
Experiment 1B (and in Experiments 1A, 1C, and 1D as well), the
instructions in Experiment 2 were constructed to encourage such a
comparison process, assuming that such a comparison process may
constitute a critical aspect of a multidimensional account of JOLs.
Method
Participants. Forty-five volunteers from undergraduate psychology
courses at the University of Maryland received course credit in return for
their participation in Experiment 2.
Figure 5. Results of Experiment 1B. Panels A and B show the mean percentage of correct recall and the mean
corresponding data for items having delayed JOLs as a joint function of item relatedness and number of study
of the mean.
315
Materials and procedure. The materials and procedure of Experiment
2 were identical to those of Experiment 1B except for the instructions
during the study/JOL phase. The participants were informed that the
members of each pair were related for some of the pairs but not for other
pairs, and they were encouraged to make greater JOLs for related items
than for unrelated items.
Results and Discussion
Prerequisite: Effects of intrinsic and extrinsic cues on recall and
JOL magnitude. The mean percentage of correct recall for each
condition for items receiving immediate JOLs is shown in Figure
8A, and the corresponding mean for items receiving delayed JOLs
is shown in Figure 8D. The mean magnitude of JOLs for each
condition for items receiving immediate JOLs is shown in Figure
8B, and the corresponding mean for items receiving delayed JOLs
is shown in Figure 8E.
As shown in Appendix A and in Figure 8A, 8B, 8D, and 8E,
there were significant main effects of the intrinsic and extrinsic
cues on both recall and JOL magnitude. The two-way interac-
tion of intrinsic and extrinsic cues on JOL magnitude was not
significant (as in Experiments 1A1D), whereas the two-way
interaction of intrinsic and extrinsic cues on recall was signif-
icant. This interaction was revealed as a greater effect of
number of study presentations for the case of unrelated items
(where recall was intermediate) compared with related items
(where recall was closer to ceiling). Presumably, this interac-
tion resulted from a ceiling effect, but it is important to note
that, as described next, the state-trace analysis placed these
Figure 6. Results of Experiment 1C. Panels A and B show the mean percentage of correct recall and the mean
corresponding data for items having delayed JOLs as a joint function of item difficulty and study duration. Panel
C is the state-trace plot of the mean magnitude of immediate JOLs against the mean percentage of correct recall,
whereas Panel F is the state-trace plot of the mean magnitude of delayed JOLs against the mean percentage of
correct recall. Each vertical and horizontal hash mark depicts the standard error of the mean.
316
JANG AND NELSON
conditions along a single curve. There were no three-way
interactions for either recall or JOL magnitude. With the pre-
requisite met, analyses of state traces are described next.
State traces of JOLs and recall. The outcome of the state-trace
analysis of recall and JOL magnitude for items receiving imme-
diate JOLs is shown in Figure 8C, and the corresponding outcome
for items receiving delayed JOLs is shown in Figure 8F. As in
Figure 5 of Experiment 1B, each of those panels shows that the
two-presentation curve falls atop the one-presentation curve. As in
Experiments 1A1D, the consistency of the result suggests that a
multidimensional account is not needed and that a single-
dimensional account is sufficient.
General Discussion
The primary goal of this research was to investigate the structure
underlying recall and JOLs by applying state-trace methodology to
determine whether one underlying dimension is sufficient or
whether multiple underlying dimensions are needed (e.g., the two
dimensions proposed in the cue-utilization framework). Across all
experiments investigating immediate and delayed JOLs, all 10
state-trace plots of recall and JOL magnitude consistently yielded
state-trace curves that fell atop each other, as predicted by the
assumption that only one dimension underlies both JOL magnitude
and recall. The failure to disconfirm the single-dimensional model
Figure 7. Results of Experiment 1D. Panels A and B show the mean percentage of correct recall and the mean
corresponding data for items having delayed JOLs as a joint function of item relatedness and study duration.
Panel C is the state-trace plot of the mean magnitude of immediate JOLs against the mean percentage of correct
recall, whereas Panel F is the state-trace plot of the mean magnitude of delayed JOLs against the mean
percentage of correct recall. Each vertical and horizontal hash mark depicts the standard error of the mean.
317
occurred even when participants in Experiment 2 were instructed
to intentionally use a comparison process that should have in-
creased differential effects of intrinsic and extrinsic cues on JOLs
(per the cue-utilization framework).
At first glance, the finding that the pattern of state traces for
immediate JOLs did not differ from that of state traces for delayed
JOLs is surprising because previous researchers speculated that the
relation between recall and JOLs might change over time. How-
ever, the results of this study are in accord with a formulation that
ascribes most of the greater accuracy of delayed JOLs to different
ratios of easier versus more difficult discriminations between items
without invoking different psychological processes for immediate
versus delayed JOLs (Nelson et al., 2004).
A multidimensional model will imitate the single-dimensional
model, yielding state-trace curves that fall atop each other, if the
relations from the independent variables through each of the di-
mensions to the dependent variables are weighted to the same
degree. Manipulating various combinations of the intrinsic and
extrinsic cues, our experiments afforded multiple opportunities to
rule out the single-dimensional model in at least a particular case.
Across all experiments, however, the results that consistently
yielded evidence for the single-dimensional model suggest that the
multiple dimensions are unnecessary.
The question of what is the single-dimensional structure that
underlies JOL magnitude and recall is a topic for future research.
Whether the single-dimensional structure is strength, ease of pro-
Figure 8. Results of Experiment 2. Panels A and B show the mean percentage of correct recall and the mean
corresponding data for items having delayed JOLs as a joint function of item relatedness and number of study
of the mean.
318
JANG AND NELSON
cessing, retrieval fluency, or something else is an open question.
However, research designed to answer that question should go
beyond postulating metaphorical structures and instead operation-
alize the various possibilities to distinguish empirically between
them.
Although the present research attempted to address the issue of
dimensionality from a theoretically agnostic standpoint, we none-
theless make a few remarks about the relation between this re-
search and Koriats (1997) cue-utilization framework. We used the
same set of independent variables that Koriat dichotomized into
intrinsic versus extrinsic cues. Thus, our conclusion that JOLs are
based on a single-dimensional construct is limited to the set of
independent variables manipulated in our experiments according
to Koriats dichotomy. Although we cannot generalize to other
nonexamined independent variables, we can generalize our con-
clusions to the independent variables that Koriat viewed as being
prototypical for his intrinsic versus extrinsic distinction. Presum-
ably, other variables might suggest a multidimensional structure.
In fact, a candidate set of dimensions of JOLs is under consider-
ation. Koriat, Bjork, Sheffer, and Bar (2004) recently reported
some evidence for two underlying dimensions of JOLs. They
showed that JOLs are insensitive to retention interval relative to
recall, suggesting a distinction between experience-based and
theory-based JOLs. They attributed the indifference of JOLs to
retention interval to the predominant dependence on subjective
experience (i.e., experience-based JOLs). Although further empir-
ical research is needed to fully understand how the theory-based
knowledge functions and can be combined with the experience-
based knowledge as Koriat et al. suggested, it should be empha-
sized that this dual-basis view serves as one potential multidimen-
sional model.
To explore the difference between our conclusions and those of
Koriat (1997), we conducted another ANOVA for each experi-
ment, treating the contrast of recall and JOLs (labeled measure
by Koriat) as a repeated variable. The complete 2 2 2 (i.e.,
measure, extrinsic cue, and intrinsic cue) ANOVAs of all experi-
ments are reported in Appendix B. According to the cue-utilization
framework, there should be an interaction of measure and extrinsic
cue, whereas there should be little or no interaction of measure and
intrinsic cue. Specifically, the interaction of measure and extrinsic
cue should yield the pattern of results indicating that recall is much
higher than JOL magnitude in the strong level of extrinsic cues
(i.e., two presentations in Experiments 1A, 1B, and 2; long pre-
sentation in Experiments 1C and 1D). Note that because Koriats
experiments examined only immediate JOLs, the comparison be-
tween our results and those of Koriat should be limited to only our
immediate JOL conditions (although the patterns of results in the
immediate and delayed JOL conditions of this study are similar).
The ANOVAs showed that, first, the interaction of measure and
extrinsic cue was significant in all experiments except for Exper-
iment 1C. The discounted effect of extrinsic cues on JOLs for
items of two presentations (i.e., underconfidence, as shown in
Table B2) was found in Experiments 1B and 2, which is consistent
with the hypothesis of the cue-utilization framework. This pattern
of interaction, however, was not found in any of the other exper-
iments. Indeed, Experiment 1A yielded the opposite pattern; JOLs
were overestimated for items of one presentation (i.e., overconfi-
dence, as shown in Table B2). Second, the interaction of measure
and intrinsic cue was significant in all experiments; Experiments
1A and 1C yielded overestimated JOLs for difficult items (and for
easy items of Experiment 1C), whereas Experiments 1B, 1D, and
2 yielded underestimated JOLs for related items. Neither of the
results was consistent with the hypothesis of the cue-utilization
framework. In the present research, on the whole, the conceptual
distinction of the intrinsic versus extrinsic cues failed functionally
to confirm the predictions from the cue-utilization framework that
whereas intrinsic cues have similar effects on both recall and JOLs,
extrinsic cues affect recall more strongly than JOLs.
Indeed, Koriat (1997) found inconsistent effects of intrinsic cues
in his experiments; for instance, whereas his Experiments 2 and 3
yielded equivalent effects of intrinsic cues on both recall and JOLs,
his Experiment 1 yielded greater effects on JOLs than on recall
(see Figure 2, top panel, p. 354), and his Experiment 4 yielded
weaker effects on JOLs than on recall. Likewise, other findings
inconsistent with conclusions derived from the cue-utilization
framework have been reported; for instance, Dunlosky and Matvey
(2001) concluded that both outcomes are inconsistent with pre-
dictions from the cue-utilization framework [which] provides more
of an empirical generalization (i.e., a taxonomy of effects) and not
a theoretical explanation for why various factors differentially
influence JOLs (p. 1186). In addition, Busey et al. (2000) re-
ported that exposure duration had a similar effect on JOL magni-
tude as on memory performance, whereas the amount of rehearsal
had a greater effect on JOL magnitude than on memory perfor-
mance (although both exposure duration and the amount of re-
hearsal are extrinsic cues).
The assumption of Koriats intrinsic versus extrinsic distinction
was based on converging/diverging interactions that were scale
dependent in the sense that the conclusion of an interaction de-
pends critically on the particular scaling both of JOL magnitude
and of recall. That is, although Koriats interactions were signifi-
cant in the statistical sense (of rejecting the null hypothesis of
parallel curves), they were not meaningful in the measurement
sense (i.e., conclusions drawn from them will carry over only to
the particular measures he reported or to some linear transforma-
tion of them, but there is no evidence that the relationship between
those values and the underlying structure is necessarily linear). By
contrast, a positive monotonic nonlinear transformation could
transform Koriats interactions to parallel (cf. Krantz & Tversky,
1971, and the admonishment about drawing conclusions from
scale-dependent interactions in the tutorial by Loftus, 1978). Such
converging/diverging interactions, in contrast to crossover inter-
actions (or interactions in which two curves have opposite-
direction slopes), are known in the literature as being problematic
as a basis for inferring underlying multidimensional structures
(e.g., Dunn & Kirsner, 1988; Krantz, Luce, Suppes, & Tversky,
1971; Loftus, 1978).
An advantage of the state-trace methodology used in the present
research over standard parametric ANOVAs is that it is not beset
with the aforementioned problem of meaningfulness of conclu-
sions that occur when conclusions are based on scale-dependent
interactions. As readers can prove to themselves, any monotonic
(linear or nonlinear) transformation can be applied to the values we
reported for JOL magnitude and recall without eliminating the
overlap of the curves in the state-trace plots shown in Figures 4C
and 4F8C and 8F. Our conclusions about an underlying single-
dimensional structure being sufficient to account for the perfor-
mance in our experiments are meaningful across all monotonic
319
transformations of JOL magnitude and recall that are shown in our
10 state-trace plots. By contrast, conclusions about converging/
diverging interactions can change if monotonic-but-nonlinear
transformations are applied to the figures shown in Koriat (1997).
The state-trace methodology used in this research not only circum-
vents the problems of interpretation of scale-dependent interac-
tions (the limitations of the general linear model) but also can
explore the same issues as dissociation techniques, but in a stron-
ger manner (see Busey et al., 2000; Loftus, 2002; Loftus & Irwin,
1998; Loftus et al., 2004; for a tutorial on state-trace analysis, see
the Appendix in Harley, Dillon, & Loftus, 2004).
References
American Psychological Association. (1992). Ethical principles of psy-
chologists and code of conduct. American Psychologist, 47, 15971611.
Bamber, D. (1979). State-trace analysis: A method of testing simple
theories of causation. Journal of Mathematical Psychology, 19, 137
181.
Begg, I., Duft, S., Lalonde, P., Melnick, R., & Sanvito, J. (1989). Memory
predictions are based on ease of processing. Journal of Memory &
Language, 28, 610632.
Benjamin, A. S. (2003). Predicting and postdicting the effects of word
frequency on memory. Memory & Cognition, 31, 297305.
Benjamin, A. S., Bjork, R. A., & Schwartz, B. L. (1998). The mismeasure
of memory: When retrieval fluency is misleading as a metamnemonic
index. Journal of Experimental Psychology: General, 127, 5568.
Busey, T. A., Tunnicliff, J., Loftus, G. R., & Loftus, E. F. (2000). Accounts
of the confidence-accuracy relation in recognition memory. Psy-
chonomic Bulletin & Review, 7, 2648.
Dunlosky, J., & Matvey, G. (2001). Empirical analysis of the intrinsic
extrinsic distinction of judgments of learning (JOLs): Effects of relat-
edness and serial position on JOLs. Journal of Experimental Psychol-
ogy: Learning, Memory, and Cognition, 27, 11801191.
Dunn, J. C., & James, R. N. (2003). Signed difference analysis: Theory and
application. Journal of Mathematical Psychology, 47, 389416.
Dunn, J. C., & Kirsner, K. (1988). Discovering functionally independent
mental processes: The principle of reversed association. Psychological
Review, 95, 91101.
Harley, E. M., Dillon, A. M., & Loftus, G. R. (2004). Why is it difficult to
see in the fog? How stimulus contrast affects visual perception and
visual memory. Psychonomic Bulletin & Review, 11, 197231.
Kimball, D. R., & Metcalfe, J. (2003). Delaying judgments of learning
affects memory, not metamemory. Memory & Cognition, 31, 918929.
Koriat, A. (1997). Monitoring ones knowledge during study: A cue-
utilization framework to judgments of learning. Journal of Experimental
Psychology: General, 126, 349370.
Koriat, A., Bjork, R. A., Sheffer, L., & Bar, S. K. (2004). Predicting ones
own forgetting: The role of experience-based and theory-based pro-
cesses. Journal of Experimental Psychology: General, 133, 643656.
Krantz, D. H., Luce, R. D., Suppes, P., & Tversky, A. (1971). Foundations
of measurement: Vol. 1. Additive and polynomial representations. New
York: Academic Press.
Krantz, D. H., & Tversky, A. (1971). Conjoint-measurement analysis of
composition rules in psychology. Psychological Review, 78, 151169.
Loftus, G. R. (1978). On interpretation of interactions. Memory & Cogni-
tion, 6, 312319.
Loftus, G. R. (2002). Analysis, interpretation, and visual presentation of
data. In H. E. Pashler (Series Ed.) & J. T. Wixted (Vol. Ed.), Stevens
handbook of experimental psychology: Vol. 4. Methodology in experi-
mental psychology (3rd ed., pp. 339390). New York: Wiley.
Loftus, G. R., & Irwin, D. E. (1998). On the relations among different
measures of visible and informational persistence. Cognitive Psychol-
ogy, 35, 135199.
Loftus, G. R., Oberg, M. A., & Dillon, A. M. (2004). Linear theory,
dimensional theory, and the face-inversion effect. Psychological Review,
111, 835863.
Nelson, T. O., & Dunlosky, J. (1991). When peoples judgments of
learning (JOLs) are extremely accurate at predicting subsequent recall:
The delayed-JOL effect. Psychological Science, 2, 267270.
Nelson, T. O., & Dunlosky, J. (1992). How shall we explain the delayed-
judgment-of-learning effect? Psychological Science, 3, 317318.
Nelson, T. O., & Dunlosky, J. (1994). Norms of paired-associate recall
during multitrial learning of Swahili-English translation equivalents.
Memory, 2, 325335.
Nelson, T. O., & Narens, L. (1994). Why investigate metacognition? In J.
Metcalfe & A. P. Shimamura (Eds.), Metacognition: Knowing about
knowing (pp. 125). Cambridge, MA: MIT Press.
Nelson, T. O., Narens, L., & Dunlosky, J. (2004). A revised methodology
for research on metamemory: Pre-judgment recall and monitoring
(PRAM). Psychological Methods, 9, 5369.
Schwartz, B. L. (1994). Sources of information in metamemory: Judgments
of learning and feeling of knowing. Psychonomic Bulletin & Review, 1,
357375.
Spellman, B. A., & Bjork, R. A. (1992). When predictions create reality:
Judgments of learning may alter what they are intended to assess.
Psychological Science, 3, 315316.
Townsend, J. T., & Ashby, F. G. (1984). Measurement scales and statistics:
The misconception misconceived. Psychological Bulletin, 96, 394401.
Weaver, C. A., III, & Kelemen, W. L. (1997). Judgments of learning at
delays: Shifts in response patterns or increased metamemory accuracy?
Psychological Science, 8, 318321.
Wixted, J. T. (1992). Subjective memorability and the mirror effect.
Journal of Experimental Psychology: Learning, Memory, and Cogni-
tion, 18, 681690.
320
JANG AND NELSON
Appendix A
Complete 2 2 2 (Timing of JOLs, Extrinsic Cue, and Intrinsic Cue) Analyses of Variance
Experiment
Mean percentage of correct recall Mean magnitude of JOLs
F(1, 44) MSE p ES F(1, 44) MSE p ES
Experiment 1A
T 1 2.13 423.21 .15
E 99.03 506.63 .001 .69 91.31 230.05 .001 .68
I 218.45 363.95 .001 .83 170.98 207.82 .001 .80
T E 1 21.03 114.24 .001 .32
T I 1 15.97 191.79 .001 .27
E I 1 1
T E I 1.38 247.16 .25 1.27 122.88 .27
Experiment 1B
T 1 1
E 74.76 368.69 .001 .63 88.77 140.63 .001 .67
I 159.52 609.57 .001 .78 121.40 355.20 .001 .73
T E 1.89 254.98 .18 2.94 499.72 .09
T I 7.59 264.52 .05 .15 1
E I 1 1
T E I 1 1
Experiment 1C
T 2.01 498.11 .16 36.53 614.77 .001 .45
E 18.68 334.60 .001 .30 24.37 114.76 .001 .36
I 190.40 308.71 .001 .81 160.19 171.34 .001 .78
T E 1 1.53 206.39 .22
T I 1 2.89 131.54 .10
E I 1 1
T E I 4.04 274.55 .05
ns
4.04 80.97 .05
ns
Experiment 1D
T 1.08 516.47 .30 2.30 603.08 .14
E 55.65 324.82 .001 .56 88.19 120.24 .001 .67
I 232.90 475.74 .001 .84 179.88 293.47 .001 .80
T E 11.80 183.33 .01 .21 44.70 111.49 .001 .50
T I 14.67 311.38 .001 .25 4.22 119.59 .05 .09
E I 1 1
T E I 1 1
Experiment 2
T 1 1.29 690.27 .26
E 105.60 316.11 .001 .71 95.41 165.84 .001 .68
I 106.39 528.72 .001 .71 89.35 345.45 .001 .67
T E 1.50 404.50 .23 6.73 148.55 .05 .13
T I 5.43 355.40 .05 .11 6.20 140.62 .05 .12
E I 13.76 229.62 .01 .24 1
T E I 1.95 228.22 .17 1
Note. Effect size (ES) is reported only when the F value was significant. JOLs judgments of learning; T timing of JOLs (immediate vs. delayed JOLs
in all experiments); E extrinsic cue (number of study presentations in Experiments 1A, 1B, and 2: one vs. two presentations; study duration in
Experiments 1C and 1D: short vs. long duration); I intrinsic cue (item difficulty in Experiments 1A and 1C: easy vs. difficult items; item relatedness
in Experiments 1B, 1D, and 2: related vs. unrelated items).
(Appendixes continue)
321
Appendix B
The complete results from 2 2 2 (i.e., measure, extrinsic cue, and intrinsic cue) analyses of variance are reported in Table B1. The
results from follow-up simple effect tests of the interactions between measure and extrinsicintrinsic cue in the condition of immediate-
judgments of learning (JOLs) are reported in Table B2, and the corresponding results in the condition of delayed JOLs are reported in Table B3.
Table B1
Complete 2 2 2 (Measure, Extrinsic Cue, and Intrinsic Cue) Analyses of Variance
Experiment
Immediate JOLs Delayed JOLs
Experiment 1A
M 1.13 728.11 .29 1
E 73.05 322.00 .001 .62 76.58 606.78 .001 .64
I 142.74 285.45 .001 .76 161.81 445.93 .001 .79
M E 19.58 168.59 .001 .31 9.59 48.70 .01 .18
M I 27.87 169.31 .001 .39 5.66 108.50 .05 .11
E I 1 1
M E I 1.79 86.77 .19 4.99 68.24 .05 .10
Experiment 1B
M 3.08 655.92 .09 8.50 422.25 .01 .16
E 70.98 276.92 .001 .62 53.01 365.97 .001 .55
I 198.78 428.38 .001 .82 81.52 656.72 .001 .65
M E 13.45 169.82 .001 .23 1
M I 20.26 231.69 .001 .32 15.78 80.60 .001 .26
E I 1 1
M E I 1 1
Experiment 1C
M 54.64 650.31 .001 .55 1
E 13.80 181.63 .001 .24 11.58 578.99 .01 .21
I 151.47 247.01 .001 .78 103.79 444.06 .001 .70
M E 1.47 222.99 .23 2.09 60.51 .16
M I 14.69 151.99 .001 .25 16.58 52.55 .001 .27
E I 2.36 161.15 .13 2.82 361.99 .10
M E I 1.32 145.84 .26 1
Experiment 1D
M 1.07 616.28 .31 5.37 285.60 .05 .11
E 24.83 145.75 .001 .36 87.34 359.75 .001 .66
I 245.53 433.75 .001 .85 108.06 516.68 .001 .71
M E 5.28 146.28 .05 .11 1
M I 33.16 165.67 .001 .43 9.99 840.28 .01 .19
E I 1 1
M E I 1 1
Experiment 2
M 7.75 1443.01 .05 .15 7.24 613.86 .05 .14
E 75.92 209.59 .001 .63 56.50 588.60 .001 .56
I 148.95 396.92 .001 .77 50.92 565.57 .001 .54
M E 6.50 157.19 .05 .13 7.82 79.61 .01 .15
M I 4.44 324.60 .05 .09 6.67 83.18 .05 .13
E I 8.12 218.94 .01 .16 1.55 240.69 .22
M E I 10.55 117.02 .01 .19 3.71 67.36 .06
Note. Effect size (ES) is reported only when the F value was significant. JOLs judgments of learning; Mmeasure (recall vs. JOLs in all experiments);
E extrinsic cue (number of study presentations in Experiments 1A, 1B, and 2: one vs. two presentations; study duration in Experiments 1C and 1D: short
vs. long duration); I intrinsic cue (item difficulty in Experiments 1A and 1C: easy vs. difficult items; item relatedness in Experiments 1B, 1D, and 2:
related vs. unrelated items).
322
JANG AND NELSON
Table B2
Simple Effect Tests Following the Interactions Between Measure and ExtrinsicIntrinsic Cue:
Immediate JOLs
Experiment and interaction t(44) p Over/underconfidence
Measure Extrinsic Cue
1A
Recall vs. JOLs of one presentation 3.19 .01 Overconfidence
Recall vs. JOLs of two presentations .88 .38
1B
Recall vs. JOLs of one presentation .10 .92
Recall vs. JOLs of two presentations 3.02 .01 Underconfidence
1D
Recall vs. JOLs of short presentation .08 .94
Recall vs. JOLs of long presentation 1.94 .06
2
Recall vs. JOLs of one presentation 1.86 .07
Measure Intrinsic Cue
1A
Recall vs. JOLs of difficult items 3.36 .01 Overconfidence
Recall vs. JOLs of easy items 1.30 .20
1B
Recall vs. JOLs of unrelated items .70 .48
Recall vs. JOLs of related items 4.42 .001 Underconfidence
1C
Recall vs. JOLs of easy items 4.56 .001 Overconfidence
1D
Recall vs. JOLs of unrelated items 1.94 .06
2
Note. JOLs judgments of learning.
(Appendixes continue)
323
Table B3
Simple Effect Tests Following the Interactions Between Measure and ExtrinsicIntrinsic Cue:
Delayed JOLs
Experiment and interaction t(44) p Over/underconfidence
Measure Extrinsic Cue
1A
Recall vs. JOLs of two presentations 1.13 .26
2
Measure Intrinsic Cue
1A
Recall vs. JOLs of difficult items 1.43 .16
1B
1C
1D
Recall vs. JOLs of unrelated items .58 .56
2
Appendix C
Mean Goodman-Kruskal gamma correlations between recall and judgments of learning are
reported in Tables C1C5, and the complete results from analyses of variance of the gammas are
reported in Table C6.
Table C1
Mean Goodman-Kruskal Gamma Correlations Between Recall and JOLs of Experiment 1A as a
Function of Timing of JOLs, Item Difficulty, and Number of Study Presentations
Item difficulty
Timing of JOLs
Immediate
(M .41, SEM .09)
Delayed
(M .76, SEM .04)
Overall
One study
presentation
Two study
presentations
One study
presentation
Two study
presentations
M SEM M SEM M SEM M SEM M SEM
Easy .28 .23 .64 .13 .95 .05 .66 .17 .63 .08
Difficult .28 .20 .43 .17 .54 .18 .89 .06 .53 .07
Overall .28 .18 .53 .14 .74 .09 .78 .09
324
JANG AND NELSON
Table C2
Mean Goodman-Kruskal Gamma Correlations Between Recall and JOLs of Experiment 1B as a
Function of Timing of JOLs, Item Relatedness, and Number of Study Presentations
Item relatedness
Timing of JOLs
Immediate
(M .28, SEM .07)
Delayed
(M .77, SEM .05)
Overall
One study
presentation
Two study
presentations
One study
presentation
Two study
presentations
Related .13 .17 .04 .17 .80 .07 .73 .09 .43 .06
Unrelated .55 .13 .41 .15 .73 .09 .80 .08 .62 .06
Overall .34 .11 .22 .13 .76 .07 .77 .06
Table C3
Mean Goodman-Kruskal Gamma Correlations Between Recall and JOLs of Experiment 1C as a
Function of Timing of JOLs, Item Difficulty, and Study Duration
Item difficulty
Timing of JOLs
Immediate
(M .29, SEM .10)
Delayed
(M .77, SEM .05)
Overall
Study
duration: 5 s
Study
duration: 15 s
Study
duration: 5 s
Study
duration: 15 s
Easy .58 .16 .08 .22 .83 .09 .80 .13 .57 .09
Difficult .13 .20 .38 .26 .53 .20 .90 .10 .49 .09
Overall .35 .14 .23 .18 .68 .11 .85 .08
Table C4
Mean Goodman-Kruskal Gamma Correlations Between Recall and JOLs of Experiment 1D as a
Function of Timing of JOLs, Item Relatedness, and Study Duration
Item relatedness
Timing of JOLs
Immediate
(M .22, SEM .08)
Delayed
(M .76, SEM .04)
Overall
Study
duration: 2 s
Study
duration: 8 s
Study
duration: 2 s
Study
duration: 8 s
Related .01 .18 .29 .17 .66 .15 .70 .10 .41 .06
Unrelated .29 .14 .28 .16 .83 .09 .86 .06 .56 .06
Overall .15 .10 .28 .10 .74 .09 .78 .05
(Appendix continue)
325
Received December 10, 2003
Revision received April 19, 2005
Accepted April 26, 2005
Table C5
Mean Goodman-Kruskal Gamma Correlations Between Recall and JOLs of Experiment 2 as a
Function of Timing of JOLs, Item Relatedness, and Number of Study Presentations
Item relatedness
Timing of JOLs
Immediate
(M .36, SEM .08)
Delayed
(M .83, SEM .04)
Overall
One study
presentation
Two study
presentations
One study
presentation
Two study
presentations
Related .34 .21 .03 .20 .79 .14 .89 .07 .51 .07
Unrelated .44 .15 .61 .11 .73 .12 .89 .08 .67 .07
Overall .39 .13 .32 .13 .76 .10 .89 .05
Table C6
Complete 2 2 2 (i.e., Timing of JOLs, Extrinsic Cue, and Intrinsic Cue) ANOVAs of
Goodman-Kruskal Gamma Correlations Between Recall and JOLs
Independent variable
Experiment 1A Experiment 1B
T 14.70 .22 .01 .55 44.06 .21 .001 .70
E 1 1
I 1 9.40 .16 .01 .33
T E 1 1
T I 1 4.53 .34 .05 .19
E I 1.05 .28 .33 1
T E I 5.30 .22 .05 .31 1
Experiment 1C Experiment 1D
T 18.95 .24 .01 .68 27.73 .41 .001 .61
E 1 1.01 .28 .33
I 1 2.30 .37 .15
T E 1.54 .27 .25 1
T I 1 1
E I 6.42 .25 .05 .42 1
T E I 1 1
Experiment 2
F(1, 14) MSE p ES
T 28.14 .24 .001 .67
E 1
I 3.38 .21 .09
T E 1
T I 2.85 .36 .11
E I 2.52 .21 .14
T E I 1
Note. Effect size (ES) is reported only when the F value was significant. JOLs judgments of learning;
ANOVAs analyses of variance; T timing of JOLs (immediate vs. delayed JOLs in all experiments); E
extrinsic cue (number of study presentations in Experiments 1A, 1B, and 2: one vs. two presentations; study
duration in Experiments 1C and 1D: short vs. long duration); I intrinsic cue (item difficulty in Experiments
1A and 1C: easy vs. difficult items; item relatedness in Experiments 1B, 1D, and 2: related vs. unrelated items).
326
JANG AND NELSON

Judgments of Learning and Recall

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Judgments of Learning and Recall

Transféré par

Droits d'auteur :

Formats disponibles

How Many Dimensions Underlie Judgments of Learning and Recall?

Evidence From State-Trace Methodology

Vous aimerez peut-être aussi