Hart, 1965

Journal of Educational Psychology 1965, Vol. 56, No.
4, 208-216
MEMORY AND THE FEELING-OF-KNOWING EXPERIENCE1

J. T. HART1 Stanford University To evaluate the accuracy of feeling-of-knowing experiences 2 investigations are reported. Both experiments (Ns of 22 and 16, respectively) show the phenomenon to be a relatively accurate indicator of memory storage, as measured by the ability of Ss to predict recognition failures and successes for items they cannot recall. The results are discussed in terms of the utility of a memory-monitoring process for the efficient functioning of a fallible storage and retrieval system.
Even when unable to answer difficult questions, people are not completely blank. Usually they have definite feelings about whether they know or do not know the absent answers. Feelings of knowing can sometimes be very strong; a person will feel that an elusive memory is close, very close right on the tip of his tongue. Tipof-the-tongue experiences and feelings of knowing of lesser intensities are very common, occurring every day with many types of memory materials: names, dates, telephone numbers, addresses, faces, places, etc. It is not surprising then that the tip-of-the-tongue experience has been recognized by psychologists for many years; nor is it surprising that a few investigations of the experience have been conducted. William James (1950), who seems to have had something wise to say about everything psychological, discussed the tip-of-the-tongue phenomenon at length (pp. 251-264). Woodworth and Schlosberg (1954) have summarized the early investigaI This investigation was completed while the author was on a United States Public Health Service predoctoral research fellowship. The investigation constitutes a portion of the author's PhD dissertation at Stanford University. I 1 am indebted to Albert Hastorf, Leonard Horowitz, and Karl Pribram for valuable discussions about the planning and results of this research.
tions of the phenomenon (pp. 719721). These early investigations were, however, limited in several respects: (a) Only the intense tip-of-the-tongue experiences were studied, not the more general and ubiquitous feeling-of-knowing experiences; (b) the investigations were unsystematic and nonquantitative, consisting mainly of collections of instances; and most importantly (c) the investigations did not answer, nor even ask, what is perhaps the most important question about tip-of-the-tongue or feeling-ofknowing experiencesare they accurate? Instead, the early investigators took the phenomenon as given and tried to study how subjects retrieved or searched for information they did not have but felt they knew. This retrieval problem is of considerable interest, but it departs from the study of the feeling-of-knowing (FOK) phenomenon itself. Indeed, asking how FOK memories are retrieved presupposes that the FOK experience is an accurate indicator of what is in memory. This presupposition is tested in the following investigations.
EXPERIMENT I
Method To answer the question about the accuracy of FOK experiences it ie necessary to find a research paradigm within which the experiences can be produced and their accuracy evaluated. Use was made of one
208
MEMOBY AND THE FEELING-OF-KNOWING
209
of the bestrestablished facts of verbal learningrecognition exceeds recall. People can almost always recognize more answers than they can produce. From this fact the following recall-judgment-recognition (RJR) paradigm can be applied aa a way of studying the accuracy of FOK experiences: (a) give the subjects a test of recall and, for those items they cannot answer, instruct them to make a judgment about whether or not they feel they know the correct answer well enough to recognize it among several wrong alternatives; (b) then give the subjects a multiple-choice recognition test covering the same items that appeared in the test of recall. If the FOK experience is an accurate indicator of memory storage, the subjects should do better on those recognition-test items which they feel they know but cannot recall than on those items which they feel they do not know. Accuracy can be easily assessed by comparing the proportion correct on feeling-of-knowing (FK) items with the proportion correct on feeling-of-notknowing (FK) items for each subject. Both of the experiments reported in this paper employed the same RJR two-step procedure. Experiment I differs from Experiment II only in the number of questions included in the tests (60 and 75, respectively) and the kind of FOK judgments obtained from the subjects (dichotomous and graded, respectively). The reasons for these differences will be given after the method and results of the first experiment are presented. Materials The questions used in the recall and recognition tests were 50 general-information questions. An attempt was made to range widely over the humanities and sciences, choosing questions that would be meaningful but not easy for the average undergraduate. The basic criterion for inclusion was that a question have a single correct answer. Three sample questions are given below in multiple-choice form: Which planet is the largest in our solar system? a. Pluto b. Venus c. Earth d. Jupiter How many sides are there in a hexagon? a. 8 b. 9 c. 6 d. 7 Who wrote "The Tempest"? a. Moliere b. Strindberg c. Jonson d. Shakespeare
Procedure The experiment was administered to all subjects in a single group. All 22 were Stanford undergraduates who had signed up for the experiment to fulfill a course requirement in general psychology. When the subjects arrived, they were seated around a large table; then instructions, including the following, were read to them prior to the test of recall: "The main thing you will be doing in this experiment is answering questions. All the questions are questions of fact. Although the questions are not easy, they are about topics that may have been familiar to you at one time. These questions do not constitute an intelligence test, and you are not expected to answer all of the questions correctly. You will be given about 10 seconds to answer a question after I read it aloud. Do not make wild guesses but do write down any answer you believe might be correct. Also, do not at this time return to questions that you have already passed over. You will be given a chance later to go over the questions again. "I mentioned before that you are not expected to answer every question correctly. Indeed the questions were chosen so that everyone would be unable to answer some questions the first time through. This was done because we are interested in your feelings about the questions for which you are unable to give a correct answer. "Next to the answer box for every question there are two columns labeled 'Feeling of Knowing: Yes/No.' If you cannot supply an answer to a question, make a check in one of the columns adjacent to the blank answer space. If you check the No column, that will indicate that you feel completely blank about what the correct answer might be. If you check the Yes column, that will indicate you have a feeling that you know the correct answer even though you cannot remember it at the moment. The criterion question to ask yourself before checking the Yes column is, 'Even though I don't remember the answer now, do I know the answer to the extent that I could pick the correct answer from among several wrong answers?' If your answer to this criterion question is 'yes,' check Column 1. If your answer is 'no,' check Column 2. "After you have finished all 50 questions, you will be given a second form of the test. On that form you will need to circle the correct answer among four possible answerB." After these instructions were read to the
210
J. T. HART
subjects, the experimenter proceeded to read the questions at a rate of approximately 10-15 seconds per question. When the 50 questions were completed, the answer sheets were collected, and then the subjects were given the multiple-choice form of the S questions. On the recognition test each O subject worked independently at his own speed, going over all 50 questions again. The subjects were instructed to answer every question on the recognition test, even if they had to guess.
Results Table 1 shows, as expected, that memory scores are higher for recognition than recall. The questions wrong on the test of recall are categorized into errors and blanks. For the purposes of the experiment interest centers on the questions left blank, since these were the questions that received a FOK judgment by the subjects. The blanks are subdivided into those for which a FK or Yes judgment was made and those for which a FK or No judgment was made. Before looking at the data to assess the accuracy of FOK judgments, a few definitions are necessary. The meaning of a ^hit" or a "miss" differs for FK and FK items. A FK hit can be straightforwardly defined as a correct multiple-choice response on a question TABLE 1
RECALL AND RECOGNITION SCOBES EXPERIMENT I Tests
u
26.6 23.4
5.2
SD
Recall Correct Incorrect Errors Blanks

FK FK
18.2
8.2
10.0 39.8 10.2
7.2 7.2 2.4 6.0 3.3 4.4 4.9 4.9
Recognition Correct Incorrect
Note.Fifty-item tests, N = 22.
previously left blank and judged FK (Yes); a FK miss would be an incorrect multiple-choice response. For FK blanks a hit would be the converse of a FK hit, since the subject who marks a blank FK (No) is saying that he does not feel he knows the correct answer and does not expect to recognize it on the multiple-choice test. Consequently, a FK hit would be scored for a wrong multiple-choice response and a FK miss for a correct multiple-choice response. Ambiguity can be avoided if it is remembered that the terms "hit" and "miss" apply to the accuracy of the FOK judgments, not to the correctness of the multiple-choice responses. For the overall test of the accuracy of the FOK judgments, FK hits should be compared to FK misses. This comparison clearly indicates that more recall items judged FK are subsequently gotten correct on the recognition test than FK items. A sign test (Siegel, 1956) shows the results to be statistically significant (p < .001, one tailed). On the basis of these results, the FOK experience appears to be at least a minimally accurate indicator of memory storage. It would be interesting to know if the indicator is equally accurate for both FK and FK judgments. Table 2 presents the data for FK and FK hits and misses. It can be seen immediately that the FK results are highly significant (p < .001, one tailed), suggesting that the FK indicator is an accurate predictor of what is in storage. However, the FK portion of Table 2 shows that subjects are not so accurate in predicting what is not in storage (p > .05, one tailed); many subjects tend to correctly recognize answers they did not feel they knew while taking the test of recall. If the f indicator were perfectly accurate, most of the subjects should have FK miss proportions of about .25the guessing
MEMORY AND THE FEELING-OF-KNOWING
211
TABLE 2
FK AND FK PROPORTIONS DIVIDED INTO HITS AND MISSES EXPERIMENT I
FK Ss FK Misses Bits Misses 5s FK Hits Misses Hits FK Misses
Hits .86 .82 .88 .90 .50
1 2 3 4 5 6 7 8 9 10 11
1.00 1.00
.60 .77 .88 .67
.14 .18 .12 .10 .60 .00 .00 .40 .23 .12 .33
.25 .73 .33 .54 .88 .53 .64 .43 .60 .80 .82
.75 .27 .67 .46 .12 .47 .36 .57 .40 .20 .18
12 13 14 15 16 17 18 19 20 21 22
.94 .80 .89 .70 .75 .67 .62 .50 .78 .62 .50
.06 .20 .11 .30 .25 .33 .38 .50 .22 .38 .50
.33 .80 .50 .33 .50 .71 .67 .25 .88 .36 .62
.67 .20 .50 .67 .50 .29 .33 .75 .12 .64 .38
Note.Ms are: FK Hits, .76, Misses, .24; FK Hits, .57, Mis8es,.43. probability for four equally likely multiple-choice answers. The departure of the FK miss proportion from the expected .25 led the investigator to look for possible artifacts. A perusal of the multiplechoice questions suggested one possibility immediately. Many of the multiple-choice alternatives did not seem equally likely, that is, a person ignorant of the correct answer but possessing some related knowledge of the field covered by the question might be able to narrow the alternatives down to three or two, thereby raising his guessing probability to .33 or .50. For example, in the sample question about the author of The Tempest, many subjects might be able to narrow their guesses to the English writers, Jonson and Shakespeare.
EXPERIMENT II
rather than dichotomous judgments about their feelings of knowing. This was done to see if the FOK experience operates as a simple Yes-No indicator of memory storage or if it operates at various graded strengths from definitely Yes down to definitely No. Aside from these two changes, the materials and procedure for Experiment II were identical to those of Experiment I. Materials The subjects in Experiment II were given 75 general information questionsthe 50 received by the Experiment I subjects, plus the 25 harder questions. A few sample questions are listed below: Who painted "Afternoon at La Grand Jatte"? a. Monet b. Seurat c. Cezanne d. Dufy What sea does West Pakistan border? a. Arabian Sea b. Caspian Sea c. Red Sea d. Black Sea Who developed the nonsense syllable in studies of learning? a. Ebbinghaus b. Hull c. Pavlov d. Wundt Procedure The 16 subjects in Experiment II were drawn from the same subject pool as those in Experiment I. They received the Bame general instructions and listed their answers to the recall and recognition tests on similar answer forms. Only this time they were asked not merely to check in either the Yes
Method To correct for this possible artifact, 25 additional questions were carefully constructed to be used in a replication experiment. An effort was made to make the alternatives appear equally likely to the uninformed subject. Additionally, it was decided to see if subjects oould make graded
212
J. T. HABT
TABLE 3
RECALL AND RECOGNITION SCORES EXPERIMENT II Tests
it SD
Recall Correct Incorrect Errors Blanks

FK FK
29.7 45.3
4.8
40.5 15.4 25.1 51.6 23.4
7.5 7.5 2.9 8.2 4.9 6.7 5.4 5.4
Recognition Correct Incorrect
Note.Seventy-five-item tests, N = 16. or No columns next to the unanswered questions but to enter a rating of their feeling of knowing or not knowing. The rating was made with the following scale (the scale was drawn on a blackboard in front of the group): Feeling-of-Knowing Rating
umns. On every question that you cannot answer you should make a rating in either the Yes column or the No column. This is the way the ratings should be assigned [illustrate on the blackboard]. "If you cannot answer the question but feel that you know it, place a 4, 5, or 6 in the Yes column adjacent to the unanswered question. If you cannot answer the question and feel that you do not know the correct answer, place a 3, 2, or 1 in the No column. A 6 in the Yes column means that you feel very strongly that you know the correct answer even though you can't remember it at the moment. If you place a 1 in the No column, that means you feel very strongly that you don't know the correct answer. The intermediate ratings should be used for less definite feelings of knowing or not knowing." The criterion question that subjects were to ask themselves before making a rating was explained in the same words used with the subjects in the first experiment. Results
Table 3 contains a breakdown of the recall and recognition responses for Yes No the 75-item tests. The data were first analyzed by treating the FOK ratings about unanswered items as dichotomous Yes or No judgments to corVery Very respond to those in Experiment I. strongly strongly Table 4 shows that the overall test Special instructions were added to explain of the accuracy of the FOK judgments the use of this scale: "You will notice that next to the answer (FK hits compared to FK misses) box for every question there are two col- confirms the previous findings. For TABLE 4
FK AND FK PROPORTIONS DIVIDED INTO HITS AND MISSES EXPERIMENT II
FK 5s Hita Misses Hits FK Hisses
*>
FK Hits Misses Hits
FK Misses
1 2 3 4 5 6 7 8
.83 .61 .71 .69 .62 .67 .28 .67
.17 .39 .29 .31 .38 .33 .72 .33
.56 .60 .53 .63 .63 .53 .81 .70
.44 .40 .47 .37 .37 .47 .19 .30
9 10 11 12 13 14 15 16
.68 .83 .59 .75 .56 .69 .62 .70
.32 .17 .41 .25 .44 .31 .38 .30
.55 .68 .62 .67 .56 .71 .55 .66
.45 .32 .38 .33 .44 .29 .45 .34
Note.MB are: FK Hits, .66, Misses, .34; FK Hits, .62, Misses, .38.
MEMOBY AND THE FEELING-OF-KNOWING
213
every subject, more FK items are subsequently recognized than FK items (p < .001, one tailed). The breakdown of the FK and FK items into hits and misses shows in Table 4 that, as before, the FK indicator is an accurate indicator of memory storage. This time, however, the FK indicator is shown to be an equally accurate indicator of what is not in storage. Indeed, in Table 4 there are no reversals from the predicted direction of hits over misses. This significant result in Experiment H a n d the absence of a significant FK result in Experiment I can be attributed to the inclusion of harder questions in Experiment II, questions that would give the subjects more definite feelings of not knowing when they were unable to come up with an answer, and to the inclusion of multiple choices on the recognition test that would seem equally likely to an uninformed subject. Yet even though the FK result is significant, it is clear that the obtained proportion of FK misses departs from what would be expected if the subjects were simply guessing. Perhaps_ this departure from the expected FK proportion can be explicated by looking at the data in terms of the ratings rather than the Yes-No dichotomization. Table 5 shows a breakdown of the data across ratings for the proportion of items correct in each rating category from 1 to 6. In the terminology used earlier, 1, 2, and 3 would correspond to FK misses and 4,5, and 6 to FK hits. Two proportions are given under each rating category; the first was obtained by summing across the proportions for subjects and then calculating the mean of the summed proportions, the second by summing the items correct in each category across subjects and then dividing by the total
TABLE 5
PROPORTIONS FOB ITEMS CORRECT RATINGS Method Ratings SUB-
DIVIDED AcCOBDING T FOK O
Means of summed proportions .30 .53 .48 .54 .57 .78 Proportions of summed scores
.30 .53 .50 .60 .61 .75
Note.Two-tailed z tests show that all the proportions are significantly different from the chance level (.25), p < .005, except the values under Rating 1 (.30), for which p > .05.
number of items in that category to obtain a proportion. The second set of proportions probably come closer to the true values that would have been obtained if many more items had been included, say 200-300 instead of 75. The mean proportions are likely to be unstable because, for the middle categories, some of the cell entries for individual subjects were very small; the procedure of taking the mean of the proportions weights each cell equally, consequently an entry for one subject based upon only one item is weighted just as heavily as an entry for another subject based upon 10 items. Descriptively, the mean proportions seem to be graded into three steps from 6 (definitely Yes) to 5, 4, 3, 2, (maybe) to 1 (definitely No); while the summed proportions appear to be graded into four steps from 6 (definitely Yes) to 5 and 4 (maybe Yes) to 3 and 2 (maybe No) to 1 (definitely No). Neither of these descriptive trends are statistically validated, however, since the data are too variable and the cell entries are too small. Only the proportions under Rating 1
214
J. T. HABT
and Rating 6 are significantly differ- retrieve memory contents, then they ent. A t test for correlated scores would not need FOKs. A memory yields t = 3.38, p < .01 for the mean item, if in storage, would be retrieved; difference .30-78. All the other differ- failure to retrieve an item would mean ences could occur with a p > .05. that it was not in storage. Consequently, these descriptive trends Clearly, however, human beings do are mentioned only as possibilities to not operate this wayretention often look for in later research. exceeds recall. The repeatedly observed What is clear from the rating-scale discrepancy between common measdata is that when subjects feel defi- ures of retention (recall, reconstrucnitely that they do not know an an- tion, recognition, and savings) shows swer, they score on those items at that there is not a simple one-to-one about the chance recognition level; relationship between storage and recall when they feel definitely that they do in the human memory system. The know the answer, their scores are three phenomenon of reminiscence is another example showing that what is imtimes the chance level. mediately retrieved may not be an adequate indicator of what is reDISCUSSION tained.8 The results of Experiments I and II For a fallible memory system, a certainly indicate that FOK judgments system in which what is retrieved does can be used as relatively accurate not completely mirror what is stored, indicators of what is and is not in the FOK experience can serve a usememory. Earlier, in the introduction, ful function. It can serve as an indithe claim was made that the most cator of what is stored in memory important and interesting question to when the retrieval of a memory item ask about FOK experiences is: "Are is temporarily unsuccessful or interthey accurate?" Now that the results rupted. If the indicator signals that an have given a "yes" answer to this item is not in storage, then the system question, it is pertinent to look at the will not continue to expend useless question again and ask: "Why is it effort and time at retrieval; instead, important that FOK experiences be input can be sought that will put the accurate? And, if they are, so what?" item into storage. Or, if the indicator Is the FOK phenomenon anything signals that an item is in storage, then more than a curious and common sub- the system will avoid redundantly jective experience? Does the experience inputting information that is already have any usefulness in daily life? Beginning answers can be made to But to be useful a FOK indicator these questions by recognizing that must be accurate. The determination human beings function cognitively of the accuracy of FOK-derived FK as information-processing systems, sysand FK judgments is basic to an tems with enormously flexible but fallible storage and retrieval capa1 Any of the standard sources on human bilities. It is this fact of memory learning, for example McGeoch and Irion fallibility that makes FOK experiences (1952) and Hovland (1951), contain disimportant. If human beings were cussions, data, and references about comof computers with infallible memories so parisons between different measures of retention and about the phenomenon that they could always immediately reminiscence.
MEMOHY AND THE FEELINQ-OF-KNOWING
216
evaluation of the utility of FOK experiences because an inaccurate FOK indicator would add nothing to the efficiency of the fallible human memory system. Indeed, the operation of an inaccurate indicator would increase the system's fallibility and produce extremes of inefficiency. With an inaccurate indicator a person would persist in trying to remember what he had never learned or had forgotten, and he would not persist in efforts to remember information that, with perhaps only a few more tries or a rest or a new kind of retrieval search, would eventually be remembered. The important finding of the investigations reported is that FOK experiences are relatively accurate indicators of memory storage. Subjects can, by referring to their FOK experiences, make accurate judgments about which items they will be able to remember and which they will not. Of course, their predictions are not perfect and there are individual differences among subjects in accuracy, but even so the FOK phenomenon seems powerful and reliable. Since the FOK experience or phenomenon refers essentially to the FK or Yes state of the indicator, it might be better to introduce a more general term to describe the overall process whereby FK and FK judgments are made about unretrieved memory items. Hereafter, the phrase "memory-monitoring process" will be used to refer to the intervening process producing FK and FK judgments. Operationally, within the RJR paradigm described, memory-monitoring accuracy is measured by counting how well subjects predict which answers they will and ill not recognize after they have been bl to answer the questions by recall.
Although the memory-monitoring process has been introduced and defined with reference to the EJR paradigm, it is clear that other paradigms might be applied to operationally bracket this phenomenon. For example, after subjects have made FOK judgments about items they cannot retrieve, they could be asked to continue efforts to recall the unretrieved itemsmore FK items should eventually be retrieved than FK items; or, if cues are introduced one at a time about sought-after items, fewer cues should be required for the identification of FK than FK items. The phrase memory monitoring was chosen to describe the process intervening between recall and recognition because it seems descriptively meaningful yet theoretically neutral. When subjects make FK and FK judgments, they, in some way, monitor or check what they do remember to arrive at a decision about what they might, remember. Memory monitoring refers only to the intervening process, the process which leads to FK and FK responses whose accuracy can be objectively evaluated within the EJR paradigm. How the monitor works, the neural mechanisms necessary to accomplish memory monitoring, the variables that affect monitoring accuracythese are unknowns. But even with so many unknowns, it is clear that questions about FOK experiences are not silly questions. The memory-monitoring process appears to be an important process, contributing significantly to the efficiency of the human information-processing system. REFERENCES
HOVLAND, C. I.
Human learning and retention. In S. S. Stevens (Ed.), Handbook of
216
J. T. HABT
experimental psychology. Wiley, 1951. Pp. 613-687.

JAMES, W. The principles
New York:
of psychology.
SIEGEL, S. Nonparametric statistics. New York: McGraw-Hill, 1956.

WOODWORTH, R. S., & SCHLOSBERG, H. Ex-
Hew York: Dover, 1950.

MCGEOCH, J. A.,& IRION, A. I. The psy-
perimental psychology. (Rev. ed.) New

York: Holt, 1964.
chology of human learning. (2nd ed.) New York: Longmans, Green, 1952.
(Received March 23, 1965)

Hart, 1965

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Hart, 1965

Transféré par

Droits d'auteur :

Formats disponibles

Journal of Educational Psychology 1965, Vol. 56, No.

MEMORY AND THE FEELING-OF-KNOWING EXPERIENCE1

MEMOBY AND THE FEELING-OF-KNOWING

Recall Correct Incorrect Errors Blanks

10.0 39.8 10.2

7.2 7.2 2.4 6.0 3.3 4.4 4.9 4.9

Recognition Correct Incorrect

Note.Fifty-item tests, N = 22.

MEMORY AND THE FEELING-OF-KNOWING

Hits .86 .82 .88 .90 .50

Recall Correct Incorrect Errors Blanks

40.5 15.4 25.1 51.6 23.4

7.5 7.5 2.9 8.2 4.9 6.7 5.4 5.4

Recognition Correct Incorrect

FK Hits Misses Hits

.83 .61 .71 .69 .62 .67 .28 .67

.17 .39 .29 .31 .38 .33 .72 .33

.56 .60 .53 .63 .63 .53 .81 .70

.44 .40 .47 .37 .37 .47 .19 .30

.68 .83 .59 .75 .56 .69 .62 .70

.32 .17 .41 .25 .44 .31 .38 .30

.55 .68 .62 .67 .56 .71 .55 .66

.45 .32 .38 .33 .44 .29 .45 .34

MEMOBY AND THE FEELING-OF-KNOWING

DIVIDED AcCOBDING T FOK O

.30 .53 .50 .60 .61 .75

MEMOHY AND THE FEELINQ-OF-KNOWING

Human learning and retention. In S. S. Stevens (Ed.), Handbook of

experimental psychology. Wiley, 1951. Pp. 613-687.

SIEGEL, S. Nonparametric statistics. New York: McGraw-Hill, 1956.

Hew York: Dover, 1950.

perimental psychology. (Rev. ed.) New

(Received March 23, 1965)

Vous aimerez peut-être aussi