RP 4
RP 4
air-conditioned room which contained only the                 Immediately after the training phase, one-third of the
equipment necessary to display the stimulus and               items of each training condition received five successive
response (on training trials) or the stimulus (on test        T trials, one-third a single T trial, and one-third no T
trials), and equipment on which S made his response           trials. After a 24-hour interval, all items were tested on
on test trials. All control and recording apparatus was       four successive T trials.
located in an adjoining room. The S sat in a chair 5.5 ft.
away from the display box, on the front of which were
                                                              Procedure
an upper panel and a lower panel, identical except for           At the beginning of the session on Day 1, E briefly
position. The size of each panel was 2 × 12 in. Eight         explained what paired-associate learning is, what types
letters or digits could be displayed in each panel. The       of trials would be used, and how to respond on T trials.
three letter positions of the left side of the upper panel    The S was told to repeat aloud the stimulus-response
were used for the stimulus presentations on both              pair appearing on the screen as often as possible on
training and test trials. The response digits were shown      R trials. It was emphasized that S should respond as
on the two letter positions of the right side of the upper    quickly as possible on T trials since latencies would be
panel on training trials. The lower panel was not used        recorded. The S was also told that there would be a
in this experiment.                                           short break of 5-10 sec. between the last R trial and
    On the table in front of S's chair was a horizontal       the first T trial and that he should guess if he did not
panel containing two columns of response keys, each           know the correct answer on a T trial since the next T
-~ × 1 in. in size. Each column contained 10 response         trial would not begin until a response had been made.
keys numbered 0-9 in ascending order, the 0 being                The session began with 10 cycles of 18 R trials per
closest to S. After the stimulus had appeared on the          cycle; nine items appeared in all 10 cycles, nine
 display panel on a test trial, S indicated his response by   appeared in only the first 5 cycles, and nine appeared
 depressing first one key in the left-hand column and         in only the last 5 cycles. The order within each cycle
 then one in the right-hand column, thus generating a         was randomized. Each stimulus-response pair appeared
 two-digit number. As soon as S depressed a key, it was       on the screen for approximately 2 sec., followed by an
 illuminated until the response had been recorded             inter-trial interval of approximately 1 sec.
 automatically in the adjoining room.                            After the short break, the T trials began. The
    The stimulus members of the paired associates were        stimulus members of six of the items in each of the
 27 three-letter English nouns, taken from the highest        three training conditions were presented in random
 frequency category in the Thorndike-Lorge tables of          order; S was given as much time as he desired to
 word frequency. The single set of 27 two-digit response      respond, and his response and its latency, to the nearest
 numbers was selected randomly, subject to the restric-        .01 sec., were recorded. No information concerning
 tions used by Izawa (1966).                                  correctness of responses was ever given once the T
    Subgroups of four Ss received the same pairings of        trials had begun. Three items from each training
 stimuli and responses. For each subgroup the stimuli         condition appeared on four additional cycles of
 were assigned randomly to the conditions, as were the        randomly ordered T trials.
 responses; therefore, the pairings of stimuli and                At the end of the session on Day 1, S was told to
 responses were also random. With nine conditions,             return 24 hr. later for "more of the same." He was not
 there were three stimulus-response pairs per condition       told whether the second day's session would use the
 for each S.                                                   same materials or not.
                                                                  At the beginning of the session on Day 2, S was told
Design                                                        that he would be tested on what he had learned on the
   Training and testing involved two types of trials,         previous day. The method of responding was briefly
which will henceforth be denoted R and T trials. A n           reviewed. The session comprised 4 cycles of T trials,
R trial on any item was a paired presentation of its           each cycle being a random sequence of the 27 stimuli
stimulus and response members. A T trial was a recall          used in the experiment. Responses and latencies were
test, on which the stimulus member was presented               recorded in the same manner as on Day 1.
alone, S attempted to give the correct response, and
no informative feedback was provided.
   Three levels of training and three of initial testing                                    RESULTS
were combined factorially to determine the nine
experimental conditions for Day 1. Three items were           Error Data
assigned to each condition for each S. During the                   Proportions of errors for the group of 40
training phase of Day 1, items in Condition 10 were
                                                               Ss, b y t r i a l s w i t h i n e a c h d a y a n d c o n d i t i o n ,
presented on 10 R trials; items in Condition 5F and 5L
were presented on 5 R trials, all in the first half, or all    a r e s u m m a r i z e d i n T a b l e 1. I n v i e w o f t h e
in the second half of the training phase, respectively.        randomization procedures, we can assume
                                  EFFECTS OF RECALL ON RETENTION                                         465
that the items receiving no tests within each            trials of Day 2 cannot be as satisfactorily
training condition on Day 1 would, if tested,            evaluated since the analysis must be based on
have yielded error proportions approximately             error scores with a range of only 0-3 (that is,
equal to those occurring on the first T trial            the number of errors per S on any one trial
of D a y 1 for 1T and 5T items. Consequently             over the three items assigned to the given
it is apparent that there was a very large               combination of training and previous testing
retention loss from Day 1 to D a y 2 for items           conditions). Nonetheless this analysis was
which were not tested on Day 1 and that this             done and the variation over D a y 2 T trials
loss was largely independent of the training             proved significant at the .01 level. It does not
condition. Further, this overnight retention             seem, however, that this trend can represent
loss was substantially reduced by the effect of          learning in the same sense as the effects of
a single T trial after training on D a y 1 and was       original training trials or the effects of D a y 1
almost completely eliminated by a sequence               tests upon long-term retention. It will be
of five T trials after training on Day 1.                noted that the decrease in error proportions
                                                  TABLE 1
                             ERROR PROPORT~NS BY TR IA ~ AND CONDITIONS
   Since each cell in Table 1 represents 120             over T trials on Day I for the 5T conditions
observations, there is little doubt that the             is followed by an overnight regression and
principal trends discernible in the table are            then a decrease which in no case goes appre-
reliable. To provide additional evidence on this         ciably below, and in most cases does not reach
point, we conducted an analysis of variance              the terminal level of D a y 1.
of the D a y 2 test data, taking as the score for           With regard to the long-term effects shown
each subject, on each combination of training            in Table 1, it is of special importance to note
and previous testing conditions, the total               that R trials and Day 1 T trials, though both
number of errors made over the four T trials             increase long-term retention, are by no means
of Day 2 on the three items assigned to that             interchangeable. For example, a combination
combination of conditions. Effects of training           of five R trials and five Day 1 T trials yields
conditions and number of Day 1 T trials                  a much lower error probability on Day 2 than
were significant well beyond the .01 level but           ten R trials with no Day 1 T trials. This last
the interaction of these two variables was not           statement holds, of course, only for com-
significant. The significance of a slight down-          parison of the ten with the 5L training condi-
ward trend in error proportions over the T               tion. On the basis of the present experiment
466                                 ALLEN, MAHLER, AND ESTES
 response members of items which were                  on Day 2, whereas the corresponding percent-
 learned during the R series on Day 1.                 age for 1T items was only 18.
    A different, though not independent, way             Thus, there was substantially more stereo-
 of examining retention loss over the 24-hour         typy of response over the retention interval
 interval is to compute the relative frequencies       after five Day 1 tests than after a single Day 1
 with which Ss switched from correct responses        test and the effect was of about the same mag-
 on Day 1 tests to errors on Day 2 tests for          nitude for errors as for correct responses. The
 particular items in the 1T and 5T conditions.        more often an item is tested immediately after
 These data are presented in the upper half of        training, the more likelyit is that whatever
Table 3 in terms of the conditional proportions       response is made to the.item on the T trials
 of errors on the first trial of Day 2 for items on   will be repeated after a long retention interval.
which the final test of Day 1 yielded a correct       It might be remarked in this respect that exam-
response. This index of retention loss, though        ination of the individual protocols reveals a
only weakly related to training conditions, is        large number of instances in which Ss settled
very substantially affected by number of              upon a particular error for a particular item in
Day 1 T trials, the level for items having had        the 5T condition and made this error repeat-
five Day 1 tests being less than half of that for     edly over the later trials of the test series and
items receiving only one Day 1 test. The              also a considerable frequency of the same
proportions of errors on the first trial of Day 2     phenomenon for all conditions over the
given an error on the same item on the last test      sequence of tests on Day 2. In view of the fact
of Day 1, shown in the lower half of Table 3,         that this type of stereotypy on Day 2 is
are not entirely comparable since they do not         substantially related to the number of previous
reveal the extent to which Ss may have changed        tests on Day 1, we evidently must conclude
from one erroneous response to another on a           that the results arise at least in part from some
given item over the interval. A further break-        form of learning which occurs on the T trials
down of the conditional error frequencies in          and not simply from pre-existing associations
this respect again shows a large effect of Day 1      between stimulus and response members of
tests; for 50 ~ of the 5T items on which errors       the items.
occurred on the last test of Day 1, the same
error recurred for the same item at least once        Latency Data
                                                         Response latencies, computed separately
                      TABLE 3                         for correct responses and errors, are presented
CONDITIONAL PROPORTION OF ERRORS ON TRIAL I,
                                                      in Table 4 for each trial in relation to training
DAY 2, GIVEN CORREC-~r OR ERROR ON LAST TRIAL,        and testing conditions. It does not seem
                   DAY 1                              feasible to do any overall statistical analysis
                                                      of the correct and error latencies separately
                      Number of Number of Day 1       in view of the large variation in number of
                       training    test trials        observations from cell to cell. However, an
                         trials                       analysis of variance for the pooled mean
                                     1       5
                                                      response latencies on Day 2 as a function of
                         10         .16     .06
                                                      training and Day 1 testing conditions shows
Error given correct      5L         .24     .10
                         5F         .20     .09       all of the main effects and the interaction of
                                                      these variables to be significant far beyond
                         10         .62"    .67"
                                                      the .01 level. Since differences between means
Error given error        5L         .91"    .82"
                         5F         .83     .91       for the various training and testing conditions
                                                      are of about the same order of magnitude for
  " N < 40.                                           the correct and error latencies, it seems safe to
     16
468                                  ALLEN,MAI-ILER,AND ESTES
                                                 TABLE 4
                              MEAN LATENCIESBY TRIALS AND CONDITIONS
conclude that all of the principal trends                in trends between latencies and error fre-
discernible in Table 4 for the correct and               quencies are that the former exhibit a rather
error latencies separately are quite reliable.           greater decline within days and a greater
   An overall pattern which may prove to be of           increase from the end of Day 1 to the beginning
major theoretical significance is to be seen in          of Day 2 than might have been expected from
the ordering of mean latencies at the beginning          the frequency data, and a somewhat greater
of Day 2 as a function of training and Day 1             convergence over the T trials of Day 2 for the
testing conditions. This order, for both correct         mean latencies representing different Day 1
and error latencies, closely parallels the order-        testing conditions.
ing for error proportions seen in Table 1. As
was found for the error proportions, there is                                      DISCUSSION
relatively little variation in Day 2 latencies as a         There seems to be no doubt concerning the
function of training conditions, but major               answer to the principal question at issue in
variation in relation to number of Day 1 tests.          this study. Relatively long-term retention of
The similarity in pattern for the correct and            paired associates is substantially influenced
error latencies is quite striking; it may be             by recall tests given immediately after training.
 noted that, throughout the table, mean error            Within any sequence of trials, the immediately
 latency for any combination of conditions is            observable effect of a T trial is a reduction in
 approximately twice the corresponding cor-              response latency, and this reduction seems to
 rect response latency. The principal differences        be of about the same magnitude regardless of
                                  EFFECTS OF RECALL ON RETENTION                                   469
whether the response is correct or incorrect.         associate learning which was suggested on the
The decline in latency which occurs over a            basis of a quite different type of evidence in an
series of closely spaced T trials is followed         earlier study by Estes & Da Polito (1967).
by regression to a point intermediate between         The principal basis for the distinction in that
the initial and terminal levels of the first series   study was the finding that the amount of infor-
after a 24-hour rest interval, then during a          mation stored in memory, as measured by a
second test series by another decline to              recognition test, was approximately equal after
approximately the same terminal level.                intentional versus incidental training proce-
Analysis of different types of errors suggests        dures, whereas recall performance was drasti-
that learning that occurs on T trials operates        cally impaired after incidental training. In
to prevent failures of retrieval, and thus to         the present study it appears that the avail-
lower the incidence of intrusion errors, on           ability, or retrievability, of the response
subsequent tests, but has little effect on the        member of a paired-associate item increases
incidence of confusions. It might be remarked         as a direct function of its frequency of occur-
that all of the principal trends in the present       rence on T trials. This conception would
data agree with those observed in an unpub-           account, not only for the effects of early
lished pilot study conducted by the writers,          T trials on long-term retention, but for the
which was of similar scope but utilized some-         parallel changes in correct and error latencies,
what more difficult material, and with the            and for the similar effects of T trials on
results of a recent study by Mahler (1968) that       stereotypy of correct responses and errors.
utilized a short period of interpolated learning      Whether learning in the sense of an increase
rather than an overnight interval between the         in the long-term retrievability of the response
initial and terminal test series.                     member of a paired-associate item occurs also
   Perhaps the most parsimonious interpreta-          on paired-presentation trials is not clear.
tion of the learning which occurs on unrein-          Pending more direct evidence, the simplest
forced recall tests would be that it is basically     interpretation would seem to be that it does
the same as that occurring on paired-presen-          not occur on paired presentations per se but
tation training trials, the only difference being     may occur during rehearsal immediately
that on tests the occurrence of the response          after paired-presentation trials, though in less
member of the paired-associate item is under          effective fashion than on recall tests when the
S's control. Evidently this simple interpreta-        stimulus member of the item is present.
tion is not adequate, however, for R and T               The trends in our latency data differ in one
trials prove not to be interchangeable in their       major respect from those reported by Eirrras
effects on retention as measured either by error      and Zeaman (1963). Whereas in both studies
probabilities or latencies. For example, it is        correct response latency decreased substan-
clear in both Table 1 and Table 4 that five R         tially over successive T trials, in our data
trials plus five immediate T trials produce           error latency decreased similarly but in Eimas
long-term retention much superior to that             and Zeaman's data error latency was virtually
observed after ten R trials with no immediate         constant. The only plausible explanation that
tests. Even more strikingly, the addition of a        has occurred to us has to do with the categor-
single test after ten R trials reduces error          ization of errors as repetitive or nonrepetitive.
frequency after a 24-hour interval by 50 ~ as         In Eimas and Zeaman's study, there were only
compared to ten R trials without the                  a few cases in which a subject made the same
immediate test.                                       incorrect response to a given item on succes-
   The pattern of results appears to fit in           sive T trials, and in this portion of the data
rather well with the distinction between              mean latency decreased slightly from the
storage and retrieval processes in paired-            first to the second T trial. For the remainder
470                                     ALLEN, MAHLER, AND ESTES
of their error data, involving nonrepetitive         correct or incorrect) by the given stimulus, and
errors, latency was constant over teStS.             is to be distinguished from the notion of
   A similar breakdown of our Day 2 latency          response availability (Horowitz, Norman, &
data (pooled over conditions to obtain               Day, 1966; Underwood, Runquist, & Schulz,
adequate N's) is presented in Table 5. For the       1959) which is not stimulus-specific.
                   TABLE 5                                               REFERENCES
MEANDAY 2 LATENCIESFORITEMSWITHALL CORRECT           BUTLER, D. C., & PETERSON,D. E. Learning during
RESPONSES,ALL SAMEERRORS,OR DIFFERENTERRORS              "extinction" with paired associates. Journal of
                                                         Verbal Learning and Verbal Behavior, 1965, 4,
                               Trial                     103-106.
                                                     EIMAS,P. D., • ZEAMAN,D. Response speed changes in
Type      N        1       2            3      4
                                                         an Estes' paired-associate "miniature" experi-
                                                         ment. Journal of Verbal Learning and Verbal
All C     620    2.20     1.74         1.64   1.55
                                                         Behavior, 1963, 1, 384-388.
Same E     54    4.07     3.05         2.70   2.26
                                                     ESTES,W. K., & DA POLITO,F. Independent variation
Diff. E   406    4.36     3.79         3.37   2.93
                                                         of memory storage and retrieval processes in
                                                         paired-associate learning. Journal of Experi-
cases in which the same error occurred on                mental Psychology, 1967, 75, 18-26.
all four tests, latency declined fully as steeply    Goss, A. E., MORGAN,C. H., & GOLIN, S. J. Paired-
as for correct responses. But for the items              associates learning as a function of percentage of
represented in the third row of Table 5, on              occurrence of response members (reinforcement).
                                                        Journal of Experimental Psychology, 1959, 57,
which Day 2 responses were all incorrect but
                                                         96-104.
not all the same error, the function is appre-       HOROWITZ, L. M., NORMAN, S. A., & DAY, R. S.
ciably shallower. Further, if these means               Availability and associative symmetry. Psycho-
are converted to reciprocals, the change from           logical Review, 1966, 73, 1-15.
the first to the second test for the "Different      IZAWA, C. Reinforcement-test sequences in paired-
                                                        associate learning. Psychological Reports, 1966,
Error" category is very slight. Thus there may
                                                         18, 879-919.
actually be no appreciable disparity between         MAHLER, W. A. Effects of study and test trials on
the corresponding trends in these two studies.           retention of paired associates. Unpublished
   The substantially steeper decline in latencies        Masters Thesis, Stanford University, 1968.
for repetitive errors seems to fit well with the     RICHARDSON,J., & GROPFER,M. S. Learning during
                                                         recall trials. Psychological Reports, 1964, 15,
assumption that, to a major extent, paired-
                                                         551-560.
associate latency reflects the state of retriev-     UNDERWOOD, B. J., RUNQUIST,W. N., & SCHULZ,
ability of the stimulus-response association.            R. W. Response learning in paired-associate lists
Other things equal, retrievability of a given            as a function of intralist similarity. Journal of
response varies directly with frequency and              Experimental Psychology, 1959, 58, 70-78.
recency of evocation of that response (whether       (Received December 11,.1968)