Learning and perception both presuppose a capacity for memory.

  • Learning is a relatively permanent change in behavior resulting from experience, and memory records both the experience and the new behavior.
  • Perception combines information extracted from the stimulus environment (features and patterns that provide the stimulus basis for perception) with information retrieved from memory (knowledge and beliefs about the world that provide the cognitive basis for perception).

Perceptual activity, in turn, changes the contents of memory: In a very real sense, memory is a byproduct of perception. The records of perceptual activity, known as memory traces or engrams, influence behavior long after the perceptual stimulus has disappeared.

Thus, in order to understand learning and perception we have to understand how knowledge is encoded, stored, and retrieved in the information-processing system we call the human mind.

Memory consists of mental representations of knowledge, stored in the mind. This knowledge comes in a variety of forms:

  • facts (e.g., whether quartz is harder than granite; when Martin Luther nailed 95 theses to the church door; your dormitory telephone number);
  • language (e.g., the meaning of the word "obstreperous"; how to indicate that something was done without taking personal responsibility for it; whether the sentence, "The stole girl poor a coat warm" is grammatically correct);
  • rules (e.g., how to take the square root of a number; the best route from the Union to Tolman Hall; how to tie a half-hitch knot);
  • personal experiences (e.g., what you had for dinner last Friday night; the last time you behaved irresponsibly; the last number you looked up in a telephone directory).

In answering these questions, we must consult our memories. But not all questions are of the same kind, which raises the issue of whether there are different kinds of memory.

The Multistore Model of Memory

One distinction among memories is temporal -- how long the memory lasts. This distinction is based on the intuition that remembering an unfamiliar telephone number that you've just looked up is somehow different from remembering your own telephone number. One memory is permanent; the other is gone almost instantly.

              multistore modelUntil recently, it was common to distinguish among three different types of memory -- three different storage structures, that held information temporarily or permanently. This idea is known as the multistore model of memory. The multistore model was originally proposed by Waugh & Norman (1965) and Atkinson & Shiffrin (1968), based on a computer metaphor of the mind -- although its roots go back to William James (1890).All versions of the multistore model posit the existence of a number of different storage structures (hence the name), as well as a set of control processes which move information between them, transforming it along the way.

  • Incoming sensory information is initially held in the sensory registers.
  • By means of attention, some information is transferred to short-term memory.
  • Information is maintained in short-term memory by means of rehearsal.
  • If the information in short-term memory receives enough rehearsal, it is copied into long-term memory by means of certain encoding processes.
  • Information can be transferred from long-term memory back to short term memory by means of certain retrieval processes.
  • Information in long-term memory is also used to support pattern-recognition processes in the sensory registers.

The multistore model of memory quickly became so popular that be became known as the "modal model" of memory, meaning that it was the view most frequently adopted by memory theorists. However, the precise vocabulary of the multistore model varied from theorist to theorist.

  • Some theorists talked about sensory registers while others talked about sensory stores or sensory memories.
  • Some theorists used terms like short-term store and long-term store instead of short- and long-term memory.
  • Other theorists, adopting terms introduced by William James, referred to short-term memory as primary memory and long-term memory as secondary memory. I prefer this usage myself.
  • Still others preferred to talk about working memory rather than short-term memory.

Don't get hung up on these terminological differences. Actually, there is a serious difference between the traditional concept of primary memory and working memory, which we'll discuss later.

The Sensory Registers

The sensory registers contain a complete, veridical representation of sensory input. They are of unlimited capacity, and are able to hold all the information that is presented to them. According to the model, there is one sensory register for each sensory modality. Two of the sensory registers have been given special names: for vision, the icon; for audition, the echo. The sensory registers store information in precategorical form -- that is, the input is not yet processed for meaning. Moreover, information is held in the sensory registers for only a brief period of time, perhaps less than a second. Information is lost from the sensory registers through decay or displacement; it cannot be maintained by any kind of cognitive activity.

The characteristics of the sensory registers were demonstrated in a classic experiment on iconic memory by George Sperling (1960). Sperling presented his subjects with a visual display consisting of a 3x4 array of letters; the array was displayed only very briefly. Then, after an interval of only 0-1 seconds after the offset of the display, they were asked to report the contents of the array.

  • In the whole report condition, the subjects were instructed to report all the letters in the array.
  • In the partial report condition, they were presented with a tone of low, medium, or high pitch which cued them to report only the letters in the lower, middle, or upper row.

Retrieval from the IconThe results were very clear.

  • When subjects were instructed to report the whole array, they didn't do very well -- they were only able to remember 4-5 of the items, on average.
  • But when subjects were cued immediately to report only a portion of the array, they were able to do this almost perfectly. Based on the random sampling by rows, Sperling estimated that at least 10 of the 12 items in the array were available.
  • But the longer the partial-report cue was delayed, the worse the subjects did. If the partial-report cue was delayed by as little as a second, performance in the partial-report condition dropped to the levels observed in the whole-report condition.

From this, Sperling concluded that the icon contained a veridical representation of the stimulus array, but that the array decayed very briefly, within about a second.

Similar experiments were done on echoic memory, with similar results, leading to similar conclusions.

In a critique of the concept of iconic memory, Ralph Haber (1983) suggested that it might only be useful for reading a book in a lightning storm! Still, something like the sensory registers must exist, as an almost logically necessary link between sensation and memory. Iconic memory might be especially useful to nonhuman animals, as it would help both predators and prey to catch a glimpse of each other "out of the corner of their eye". And echoic memory might be even more useful as a support for speech perception, where individual sounds go by very fast, and have to be captured to link with other sounds.

Primary (Short-Term) Memory

Information is transferred from the sensory registers to primary (or short-term) memory after it has received some degree of processing -- that is, after it has been subject to feature detection, pattern recognition, and directed attention. Typically, primary memory is thought of as storing an acoustic representation of the information -- that is, what the name or description of an object or event sounds like. Primary memory has a limited capacity: it can contain only about 7 (plus or minus 2) items (though the effective capacity of this store can be increased by "chunking" items together -- the capacity of memory is approximately 7 "chunks", rather than 7 items). Information can be maintained in primary memory indefinitely by means of rehearsal. If it is not rehearsed, the information either decays over time, or (more likely) is displaced by new information.

              digit-span testThe capacity of primary memory is often demonstrated by the digit-span test, in which subjects are read a list of digits (or letters), and must repeat them back to the experimenter. Most people do pretty well with lists up to 6 or 7 digits in length, but start to do poorly with lists that are longer than that.

ChunkingBut the capacity of primary memory can be increased by chunking -- by grouping items together in some meaningful way. The "Alphabet Song" from Hair ("The American Tribal Love-Rock Musical") capitalizes on chunking.

In a famous paper, George Miller (1956) characterized the capacity of short-term memory as "The magical number seven, plus or minus two". That is, we can hold about 7 items in short-term memory at a time. It was this kind of research, at Bell Laboratories, that fixed the length of telephone numbers at 7 digits -- a 3-digit exchange plus a 4-digit number. If anything new is to enter short-term memory, something has to go out to make room for it.

There's a story here. Miller made the capacity of short-term memory famous, but -- as he acknowledged -- it wasn't his discovery. Credit for that goes to John E. Kaplan (1918-2013), a psychologist and human-factors researcher who spent his career at Bell Laboratories, the research arm of AT&T. Prior to the 1940s, telephone exchanges typically began with the first two letters of a familiar word, followed by 5 digits -- as in the Glenn Miller tune, "Pennsylvania 6-500" (the actual telephone number of the Pennsylvania Hotel, often claimed as the oldest telephone number in New York City), or the film Butterfield 8 (after an exchange serving the upscale Upper West Side of Manhattan. But the telephone company (there was only one at the time!) worried about running out of phone numbers, and so switched to an all-digit system. This raised the question of how many digits people could hold in memory. Kaplan did the study, and determined that the optimal number was 7. (Kaplan also determined the optimal layout for the letters and numbers on the rotary, and then the touch-pad telephone -- reversing the conventional placement on a hand calculator or computer keyboard -- but that's another story).

However, as Miller also noted, the effective capacity of short-term memory can be increased by chunking. If you group the information into chunks, and the chunks into "chunks of chunks", you can actually hold quite a bit of information in short-term memory. For example, when area codes were added to telephone numbers, they were assigned according to geographical area, so that people wouldn't have to remember all 3 digits of the area code, thus straining the capacity of short-term memory.

Secondary (Long-Term) Memory

If it receives enough rehearsal, information may be transferred to secondary (or long-term)memory, the permanent repository of stored knowledge. The capacity of secondary memory is essentially unlimited. Information is not lost from this structure, either through decay or displacement.

              Serial-Position EffectThe Serial-Position Curve. Support for a structural distinction between primary and secondary memory is provided by the serial-position curve. Consider a form of memory experiment known as single-trial free recall: the experimenter presents a list of items for a single study trial, and then the subject simply must recall the items that he or she studied. If we plot the probability of recalling each item against its position in the study list, we typically observe a bowed curve: items in the early and late portions of the list are more likely to be recalled than those in the middle.

Here are selected serial-position curves generated by students taking Psychology 1 who completed the "Serial Position" exercise an exercise on the ZAPS website. The values for the "Reference" group were generated from all the students, from all classes all over the world, who completed the exercise that year.

erial-Position Effect erial-Position Effect erial-Position Effect

erial-Position EffectThe enhanced memorability of items appearing early in the list is called the primacy effect. The enhanced memorability of items appearing late in the list is called the recency effect. The primacy and recency effects are affected by different sorts of variables.

Effect of
                Spacing on the Serial-Position Effect Increasing the interval between adjacent items, and thus increasing the amount of rehearsal each item can receive, increases primacy but has no effect on recency.

Effect of
                Retention Interval on the Serial-Position Effect Engaging the subject in a distracting task immediately after presentation of the list, and then increasing the length of the retention interval, reduces recency but has no effect on primacy.

Therefore, it may be concluded that the primacy effect reflects retrieval from secondary memory, while the recency effect reflects retrieval from primary memory.

Neuropsychological evidence also supports a structural distinction between primary (short-term) and secondary (long-term) memory. For example, Patient H.M., who suffered from the amnesic syndrome following the surgical removal of his hippocampus and related structures in the medial portion of his temporal lobes, had normal digit-span performance, but impaired free recall. Apparently, he had normal short-term memory, but impaired long-term memory. In contrast, another patient, known as K.F. (Shallice & Warrington, 1970), who had lesions in his left parieto-occipital area, had grossly impaired digit-span performance, but normal free recall of even 10-item lists. The fact that damage to two different parts of the brain results in different patterns of performance on short- and long-term memory tests supports a structural distinction between the two kinds of memory.

Primary Memory and Working Memory

Nevertheless, most memory theorists no longer prefer to think of a structural distinction between primary and secondary memory (and, as indicated by the comment by Ralph Haber cited above, there is also a debate about the sensory registers). For example, consider Patient K.F., just cited as a contrast to Patient H.M. K.F. has poor short-term memory but normal long-term memory. But according to the multistore model of memory, short-term memory is the pathway to long-term memory. This implies that you can't have long-term memory if you don't have short-term memory. But K.F. has long-term memory, despite the fact that he apparently doesn't have short-term memory.

              MemoryIf you consider primary and secondary memory to be separate stores, you need a way for primary and secondary memory to interact. Accordingly, Alan Baddeley (1974, 1986, 2000) introduced the notion of working memory, which consists of several elements:

  • a phonological loop for auditory rehearsal (like repeating a telephone number over and over);
  • a visuo-spatial sketchpad that recycles visual images of a stimulus;
  • a central executive that controls conscious information-processing; and
  • an episodic buffer that connects the phonological loop and visuo-spatial sketchpad to secondary memory.

While Baddeley's view implies that working memory is a separate store from secondary memory (which is why the episodic buffer is needed to link them), John Anderson has proposed an alternative "unistore" conception of working memory is simply that subset of declarative memory which contains activated representations of:

  1. the organism
  2. in its current environment,
  3. its currently active processing goals, and
  4. pre-existing declarative knowledge activated by perceptual inputs or the operations of various thought processes.

When these four types of representations are linked together, an episodic memory is formed.

Many investigators have adopted Baddeley's multistore view of working memory, but there is also evidence favoring Anderson's unistore view. For the record, I tend to side with Anderson, but my principal focus in these lectures is on secondary memory, so nothing much rides on it.

Another option is to think of memory as consisting of a single, unitary store. The modal model of memory implies that primary ( short-term) and secondary (long-term) memory are separate "mental storehouses". The alternative view is that primary (or working) memory consists of those elements of secondary memory that are currently in a state of activation, or currently being used by the cognitive system (Crowder, 1993; Cowan, 1995).

These lectures are focused mostly on long-term, secondary memory.

Attention: Linking Perception and Memory

James distinguished between primary (short-term) and secondary (long-term) memory in terms of attention. Primary memory is the memory we have of an object once it has disappeared, but while we are still paying attention to our image of it. Secondary memory is the memory we have of an object once we have stopped paying attention to our image of it. The distinction may be illustrated with memory for telephone numbers:

  • When someone you meet gives you their telephone number, and you repeat it to yourself until you can write it down, that's primary memory.
  • When you've lost the piece of paper on which you've written that number, but you remember it anyway, that's secondary memory.

When psychologists talk about "secondary" or "long-term" memory, they mean whatever trace remains of a stimulus that has disappeared, after 30 seconds of distraction. So it's clear that attention links perception and memory. But what is attention?

"Every one knows what attention is", William James famously wrote in the Principles of Psychology (1890/1980, p. 403-404.

It is the taking possession by the mind, in clear and vivid form, of one out of what seem several simultaneously possible objects or trains of thought. Focalization, concentration, of consciousness are of its essence. It implies withdrawal from some things in order to deal effectively with others, and is a condition which has a real opposite in the confused, dazed, scatterbrained state which in French is called distraction, and Zerstreutheit in German.

James went on to distinguish between different varieties of attention: to the objects of sensation in the external world, or to thoughts and other products of the mind; immediate, when the object of attention is intrinsically interesting, or derived, where the object is of interest by virtue of its relation to something else; passively drawn, as if by an involuntary reflex, or actively paid, by a deliberate and voluntary act.

James's description has never been bettered, but before World War II there was little scientific literature on the topic. Partly this was due to the influence of the structuralists, like Titchener, who took attention as a given: you can't introspect without paying attention. R.S. Woodworth, inclined more toward functionalism, offered a common-sense theory of how one stimulus was selected for attention over another, and later worked on the problem of the span of apprehension, or the question of how many objects we can perceive in a single glance. But after World War I attention proved too mentalistic a concept for the behaviorists who followed John B. Watson, and the word pretty much dropped out of the vocabulary of psychologists. Accordingly, it took more than half a century before psychologists could begin to figure out how attention really works.

The Attentional Bottleneck

One of the landmark events of the cognitive revolution within psychology was the development of theories of the human mind expressly modeled on the high-speed computer. According to these theories, human thought is characterized as information processing. Schematically, these theories of human information processing can be represented as flowcharts, with boxes representing structures for storing information, and arrows representing the processes which move information from one storage structure to another. Attention is one of these processes, and it determines which information will become conscious and which will not.

One of the first models of this type was proposed by Donald Broadbent, a British psychologist. Information arriving at the sensory receptors is first held in a short-term store, from which it passes through a selective filter into a limited capacity processor that compares sensory information with information already present in a long-term store. Depending on the results of this comparison, the newly arrived sensory information may itself be deposited in the long-term store, and it may be used to generate some response executed through the body's systems of muscles and glands. The limited-capacity processor is tantamount to consciousness; thus, attention is the pathway to awareness. Preattentive processing is unconscious processing: So how much preattentive processing is there? According to Broadbent, not very much. According to his theory, preattentive processing is limited to a very narrow range of physical properties, such as the spatial location and physical features of the stimulus.

Broadbent based this conclusion on experiments conducted by Colin Cherry on what Cherry called "the cocktail-party phenomenon". When you're at a cocktail party, there are lots of conversations going on, but you can only pay attention to one of these at a time. Attentional selection is accomplished by virtue of spatial and physical processing: you look at the person you're talking to, and if you should happen to look away for a moment (for example, to ask a waiter to refresh your drink), you maintain the conversation by staying focused on the sound of the other person's voice. Of course, the moment you look away, your attention is distracted from the conversation, and you're likely to miss something that's been said. Attention is drawn and focused on physical grounds; analysis of meaning comes later.

Cherry simulated the cocktail-party phenomenon in an experimental paradigm known as dichotic listening or shadowing. In a dichotic listening experiment, the subject is asked to repeat (or shadow) a message presented over one of a pair or earphones (or speakers), while ignoring a competing message presented over the other device. Normal subjects can do this successfully, but their ability to repeat the target passage comes at some expense. While they are able to remember much of what was presented over the attended channel, they pretty much forget whatever was presented over the unattended channel. Moreover, while they generally noticed when the voice on the attended channel switched from male to female, they fail to do so when the voice switched from one language to another, or from forwards to backwards speech.

The dichotic listening experiment simultaneously reveals the limited capacity of attention and the basis on which attention selection occurs. Just as we can only pay attention to one conversation at a time, so we can only process information from one channel at a time. And just as we pay attention to our companions by looking at them, so we discriminate between attended and unattended channels on the basis of their spatial location or other physical features. Physical analysis comes first, preattentively; analysis of meaning comes later, post-attentively. The implication of the filter model is that while certain aspects of physical structure can be (and probably must be) analyzed unconsciously, meaning analysis must be performed consciously.

A Cocktail-Party Symphony

A movement of Symphony #5 ("Solo", 1999) by the American composer Libby Larsen is subtitled "The cocktail party effect", and was inspired by Cherry's experiments on dichotic listening, and referring to our ability to isolate and pay attention to a single voice among the extraneous noise one encounters at a crowded cocktail party -- or, in this case, a single instrumental line against the background of full orchestra. "It's a kind of musical 'Where's Waldo?'" says Larsen. "In this case, Waldo is a melody, introduced at the beginning of the movement, from then on hidden amidst the other music."

Attention in the Multistore Model of Memory

MultistoreOther information-processing theorists, approaching the problem from a different angle, came to much the same conclusions regarding the limitations on pre-attentive information processing. Thus, while Cherry and Broadbent were analyzing the properties of attention, other investigators, such as Nancy Waugh and Donald Norman, and Richard Atkinson (now President of the University of California) and Richard Shiffrin (they were both professors at Stanford at the time), were studying the properties of the storage structures involved in human memory. According to the multistore model of memory, sensory information first encounters a set of sensory registers, which hold, for a very brief period of time, a complete and veridical representation of the sensory field. In principle, there is one sensory register for each modality of sensation: Neisser called the visual register the icon and the auditory one the echo. The sensory registers hold vast amounts of information, but we are only aware of that subset of stored information, which has been attended to and copied into another storage structure, known as primary or short-term memory. With further processing, information held in primary memory can be encoded in a permanent repository of knowledge known as secondary or long-term memory. In the absence of further processing, the contents of primary memory decay or are displaced by new material extracted from the sensory registers or retrieved from secondary memory.

Primary memory, so named by William James, is the seat of consciousness in the multistore model, combining information extracted through sensory processes with prior knowledge retrieved from secondary or long-term memory. Like the central processor in Broadbent's model, the capacity of primary memory is severely limited: it was famously characterized by George Miller in terms of

"The magical number seven, plus or minus two"

items. Of course, as Miller pointed out, the capacity of primary memory can be effectively increased by "chunking", or grouping related items together. Thus, consciousness is limited to about seven "chunks" of information, far less than is available in either the sensory registers or secondary memory. Unattended information held in the sensory registers, or residing in latent form in secondary memory, may be considered preconscious -- available to consciousness, under certain conditions, but not immediately present within its scope.

Atkinson and Shiffrin focused on memory structures, but the implications of their model for attention and consciousness are the same as those of Broadbent's. Attention selects information from the sensory registers for further processing in primary memory, but the information in the sensory registers is represented in "pre-categorical" form. It has been analyzed in terms of certain elementary physical features, such as horizontal or vertical lines, angles and curves, and the like -- a process known as feature detection. In addition, certain patterns or combinations of features have been recognized as familiar -- a process known as pattern recognition. But that is about all. For the most part, the multistore model holds that the analysis of meaning takes place in primary memory, which has a direct connection to the storehouse of knowledge comprising secondary or long-term memory.

Thus, the multistore model of memory attention selects sensory information based on its physical and spatial properties -- where it is and what it looks and sounds like. It does not select information based on its meaning -- indeed, it cannot do so, because meaning analyses are performed after attentional selection has taken place, and the information has been copied into primary memory. At least so far as perception is concerned, preattentive processes are limited to physical and spatial analyses, and preconscious representations contain information about the spatial and physical features of objects and events. Like the filter model, then, the classic multistore model of memory permits certain physical analyses to transpire unconsciously, but requires that meaning analyses be performed consciously.

Early vs. Late Selection

But wait: that can't be the case, because attention, and therefore consciousness, is sometimes drawn to objects and events on the basis of their meaning, not just their spatial location or their physical features. At a cocktail party, we do shift our attention if we should happen to hear someone mention our own name -- a shift that would seem to require attentional selection based on meaning, not just sound. As it happens, even in Cherry's experiment, subjects shifted attention in response to the presentation of their own names on about 35% of the trials. Neville Moray confirmed this observation. In his shadowing experiment, the phrase "you may stop now" was presented, without warning, over the unattended channel. Subjects noticed this about 6% of the time, but when the phrase was preceded by their names, the rate went up to 33%.

Anne Treisman (who taught at Berkeley for several years) performed a dichotic listening experiment in which the shadowed message shifted back and forth between the subject's ears. Treisman observed that, on about 30% of trials, her subjects followed the message from one ear to the other. When they realized that they had made a mistake, the subjects shifted their attention back to the original ear. However, the fact that the shift occurred at all indicates that there must be some preattentive processing of the meaning of an event, beyond its spatial and physical structure.

Treisman's observations led her to modify Broadbent's theory. In her proposal, attention is not an absolute filter, but rather merely attenuates processing of the unattended channel. This attenuator can be tuned to demands arising from either the person's goals or the immediate context. Thus, at a cocktail party, if you are being a good guest and behaving yourself properly, you pay most attention to the person who is presently talking to you, with attentional selection controlled by physical features such as the person's location and tone of voice. However, you are also attentive for people talking about you, so that attentional selection open to particular semantic features as well. Put more broadly, attention is not determined solely by the physical attributes of a stimulus. Rather, it can be deployed depending on a perceiver's expectancies and goals.

Unlike Broadbent's original model, Treisman's approach was a truly cognitive theory of attention. It departed from a focus on "bottom-up" processing, driven by the physical features of stimulation, and affords a clear role for "top-down" selection based on semantic attributes. Because attentional selection can be based on meaning, the model implies that preattentive processing is not limited to the analysis of the physical structure and other perceptual attributes of a stimulus. Semantic processing can also occur preattentively. But how much semantic processing can take place?

This question set the stage for a long-running debate between early- and late-selection theories of attention. According to early-selection theories, like Broadbent's original proposal, attentional selection occurs relatively early in the information-processing sequence, before meaning analysis can occur. According to late-selection theories, such as those foreshadowed by Treisman, attentional selection occurs relatively late in information processing, after some semantic analysis has taken place. The question is an important one, because the answer determines the limits of preattentive or preconscious processing.

Late-selection theories of attention abandon the notion of an attentional filter or attenuator entirely. According to the most radical of these, proposed by J. Anthony Deutsch and Diana Deutsch, and by Donald Norman, all inputs are analyzed fully preattentively, and attention is deployed depending on the pertinence of information to ongoing tasks. In this view, attention is required for the selection of a response to stimulation, but not for the selection of inputs to primary memory. For this reason, late-selection theories are also called response-selection theories. They imply that a great deal of semantic processing takes place preattentively, greatly expanding the possibilities for preconscious processing. It should be understood that late selection theories do not hold that literally every physical stimulus in the person's sensory field is analyzed preattentively for meaning. They simply hold that meaning analysis is preattentive, just as physical analysis is. Whatever gets processed, gets processed semantically as well as physically, at least to some extent.

Late-and-Early Selection ComparedThe primary difference between early-and late-selection theories of attention is illustrated in this figure, adapted from Pashler (1998). The left side of the figure represents four stimulus events competing for attention. The early stages of information processing provide physical descriptions of these stimuli, followed by identification and, at late stages of information processing, semantic description, and finally some response. According to early-selection theories, depicted in Panel A, preattentive processing is limited to physical description; all available stimuli are processed at this stage, but identification, semantic description, and response are limited to the single stimulus selected on the basis of the physical descriptions composed at the early stages. According to late-selection theories, depicted in Panel B, all available stimuli are also identified and processed for meaning preattentively; attention and consciousness are required only for response. In some respects, identification is the crux of the difference between early- and late-selection theories: early-selection theories hold that conscious awareness precedes stimulus identification, while late-selection theories hold that identification precedes consciousness.

Filters vs. Resources

The debate between early- and late-selection theories of attention was very vigorous -- and to some, seemingly endless and unproductive. In response, some investigators completely reformulated the problem. In their view, attention was not any kind of filter or attenuator or selector; rather, attention was identified with mental capacity -- with the person's ability to deploy attentional resources in various directions.

The earliest
                capacity theoryThe earliest capacity theory was proposed by Daniel Kahneman (who taught at Berkeley for several years), who equated attention with mental effort. The amount of effort we can make in processing is determined to some extent by our level of physiological arousal, but even at optimal arousal levels a person's cognitive resources are limited. Within these limits, however, attention can be freely directed, based on the individual's allocation policy. The allocation policy, in turn, is determined by the individual's enduring dispositions (innate or acquired) as well as by momentary intentions. Thus, at a cocktail party, we may intend to pay attention to the person we're speaking to, but we are also disposed to pick up on someone else's mention of our name. The person's ability to carry out some information-processing function is determined by the resources required for the task. If the tasks at hand are relatively undemanding, he or she can carry out several tasks simultaneously, in parallel, so long as there is no structural interference between them (for example, we cannot utter two sentences at once, because we have only a single vocal apparatus). On the other hand, if the tasks are highly demanding, the person focuses all of his or her attentional resources on one task at a time, in series.

Spotlights and Zoom Lenses

A variant on the capacity model is the spotlight metaphor of attention initially proposed by Michael Posner and later adopted by Broadbent as well. In this view, metaphorically speaking, paying attention to something "throws light on the subject". The attentional spotlight "illuminates" whatever occupies the space it is focused on, permitting the information provided by that object to be processed. Once engaged on a particular portion of the stimulus environment, attention can also be disengaged and shifted elsewhere in response to various kinds of cues -- a process that takes some (small) amount of time. In Posner's theory, engagement, disengagement, and shifting (or movement) constitute elementary components in the attentional system, and someone who suffers from "attentional deficit disorder" presumably has a deficit in one or more of these components. Interestingly, neuropsychological and brain-imaging studies indicate that each of these component processes is controlled by a different attentional module, or system, in the brain. According to the spotlight metaphor, shifting attention moves the attentional spotlight from one portion of the sensory field to another. However, the attentional spotlight is not like a real spotlight. In the first place, there is only one of them: it is not possible to illuminate more than one portion of space at a time. Moreover, the attentional spotlight cannot expand or contract, so that, wherever it shines, it illuminates a constant portion of space. Along the same lines, other theorists have offered an analogy between attention and the zoom lens on a camera. There is only one zoom lens, just as there is only one spotlight, but unlike the attentional spotlight, the attentional zoom lens can widen or narrow its focus.

Of course, there is no reason why the scope of the attentional spotlight cannot be changed, just as is possible with a real spotlight, so the two models are not necessarily incompatible in that respect. However, comparing the two metaphors raises the question of what happens to objects that are outside the attentional field. A zoom-lens mechanism would seem to exclude them entirely, so we would not expect any substantial "preattentive" processing to occur.

But even a tightly focused spotlight illuminates portions of space other than that on which it is focused (to demonstrate this for yourself, turn on a flashlight in your garage at night). So, following the metaphor, it should be possible to process information that lies outside the scope of conscious attention. Still, even under the spotlight metaphor, there are limits to preattentive or preconscious processing. A diffuse spot casts less light on any particular portion of space than does a tightly focused one. Accordingly, we would expect there to be limits on the amount of processing that can occur on the fringes of the illuminated area. We appear to be back where we were with the capacity model -- which is appropriate, after all, given that the spotlight and zoom-lens metaphors are just variants on the capacity metaphor. While we might expect relatively complex semantic processing to be possible at the center of attention, processing on the periphery might be limited to only relatively simple, physical attributes.

For the record, the filter, attenuator, and capacity models of attention, including the spotlight and zoom-lens variants on the capacity model, do not exhaust the possibilities for attention theory. These are models of spatial attention -- that is, models that attempt to understand how attention is focused on particular locations in space, so that the objects and events occurring in those locations can be processed. However, there is another class of models, which propose that attention is focused on the objects themselves, rather than on their spatial locations. For example, in Balint's syndrome, patients with bilateral lesions in the posterior parietal and lateral occipital lobes are unable to perceive more than one object at a time, even if these objects are placed at the same location. In other words, they have difficulty attending to a point in space, but have difficulty attending to objects within that space. It is possible that the attentional system is space-based at one level and object-based at another level, but this does not change our central problem: what is the cognitive fate of unattended objects? To what extent is preattentive processing possible? Can preattentive processing go beyond relatively simple analyses of the physical properties of the stimulus and extend to relatively complex semantic analyses as well?

The Concept of Automaticity

The key to answering this question, at least in the context of Kahneman's capacity model of attention is the concept of automaticity. In terms of the person's allocation policy, enduring dispositions are applied automatically, while momentary intentions are applied deliberately. Momentary intentions call for mental effort, and they drain cognitive resources. By contrast, automatic processes do not require much mental effort, and so they don't consume many cognitive resources. Put another way, automatic processes are unconscious: they are executed without any conscious intention on our part. And, although the mental contents they generate may be available for conscious introspection, the processes themselves are not: they are performed outside the scope of conscious awareness. Some automatic processes are innate: they come with us into the world, as part of our phylogenetic heritage. Others, however, are acquired through experience and learning. Either way, they are preattentive, or preconscious, because they do not require the deployment of attention, and they do not enter into conscious awareness.

The implications for the scope of unconscious, preattentive processing are clear: if automatic processes are unconscious, then any mental activity can be performed unconsciously, so long as it has been automatized, one way or another. In this way, automaticity expands the scope of unconscious influence. Early-selection theories of attention restricted unconscious processing to relatively primitive analyses of physical features. Capacity theories permit even complex mental processes to be performed unconsciously, so long as they are automatized.

Attentional theorists were quick to pick up on the notion of automaticity, not least, I think, because it skirted the problem of distinguishing empirically between early- and late-selection theories of attention. For example, David LaBerge argued that most complex cognitive and motoric skills cannot be performed consciously, because their cognitive requirements exceed the capacity of human attentional resources. For example, a reader must translate the visual stimulus of a phrase such as


into an internal mental representation of the word's meaning. In formal terms, this translation process involves:

  1. employing feature detectors to analyze the graphemic information on the printed page into elementary lines, angles, curves, intersections, and the like;
  2. recognizing particular combinations of these features as letters of the alphabet;
  3. recognizing certain spelling patterns as familiar syllables and morphemes;
  4. recognizing certain combinations of syllables and morphemes as familiar words; and, above the level of individual words,
  5. recognizing certain combinations of words as familiar phrases, etc.

If we had to follow this sequence consciously, stroke by stroke, letter by letter, word by word, phrase by phrase, we'd never get any reading done. In fact, of course, skilled readers get a lot of reading done. LaBerge and Samuels argued that this happens because at least some components of the reading process have become automatized -- they just happen without any intention or awareness on our part, freeing cognitive resources for other activities, such as the task of making sense of what we have read. The first process, feature detection, is innately automatic: we are hard-wired to do it. Processes at other levels can be automatized through learning. At early stages of learning, they are effortful and conscious; and later stages, they become effortless and unconscious.

                "Stroop Effect"At  around the same time, Posner and Snyder used the Stroop effect to illustrate the distinction between automatic and strategic processing. In the basic "Stroop" experiment, subjects are presented with a list of letter strings names printed in different colors, and were asked to name the color in which each string is printed. When the string consists of random letters, the task is fairly easy, as indicated by relatively rapid response latencies. When the letter string is a meaningful word, however, response latency increases. Of special interest are conditions where the words are the names of colors. Although one would think that interference would be reduced if the color word matched the color of the ink in which it was printed (e.g., the word yellow printed in yellow ink), there is still some interference, compared to the non-word control condition. The task is especially great when the word and its color do not match (e.g., yellow printed in green ink), it is very hard. Despite the subjects' conscious intention to name ink colors, and to ignore the words themselves, they cannot help reading the color names, and this interferes with naming of the colors. It just happens automatically. Other early work on automaticity was performed by Schneider and Shiffrin.

The Stroop Effect in Art

Jasper Johns, a prominent contemporary American artist, has often used "psychological" material like Ruben's Vase and Jastrow's Duck-Rabbit in his art. He's also done a variant on the Stroop test. In "False Start" (1959), he splashes paint in various colors across the canvas, and then overlays the colors with color names -- sometimes the name matches the color, sometimes it does not.

These seminal studies laid down the core characteristics of automatic, as opposed to controlled processes:

  • Inevitable evocation: Automatic processes are inevitably engaged by the appearance of specific environmental stimuli, regardless of the person's conscious intentions;
  • Incorrigible completion: Once invoked, they proceed inevitably to their conclusion;
  • Effortless execution: In either case, execution of an automatic process consumes no attentional resources.
  • Parallelism: Because they consume no attentional resources, automatic processes do not interfere with other ongoing cognitive processes.
  • Automatization: Some processes are innately automatic, or nearly so, while others are automatized only after extensive practice with a task.
  • Unconscious: For much the same reason, automatic processes leave no consciously accessible traces of their operation in memory.

Some investigators added other properties to the list, but for most theorists, two features are especially important in defining the concept of automaticity: automatic processes are invoked regardless of the subject's intentions, and their execution consumes no (or very little) cognitive capacity. Note that, in terms of capacity theory, these two fundamental properties are linked. According to Kahneman, voluntary control occurs by virtue of the allocation of cognitive resources to some process. If automatic processes are independent of cognitive resources, then they cannot be controlled be allocating, or denying, cognitive resources to them. Everything about automaticity comes down to a matter of capacity. In capacity theories, attention must be "paid", and "paying" incurs a cost. Automatic processes are attention-free, and incur no costs. Because they don't involve attention, they can't be started deliberately, they can't be stopped deliberately, and can't be monitored consciously.

Some version of the capacity theory continues to dominate most research on attention, even though recent research has cast doubts on particular details. Similarly, recent research casts doubt on the strong version of automaticity: "automatic" processes aren't quite as automatic as we thought they were, and no matter how much they have been practiced, they still draw on cognitive resources to some extent.

Theories of attention and automaticity continue to evolve in response to new data -- that's the way things go in science. But if attention is what links perception and memory, the memory-based view of automatization links attention to memory as well: effortful processes depend on declarative memory, automatic ones depend on procedural memory. Which brings us full circle.

The best technical introduction to research and theory on attention is, naturally enough, The Psychology of Attention (1998) by Harold Pashler, a professor at UC San Diego.  For a more popular treatment, see Focus: The Hidden Driver of Excellence (2013) by Daniel Goleman, a science writer who holds a PhD in psychology. 

Memory Proper

We now turn our focus on long-term or secondary memory, or what James called memory proper -- what people ordinarily mean when they talk about memory.

Classification of Knowledge

Taxonomy of
                Knowledge Stored in MemoryAdopting the unistore theory of memory does not mean that all memory is the same. There may not be different memory storehouses, but there are certainly different types of knowledge stored in the single structure. The fundamental distinction between types of knowledge is between declarative and procedural knowledge.

Declarative knowledge is factual knowledge concerning the nature of the world, in both its physical and its social aspects. It includes knowledge about what words, numbers, and other symbols mean; what attributes objects possess; what categories they belong to. It includes knowledge of events, and the contexts in which they take place. Declarative knowledge may be represented in propositional format: that is, as sentences containing a subject, verb, and object. For example:

  • Blunt means "dull".
  • Three is less than four.
  • A red octagonal sign means stop.
  • Automobiles have wheels.
  • Chevrolets are automobiles.
  • A hippie touched a debutante.

It is sometimes convenient to distinguish between perception-based and meaning-based representations in declarative memory.

  • Perception-based representations often take the form of sensory images, and preserve information about the physical structure of an object or event -- information about its physical details and the spatial relations among its components. A perception-based representation of a car is like a mental "picture" of what a car looks like.
  • Meaning-based representations often take the form of verbal descriptions, and preserve information about the meaning and conceptual relations of an object or event. A meaning-based representation of a car is like a dictionary entry, that indicates that a car is a type of motor vehicle, often powered by an internal-combustion engine, used for transporting people on roads.

Procedural knowledge consists of the rules and skills used to manipulate and transform declarative knowledge. It includes the rules of mathematical and logical operations; the rules of grammar, inference, and judgment; strategies for forming percepts, and for encoding and retrieving memories; and motor skills. Procedural knowledge may be represented as productions (or, more properly, production systems): statements having an if-then, goal-condition-action format. For example:

  • If you want to divide A by B,then count the number of times B can be subtracted from A and still leave a remainder greater than or equal to zero.
  • If you want to know how likeable a person is and you already know some of his or her personality traits,then count the number of his or her desirable traits.
  • If you want to remember something,then relate it to something you already know.
  • If you want to tie your necktie in a "Shelby" knot, and the tie is already drapes around your neck,then begin with the tie inside out, wide end under.

Attention, Automaticity, and Procedural Knowledge

The concept of procedural knowledge offers a different perspective on attention and automaticity. For example, John Anderson has suggested that effortful processes are mediated by declarative memory while automatic processes are mediated by procedural memory.

According to Anderson, skill learning proceeds through three stages.

  • At the initial cognitive stage, people memorize a declarative representation of the steps involved. When they want to perform the skill, they retrieve this memory and consciously follow the several steps the way they would follow a recipe.
  • At an intermediate associative stage, the person practices the skill, becomes more accurate and efficient in following the algorithm. During this stage, Anderson argues that the mental representation of the skill shifts from declarative to procedural format -- from a list of steps to a production system -- a process known as proceduralization. (For those familiar with the operation of computer programs, proceduralization is analogous to the compilation of a program from the more or less familiar language of a BASIC or FORTRAN program into the machine language of 0s and 1s. In fact, Anderson has sometimes referred to this shift as knowledge compilation.) The availability of production rules eliminates the need to retrieve individual steps from declarative memory, and in fact the complied procedure eliminates many of these steps altogether.
  • When knowledge compilation is complete, the person moves to the final autonomous stage, where performance is controlled by the procedural representation, not the declarative one. Autonomous performance is typically very fast, efficient, and accurate, but for present purposes the important thing is that the person has no awareness of the individual steps as he or she performs them. The person is certainly aware of intending to drive the car, and the person is certainly aware that the car is moving forward, but the steps that intervened between this goal and its achievement are simply not available to introspective access. The procedure is unconscious in the strict sense of the word. The loss of conscious access occurs, apparently, because the format in which the knowledge is represented has changed: declarative knowledge is available to conscious awareness, but procedural knowledge is not.

A couple of things need to be said about unconscious procedural knowledge. First, recall that at the associative stage the declarative and procedural representations exist side by side in memory, with control over performance gradually shifting from the former to the latter. This means that at the initial and intermediate stages of skill acquisition, the person may well be aware of what he or she is doing. At the autonomous stage, after proceduralization is complete and performance is being controlled by the new procedural knowledge structure, the corresponding declarative structure may persist, and the person may have introspective access to it. But it is not this declarative structure that is controlling performance -- in that sense, the person's awareness of what he or she is doing is indirect -- like reading a grammar text to find out how one is speaking and understanding English. If access to the declarative knowledge is lost, the person may regain it -- perhaps by slowing the execution of the procedure down, perhaps by memorizing a list of instructions corresponding to the procedure. But again, this awareness is indirect. The procedure itself, the knowledge that is actually controlling performance automatically, remains isolated from consciousness, unavailable to introspection.

Within the domain of declarative knowledge, a further distinction may be drawn between episodic and semantic memory.

Structure of Episodic MemoryEpisodic memory is autobiographical memory, and concerns one's personal experiences. In addition to describing specific events, they also include a description of the episodic context (i.e., time and place) in which the eventsStructure of Episodic Memory occurred, as well as a reference to the self as the agent or experiencer of these events; the reference to the self may also include additional information about the person's cognitive, emotional, and motivational state at the time of the event. Thus, a typical episodic memory might take the following form:

I was happy when I saw [self]
a hippie touch a debutante [fact]
in the park on Saturday. [context]

Semantic memory, by contrast, is the mental lexicon, and concerns one's context-free, knowledge. Information stored in semantic memory makes no reference to the context in which it was acquired, and no reference self as agent or experiencer of events. Knowledge stored in semantic memory is categorical, including information about subset-superset relations, similarity relations, and category-attribute relations. Thus, some typical semantic memories might take the following form:

  • I am tall.
  • Happy people smile.
  • Hippies have long hair.
  • Touches can be good or bad.
  • Debutantes wear long dresses.
  • Parks have trees.
  • Saturday follows Friday.

Semantic memory is often portrayed as a network, with nodes representing individual concepts and links representing the semantic or conceptual relations between them. Thus, a node representing car would be linked to other nodes representing vehicle,road, and gasoline.

Episodic memory can be portrayed in the same way, with a node representing an event linked to other nodes representing the spatiotemporal context in which the event occurred, and the relation of the self to the event.

Relations Among the Forms of Memory

The three basic forms of knowledge -- procedural, declarative-semantic, and declarative-episodic -- are related to each other.

Because most knowledge comes from experience, most semantic knowledge starts out with an episodic character, representing one's first encounter with the object or event from which the knowledge is derived. However, as similar episodes accumulate, the contextual features of the individual episodic memories are lost (or, perhaps, blur together), resulting in a fragment of generalized, abstract, semantic memory.

Note, too, that the formation of a meaningful episodic memory depends on the existence of a pre-existing fund of semantic knowledge (about hippies and debutantes, for example).

Most procedural knowledge starts out in declarative form, as a list representing the steps in a skilled activity. Through repeated practice, and overlearning, the form in which this activity is represented changes from declarative to procedural. At this point, the skill is no longer conscious, and is executed automatically whenever the relevant goals and environmental conditions are instantiated. How much practice? Some idea can be found in what is known as the thousand-hour rule.

Finally, memory itself is a skilled activity. Thus, the encoding, storage, and retrieval of declarative memories depends on procedural knowledge -- knowledge of how to use our memories.

Memory and Metamemory

An interesting aspect of memory is metamemory, or our declarative knowledge about our own memory systems.

At one level, metamemory consists of our awareness of the (declarative) contents of our memory systems. This is the kind of knowledge that permits us to know, intuitively, that we do know our home telephone numbers, that we might still know the telephone number of a former lover (after all, you called him or her often enough in the past), that we might recognize the main telephone at the White House (if for no other reason than that it was mentioned on the "In This White House" episode of the NBC television series The West Wing, 2000, when Ainsley Hayes checks her Caller ID; in any event, it's 202-456-1414) and that we do not know Fidel Castro's telephone number. It is the kind of knowledge that permits us to say, when confronted with a familiar face at a cocktail party, that we don't recall the person's name but would recognize it if we heard it -- what is sometimes known as the feeling of knowing (FOK) experience. These judgments are usually accurate, but in making them we do not have to search the contents of memory.

At another level, metamemory consists of our understanding of the (procedural) principles by which memory operates. This is the kind of knowledge that permits us to know that we can remember 4 digits, but better write down 11 digits; that rehearsing an item to ourselves helps us to remember it; that memory is better over short than long delays. Metamemory of this sort is quite accurate: when asked how they would go about remembering various things, even young children reveal a fairly sophisticated understanding of these principles.

Manifestations of Memory

In class, the lectures focus on declarative, episodic knowledge -- autobiographical memory, for personal experiences that occurred at particular times, and in particular places. Certain aspects of procedural knowledge, and of semantic memory, are covered in the lectures on thought and language.

                Study ListIn the laboratory, episodic memory is often studied by means of the verbal learning paradigm. In this procedure, the subject's task is to memorize one or more lists of familiar words. Consider, for example, an experiment that runs for three days. On the first two days, the subject is presented with two different lists of 25 words each: each list contains five items from each of five conceptual categories (e.g., four-legged animals, women's first names, etc.). The subject is given one study trial per list, and studies each list in a different room. On the third day, the subject's memory is tested: she is asked to remember the contents of each list. Note that the subject is not asked to add new words to her vocabulary -- this would be a semantic memory task. The subject may receive only a single study-test trial; alternatively, trials may be continued until the list is thoroughly memorized.

In any case, each session, each list, and each word constitutes an event: it occurs in a particular place, at a particular time -- the room in which the list is studied, the list in which the word is studied, and the position of the word on the list. Thus, the verbal learning paradigm provides a simple laboratory model for perceiving and remembering events in the real world.

There are many variants on this basic paradigm. For example, the subject may study words or pictures, sentences or paragraphs, slide sequences or film clips; he or she may be asked to remember odors or tastes, or sequences of tones. In any case, each item comprises a unique event that occurs in a particular spatiotemporal context -- an episodic memory.

Memory for such a list may be tested by a number of different methods.

  • In the method of free recall, the subject is simply asked to recall the items that were studied -- perhaps all the words together, or one list at a time. No further information is given, and no constraints are imposed on the manner of recall.
  • In the method of serial recall, the subject must recall the items in the order in which they were originally presented.
  • In the method of cued recall, the subject is offered specific prompts or hints concerning the list items -- for example, the first letter or first syllable of a word; the name of the category to which it belongs; or some closely associated word. These cues are intended to help remind the subject of list items.
  • In the method of recognition, the subject is asked to examine a list consisting of items that actually appeared on the list (targets or "old" items), and others that did not appear (lures, foils, or "new" items). The subject is asked to distinguish between targets and lures, and, often, to rate his or her confidence in this judgment.

Notice that all these tests are really variants on cued recall. There's always some cue, even though it may be pretty vague. In serial recall, each item on the list serves as a cue for remembering the one that follows. In recognition, the "cue" is the item itself.

Paired-Associate LearningAnother commonly used verbal-learning procedure is called paired-associate learning -- a technique invented by Mary Cover Jones, who did graduate work with William James and became the first female president of the American Psychological Association. Here the items on the list consist of pairs of words instead of single words -- a stimulus term (A), analogous to the conditioned stimulus in classical conditioning, and a response term (B), analogous to the conditioned response (I don't believe these analogies, but the procedure is a holdover from a time when memory was studied as a variant on animal learning). Subjects study the A-B pairs as usual. At the time of test, they are prompted with the stimulus term (A), and asked to remember the response term (B). Again, the subject may receive only a single study-test trial; alternatively, trials may be continued until the list is thoroughly memorized. Paired-associate learning is also a variant on cued recall, where the stimulus term serves as a cue for the recall of the target term.

Now assume that a subject has memorized a list of paired-associates, A-B, and then is retested after some interval of time has passed. Usually, the subject will display some degree of memory failure: either a failure to produce the correct B response, or the production of an incorrect B response, when presented with stimulus A. Under these circumstances, a number of alternative memory tests may be conducted.

In relearning, the subject is asked to learn a second list of paired associates. Some of these are old, but forgotten, A-B pairs. Others are entirely new, and may be designated C-D pairs. It turns out that relearning of A-B pairs proceeds faster than of C-D pairs --a phenomenon known as savings (recall a related discussion, in the lectures on learning, of savings in relearning after extinction -- again, note the parallels between human memory and animal learning).

The "lie detector" paradigm relies on psychophysiology -- the use of electronic devices to monitor physiological changes -- for example, the electroencephalogram (EEG), for brain activity; the electrocardiogram (EKG) for cardiovascular activity; the electromyogram (EMG) for muscle activity; the electrodermal response (EDR), for changes in the electrical conductance of the skin -- while the subject is engaged in some mental function. If we present a subject with a list of items containing old, but forgotten, A-B pairs, and entirely new C-D pairs, we may well observe a differential pattern of psychophysiological response. Usually, as its name implies, the lie-detector is used to detect the willful withholding of information. But it can also be used to detect memory, since the only difference between A-B and A-D pairs is that the former are old while the latter are new.

In the bulk of these lectures we focus on episodic memory, memory for discrete events, as measured by recall or recognition, and ask:What makes the difference between an event that is remembered and one that is forgotten?

Principles of Conscious Recollection

Just as perception depends on memory, so memory depends on perception. Perception changes the contents of the memory system, leaving a trace or engram which persists long after the experience has passed.

Stage Analysis

                Analysis of MemoryWe can understand the causes of remembering and forgetting in terms of three stages of memory processing:

  • In encoding, we create a record of perceptual activity that leaves a representation of an event in memory. This representation is known as the memory trace or engram.
  • In storage, a newly encoded trace is retained over time in a latent state.;
  • In retrieval, a stored memory trace is activated so it can be used in some cognitive task.

Forgetting can involve a failure at any of these stages, alone or in combination. Thus, at least in principle, an event can be forgotten because a trace of that event was never encoded in the first place; or because it was lost from storage; or because it can't be retrieved.

Unfortunately, when we're dealing with autobiographical memory in the real world, we don't always have knowledge or control over these three stages. This is especially true for the encoding stage: because the experimenter was not usually present when the event occurred, he or she cannot check the accuracy of what the subject recalls. If I ask you what you had for dinner last Thursday night, how can I know that you're correct? The verbal-learning experiment, in which memory is tested for material "studied" in the laboratory, allows us to rigorously control the conditions of encoding, storage, and retrieval, so we can determine their effects on memory.

Link to an interview with Gordon H. Bower.


How do we lay down a trace of some event in memory? The general idea behind analyses of the encoding phase is that memory is a byproduct of perception. One cannot remember something that was not perceived in the first place, and what was perceived determines what will be subsequently remembered.


Traditionally, the encoding phase has been characterized in terms of rehearsal. In classic associationist learning theory, memories are associations "stamped in the mind" by repetition (recall Thorndike's Law of Practice, which held that conditioned responses were strengthened by repetition).

Nonsense-Syllable Paradigm In the first quantitative experiments on memory, Hermann von Ebbinghaus (1885) studied lists of nonsense syllables -- pronounceable but meaningless letter strings consisting of a consonant, a vowel, and another consonant, such as TUL -- arranged in a strict serial order (Ebbinghaus, along with all other early psychologists, adhered to the traditional doctrine of association by contiguity). During the study phase, he would begin with the first syllable on the list (the stimulus or cue) and attempting to recall the second one (the response or target). This second syllable then became the stimulus for the third syllable, and so on, forming a chain of associations from the beginning to the end of the list. In this way, Ebbinghaus believed, memories were formed by means of rehearsal -- repeating pairs of items over and over, until an engram representing the association between them was "stamped in".

Retention As a Function of RepetitionIn one experiment, Ebbinghaus memorized lists of 16 nonsense syllables, repeated from 8 to 64 times. Then, after a retention interval of 24 hours, he measured the time it took him to relearn each list to a criterion of one perfect repetition. His measure of memory was savings in relearning the list (in addition to the nonsense syllable, this measurement technique was invented by Ebbinghaus, long before it became part of the study of conditioning in animals). These savings were compared to the time required to learn such a list from scratch (in other words, "savings" in "relearning" a list that had received 0 prior repetitions). Ebbinghaus found that with more repetitions during the study (encoding) phase, the less time was needed to relearn the list at the test (retrieval) phase. Thus, he inferred that memory strength increases with rehearsal.

Craik Watkins ProcedureEbbinghaus's findings were consistent with the prevailing theory at the time, but it is now clear that , encoding is not merely a function of rehearsal. In a more recent experiment, Craik and Watkins, at the University of Toronto, presented subjects with a list of words, one at a time, each item presented at a regular interval. The subject's task was to detect any word meeting a particular criterion -- e.g., words that began with the letter P -- and be able to report, on demand, the most recent word. When a new "P" word appeared, they could forget the previous one, and they could also forget all the intervening, "non-P" words. Craik and Watkins also varied the number of noncritical non-P items between the critical "P" words -- thus varying the amount of rehearsal devoted to each target.

Repetition and RecallAt the end of the experiment, Craik and Watkins surprised their subjects with a test of memory, in which they were instructed to recall all the words they had seen. Not surprisingly, recall for the noncritical items was very poor -- no rehearsal, no memory. However, for the critical targets, recall was still very low, even though each critical word had received some rehearsal. Moreover, there was no correlation between recall and the amount of rehearsal items had received.


Based on results such as these, Craik and Watkins argued that there is a difference between maintenance rehearsal, or rote repetition, which maintains traces in an active state, and elaborative rehearsal, which links new items to pre-existing knowledge stored in memory, and which lays down a lasting trace in long-term memory. Rote rehearsal may be fine for remembering a telephone number on the way to the phone, but in their view, memories are only permanently stored if they are subject to some degree of elaboration.

The importance of elaboration is illustrated by experiments by Craik and Lockhart, and Craik and Tulving (all at the University of Toronto) on "depth (or levels) of processing" in memory. In one experiment, subjects were presented with a list of words, and asked to perform one of four kinds of orienting tasks:

  • orthographic, in which they were asked to make a judgment about the physical characteristics of the printed word -- e.g., its color, upper or lower case, typeface;
  • phonemic, in which they were asked to make a judgment about the acoustic characteristics of the spoken word -- e.g., whether it rhymed with some other word;
  • semantic, in which they were asked to make a judgment about the meaning of the word -- e.g., whether it was similar to another word, or belonged in some category, or was a semantic associated of another word; and
  • sentence, in which they were asked to judge whether the word made sense in a particular sentence -- a judgment that is based on syntax as well as semantics.

Again, at the end of the experiment the subjects were surprised with tests of recall or recognition of the words they had judged. Craik and his colleagues discovered that memory was relatively poor for words that had been subject to orthographic or phonemic orienting tasks, compared to the semantic or sentence tasks. In their interpretation, each orienting task required a different kind of processing at the time of encoding. The orthographic and phonemic tasks required only "shallow" processing, while the semantic and sentence tasks required "deeper" processing. The deeper the processing, the better the memory.

Taken together, these studies illustrate the elaboration principle in memory:

The Elaboration Principle

Memory for an event is a function of the extent to which that event is analyzed and related to pre-existing knowledge at the time of encoding.

We can define elaboration in a number of ways:

  • the amount of attention paid to the event;
  • the number of attributes or features of the event processed at the time of perception; or
  • the number of links formed between the event and other, pre-existing knowledge.

All of this follows from the basic idea that memory is a product of perceptual analysis.

So now we understand that there are at least two different modes of processing at the time of encoding.

  • Rote maintenance rehearsal, in which we mentally repeat an event over and over without adding anything to it, maintains an item in "immediate" memory (short-term, primary, or working memory), but does not create a particularly long-lasting trace. This requires that we add something to the trace at the time of encoding.
  • Elaborative rehearsal, which adds something to the trace by connecting the new memory up to things we already know. This "added value" is critical to creating a long-lasting memory trace.


The elaboration principle applies to individual list items, and says that retention is improved if we connect them to previous knowledge. But retention is also improved if links are established between individual list items, connecting them to each other. Evidence for this idea comes from studies of organizational activity in memory. When we compare the order in which a list of words is recalled to the order in which it was originally presented, we usually observe some reorganization of the list items. There are many different types of organizational activity.

For example, if one studied a list of words likeAssociative Clustering in Recallthose on the left, recall might look like the list on the right. This phenomenon is known as associative clustering, because words are clustered together according to their associative relationships, such as boy-girl.

                  ListTo take another example,if one studied a list of words like those on the left, recall might look like the list on the right. This phenomenon is known as category clustering, because words are clustered together according to their shared membership in some category, such as foot-finger-mouth.

Associative and category clustering both are based on the meanings of the words involved, but they're different. Associative relationships are determined by a simple word-association test: the experimenter presents a word, and the subject responds with the first word that comes to mind. An example might be oyster-pearl.Oyster and pearl are associatively related, but they do not belong to the same conceptual category. Conceptual relationships are determined by category membership. For example,eagle and pigeon belong to the same conceptual category, but they're not associatively related: when an experimenter says eagle, nobody says pigeon.

Category Clustering in Free RecallAn important early experiment by Bousfield and Cohen (1953) showed that both free recall and category clustering increased over repeated study-test cycles. In this experiment, Bousfield was careful to use category members that were not also associatively related, to rule out the possibility that category clustering was based simply on associative relationships among the items.

Subjective Organization in RecallBoth associative and category clustering reflect organizational activity on the part of the subject. Even when there are no associative or categorical relations built into the list, subjects can find idiosyncratic relations among items -- a phenomenon known as subjective organization -- a phenomenon first noted by Endel Tulving (1962). Presented with a list like this one, in which none of the items are associatively or conceptually related to any of the others, subjects might link them together through an image or a narrative:

A boy is sitting at a table, with his dog at his feet, eating a pepper, looking out the window through some iron bars at the stars in the blue night sky.

Again, the subject is adding something to each item, linking related items together by their associative or conceptual relationships.

Findings such as these illustrate the organization principle in memory:

The Organization Principle

Memory for an event is a function of the extent to which that event is related to other events at the time of encoding.

The organization principle illustrates an important distinction among types of processing:

  • single-item processing, in which individual list items are related to pre-existing knowledge; and
  • inter-item processing, in which individual list items are related to each other.

Both elaboration and organization reflect cognitive activity on the part of the subject -- what the British psychologist F.C. Bartlett called "effort after meaning". The more effort after meaning, the better the memory.

So now we have three ways to process an event at the time of encoding:

  • rote maintenance rehearsal, which maintains individual items in a highly active state;
  • elaboration, which connects individual items to pre-existing knowledge; and
  • organization, which connects individual items to each other (and also requires elaboration, to pre-existing knowledge about categories).

The more elaboration and organization at the time of encoding, the better the memory at the time of retrieval.


Assuming that a memory trace is adequately encoded in memory, well elaborated and organized. Once encoded, that trace is, at least in principle, available for use. It remains activated so long as attention is devoted to it (e.g., by rote maintenance rehearsal), and so has a high probability of being retrieved from primary (short-term or working) memory.


But when the trace is no longer the object of active attention, its activation fades. Now, there is a lower probability of retrieval from secondary (long-term) memory. Thus, the probability of retrieval progressively diminishes over time.

Ebbinghaus TimeThis commonsense point is obvious to anyone one who has ever forgotten anything, but it was initially documented by Ebbinghaus, in another of his pioneering studies of memory. In this study, Ebbinghaus used essentially the same method as in his study of rehearsal, learning by rote a list of nonsense syllables. After reaching a criterion of learning (one perfect repetition of the list), he allowed an interval of time to pass, and then measures his own memory by savings in relearning. The result was a progressive forgetting, with memory loss increasing with the length of the retention interval.

Ebbinghaus's result illustrates the time-dependency principle in memory:

The Time-Dependency Principle

Memory for an event is an inverse (negative) function of the length of the retention interval between storage and retrieval.

Robert Morris, the "Memory" series
                (1963)The Time-Dependency Principle is beautifully illustrated by the five drawings in the "Memory" series (1963) by Robert Morris (the third drawing in the series is illustrated here).

"These drawings use words rather than representational images to analyze the artist's preoccupation with the workings of physical systems and psychological states.... The five drawings in the Memory series... trace the process of decay through the artist's increasingly faulty recapitulation of a text that has been previously committed to memory" (from Inability to Endure or Deny the World: Representation and Text in the Work of Robert Morris by Terrie Sultan (Washington, D.C.: Corcoran Gallery of Art, 1990).

There are exceptions to the time-dependency principle. In the phenomena of reminiscence and hypermnesia, memory actually seems to improve over time. But for present purposes, let us focus on time-dependent forgetting.

Time-dependency is a commonsense point. The real scientific problem lies in discovering what happens over the retention interval to produce time-dependency. Theories of forgetting have generally focused on four possibilities:

  • Passive decay of the memory trace. The idea here is that without active rehearsal, memory simply fades over time, the way an echo fades in a building, or the way a photograph fades with exposure to the sun. Decay is a variant on Thorndike's Law of Exercise, or the principle of "use it or lose it": unless the memory is rehearsed, it is lost from storage.
  • Displacement of traces from storage. Assuming that consolidation has been completed, it is possible that newly encoded memories can erase old memories from storage, in much the same way that recording a new message on a telephone answering machine erases previous messages.
  • Consolidation failure after encoding. This variant on decay theory assumes that the encoding of a new memory doesn't happen instantaneously, at the moment of perception, but rather takes time. Accordingly, certain events, like the encoding of new memory traces, can interrupt this time-dependent process.
  • Interference among stored memory traces. Here, the idea is that nothing happens to disrupt the encoding, consolidation, and storage processes, but that new memories, acquired over the retention interval, somehow get in the way old memories. Whereas the decay, consolidation, and displacement theories interpret time-dependent forgetting as a failure of encoding or storage, the interference theory interprets it as a failure of retrieval.

As it turns out, each of these principles applies to a different aspect of memory.

Information is lost from the sensory registers, such as the icon and the echo, by passive decay, as well as by displacement of old information by newly arriving information.

Information in primary (short-term) or working memory may also be subject to decay, once the subject stops active rehearsal. But because these are limited-capacity stores, old information is subject to decay by newly arriving information.

Some evidence for consolidation failure in secondary or long-term memory is provided by the phenomenon of traumatic retrograde amnesia. If subjects are asked to write down a list of every event that they remember in their lives (and don't think this hasn't been done!), we would ordinarily observe a temporal gradient in memory -- that is, more events would be remembered from the recent past than from the remote past. But following a concussive blow to the head -- in which the skull is struck or shaken, injuring the brain, and resulting in a loss consciousness -- the victim may lose his or her memories for events in the very recent past. This, by itself, is evidence of some kind of consolidation failure -- the argument being that these recent memories were particularly vulnerable to destruction because the consolidation process had not yet been completed. It's a good argument, but the only problem is that many of those memories will eventually come back, indicating that the amnesia entails a temporary disruption of retrieval - -which is not what a consolidation failure is supposed to be all about. Still, memories for the moments to minutes immediately prior to the accident appear to be permanently lost, perhaps due to an interruption of consolidation. This is known as a final or residual retrograde amnesia.

Temporal Gradient in Memory Traumatic Retrograde Amnesia Recovery from Traumatic Retrograde Amnesia "Final" Residual Amnesia
                        Following Recovery

Traumatic retrograde amnesia is hard to study experimentally, because we can't administer concussive blows to subjects' heads. But a laboratory model for traumatic retrograde amnesia may be found in the amnesia that follows a dose of electroconvulsive therapy (ECT), a common (and surprisingly effective) treatment for some forms of depression. In ECT, electrodes are placed over the patient's temples (in bilateral ECT), or over the temple of the non-dominant hemisphere and the vertex of the skull (in unilateral ECT), and a current is applied that passes through the patient's brain and induces a convulsive seizure very must like what is observed in epilepsy. The treatment does not cause pain (because the patient is anesthetized), but it does render the patient amnesic for events that have occurred in the recent past.

ECT and
                Retrograde AmnesiaFor example, in a study by Squire & Chase (1975), patient were given a course of ECT and then asked to remember news events (such as the names of TV shows or winners of famous horse races). Compared to patients who did not receive ECT, the ECT patients had difficulty remembering events from the immediately past two years or so. But further back than that, the two groups were completely comparable in terms of their memories. The amnesia is retrograde in nature, because it works backward in time. By contrast, the amnesia in the case of Patient H.M was anterograde in nature, because it affected memories that occurred after the brain injury occurred. Still, the post-ECT retrograde amnesia was temporally graded, because it affected very recent memories more strongly than remote ones.

Patients can also recover from post-ECT retrograde amnesia -- though they also show a residual amnesia covering the moments immediately before the ECT was delivered.

The pattern post-traumatic retrograde amnesia, and the existence of a final amnesia even after a substantial period of recovery, has led some theorists to propose that there are two kinds of consolidation.

  • Short-term consolidation is essentially a byproduct of encoding. It occurs within seconds of the event, and its disruption causes anterograde amnesia. In these terms, Patient H.M. suffers from a deficit in short-term consolidation.
  • Long-term consolidation, by contrast, is mediated by a process that transpires over a more substantial period of time -- at least hours, perhaps days, after encoding. Its disruption is responsible for retrograde amnesia. There is now considerable evidence that long-term consolidation is facilitated by a period of sleep after learning.


However, Paired-Associate Learningmost theories of time-dependent forgetting from secondary memory emphasize interference, which is often studied using variants on the paired-associate learning paradigm described earlier.

  • Recall that in the standard paired-associate learning experiment, the subject is asked to learn a list of items, designated the A-B list. In studies of interference, the subject is then asked to learn a second list of paired associates. Some of these items are entirely new, and are designated C-D items. Others are composed of an old stimulus term paired with a new response term, and are designated A-D items. Retroactive Inhibition in Paired-Associate
                    LearningAfter the A-D items of the second list are memorized, the subject may have difficulty remembering the A-B items from the first list. Apparently, memory for the A-D items from the second list interferes with memory for the A-B items from the first list. This phenomenon is known as retroactive inhibition, or RI, because the inhibition of one memory by another acts backward in time. Note, however, that RI could be caused by factors other than interference. Traces of the A-D list could disrupt consolidation of the A-B list, or they displace A-B items from storage.

  • Associate LearningHowever, it is also commonly observed that A-D items are learned more slowly than C-D items. Apparently, memory for the A-B items from the first list interferes with learning of the A-D items on the second list. This phenomenon is known as proactive inhibition, or PI, because the interference acts forward in time. PI cannot be produced by consolidation failure or displacement, because the inhibiting memories are already in storage. PI must be a phenomenon of interference. Because RI could result from interference, and PI must reflect interference, interference has become a general account of time-dependent forgetting in secondary memory.

Inhibition and Interference

Psychologists sometimes use the terms inhibition and interference interchangeably, causing considerable confusion among new students (I did this myself, in the first draft of this lecture. Technically, retroactive and proactive inhibition are phenomena of memory. Retroactive and proactive interference are the mechanisms which, respectively, explain the phenomena of inhibition.

The general idea behind the doctrine of interference is that once encoded (with perhaps a little extra time needed for consolidation), traces are permanently stored in memory, and forgetting occurs by virtue of the mutual interference among accumulated memories. This leads to the paradox of interference: the more you know, the harder it is to retrieve any particular item of information. Although this may seem counterintuitive, if follows from the principle of interference, and is demonstrated in studies that measure retrieval in terms of response latencies rather than accuracy.

In one experiment, subjects learned a list of simple facts about people and their locations, such as

  • The doctor is in the bank
  • The fireman is in the park
  • The lawyer is in the church
  • the lawyer is in the park

The Fan
                  EffectNote that there was only one fact about the doctor and one fact about the bank; but there were 2 facts about the lawyer, and two facts about the park. In this way, the experimenters varied the number of facts learned about each person and each location. Subjects memorized the list until they could recall it perfectly, and then were asked to distinguish between old and new sentences on a recognition test. The subjects rarely made a mistake on this recognition test, but their response latencies for both true and false facts varied according to the number of facts they had learned about each person and each location. The more they knew, the more time it to them to access any particular item of information.

This pattern of findings is sometimes referred to as the fan effect. In associative network models of memory, each concept is represented by a node, and nodes are linked to form propositions representing facts. Thus, in the example given, there would be one associative link between the node representing the doctor and the node representing the bank; one associative link between the node representing the fireman and the node representing the park; but two associative links from the node representing the lawyer -- one to the node representing the church and another to a node representing the park. In these associative network models, it is assumed that activation spreads from the node representing the subject (e.g.,the lawyer) down associative links to nodes representing the objects (e.g.,the church and the park). But this activation spreads serially, down one link at a time. So the more links that fan out from any node, the longer it will take to retrieve the information represented by that link. Because there are two links fanning out from the lawyer node, but only one link fanning out from the doctor and fireman nodes, the models predict that it will take longer to retrieve information about the lawyer than about the doctor or the fireman. That's the fan effect. And you've just learned your first computational model of memory!

The concept of interference illustrates the distinction between the availability of a trace in storage, and the accessibility of that trace during retrieval (Tulving & Pearlstone, 1966). Availability has to do with the existence of a trace in memory storage. Traces are available in memory by virtue of having been encoded and retained in storage. But not all memories available in storage are accessible at any particular point in time. Decay, consolidation failure, and displacement are, essentially, effects on availability. Interference is an effect on accessibility. The role of interference indicates that secondary memory is essentially permanent: once encoding has been accomplished, forgetting is largely a function of retrieval factors.


Assuming that a trace has been adequately encoded during perception, and not lost from storage over the subsequent retention interval, information in memory storage must be retrieved in order for the person to answer queries about some event in his or her past.


These queries focus the process of information retrieval by serving as retrieval cues, or reminders of the events in question. Remembering succeeds when the subject is able to add information garnered from memory to information contained in the query itself.

                Cued Recall and RecognitionThe role of cues in retrieval is illustrated by comparing the results of the various types of memory tests described earlier. In an experiment by Tulving and Watkins (1975), subjects studied a list of five-letter words. On the retention test, they were given cues consisting of 0-5 letters and asked to recall the corresponding words from the list. The "zero-letter" condition, of course, corresponds to free recall, while the "five-letter" condition corresponds to recognition. Recognition was better with free recall, with cued recall falling somewhere in between.

The results of the experiment may be understood by analyzing the kinds of cues provided by each kind of test.

  • In free recall, the query is quite impoverished. It specifies the context in which some event occurred, but provides no additional information about the target memory. The subject then has to retrieve the event without much help. A memory test such as "What were the items on the list you just studied?" is analogous to a question like "What did you do on your summer vacation?". The subject has to do all the work.
  • In cued recall, the query provides more cues about the target memory. The query describes the context, as in free recall, but also describes the general features of the event, so that the subjects has a lot more information about what to retrieve. A memory test such as "Were there any color names on the list?" is analogous to a question like "Did you go to any national parks on your summer vacation? Which ones?". Paired-associate learning is a variant on cued recall, where the stimulus term serves as a cue for recall of the response term
  • In recognition, the query contains a copy of the target event itself (for that reason, recognition cues are often called copy cues). Not a literal copy, necessarily, but still a fairly complete representation. A memory test such as "Was orange or purple on the list?" is analogous to a question like "Did you go to Yosemite or Yellowstone on your summer vacation?".

Don't Look Back! Answer from Memory

  1. What were the items in the list that I used to illustrate category clustering in memory? This is a test of free recall: the query specifies the spatiotemporal context in which the event occurred, and you have to fill in with what the content of the list was.
  2. What were the categories in the list? This is also a test of free recall, directed not at the items but at their categories ("chunks" are also items, after all).
  3. Were some of the items animal names or bird names? This is a test of recognition, also directed at the categories.
  4. What were the animal names on the list? This is a test of cued recall.
  5. Was one of the words lion or tiger? This is a rest of recognition, directed at the specific items on the list.

In a sense, all recall is cued recall, except that in free recall the cues are not especially informative (of course, the "freest" free recall of all would be the simple query, "Tell me everything you remember"). In that same sense, recognition is also a variant on cued recall, in which the "cues" are a copy of the event being remembered.

The comparison among free recall, cued recall, and recognition illustrates the principle of cue-dependent memory:

The Cue-Dependency Principle

Memory for an event is a function of the amount of information supplied by the retrieval cue.

Memory is not trace-dependent, a matter of encoding, storage, and availability. Rather, it is cue-dependent, a matter of accessibility and retrieval. Retrieval begins with a retrieval cue, either presented by the environment (e.g." Where did you put the car keys?") or generated by the subject (e.g., "Where did I put the car keys?"). The probability of successfully retrieving a memory depends on the information value of the retrieval cue. The more information contained in the cue, the more likely it is that retrieval will succeed. An encoding may be good or poor, but whatever is encoded remains permanently in storage. The problem for remembering is to gain access to stored information, and the retrieval of stored information is a function of the cues contained in the query.

Encoding Specificity

Usually, we consider encoding, storage, and retrieval to be separate phases of memory processing. But in fact, these stages interact in interesting ways. Rich retrieval cues can compensate for a poor encoding, and a deep permits access to memories even with very impoverished retrieval cues. The mnemonic devices used by ancient Greek and Roman orators, such as the method of loci (and the more recently invented pegword system), provide both a background for elaboration and organization at the time of encoding, and a framework to guide retrieval. In this way, encoding processes set the stage for effective retrieval processes.

Consider the following cued-recall test (and no fair looking back!):

Was there a girl's or a woman's first name on any of the lists presented in this Lecture Supplement?

Most people will probably say "no": that last list consisted of the names of clothing, body parts, animals, and colors. But now look back at the list used to illustrate category clustering in recall.Amber is a color, but it's also a girl's or woman's first name (think of Amber Tamblyn, the actress, or Amber Alerts concerning children who have been kidnapped). So there was a girl's name on the list, you just didn't think of it that way. If you had been asked whether there was a color name on the list, you probably would have remembered orange, purple, and amber.

And that's the point. Both girl's name and color are perfectly appropriate retrieval cues, but one is more effective than the other because the word amber was encoded as a color, not as a girl's name. How an item is encoded determines how it will be retrieved. If it's encoded as a color, it will be most easily retrieved as a color. If it's encoded as a girl's name, it will be most easily retrieved as a girl's name.

Considerations such as this lead us to the principle of encoding specificity in memory:

The Encoding Specificity Principle

Memory of an event is a function of the degree to which the cues present at the time of retrieval match or overlap with the cues present at the time of encoding.

The encoding specificity principle is sometimes known as the principle of transfer-appropriate processing, which holds that memory is best when the cognitive processes deployed at the time of encoding match the cognitive processes deployed at the time of retrieval.

Encoding specificity says that what is stored is determined by how an event is encoded: memory is a byproduct of perception; and how an event is encoded determines how it will be retrieved -- what retrieval cues will be effective in gaining access to the memory trace. Memories are accessible to the extent that the information supplied by the retrieval cue matches the information encoded in the memory trace.

As such, encoding specificity sets limits on cue-dependency. The cue-dependency principle assumes that encoded traces remain permanently available in storage, and that the problem in secondary memory is to gain access to available trace information. Cue-dependency then states that accessibility is dependent on the amount of information contained in the retrieval cue. The query must supply enough cue information to insure contact with the trace stored in memory -- and to compensate, if necessary, for a poor encoding. But encoding specificity makes it clear that retrieval is not simply a matter of the sheer quantity of information in the cue. It must be the right quality, the right kind, as well. For retrieval to be effective, the information supplied by the retrieval cue must match the information encoded with the memory trace at the outset.

Encoding specificity explains one of the most fascinating phenomena of memory:state-dependency, or the fact that memory is best when subjects' physiological states at the time of retrieval match their physiological states at the time of encoding.

State-Dependent Memory ParadigmState-dependency was first observed in animals (by Overton, 1964), but it can also be observed in humans in verbal-learning experiments employing a "Noah's Ark", or "two-by-two", design. In the first phase, a word list is studied while one group of subjects is under the influence of some psychoactive drug, such as a barbiturate sedative. Then each of these groups is divided in half, and each of these subgroups is tested either under the influence of the same drug or not. All psychoactive drugs alter the physiology of the nervous system -- and, in fact, with certain exceptions, they impair cognitive functions such as encoding and retrieval. For that reason, memory is best if there are no drugs involved at all. Over all, however, memory is best when encoding and retrieval take place in the same physiological state. This is the phenomenon of "state-dependent" memory.

                Drug-Dependent Memory in Children with ADHDState-dependency was observed in a study of the effects of the drug Ritalin on learning and memory in children with attention deficit hyperactivity disorder (ADHD). Although Ritalin is an amphetamine derivative, and thus a central nervous stimulant, it has the paradoxical effect of enabling some ADHD patients to better focus their attention, thus improving learning and memory (Ritalin, like all psychoactive drugs, impairs learning and memory in individuals who do not suffer from ADHD). Anyway, the children were engaged in a "zoo location" task in which they studied a list of paired associates consisting of an animal name and a city name, such as TIGER-TORONTO (the investigators were both working in Canada at the time). For the memory test, the children were supplied with the animal name, and asked to supply the corresponding city name. Half the children were on Ritalin at the time of study, and half the children were on Ritalin at the time of test. Memory was measured in terms of the number of errors the children made, so the fewer the errors the better. Children who studied under the influence of the drug made fewer errors when they were also tested under the influence of the drug, and children who studied off the drug made fewer errors when they were also tested off the drug. In other words, memory depended on congruence between drug states at encoding and retrieval.

Similar effects have been obtained in studies employing a wide variety of psychoactive drugs. In general, state-dependency is strongest with central nervous system depressants such as the barbiturates, alcohol, and benzodiazepines (such as Librium), though nicotine also produces relatively strong effects (another good reason not to smoke!). Moderately strong effects have been observed with narcotics and marijuana. Fortunately for many college students, no state dependency has been observed with caffeine -- though this may have been because the subjects' systems were so heavily loaded with caffeine that their physiological states didn't really change between study and test! Aspirin and lithium (a drug sometimes used in the treatment of bipolar affective disorder, or so-called "manic-depressive" illness) do not seem to produce state-dependency.

                Dependent MemoryAnalogous findings have been obtained for emotional states. In the phenomenon of mood-dependent memory, subjects study items in happy or sad moods (induced through music or other techniques), and then are tested in happy or sad moods. Memory is best when the subject's mood at the time of retrieval matches his or her mood at the time of encoding.

Environment-Dependent MemorySimilar findings have even been obtained in different physical settings, as in the phenomenon of environment-dependent memory. In one experiment, U.S. Navy SCUBA divers studied a list of words on land or 25 feet under water, and then were tested underwater or on land. Memory was best when the environment of retrieval matched the environment of encoding.

State-dependency, mood-dependency, and environment-dependency are all variations on a theme of context-dependency in memory. Memory varies across internal contexts such as drug states and mood states, as well as across external contexts such as environmental settings. In each case, memory s best when the context of retrieval matches the context of encoding. Context-dependency in memory is predicted by the encoding specificity principle, and it shows how encoding and retrieval factors work together to affect memory.

Encoding and retrieval processes are not really independent of each other: encoding sets the stage for retrieval, and retrieval capitalizes on encoding.

Schematic Processing

The complementary effects of encoding and retrieval processes are also illustrated by schematic processes in memory. Recall (from the discussion of perceptual construction) that a schema (plural schemata or schemas) is an organized and generalized knowledge structure representing the person's beliefs and expectations about some aspect of the world. Schemata provide the cognitive basis for perception and memory, structuring perceptual experience, guiding encoding and retrieval, facilitating processing, and filling in information that is missing from the stimulus or the trace.

The schematic effects on memory may be illustrated in a variant on the verbal-learning paradigm known as the "person memory" experiment, in which subjects study and remember information about specific individuals. In one kind of person memory study, subjects are first presented with a trait ensemble listing the personality traits of a person, such as:

Judy is

  • intelligent,
  • intellectually sophisticated,
  • artistically sensitive,
  • refined,
  • imaginative, and
  • witty.

The point of this phase is to establish a general impression of the target person (in this case, Judy) in the mind of the subject - in other words, a schema for Judy. Then we present information about that person's specific behaviors.

  • Some of these behaviors are schema-congruent, in that they have a high probability of occurring, given the impression we have of the person -- such as
    • Judy won the chess tournament.
    • Judy attended the symphony concert.
  • Other behaviors are schema-incongruent, in that they have a low probability of occurring, given the impression we have of the person -- such as
    • Judy made the same mistake three times.
    • Judy had difficulty understanding the daytime television show.
  • Still other behaviors are schema-irrelevant, in that they are not predicted by the schema one way or another -- such as:
    • Judy ordered a sandwich for lunch.
    • Judy took the elevator to the third floor.

Expressed in terms of conditional probabilities,

  • for schema-congruent behaviors,p(behavior | schema) > .50, approaching 1.0.
  • for schema-incongruent behaviors,p(behavior | schema) < .50, approaching 0.0.
  • for schema-irrelevant behaviors,p(behavior | schema) = .50, just chance.

Schema-Congruence and MemoryWhen we plot free recall for the person's behaviors against schema-congruence, we observe a U-shaped function: that schema-relevant (whether congruent or incongruent) behaviors are remembered better than schema-irrelevant behaviors, and that schema-incongruent behaviors are remembered best of all. study a list of personality traits to establish a general impression, or schema, concerning a person.

These results illustrate the schematic processing principle:

The Schematic Processing Principle

Memory for an event is a function of the relationship between that event and pre-existing schemata (representing prior knowledge, expectations, and beliefs).

We can understand the role of schemata in memory with reference to the stage analysis of memory.

  • Schema-congruent events fit right into the subject's prevailing schemata, and this schema provides extra cues at the time of retrieval, which is more likely to succeed. If you want to know what a person has done, it helps to remember what kind of person he or she is.
  • Schema-incongruent events, however, do not fit in. They are surprising, not predicted by what we know, and therefore must be explained. This explanatory activity, in turn, results in more elaborate processing at the time of encoding, and better recall at retrieval.
  • Schema-irrelevant events are not unexpected, and so do not receive much elaboration at the time of encoding; nor does the schema provide effective cues to their retrieval. Because of this double disadvantage, schema-irrelevant events are poorly recalled.

Memory as Reconstruction

The traditional study of memory, whether explicit or implicit, is often, at least tacitly, based on what might be called the library metaphor of memory. In this metaphor,

  • Memories are contained in traces, which are like books which contain information.
  • Memories are encoded, just as books are purchased, cataloged, and shelved.
  • Memories are stored, just as books remain on the shelf until the information in them is needed.
  • Memories are retrieved, just as books are looked up in a catalog, found on the shelf, checked out, and read.

From this point of view, remembering is a matter of information retrieval. Just as perceptual processes produce a mental representation of some event in the present, memory processes reproduce a mental representation of some event in the past. This reproductive view of memory has taken us a long way toward understanding how memory works, but it is not the whole story.

An alternative view of memory was first proposed by Frederick C. Bartlett, a British psychologist in a book he wrote in 1932. Bartlett objected to the techniques of verbal and paired-associate learning used by those who followed in the tradition of Ebbinghaus, and who held to the underlying assumption (even though they might not have used these precise terms). From this point of view, memories are things to be encoded, stored, and retrieved, the problem for memory is to reproduce the past as it occurred, and forgetting is a failure of reproduction. In contrast, Bartlett argued, ordinary remembering is better represented by memory for stories. Memories are narratives of the past,reconstructions of past events rather than re productions of them.

Memory: Noun or Verb?

In Ebbinghaus's 1885 book, Uber das Gedachtniss, memory (das Gedachtniss) is a noun, and memories are things.

In Bartlett's 1932 book, Remembering, memory (remembering) is a verb, and memories are products of human activity.

Memory for Stories

Accordingly, Bartlett studied memory for stories instead of memory for lists of nonsense syllables, paired associates, or words. Instead, he read his subjects unfamiliar stories -- narratives from foreign cultures where the details were somewhat unfamiliar, and the story line somewhat confusing.

One such story was The War of the Ghosts, a Native American tale collected by the pioneering American anthropologist Franz Boas.

War of the Ghosts
  • In some studies Bartlett employed the method of repeated reproduction, in which subjects would recall the same story on several occasions spread out over a period of hours or days.
  • In other studies he employed the method of serial reproduction, in which one subject would tell the story to a second subject, who in turn would tell the story to a third subject, and so on -- sort of like the game of "telephone", illustrated in "The Gossips" (1948), a painting by Norman Rockwell.

Of course, Bartlett observed that his subjects progressively forgot minor details, though they always retained the gist of the story. But he also observed other kinds of changes, which he thought were part and parcel of normal remembering and forgetting:

  • Errors of Omission
    • Forgetting relatively minor details, as already described;
    • Omission of features that didn't conform to his subjects' expectations.
  • But also Errors of Commission, such as
    • Rationalization, in which subjects would import details to "explain" puzzling passages.
    • Transformation of detail from the strange to the familiar.
    • Transformation of order to provide a more coherent structure to the narrative.

Over time or over storytellers, Bartlett observed that the story got shorter, but it also got more coherent. His conclusion was that remembering involves reconstructive, not merely reproductive, activity. In Bartlett's view, the rememberer (by clumsy analogy to the perceiver) retrieves the dominant details of the story, and something about his or her attitude toward the event; and then builds up the rest from that foundation, resulting in a narrative that may well be coherent, but may not be accurate.

Bartlett's arguments were largely ignored, in part because his experiments were not tightly controlled and his data analyses impressionistic. Ebbinghaus set the standard for scientific studies of memory. But beginning in the 1970s, as part of the "cognitive revolution" in psychology, Bartlett's ideas were revived, and investigators began to study schematic processes in memory. Some of this research led to the schematic processing principle discussed earlier. But research on the schematic processing principle was focused on the determinants of accurate recollections. In other research, investigators looked at the role of schematic processes in producing inaccurate memories. After all, an inaccurate memory is just as much a memory as an accurate one is -- it's still the person's mental representation of the past. If Bartlett is right, and remembering the past is like telling a story, then we need to be interested in inaccuracies as well as accuracies.

The Semantic Integration Effect

Perhaps the earliest attempt to study reconstructive processes quantitatively was work by Bransford and Franks (1971, 1972) on memory for sentences. At first glance, their research looks like any other verbal-learning experiment, with the exception that, instead of lists of words, the subjects studied lists of sentences such as the following:

  1. The girl broke the window on the porch.
  2. The girl who lives next door broke the window on the porch.
  3. The girl lives next door.
  4. The girl who lives next door broke the large window
  5. The large window was on the porch
  6. The window was large.

These sentences are related to each other, but differ in terms of the number of propositional ideas they contain. Sentence #6 contains one proposition, that the window was large. Sentence #5, with two propositions, adds the idea that the window was on the porch. These sentences were interspersed with other sentences about a barking dog, a man smoking his pipe, a car climbing a hill, and ants in a kitchen.

Later, on a recognition test, the subjects were presented with new sentences that conveyed the same ideas of the originals, but which hadn't been seen during the encoding phase, such as:

  • The girl who lives next door broke the large window on the porch.

The sentence above contains four propositional idea units:

  1. The girl lives next door.
  2. The girl broke the window.
  3. The window was large.
  4. The window was on the porch.

All of these ideas are consistent with sentences actually presented during the study phase, and in fact two of them -- numbers 2 and 4 -- actually were presented during the study phase. But the entire sentence was not.

                MemoryAt the time of test, Bransford and Franks presented subjects with old targets and new lures consisting of 1-4 propositional idea units, and asked subjects to rate each one on a scale of confidence ranging from -5 (very confident the item is new) to +5 (very confident the item is old). The results were very striking, because the subjects had great difficulty distinguishing between old and new items. Three- and four-unit items received relatively high positive confidence ratings, regardless of whether they were old or new. The subjects were somewhat better able to discriminate between old and new items at the one- and two-proposition levels, but note that they tended to rate old one-proposition items as new.

Bransford and Franks' subjects were obviously very confused about the details of what they studied. But they were not confused about the general thrust of the studied material -- that the girl who lived next door broke the large window on the porch. Abstracting that complex idea from simpler sentences is part and parcel of what Bartlett called effort after meaning. False recognition of sentences consistent with that abstract idea -- that schema for the story -- is a reflection of the reconstructive activity that Bartlett felt was central to the process of remembering.

Point of View Effects

Another early example of "neo-Bartlettian" research on memory concerned the role of point of view in memory. In one study by Gordon Bower and his associates, subjects read a short story about two boys who played hooky from school, describing their activities around the house. While reading the story, the subjects were asked to take the perspective of either (1) a prospective home-buyer (who might notice things like tile floors and chandeliers) or (2) a burglar (who might notice things like silverware and television sets). Subjects recalled the story, and then were asked to shift perspective -- those who originally read the story from the perspective of a home-buyer were now asked to take the perspective of a burglar. When they did so, the subjects recalled new details that were irrelevant to the original perspective, but important to the new one. This is consistent with Bartlett's idea that the subject's attitude is an important determinant of how a memory is reconstructed.

The Post-Event Misinformation Effect

Bartlett disparaged the verbal-learning paradigm, but aspects of reconstructive activity can be seen even within the constraints of word lists and similar materials. For example, Elizabeth Loftus of the University of Washington has developed a laboratory model of eyewitness memory in which subjects view a series of slides, or a short film, depicting an accident or a crime, and later are asked questions about what they saw. The subjects, then, take the role of bystanders or eyewitnesses to the event.

Link to an interview with Elizabeth Loftus.

                Post-Event Misinformation EffectIn one experiment, the subjects viewed a series of slides depicting an auto-pedestrian accident.

  • In one version of the slideshow, the subjects saw a red Datsun stopped at a "yield" sign.
  • In the other version, they saw the red Datsun stopped at a "stop" sign.

Then the car turns the corner and hits a pedestrian in a crosswalk. Later, during the "interrogation", half the subjects in each group responded to a question that assumed the existence of a yield sign, while the other half responded to a similar question that assumed the existence of a stop sign.

  • Did another car pass the red Datsun when it stopped at the yield sign?
  • Did another car pass the red Datsun when it stopped at the stop sign?

For those who saw the yield sign, the first question is appropriate, but the second question is misleading, because it inappropriately suggests that there was a stop sign instead. In legal jargon, such questions are called leading questions, because they assume facts in evidence (a famous example is the question, "When did you stop beating your wife?", which assumes from the outset that such beatings actually occurred).

Still later, the subjects received a recognition test in which they were askedCorrect
                Recognition of Signwhich of two slides they saw -- one with a yield sign and one with a stop sign. While most subjects who got the non-misleading question -- those who saw a yield sign and were queried about a yield sign, for example -- recognized the appropriate sign, fewer than half the subjects in the misleading conditions -- those who saw a yield sign but were questioned about a stop sign -- did so correctly.

This phenomenon is known as the post-event misinformation effect, because misinformation acquired after an event is incorporated into the person's memory for the event itself. The legal implications of the effect are important, because it shows that eyewitnesses can incorporate information from leading questions into their memories, leading them to remember things differently than they otherwise would, had the leading questions been omitted.

The  Associative Memory Illusion

                  Associates to NeedleWhen a subject who has seen a yield sign falsely recognizes a stop sign, the memory is in some sense illusory. Another illusion of memory is induced by asking subjects to study a list of words which are all semantic associates of another word (known as the critical target), which is omitted from the list. For example, subjects might study a list of word such as sharp, knitting,and haystack, all of which are associatively related to the critical target word needle. On later tests of recall and recognition, the subjects will often falsely remember that the semantic associate, also occurred on the list. Under some circumstances, false memory for the critical target can be as strong as true memory for actual list items.

Semantic Associates of NeedleThis Inducing the Associative Memory Illusion phenomenon appears to be mediated by semantic associations between the list items and the critical target. The list was composed of forward associations -- associations produced by asking a large group of subjects to respond to needle with the first word that came to mind. But semantic associations can be bidirectional, and words like sharp,knitting, and haystack also make people think of needle as a backward association. When subjects study a list of words, they naturally think of semantically related words as well, and these semantic associates -- backwards as well as forwards -- can be confused with list items on a subsequent retention test. In this way, subjects will often falsely remember studying the critical target word as well. Because the illusory memory (needle) is semantically associated with the items in the study list (e.g.,sharp and knitting), this phenomenon is called the associative memory illusion.

The associative memory illusion is just one of many illusions of memory, including the semantic integration effect and the post-event misinformation effect, documented in laboratory research. They are illusions because in each case the person is remembering an event that never occurred. Just as perceptual illusions are produced by the misapplication of the perceiver's pre-existing knowledge and beliefs (including unconscious inferences), so memory illusions are a product of the pre-existing knowledge (including knowledge of semantic associations among words) that a person brings to encoding and retrieval.

The Reconstruction Principle

Semantic integration effects, point of view effects, the post-event misinformation effect, and illusory memories all illustrate the reconstruction principle of memory:

The Reconstruction Principle

Every memory is a blend of information recovered from memory traces and knowledge, expectations, and beliefs derived from other sources.

The reconstruction principle reminds us that remembering is not simply a matter of retrieving encoded information from storage. If memory were merely a byproduct of perception, then all we could do is remember what happened to us, or forget it. But reconstructive processes allow us to remember events differently from the way they happened, and even allow us to remember things that did not happen at all. Memories are not reproductions of the past: they are also beliefs about the past -- beliefs that are consistent with what we know and believe about ourselves and the world, but which may not be entirely accurate.

For the most part, then, we recreate events and experiences each time we remember them. We begin with fragmentary details, either suggested by the retrieval query or retrieved from the memory trace. Then, through reconstructive activity, we fill in the gaps and flesh out the details, making inferences from our knowledge of ourselves and the world. The resulting memory is, then, an "educated guess" about what might have happened.

Reconstruction reminds us that memory is not just a cognitive faculty, or a biological function, but a human activity: memories are the stories we tell about ourselves to remind ourselves who we are, and relate to other people. Our memories are part of our identity, and shared memories help bind us together in groups, families, societies, and cultures.

So now we have a full set of principles of memory:

  • Elaboration and organization guiding the encoding of new memories, making them available for later use.
  • Time-dependency, produced by the phenomenon of interference, during storage.
  • Cue-dependency,encoding specificity,schematic processing, controlling the accessibility of memory, whether explicit or implicit, at the time of retrieval.
  • Reconstruction, based on accessible trace information and other world knowledge, yielding a mental representation of some past event in one's personal history.

The Problem of Eyewitness Memory

Because of the critical role of eyewitness testimony, the accuracy of memory becomes a critical issue in many legal cases. It's bad enough that people are prone to forgetting. But the reconstructive nature of memory means that witnesses memories can be biased and distorted in a number of ways. In fact, it was to explore the distortions introduced into eyewitness memory by leading questions that Loftus began her research on the post-event information effect.

It is sometimes asserted that the negative emotional arousal associated with being a witness or victim of crime impairs memory, much in the manner of the Freudian notion of repression. In fact, however, emotional arousal typically enhances memory rather than impairing it -- at least for the gist of the event -- the important central details. emotional arousal may, in fact, impair memory for peripheral or incidental details. This is demonstrated clearly in experiments by Loftus and her associates on weapon focus: if a weapon is used in a crime, witnesses typically remember it will, because their attention is focused there; they may not remember much about the perpetrator's face or clothing, because their attention is focused elsewhere. The issue, then, is where attention is focused, not the individual's degree of emotional arousal.

In addition, eyewitness identification is notoriously unreliable. In particular,cross-race identification is particularly poor. White witnesses have difficulty distinguishing among black suspects, and vice-versa; and the same problem holds for Asians, too.

Lineups are also problematic, in that witnesses' identifications can be biased by events that occurred before the lineup, or during the lineup process itself.

An anecdote from my own experience. When I taught at the University of Wisconsin, I was asked to serve as an expert witness for the defense in the case of a gas-station robbery in which a number of people, alleged members of the Puerto Rican national liberation movement, were charged (the alleged motive was to finance violent revolutionary activities). Previously, the police had shown some witnesses a photograph of one of the suspects, embedded in a group of non-suspects -- a technique known as a "photospread". None of the witnesses could identify the suspect. Then these same witnesses were brought into a lineup, in which the same suspect was shown with a new set of foils and were asked if they had ever seen any of those individuals before. They immediately picked out the suspect. But of course, they had seen the suspect before -- in the photospread!  Prior use of the photospread had, arguably, created a feeling of familiarity on the part of the witnesses -- but this feeling of familiarity was illusory -- it arose from the photospread, not from the crime itself. When the judge understood this, he immediately dismissed the charges against the defendants, on the ground that the identification in the lineup had been biased by the prior photospread.

In 2014, the National Research Council, the operating arm of the US National Science Foundation, reviewed the literature on the accuracy of eyewitness testimony, and recommended a number of procedural improvements to forensic investigation:

  • Eyewitnesses should view only one suspect (or foil) at a time, instead of the whole group, suspect and foils, together.
  • The suspect should not "stand out" in any way in the lineup.
  • Witnesses should be cautioned that the offender might not be in the lineup.
  • The lineup should be conducted in a "double blind" manner, so that neither the officials nor the witnesses know who the suspect is.
  • Officials should collect confidence statements from witnesses at the time they make their identification.

This last point is very important, because there is a large literature showing that the relationship between confidence and accuracy in eyewitness memory is surprisingly weak.  It was this fact that led to the controversy over eyewitness identification in the first place.  However, both laboratory and field research shows that the accuracy-confidence relationship is strengthened when safeguards like those listed by the NRC are put in place.  The downside is that these same procedures result in fewer identifications by eyewitnesses -- which should tell you something.  The trade-off is this: the procedures result in fewer identifications, but these identifications are both more accurate and more confidently made.

For more about the science of eyewitness identification, see:

  • Gary Wells et al., "Eyewitness Identification Procedures: Recommendations for Lineups and Photospreads".  Law and Human Behavior, 1998.
  • National Research Council, Strengthening Forensic Science in the United States: A Path Forward (2009).
  • National Research Council, Identifying the Culprit: Assessing Eyewitness Identification (2014).
  • John Wixted and Gary Wells, "The Relationship between Eyewitness Confidence and Identification Accuracy: A New Synthesis (2017).

Explicit and Implicit Memory

The traditional lie-detector paradigm assumes that the liar remembers the critical information perfectly well, and is merely withholding it from the investigators. However, at least in principle, a person could respond physiologically to a critical item, because the prior occurrence of that event has been encoded and stored in memory, even though he or she does not remember it consciously. The question of unconscious memory has been around scientific and clinical psychology and popular culture for more than a century, providing a wealth of plot material for books and movies, as well as for some forms of psychotherapy, like psychoanalysis. We usually think of memory in terms of the person's ability to consciously recollect his or her past, but can memories really be unconscious?

Certainly memories can be unconscious in the trivial sense that they are so poorly encoded that while available in principle, they are inaccessible under even optimal circumstances, and play no role in the person's behavior. Memories can also be unconscious in the sense that they are latent in storage, not yet activated by retrieval processes -- in which case we might more appropriately speak of preconscious memories. But can memories be unconscious in the substantive sense -- inaccessible to conscious recollection, but nevertheless influencing the person's ongoing experience, thought, and action?

Priming in Amnesia

Studies of neurological patients with the amnesic syndrome, like patient H.M., provide an affirmative answer to this question. Recall that these patients have suffered damage to the hippocampus and other structures in the medial portion of the temporal lobe, as a result of which they display anterograde amnesia -- an inability to remember experiences that occurred since the onset of the brain damage. Beginning in the late 1960s and early 1970s, a series of studies (initially performed by Elizabeth Warrington and Lawrence Weiskrantz at Oxford University) began to examine this memory deficit more closely, using variants on the traditional verbal-learning paradigm. When amnesic patients studied lists of words, and then received traditional tests of recall or recognition, they performed quite poorly compared to control subjects.

Memory Test
                PerformanceIn another test, however, they were placed in a "guessing game" in which they were presented with stems and fragments of words, and asked to complete the stimulus with the first English word that came to mind. Some of the stems and fragments targeted items from the previously studied word list, and the amnesic subjects were no less likely to generate list items as guesses than controls. For example, if the study list contained the word NATION, and the task was to complete the stem NAT___ with a familiar English word, patients were just as likely as controls to guess NATION. On first glance, this shows that amnesia impairs episodic memory -- patients' recollection of the past study session -- was impaired, but their semantic memory -- their ability to recognize common English words -- was spared. But the result was even more interesting than that, because the patients, like the controls, were more likely to complete the stems and fragments with previously studied items like NATION than with alternatives like NATURE which were equally good completions.

This phenomenon is known as priming, where performance of one task (like studying a list of words) influences subsequent performance of an entirely different task (like guessing the words represented by stems and fragments). In fact, there are two forms of priming:

  • positive priming, where the first task facilitates the second (e.g., seeing NATION makes it more likely that subjects will complete the prompt NAT___ with NATION than with NATURE).
  • negative priming, where the first task inhibits the second.(e.g., seeing NATION makes it less likely that subjects will complete the prompt NAT___ with NATION than with NATURE).

Positive priming is much more commonly studied than negative priming.

Priming, in either form, shows that the subjects did, in fact, process the primes and retain some trace of them in memory. These traces are not accessible to conscious recollection, but they nevertheless influence the subject's subsequent experience, thought, or action. Priming represents unconscious memories in the substantive sense.

Priming effects come in various forms, of which two are most important for our purposes:

  • Repetition priming, in which the prompt is a recapitulation, in whole or in part, of the prime.
    • The stem-completion test described above is a common example of repetition priming, because the stem NAT contains part of the prime NATION.
    • And so are fragment-completion tests, in which, for example, implicit memory for the prime ASSASSIN is tested by asking subjects to complete the target A__AS_N .
    • In a lexical decision test, subjects are presented with a prime such as DESCRIBE, and than must determine whether the letter-string DASCRIBE is a legal English word.
    • In a perceptual identification test, subjects are presented with a prime such as BANDANNA, and then must identify the word BANDANNA presented against a "noisy" background that makes it difficult to see (or hear) the word.
  • Semantic priming, in which there is the prime and the prompt are linked by meaning, rather than by physical similarity.
    • In associative priming, the link between prime and prompt is associative in nature, such as DOCTOR-NURSE.
    • In categorical priming, the link between prime and prompt is conceptual in nature, such as ANIMAL-LION.
    • In affective priming, the link between prime and prompt is shared connotative meaning, or evaluation, such as PEACE-LOVE, which are both "good" things, and WAR-HATE, which are "bad" things.

Thus, a subject primed with DOCTOR might show faster response latencies when making lexical decisions about the prompt NURSE, or a subject primed with LION might be more likely to generate LION when asked to list instances of the prompt category ANIMAL.

Note that while stem-completion and fragment-completion are generally considered to involve repetition priming, this is not necessarily the case. Thus,DOCTOR might prime NURSE as the completion of the stem NU___ (as opposed to NUGGET or NUMBER), and PEACE might prime completion of the fragment L_V_ with LOVE (as opposed to LIVE).

Similarly, although a free-association or category-generation procedure would usually seem to exemplify semantic priming, they can also be used in a repetition-priming experiment as well. So, for example, if a subject studies the paired-associate BREAD-BUTTER, and then is asked to complete the cue BREAD with the first word that comes to mind, the procedure counts as repetition priming because the cue recapitulates part of the prime. But if the subject studies BUTTER, and then responds to the cue BREAD with the first word that comes to mind, that is semantic priming because the target and cue are not the same, even in part.

The distinction between repetition and semantic priming is important, they seem to reflect different kinds of memory representations. Repetition priming can be mediated solely by a perception-based representation of the prime -- essentially a physical description of the prime -- and does not require any analysis of the meaning of the prime. But semantic priming requires a meaning-based representation, and cannot be mediated solely by a perception-based representation.

Implicit Memory as Unconscious Memory

Studies of preserved priming in the amnesic syndrome led Daniel Schacter, now at Harvard University, to articulate a distinction between explicit and implicit expressions of episodic memory.

  • Explicit memory is the person's conscious recollection of some past event, as reflected in performance on tests of recall and recognition.
  • Implicit memory is any change in the person's experience, thought, or action that is attributable to some past event, as in performance on tests of priming (savings in relearning, and psychophysiological indices such as the lie-detector could also count as evidence of implicit memory).

Explicit and implicit memory can be dissociated, in the sense that implicit memory can occur independent of, and even in the absence of, explicit memory. The amnesic syndrome (like Patient H.M., discussed in the lecture supplement on The Biological Bases of Mind and Behavior, is one case where implicit memory is spared even when explicit memory is grossly impaired.

The same sort of dissociation between explicit and implicit memory can be observed in the retrograde amnesia caused by the use of electroconvulsive therapy (ECT) in the treatment of some forms of depression. In one study, conducted by Jennifer Dorfman and her colleagues, patients studied a list of words immediately before receiving ECT. Later, in the recovery room, they showed impaired memory on an explicit test of stem-cued recall, but intact priming on an implicit test of stem-completion.

                Memory Following Propofol SedationThe dissociation between explicit and implicit memory can also be observed in medical patients who are undergoing outpatient diagnostic and treatment procedures under conscious sedation. In this study, by Randall Cork and his colleagues, medical patients studied a list of words while sedated, but conscious, during outpatient surgery. Sedative drugs typically induce amnesia. Not surprisingly, then, the patients showed poor explicit memory on a test of cued recall. But they showed intact priming on a free-association test of implicit memory.

Implicit Memory Following General AnesthesiaAnd the dissociation between explicit and implicit memory can even be observed in surgical patients who have undergone general anesthesia. In general anesthesia the patient is completely unconscious during surgery, and so lacks any conscious recollection of surgical events. In this experiment, surgical patients were presented with a list of paired associates, played over earphones, while they were anesthetized. Later, they performed poorly on a cued-recall test of explicit memory, but remarkably showed significant priming on a free-association test of implicit memory.

Are New Principles Needed?

The principles discussed in the bulk of this lecture -- elaboration, organization, etc. apply to explicit memory, conscious recollection, and it is an open question whether they also apply to implicit or unconscious memory. For example, it is sometimes claimed that implicit memory is independent of elaborative processing at the time of encoding, as in Craik and Lockhart's depth-of-processing paradigm. However, the claim may be overstated. For one thing most experiments on depth of processing and implicit memory have employed repetition priming paradigms, and repetition priming can be mediated by a perception-based representation of the prime. So, it is not surprising that repetition priming is (relatively) unaffected by deep, semantic processing. The same might not be the case with semantic priming.

The Rediscovery of the Unconscious

The dissociation between explicit and implicit memory in various kinds of amnesia shows that it is, indeed, possible to have unconscious memories that influence behavior outside of awareness. Variations on the priming paradigm have also provided evidence for unconscious percepts, thoughts, and emotions.

Link to an article on "the rediscovery of the unconscious".

Biological Bases of Memory

We have already discussed some of the biological bases of memory.

  • As discussed in the lectures on the Biological Bases of Mind and Behavior, the hippocampus and related structures in the medial portion of the temporal lobe seem to be critical for retaining episodic memories in consciously accessible form. Patients, like H.M., who have sustained damage to these structures suffer an anterograde amnesia for "postmorbid" events that occurred after the injury. As discussed in these lectures, the amnesia primarily affects explicit memory, or conscious recollection, leaving implicit, or unconscious, memory relatively spared.
  • As discussed in the lectures on Learning, long-term potentiation appears to be the biological basis for learning, and thus for memory, at the molecular and cellular level of analysis. Just as, in simple associative learning, there must be a way to make a response more likely in the presence of a stimulus, in memory there must be a way to get from retrieval cue to target engram. Long-term potentiation is the best account we have of how one thing leads to another in the nervous system.

But how are memories themselves represented in the brain?

At the psychological level of analysis, a memory can be viewed as a bundle of features describing an event, the context in which it occurred, and the person as the agent or patient, stimulus or experiencer of that event. In a typical associative-network model of memory, each of these features would be represented by a node, and each of the nodes would be connected to the others by associative links.

At the biological level of analysis, there are two basic models for the neural representation of memories.

  • The most obvious is that each node in the associative memory network is represented by a cluster of neurons that constitute the neural representation of that feature. The memory as a whole, then is represented by several clusters of neurons, each representing a different feature, firing simultaneously. Or something like that. This would be a "locationist" view of memory, in that each memory trace is located at a particular point in cortical space. It's this kind of theory that is contradicted by the findings leading up to Lashley's Law of Mass Action, discussed in the lectures on Learning.
  • A somewhat different, less obvious, but more popular view -- in part because it is more consistent with the Law of Mass Action -- is that each item in memory is represented by a pattern of neural activity that is spread widely across the cortex. Thus, individual memories aren't localized anywhere. Or, put another way, they're localized everywhere -- in the cerebral cortex as a whole. This view is sometimes called a "connectionist" view of memory, because the neural representation of memory isn't found in individual neurons, or even in groups of neurons, but rather in the connections among them.

For an excellent discussion of the neurobiology of memory, see Memory: From Minds to Molecules by Larry R. Squire and Eric Kandel. Squire is a cognitive neuropsychologist at the University of California, San Diego. Kandel, at Columbia University, shared the 2000 Nobel Prize in Physiology or Medicine with two other neuroscientists, for his studies of long-term potentiation as the molecular and cellular basis of learning and memory.

A Unified View of Perception and Memory

Encoding processes makes knowledge about the past available in memory, while retrieval processes permit us to gain access to the knowledge stored there. This access can be conscious, in the form of explicit memory, or it can be unconscious, in the form of implicit memory. But conscious recollection, at least, is a product of reconstructive activity.

Thus, the situation in memory is analogous to that of perception. Both involve active cognitive activity, "effort after meaning", going "beyond the information given" (Bruner), in order to make sense of what we experience and what we remember.

  • Perception is not just a matter of extracting features from, and recognizing patterns in, stimulus input.
  • Memory is not just a matter of retrieving stored memory traces encoded during perception.

Rather, both represent problem-solving activity: the perceiver is trying to "figure out" what is happening now, while the rememberer is trying to "figure out" what happened in the past.

  • Perception of the present relies on memory for the past, including expectations based on past experience. The perceiver tries to make sense of the present in terms of the past.
  • Similarly, perception contributes to memory for the past, contributing the context in which remembering takes place. The rememberer tries to make sense of his or her memories in terms of the current situation.

Perception is constructive cognitive activity, wherein the perceiver actively constructs a mental representation of his or her present experience.

Memory is reconstructive cognitive activity, wherein the rememberer actively reconstructs a mental representation of his or her past experiences.

Just as perceiving is more like painting a picture than viewing a photograph, so remembering is more like making up a story than reading one in a book.

In this way, both conscious perception and conscious memory rely heavily on reasoning, judgment, inference, and decision-making -- processes to which we will turn in the next series of lectures.

Unconscious memory may invoke different principles than conscious recollection -- depth of processing and other aspects of elaboration are not necessary for repetition priming to occur, for example. But even so, the fact that the implicit-explicit distinction extends to memory as well as to perception indicates a basic point of unity. Memory depends on perception. If something has not been consciously perceived, it seems unlikely that it ever could be consciously remembered. But unconscious, implicit perception can leave traces in memory, as well -- implicit memories, that affect our ongoing experience, thought, and action in the absence of conscious awareness and conscious control -- but memories nonetheless.

This page last modified 01/03/2017.