Social Memory

Also the lecture supplements for Psychology 122, "Human Learning and Memory".

Perceptual activity ends with the identification and categorization of the distal stimulus. Every act of perception is an act of categorization (paraphrasing Bruner), and memory provides the conceptual knowledge that permits categorization to occur. But the relations between perception and memory are more extensive than that:

Memory serves as the background for perception, containing the expectations, beliefs, world knowledge, and conceptual knowledge against which percepts are constructed. In this way, memory is the cognitive basis for perception.
Memory is the byproduct of perceptual activity: it stores a description of the percept that has been constructed -- or, alternatively, it stores a record of the processes by which the percept was constructed. Either way, memory frees behavior from dominance by the immediate stimulus environment.

Accordingly, a theory of social cognition must go beyond perception to describe the processes by which social memories are encoded, the structure of the memory trace, and the operations by which social knowledge is retrieved (Hastie & Carlson, 1980; Kihlstrom & Hastie, 1997).

Object Recognition in the Nonsocial Domain

By way of background, let us consider some theories of object recognition in the nonsocial domain (Humphreys & Bruce, 1989; Roth & Bruce, 1995; Palmer, 1999).

Some object in the environment, known as the distal stimulus radiates a pattern of physical energy which falls on the retina of the eye. This two-dimensional retinal image is the proximal stimulus for vision.
Early stages of visual processing, generally known as feature detection and pattern recognition, take the retinal image and create a three-dimensional structural description of the distal stimulus.

Early theories of visual object recognition focused on the recognition of letters and words, and didn't have to be concerned about the distinction between 2- and 3-dimensional descriptions.
But as soon as we get into the "real world" of objects, like cars and cows and people at varying distances from the observer, we have to deal with them in three dimensions.

A highly influential theory of visual perception is that of David Marr (1982), who distinguished among three distinct levels of analysis:

The computational level is concerned with the processes used to accomplish the various tasks of visual perception.

Marr was interested in artificial intelligence and machine vision, and he took the computer metaphor seriously.

There's some input, which is operated on by some program to generate an output.

Much of the psychological analysis of perception is performed at this level.

The algorithmic level specifies the actual operations by which these computations are performed.

To continue the computer analogy, this is the "program" written as if in machine language.
In the case of machine vision, these algorithms are collected into operating computer programs.

The hardware level describes the specific neural mechanisms (or, in the case of machine vision, the electronic components) that execute the algorithms.

Marr was a proponent of what John Searle calls "Strong AI": he didn't think it mattered whether a program was implemented in neurons or silicon chips.

Marr also offered a theory of the early stages of image processing -- that is, how the retinal image is processed by early stages of the visual system.

Gray-Level Description: the system measures the intensity of light at each point in the image.
Primal Sketch: This takes the gray-level description and creates a 2-dimensional raw primal sketch which identifies the edges of objects, their texture, and the boundaries between them where they overlap. Further processing results in a full primal sketch, which outlines the shapes of the objects in the scene.

These primal sketches are viewpoint-dependent, in that they represent how the objects look from the perceiver's particular point of view.

2-1/2D Sketch: describes the layout of the scene, and how the various surfaces in the primal sketch relate to each other and to the perceiver.
Object Recognition: Creates a structural description of the objects that can be compared to images of objects and scenes that are already stored in memory.

Because the objects in the scene may represent a different viewpoint from what is stored in memory, object recognition requires the creation of an object-centered or viewpoint-independent description, which permits us to recognize objects from any angle as the observer -- or, for that matter, the objects themselves -- move in space. Marr & Nishihara (1978) proposed that complex shapes are broken down into simpler components, called primitives, which consist of combinations of generalized cylinders of various sizes and orientations.

Thus, the structural description of a horse might be represented as follows

A large horizontal cylinder in the center represents the body of the horse.
Four long cylinders coming downwards from the "body", two at each end, represent the legs.
A short cylinder angling upwards from one side of the central cylinder represents the neck.
Another short cylinder angling downwards from the "neck" represents the head.
A short cylinder angling downwards from the other end of the "body" represents the tail.

You get the idea.

An extension of Marr's theory, is the Recognition-by-Components (RBC) theory proposed by Biederman (1987; Biederman & Gerhardstein, 1993). RBC employs a more extensive list of primitives, not just cylinders, known as geons, which is a highly generalized set of 36 3-dimensional shapes, out of which pretty much any conceivable object can be constructed. Think of Legos! (Though there area some exceptions, which is a problem for the theory.)

Sample geons include, in addition to cylinders, cubes, wedges, pyramids, barrels, arches, cones, and handles.
These geons are linked together by a set of structural relationships, such as relative size, verticality, centering, relative size of surfaces at joints, and the like.
The structural description, including the geons and their structural relationships, are then compared to similar representations stored in memory, and the best match yields the object recognition -- as a a horse, or a suitcase, or a person.

Two important points about Biederman's very popular theory.

Like most theories of object recognition in the social domain, Biederman's theory (and Marr's, for that matter) is a theory of the recognition of a category or class of objects. That is, it enables the perceiver (whether human or machine) to recognize a horse, or a suitcase, or a human. But object recognition in the social case typically goes beyond categorical recognition to recognition of a specific instance. I don't recognize a person, or even a woman, as such; rather, I recognize my spouse Lucy, or my sister Jean. But you can imagine how the Marr-Biederman approach could be expanded, by adding features (described by geons) and structural relationships to distinguish between, two individuals, both of which are recognized as "human" or "woman".
For both Marr and Biederman, object recognition relies on the development, in perception, of an object-centered or viewpoint-independent description of the distal stimulus, which is then compared to object-centered or viewpoint-independent descriptions of familiar objects stored in memory. But there are good reasons for thinking that the perceptual representations at issue are actually viewpoint-dependent, and that they're compared to multiple exemplars of familiar objects, each from a different viewpoint, stored in memory.

Just such a proposal has been made and defended, ably, by Michael Tarr (e.g., Tarr & Bulthoff, 1995). To be honest, I tend to favor Tarr's multiple-exemplar view. He was a colleague when I taught at Yale, and I'm persuaded by his evidence -- as I am by other work from his laboratory and students, as we'll see in the lectures on Social Neuroscience.

But I don't pretend to expertise in this matter. The most prominent work on face perception, to which we now turn, is based on the Marr-Biederman view of object recognition, as the final, memory-dependent stage in perceptual processing.

Face Recognition

To continue our consideration of the relations between social perception and memory, let's consider the case of face recognition -- that is, the process of putting a name to a familiar face, be it the face of our spouse, parent, or child, friend or neighbor or co-worker, or politician or celebrity.

But first, let's pause to consider the fact that we don't all look alike. While that may seem obvious, it's been argued that most other animals seem to look very much like each other, and seem to recognize each other as individuals via other sensory modalities, such as smell or vocalization. Michael J. Sheehan, a postdoctoral fellow at UCB, has proposed (Nature Communications, 09/16/2014) that the reason we're individually recognizable, on the basis of such visual features as the distance between the eyes, or the width of the nose, has its basis in evolution. It's not clear how, as Sheehan proposes, human social structure drove the evolution of facial distinctiveness; it might have been the other way around, that the fact that we're so distinctive forced us to develop certain uniquely human social structures. But still, evolutionary pressures tend to make individual species members more alike over time; but in this case, they seem to have made us more different, at least with respect to faces. in any event, we really do all look different. Anthropometric measurements on a sample of US Army personnel confirm that facial traits are much more variable than other human traits, such as the length of the arm and the length of the leg; and they are less strongly intercorrelated with each other, too.

Studies of Face Perception and Recognition

And first, let's consider some elementary phenomena of face-recognition.

Faces are processed holistically. We don't just perceive individual features, like wide eyes and a long nose, independently of each other.

Bradshaw and Wallace (1971) presented subjects with pairs of faces that differed on 2-7 features, and asked subjects to judge whether they were different. Response latency on this task was negatively correlated with the number of differing features, suggesting that subjects compared each feature separately and sequentially. This, in turn, suggested that facial features are, in fact, processed independently of each other.
But Sergeant (1984) argued that it's hard to vary one feature independently of the others. If a face has a long nose, then the configuration of facial features -- e.g., the distance from the eyes to the lips -- must also change. So faces might be processed holistically after all.
Tanaka & Farah (1993) asked subjects to memorize a set of normal or scrambled faces (in which, for example, the nose and mouth exchanged places), so they could remember which face belonged to "Jim". Then they were presented with just one facial feature, like the nose, and asked whether it is "Jim's" nose.

When the feature had been presented in the context of a normal face, recognition was best when the feature was tested in the context of that same face, compared to when it was presented alone, without any context at all. This indicated that faces are processed holistically, not as a collection of individual features.
When the feature had been presented in the context of a scrambled face, there was no advantage when features were presented in context. Disrupted the overall configuration of the face impaired memory for individual features. Again, this means that faces are processed holistically.
No such findings were obtained when subjects memorized pictures of houses, rather than of faces, suggesting that faces may be "special" in this respect.

Faces presented upside-down are more difficult to recognize than faces presented right-side up, and this is not true for other types of objects (e.g., Yin, 1969).
In a variant on Yin's procedure, Young et al. (1987) sliced facial photographs in half, and paired the upper half of one face (e.g., Ronald Reagan's), with the lower half of another face (e.g., George H. W. Bush's. Subjects found it harder to identify the upper half of the image when it was paired with the "wrong" lower half. Again, this suggests that individual features are not processed independently.
Thompson (1980) took photographs of famous faces (like British prime minister Margaret Thatcher), and inverted the eyes and mouth. When presented upright, the face appears odd, even grotesque; but when presented upside-down, the face appears normal. This Thatcher illusion (though it's not really an illusion) occurs because faces are processed holistically when presented upright, but their features are processed independently when faces are presented upside down (Bartlett & Searcy, 1993).
Carey (1981, 1992) found that young children tend to process facial features independently. But as they get older, they come to process faces configurally.

There are a number of neurological syndromes which involve an impairment in face recognition.Chief among these is prosopagnosia, where the patient does not recognize faces that ought to be familiar to him. Nonetheless, he is able to determine the gender of the face (i.e., male or female), identify emotional expressions (e.g., of fear or joy), and read lips.

Prosopagnosia is often associated with a lesion in the fusiform area of the brain, at the junction of the occipital and temporal lobes. Some authorities have identified the fusiform area as the "face area" of the brain, and even refer to it as the "fusiform face area". But, as we'll see in the lectures on Social Neuroscience, the situation is actually more complicated than this.

Young et al. (1985) asked subjects to keep a diary in which they recorded errors, or at least difficulties, in recognizing familiar faces. They found that particular errors were especially common.

Complete recognition failure.
Feeling of familiarity
Inability to remember name
Misidentification
There were no instances where a person was recognized as unfamiliar, and identified by name, but no other identifying information was available.

Reaction-time studies show that people are able to identify a face as familiar before they can state the occupation of the person whose face it is. While that may come as no surprise, these same studies also show that people can state the occupation before they can name the person who belongs to the face.

In fact, some researchers have gone so far as to assert that subjects cannot state the person's name before they can state his occupation. I myself doubt this result. I suspect it's an artifact of the selection of faces used in these studies, which typically consist of famous people, like actresses and presidents. Under those circumstances, I suspect that subjects will often say something like "Oh, that's that actress -- what's her name?". But with more individualized acquaintances, I suspect that identification of the face occurs before subjects can state the person's occupation -- as when people say "Oh, that's my neighbor Brett -- he's some kind of software engineer". But this is an empirical question. Addressing it wouldn't be easy, but -- hint, hint -- it would make a great senior honors thesis.

Anyway, on the basis of findings such as these, Bruce and Young proposed a formal model of face recognition which has become widely accepted. the model assumes that there are separate systems (or modules) for various aspects of face perception: identifying a face (which is the primary concern of the model), analyzing emotional expressions (a la Ekman), analyzing facial speech (i.e., lip-reading), and directed visual processing (e.g., examining someone's teeth for stray bits of spinach).

Sticking with the process of face recognition:

First there is a structural encoding of the features of the face, both individually and their configuration with respect to each other.

The initial encoding is viewpoint-centered, or instance-based, meaning that it is like a "snapshot" of the face taken from a particular angle or viewpoint. It is this sort of image that prevents subjects from recognizing familiar people from behind.

This viewpoint-centered description allows for the analysis of facial expressions of emotion, and also for lip-reading.

Structural encoding ends with a viewpoint-independent, object-centered representation, which is more abstract, and permits object recognition from unfamiliar viewpoints, with different facial expressions, or while in motion.

This object-centered representation provides the perceptual input for the facial recognition process.

This perceptual representation is then connected with an object-centered facial recognition unit (FRU) stored in memory.

There is one FRU for each familiar face.
The FRUs are part of the larger fund of knowledge stored in memory.

If a match is made between the stimulus face and some FRU, the face is recognized as familiar.Information from the FRU then passes to a person identity node (PIN).

There is one PIN for each known person, and it contains the name of the person.

A familiar person can be named only by virtue of the PIN.

The PIN is also part of the broader memory system, permitting access to other knowledge about the person.

The PIN can also be accessed through other recognition processes, as when we recognize someone's voice over the telephone.
Activation of a PIN can prime the corresponding FRU, facilitating face recognition if, for example, we are expecting to see a particular person.

Whereas the FRU is the cognitive basis for face recognition (giving rise to a feeling of familiarity), the PIN is the cognitive basis for person recognition (allowing one to put a name to a face). So, the route to face identification is as follows:

Structural Encoding
Familiarity
Identification
Name Generation

The model provides a pretty good account of the major phenomena of face recognition.

Subjects can recognize different facial expressions of emotion, and read lips, and examine faces closely, even if they cannot recognize the face they're looking at, because these tasks are performed by different, dissociable, cognitive modules.

And for the same reason, subjects can perform one of these three tasks without necessarily being able to perform the other two.

Subjects can recognize a name as familiar, by virtue of the FRUs, without being able to name the face (which requires the PINs).

And they can retrieve information about the familiar person, such as his occupation, without being able to identify him, because the FRU connects directly to the larger memory store.

When subjects experience a feeling of familiarity, but cannot name the person, presenting the person's face does not help. This is because the PIN stands between the FRU and generating a name.

However, presenting additional semantic information, such as the person's initials, does help, because that enhances the cues that are input to the PIN from the larger memory store.

That doesn't mean that the B&Y model is perfect. But it remains the best model we have, which is why it's been so widely embraced by other theorists.

The B&Y model has been implemented as a "connectionist" computer simulation by

Burton et al. (1991) (I'll have more to say about "connectionist" models later). Inputs from a face (via the FRUs) or a name activate corresponding PINs, which in turn activate related semantic information stored in memory. Note that, like all "connectionist" models, the pattern of activation is interactive.

Both the name and face of Prince Charles activates his PIN, but either pathway will also activate the other one.
Activation of Charles's PIN is the only way to retrieve semantic information about him, such as the fact that he plays polo.
Excitation of the "Charles" PIN will inhibit other, related nodes, such as those representing Princess Diana or Prime Minister Thatcher.

In any event, only when a face has been identified as (1) familiar (or not) and (2) named (or not) is the process of face perception complete.

Bruce and Young are clear about the explicit parallels between face recognition and other forms of recognition -- object recognition and word recognition. In each case, processing follows a route from the early stages of visual processing to recognition units, then an identity unit (itself connected to other, related semantic knowledge), then a name code, and finally the naming response. But the modules performing these perceptual tasks are different.

We generally think of face recognition as a universal ability, and prosopagnosia as a neurological condition affecting only patients with certain forms of brain damage. However, individual differences in face recognition may vary along a wide continuum (Russell, Nakayama, & Duchaine, Psychonomic Bulletin & Review, 2009). Even in the absence of demonstrable brain damage, some people are so bad at face recognition that they appear to have a "developmental", perhaps congenital and inherited, form of prosopagnosia. Others, whom Russell et al. call "super-recognizers", have an extraordinary ability to recognize faces from different angles and in different contexts. These individual differences are often assessed employing the Cambridge Face Memory Test.

Taxonomy of Memory

Broadly speaking, memory comes in two broad forms (Winograd, 1975; Anderson, 1976):

declarative, representing factual knowledge;
procedural, representing directions for cognitive or motor actions.

Declarative Knowledge

Declarative knowledge consists of factual statements about the world -- past, present, and, for that matter, the future. All declarative knowledge can be represented by sentence-like propositions, consisting of a subject, an object, and the relation between them. The general format for a proposition is The subject verbed the object. Declarative knowledge can be represented in two ways:

meaning-based representations, which are essentially verbal descriptions;
perception-based representations, which are essentially mental images.

The most popular theoretical model for memory consists of an associative network with concepts represented by nodes, and associative links representing the relations between them. In an alternative representational format, propositions themselves are represented by nodes, associatively linked to other nodes representing the subject, object, and relation.

In any event, each node in memory is associatively linked to other nodes representing related knowledge, and each proposition is linked to related propositions, so that the associative network represents all of the individual's knowledge. For example, each element in the proposition The hippie touched the debutante (Anderson, 1976) is linked to other nodes representing our knowledge about hippies, debutantes, and touching. When nodes representing the elements of a proposition are activated by perception, activation spreads to related nodes in the network. This spreading activation serves as the basis for priming.

The facts comprising declarative memory can be either episodic or semantic in nature.

In principle, every episodic memory consists of a number of elements, each represented by one or more propositions:

A description of the event itself.
A description of the episodic context in which the event occurred. These contextual features may include:

A description of the spatial and temporal context: the time and place where the event occurred.
A description of the causal relations entailed by the event.

Self-Reference, including:

The role of the self in the event being represented, as

Agent (as in, I gave the present to Lucy);
Patient (as in, Lucy gave the present to me);
Stimulus (as in, I made Lucy happy); or
Experiencer (as in, Lucy made me happy).

The self's internal state, including

Cognitive (knowledge and beliefs);
Affective (feelings and emotions); and
Conative (desires and goals).

In contrast with the autobiographical knowledge of specific events and experiences that comprises episodic memory, semantic memory holds abstract, context free knowledge:

our "mental lexicon" of knowledge of the meanings of words;
other linguistic knowledge, e.g., of phonology, syntax, and pragmatics;
knowledge of objects; and last, but not least in this context,
categorical knowledge of concepts, subset-superset relations, similarities, and category-attribute relations.

Semantic memory forms the background against which episodic memories are encoded.

Given the basic empiricist position that (most) knowledge comes to us via experience, or reflections on experience, it is evident that semantic memory begins in episodic form. When you first learn that Columbus discovered America you also remember the circumstances under which you learned that fact. However, accumulated encounters with the fact will blur the episodic features, resulting in a generic memory that makes no reference to the circumstances under which it was acquired.

Procedural Knowledge

Similarly, all procedural knowledge can be represented by productions consisting of a goal, a condition, and an action which will achieve the goal under the condition. The general format for a production is IF (goal, condition) THEN (action). Productions, too, can be linked to each other in a larger network called a production system. Procedural knowledge comes in two broad forms:

motor activities, which take the form of overt behavior (like tying a knot);
mental activities, which take the form of mental transformations (like solving an equation).

e goal and condition elements of a production are represented as nodes in declarative memory. When these nodes are activated by perception, the action is automatically executed. Automatic processes have a number of interesting features:

Inevitable Evocation.
Incorrigible Execution.
They are Effortless.
They are Rapid.
They are Unconscious.

In principle, declarative knowledge is consciously accessible, in that it can be brought into phenomenal awareness. But procedural knowledge is, in principle, unconscious. The action elements are not part of the declarative memory system. As a result, procedural knowledge can be consciously known only by inference from performance.

Individual productions are embedded in larger production systems. In a production system, execution of one production is the precondition for executing another. Or, put another way, execution of each production creates the conditions under which the next production can be executed.

The rules and skills that comprise procedural memory can be either cognitive or motoric in nature:

Cognitive procedures include the mental activities by which we act on mental states to transform them -- for example, how to carry out long division or to extract square roots (without using a calculator, that is).
Motor procedures include the behavioral activities by which we act on the external environment to transform it -- for example, how to tie our shoes (i.e., to transform our shoes from a state of being untied to a state of being tied) and how to order a meal in a restaurant.

The goals and conditions of a production are represented as nodes in declarative memory. When activated, the resulting production may create a new node in declarative memory.

According to a theory offered by J.R. Anderson, procedural memory starts out in declarative form -- that is, as a factual list of directions -- an algorithm, or "recipe" -- that can be consciously retrieved and deliberately executed by the user. Through repeated use, this knowledge structure undergoes proceduralization, a process analogous to knowledge compilation in a computer, that changes its representational format from declarative to procedural. At this point activation of nodes representing the goals and conditions of a production will automatically execute the action.

Social Cognition and the Forms of Memory

All these forms of memory are relevant to social cognition.

On the declarative side:

Episodic memory consists, first and foremost, of the individual's autobiographical memory -- a record of his or her own experiences, thoughts, and actions. Autobiographical memory is an important part of the self-concept.

I gave a present to Lucy yesterday, for her birthday.
I felt bad when Reid criticized me.

Episodic memory is also studied in the form of person memory -- our knowledge of specific episodes in other people's lives.

Judy won the chess tournament.
James Bartlett adopted the kitten.

Similarly, semantic memory consists in our knowledge of our own and other people's general characteristics, independent of their specific behaviors.

I am a neurotic introvert.
Lucy is a stable extravert.

Semantic memory also includes our implicit personality theory -- our general idea of what other people are like, and how personality is organized.

The basic dimensions of personality are neuroticism, extraversion, agreeableness, conscientiousness, and openness.
Most people are emotionally stable and extraverted.

Semantic memory also consists in our lexical knowledge of the meanings of words relevant to personality and social interaction.

Neurotics are anxious and excitable.
Extraverts are talkative and sociable.

On the procedural side:

Motor procedures include specific, habitual forms of overt behavior.

If you want to be liked, then make eye contact with the person you're talking to.
If you want to make a good impression, shake hands with a grip that is not too strong nor too weak.
If you're feeling emotion, then don't display it in public.
If you're engaged in a conversation, then stand within X feet of the person.

Cognitive procedures include specific, habitual forms of cognitive activity:

If you're forming an impression, then pay special attention to central traits.
If you want to know whether someone is likeable, then compute the weighted average of the likeability values of their traits.
If you want to analyze the cause of some behavior, then search for information about the consensus, consistency, and distinctiveness of that behavior (we'll discuss this rule later, in the lectures on Social Judgment).

Eyewitness Identification

Although models such as Bruce and Young's are viable accounts of face recognition, the fact is that we're often pretty bad at recognizing the faces of strangers. This is demonstrated clearly by eyewitness identification in forensic settings, where a witness or victim has to identify the perpetrator from a lineup or photospread. If you think about it, eyewitness identification is a paradigmatic example of episodic person memory: the witness or victim encounters the perpetrator just once, during the commission of a crime; and the question is whether the former will recognize the latter at some subsequent time. In fact, these days, the standard police procedure for lineups or showups (using photospreads) is to ask the witness two questions: (1) Have you ever seen any of these individuals before?" (2) "And if so, where?". This is exactly what episodic memory is all about: what happened, and in what context.

True Story: I was once asked to testify for the defense, as an expert witness on memory, in a case in which several Puerto Rican men had been charged for armed robbery of a gas station in Chicago (illinois v Ciaramitaro, Cook County, 1985). The police theory was that the men were members of the Puerto Rican National Liberation Front, and had held up the gas station to raise money to support terrorist activities. The hearing was held in a fortified courtroom, shielded from spectators by bulletproof glass. During the initial investigation, the police had shown the gas station attendant a photospread including the suspects, and he had failed to identify any of them. But later, the police showed the attendant a new photospread, with the same suspects but different foils, and asked him if he had seen any of the faces before. The attendant immediately picked out the suspects, leading to their arrest. As an expert, my job was to testify that this was an inappropriate procedure, and that the repetition of the suspects' photos might have biased the attendant toward recognition. But before I could conclude my testimony, the judge interrupted the proceedings and asked the district attorney prosecuting the case, "They did what?". When the prosecutor admitted that this had, in fact, been the procedure, the judge dismissed the case immediately.

To take just one example of the problem: according to the Innocence Project, a large proportion -- up to 75% -- of criminal convictions subsequently vacated on the basis of DNA evidence had been based on eyewitness identification.

True Story: Donald Thomson, an Australian psychologist (and the Thomson of Tulving & Thomson, 1973) was once arrested on a charge of rape based n the victim's identification. Thomson's alibi was that, at the time of the attack, he was being interviewed on television, along with an assistant police commissioner. The detective who interrogated Thomson initially didn't believe him. It turned out that the victim had been watching TV right before her attack, and apparently confused Thomson's face with that of her perpetrator (Baddeley, 1990)..

Now, from one point of view the unreliability of eyewitness identification should surprise us. After all, picture recognition is remarkably good.

In a classic study, Roger Shepard (1967) had subjects study a long list of words, sentences, or photographs for just a single trial. In a two-alternative forced choice recognition test, the subjects correctly recognized 88-89% of the words and sentences, and almost 100% of the pictures.
Paivio (1969) showed that concrete words were remembered better than abstract words.
Bahrick et al. (1975) followed up high-school graduates, asking them to identify classmates from photographs in their yearbook.

When subjects had to match a picture with a name (essentially a recognition test), they were more than 90% accurate even after 14 years, and over 60% accurate after more than 47 years.
When they had to generate the classmates name, given only their picture (essentially a cued recall test), performance was worse, but they were still 60% correct after 7 years.

Collectively, results like these illustrate the picture superiority effect (Paivio, 1971, 1986; Madigan, 1983), and is often taken as evidence for a dual-code theory of memory, which states that memories can be encoded in both verbal (like the word dog) and imagistic (like a picture of a dog) form. All of this suggests that memory for what a person looks like should be really good.

Doubts about eyewitness memory really began with a study by Loftus and Palmer (1974) demonstrating the post-event misinformation effect. That is leading questions like "How fast were the cars going when they _____ each other?" -- where the blank space was filled by verbs like hit or smashed, biased observers' memories for, or judgments about, the event they witnessed.

It gathered steam with a series of analyses showing a surprisingly low correlation between accuracy and confidence in eyewitness identification.

Eyewitness identification is one area where empirical research, in both the laboratory and the field, has had a major impact on practical application. For example, Lindsay and Wells (American Psychologist, 1993) and others argued that sequential lineups, in which witnesses must make individual decisions about each target, were superior to the classic simultaneous lineup, in which witnesses viewed all the targets at the same time. The sequential procedure significantly reduced the rate of false identifications, without substantially affecting correct identifications (Wells et al., Psychological, Public Policy, & Law, 2011). More recently, however, the simultaneous method has come back into favor, based on a form of signal-detection analysis (Gronlund et al, Current Directions in Psychological Science, 2014; American Psychologist, 2015). It turns out that the sequential method makes witnesses less likely to identify anyone, and this spuriously inflates the accuracy of any identifications that are made. Employing a variant on signal detection theory, they found that simultaneous presentation yields higher accuracy levels than sequential presentation -- and that confidence and accuracy are correlated after all.

After reviewing the empirical research, the National Academy of Sciences made recommendations concerning the proper handling of eyewitness testimony (Identifying the Culprit: Assessing Eyewitness Identification, 2014). For example, it recommends that jurors be cautioned that confident identifications are not necessarily accurate ones. However, they didn't make specific recommendations on simultaneous vs. spontaneous presentations.

Still, in the present context, the bottom line of this research is that eyewitness identifications are a lot more accurate than we originally thought they were. This, in turn, is commensurate with what is known about picture recognition in nonsocial domains.

Person Memory

Person memory consists of a person's factual knowledge about some other person -- his or her general traits and attitudes, and his or her specific behaviors and experiences. Thus, person memory consists of a mix of episodic and semantic memories.

The Verbal-Learning Paradigm

In general, person memory is studied with variants of the classic verbal-learning paradigm employed in the study of memory since the time of Ebbinghaus (1885). The only difference is that instead of describing a list of words, the items committed to memory describe a person.

Interestingly, when subjects study a list of facts about a person in order to form an impression of that person's personality, they remember the facts better than if they studied them in anticipation of a later memory test.

The Structure of Person Memory

Viewed from the perspective of a generic associative-network model of memory, person memory can be represented by a node representing each individual person, linked to nodes representing facts about that person. As we accumulate knowledge about the person, additional links are created.

This structure is illustrated by a classic experiment on the fan effect by Anderson (1974). Anderson asked his subjects to learn simple facts about people and locations, such as:

The doctor is in the bank. (1-1)
The fireman is in the park. (1-2)
The lawyer is in the church. (2-1)
The lawyer is in the park. (2-1)

Note that from just these four sentences, we have learned:

1 fact about the doctor (that he's in the bank).
1 fact about the bank (that the doctor's in it).
1 fact about the fireman.
2 facts about the lawyer.
2 facts about the park.

The subjects memorized such sentences to a criterion of perfect recall. Then Anderson conducted a recognition test in which subjects were asked to verify whether they had studied various sentences:

The doctor is in the bank (studied target)
The doctor is in the park (unstudied lure).

Subjects rarely made mistakes on this task, but Anderson was more interested in their response latencies, which varied according to the number of facts that they had learned about various people and locations. The more facts the subjects knew, about either people or locations, the longer it took them to verify any particular fact. Anderson called this outcome the paradox of knowledge: the more you know about a subject, the harder it is to retrieve any particular item of information about it. The fan effect is an excellent demonstration of inter-item interference in memory, but in the present context it is most important for what it tells us about how knowledge is represented in memory, and how it is retrieved.

Persons and locations are represented as nodes in an associative network.
Knowledge about a person is represented by associative links between nodes.
These associative links fan out from the "person" nodes.
When queried about a person, the node representing that person is activated.
Then retrieval traces down each associative link serially, one link at a time, until the relevant fact is reached (in the case of a true fact, like "the doctor is in the bank"), or not (in the case of a false fact, like "the doctor is in the park").
It's the fact that these associative links are searched serially, one at a time, that gives rise to the fan effect. The more you know, the more associative links fan out from the topic node, and the longer it will take, on average, to verify any one of them.

Individuation and Reference

An interesting problem occurs when with respect to individuation and reference. Suppose you learn a set of facts about one person, James Bartlett (e.g., that he rescued the kitten), and then another set of facts about another person, The Lawyer (e.g., that he caused the accident). Then you learn that James Bartlett and The Lawyer are one and the same person. Whatever you learned about James Bartlett you also now know about The Lawyer, and vice-versa. How is this situation represented in memory, so that you can know that James Bartlett caused the accident (which, of course, he did)?

One possibility is that the two representations remain separate -- one for James Bartlett, and one for The Lawyer. Under these conditions, we wouldn't know that James Bartlett caused the accident, because there is no connection between the two nodes. But we do know this fact, so this representation can't be right.

Another possibility is that all the knowledge about The Lawyer is linked directly to James Bartlett, including the fact that James Bartlett is a lawyer, and vice-versa. This permits us to know directly that James Bartlett is the lawyer caused the accident.

Yet a third possibility is that when we learn that James Bartlett is the lawyer, we establish a new link between the James Bartlett node and the Lawyer node. This permits us to know by inference that James Bartlett caused the accident.

In a classic study of person memory, Anderson and Hastie (1974) used a sentence-verification paradigm to show that response latencies for inferential sentences were longer than those for non-inferential sentences, but only for subjects who learned later that James Bartlett was the lawyer. Apparently, there are three links between the node representing James Bartlett and the node representing the fact that he caused the accident: And because it takes time to trace down each of these links (that's the implication of the fan effect), it takes longer to verify an inferential fact.

Apparently, if we know the reference -- that James Bartlett is the lawyer -- at the outset, we build a knowledge structure around a single node representing everything we know about the person.

But when we only learn the reference later, we establish a link between two separate nodes representing, respectively, what we know about James Bartlett and what we know about The Lawyer. The extra link takes time to traverse, leading to the longer response latencies in the "reference after" condition.

So now, within the framework of a generic associative-network model of memory, we have some idea of what person memory looks like -- that is, how our knowledge about a person is represented in memory.

There is a central node representing each person.
There are nodes representing various facts about the person.

His or her name.
What he or she looks like (perhaps in the form of a mental image or some other perception-based representation).
Traits, attitudes, and other generic information constituting semantic person memory.
Behaviors, experiences, and other context-specific information constituting episodic person memory.

And there are associative links connecting the person node to each of these fact nodes.

Schematic Effects on Person Memory

A great deal of research on person memory has been devoted to studying the effects of general beliefs and expectations, collectively known as cognitive schemata, on memory for specific facts about a person. Bartlett (1932) famously proposed that memory favors schema-congruent information, but subsequent research yielded conflicting results.

Technically, the word schema has a Greek root, and so its proper plural is schemata. However, the word has been Anglicized, and so you will often see the plural schemas instead. If you are really lucky, you will stumble on the occasional use of the word schematas as the plural for schema. Apparently, the writer didn't want to take any chances -- sort of like the kind of person who wears both a belt and suspenders to keep up his pants.

Hastie & Kumar (1979) combined the verbal-learning paradigm with Asch's impression-formation paradigm. Subjects first studied an ensemble of traits describing some person, in order to induce a schema for that person (Judy is smart and sophisticated). Then they studied a list of that person's specific behaviors.

Some behaviors are schema-congruent, in that the probability of observing the behavior, given the schema, is greater than the probability of observing the behavior in the absence of the schema. (Smart, sophisticated Judy won the chess tournament and attended the concert.)
Other behaviors are schema-incongruent, in that the probability of observing the behavior, given the schema, is less than the probability of observing the behavior in the absence of the schema. (Smart, sophisticated Judy shouldn't repeat mistakes or get confused by a TV show.)
And still other behaviors are Schema-irrelevant, in that the probability of observing the behavior is the same, whether the schema is present or not. (What difference does it make whether smart, sophisticated Judy eats a cheeseburger or goes to the 3rd floor?)

Hastie & Kumar varied the mix of behaviors across conditions:

0 schema-incongruent, 4 schema-irrelevant, and 12 schema-congruent.
1 schema-incongruent, 4 schema-irrelevant, and 11 schema-congruent.
3 schema-incongruent, 4 schema-irrelevant, and 9 schema-incongruent.
6 schema-incongruent, 4 schema-irrelevant, and 6 schema-congruent.

Subsequent recall testing showed that schema-relevant items were remembered better than schema-irrelevant items. But among the schema-relevant items, schema-incongruent items were remembered even better than schema-congruent ones.

The findings of the Hastie & Kumar experiment illustrate the schematic processing principle of memory: the memorability of an event depends on its relationship to pre-existing schemata.

Hastie (1980, 1981) proposed a two-process explanation for schema-dependency:

Memory favors schema-congruent information because the schema itself provides additional cue information at the time of retrieval (following the cue-dependency principle in memory).
Memory favors schema-incongruent information because schema-congruent information is surprising, and surprising events must be explained, and explanatory activity produces a more elaborate memory trace at the time of encoding (following the elaboration principle in memory).
Schema-irrelevant information gets neither advantage, and so is more likely to be forgotten.

In order to test this explanation, Hastie (1984) conducted an experiment in which the trait ensemble was followed by a list of schema-congruent and schema-incongruent items. Recall testing yielded the schematic processing effect, as expected. However, in a second experiment Hastie asked subjects to perform a sentence-continuation task: after each item, they were supposed to continue it with either an explanation of the event, an elaboration of the event, or the sequel to the event. On a later recall test, items (whether schema-congruent or schema-incongruent) in the explanation condition were recalled better than those in the elaboration or sequel condition. So, it's not schema-incongruency per se that yields better memory: it's the explanatory activity that schema-incongruency instigates.

Thomas Srull (1981) offered a somewhat different explanation for schema-dependency, within the framework of a generic associative-network model of memory. He proposed that nodes representing individual episodes are linked to a node representing the person, in the usual way. Then, connections among nodes are produced by virtue of processing at the time of encoding -- such as explaining schema-incongruent items in light of the schema. However, nodes representing schema-incongruent items are associatively linked both to each other and to nodes representing schema-congruent items as well.

In his experiment, Srull, like Hastie & Kumar, varied the mix of behaviors: 12 schema-congruent, 12 schema-neutral (or schema-irrelevant), and either 0, 6, or 12 schema-incongruent behaviors. Testing recall, Srull obtained the usual schema-dependency effect. Schema-relevant items were recalled better than schema-irrelevant items, and schema-incongruent items were recalled better than schema-congruent items.

Then, Srull employed a sentence-verification procedure, not unlike that which had been used by Anderson & Hastie (1974), to examine priming effects on recognition memory. Srull compared response latencies to verify schema-congruent, incongruent, and -irrelevant items, depending on the immediately preceding item. Compared to a baseline provided by processing of schema-irrelevant items served as a baseline. Schema-congruent items primed responses to schema-incongruent items, while schema-incongruent items primed both schema-congruent and schema-incongruent items; schema-irrelevant items didn't prime anything. These results are consistent with Srull's hypothesis, that schema-incongruent items are linked to each other and to schema-congruent items, but that schema-congruent items are not directly linked to each other.

Organization of Person Memory

Srull's model shows how behavioral episodes are represented in person memory, and presumably more abstract, generic trait information is represented the same way -- as nodes representing traits linked to a central node representing the person. However, there is another possibility: given that traits are categories of behaviors, it is possible that traits organize behavioral episodes in memory -- that is, that nodes representing traits are linked to nodes representing the behaviors that exemplify or instantiate them.

Category Clustering

In fact, there is a long tradition in the study of nonsocial memory which supports the idea that abstract categories organize more concrete information in memory. For example, if subjects study a list of words, some of which are instances of various conceptual categories, their recall will tend to be organized by category - a phenomenon called category clustering. In the early 1950s, as one of the earliest shots in the "cognitive revolution" in the study of human learning and memory, Bousfield and his associates showed that category clustering increased over trials, just as recall itself did. This is consistent with the organization principle in memory processing.

Consistent with the organization principle, some theorists have proposed a structure of person memory in which "trait nodes" fan out from "person nodes", and that specific "behavior nodes" fan out from the trait nodes, has been very popular. Certainly it is a highly rational way to organize person memory. Accordingly, some investigators have tried to demonstrate that person memory is, indeed, organized by traits.

Certainly, person memory is organized by person. In one experiment, Ostrom et al. (1981) had subjects study five sentences each about 5 familiar people (e.g.,. Clint Eastwood is an actor) and about 5 unfamiliar people (e.g., Clark Patterson is a comedian). The 50 sentences were randomized at presentation, but Ostrom et al. found that subjects' recall tended to be organized by person -- e.g., they recalled several Clint Eastwood facts before recalling several Clark Patterson facts, etc.

But is recall of facts about a single individual person organized by that person's traits? In one study, Hamilton, Leirer, & Katz (1979) had subjects study a list of 16 sentences, 4 each describing a person's socially desirable, intelligent, athletic, and religious behaviors. As expected, recall was better under impression-formation instructions compared to a memory set. But although the level of category clustering was significantly greater than zero, it was not very high in absolute terms.

In a similar study, Hamilton, Katz, & Leirer (1979) gave subjects 8 study-test trials, instead of only one or two, as in the previous study. Recall grew appreciably over the 8 trials, but clustering remained at rather low levels. Although Hamilton et al. claimed these two studies as evidence for trait-based organization of person memory, which they are, kindasorta, it has to be said that there wasn't much trait-based organization in evidence.

Further doubt was cast on the organizational hypothesis by a study by Smith & Kihlstrom (1987), which used Norman's (1968) set of Big Five traits as stimulus materials. After 5 study-test trials, there was very little evidence of category clustering -- especially compared to a standard experiment of the type performed by Bousfield and his associates, using nouns instead of trait adjectives as stimulus materials.

Even more doubt was cast by an experiment by Dabady, Bell, & Kihlstrom (1992) which used sentences describing specific behaviors related to the Big Five, instead of trait adjectives, as stimulus materials. The sentences were presented either randomly, or blocked by trait.

Blocked presentation yielded slightly worse recall than random presentation, suggesting that subjects didn't capitalize on the traits to organize their memories. And category clustering was at very low levels, approaching zero.

Priming of Behaviors by Traits

The coup de gras to the organization hypothesis was administered by a series of studies performed by Klein, Loftus, and their colleagues. In one experiment, Klein and Loftus (1992) presented a list of 20 behaviors, 4 instances of each of 5 traits (athletic, intelligent, honest, religious, and sociable) under three different conditions (impression formation, memorization, and category clustering).

There was some evidence of category clustering in the impression and memorization conditions, as in Hamilton's studies, but the level of clustering was much less than that observed when subjects were actually asked to sort the behaviors into the appropriate trait categories.

A second study, by Klein, Loftus, et al. (1992), capitalized on priming, as in the earlier study by Srull. If trait nodes are interposed between person nodes and episode nodes, then activating the trait nodes should prime retrieval of trait-related behavioral episodes. Klein et al. first asked their subjects to rate their own mothers on a list of trait adjectives. Then, in the experimenter proper, the subjects were presented with the trait terms and asked to perform one of three tasks:

define the trait;
judge whether it described their mothers;
recall a behavioral episode in which their mothers displayed the trait.

If traits organize memory, then rating a trait for its descriptiveness should prime retrieval of trait-related behaviors. As it turned out, there was no priming for traits that were highly descriptive of the subjects' mothers: response latencies in the recall task were no shorter when it was preceded by the describe task than when it was preceded by the define task.

For traits that were less descriptive of their mothers, however, there was such priming. Klein et al. concluded that highly descriptive traits are represented independently of trait-related behaviors. In order to account for the priming effect observed with less-descriptive traits, Klein et al suggested that these trait judgments are based on the retrieval of exemplary behaviors. In effect, then, in the case of less-descriptive traits, it is the retrieval of exemplary behaviors, not retrieval of the trait itself, that is responsible for priming the retrieval of other exemplary behaviors.

Episodic and Semantic Self-Knowledge in Amnesia

Neuropsychological evidence converges on the conclusion that knowledge of traits is represented independently of knowledge of behaviors. Patients suffering the amnesic syndrome following damage to the hippocampus and the medial portions of the temporal lobe cannot remember "post-morbid" events that happened after their brain damage occurred. This means that they lose autobiographical memory. Yet they do retain semantic knowledge of themselves, in terms of their personality traits.

Actually, there are two temporal (time-related) forms of amnesia:

Anterograde amnesia (or AA) covers "postmorbid" events and experiences that occurred since the onset of the brain damage. The most famous case is that of H.M., who suffered an anterograde amnesia following excision of his hippocampus and other portions of the medial temporal lobe as a desperate treatment for intractable epilepsy. H.M.'s surgery occurred when he was 27 years old, and he lived to be 82 (he died in 2008), but for all that time he had not a single conscious recollection of anything he had experienced, or anything he had done.
Retrograde amnesia (or RA) covers "premorbid" memories that occurred prior to the brain damage. Retrograde amnesia most commonly occurs in the aftermath of a concussive blow to the head, and patients eventually recover most of their memories.

In the most dramatic example, Endel Tulving (1993) studied Patient K.C., a unique patient who suffered a complete amnesia, both anterograde and retrograde, following a motorcycle accident. He also underwent a profound personality change, from fairly extraverted before the accident to fairly introverted afterwards. K.C. was unable to remember any particular event that occurred throughout his whole life, showing a complete loss of episodic self-knowledge. Nevertheless, he was nonetheless able to describe his current personality accurately, showing that semantic self-knowledge had been spared. K.C. died in 2014 at age 62.

In one part of his study, Tulving employed rating scales consisting of common trait adjectives.

After the accident, K.C.'s self-ratings (i.e., of his changed, postmorbid personality) largely agreed with his mother's ratings of his postmorbid personality, Q = .77.

The Q statistic employed by Tulving is similar to the familiar correlation coefficient, r.

And K.C.'s ratings of his mother's personality agreed with her self-ratings, Q = .80.

In another analysis, Tulving employed a two-alternative forced-choice procedure, in which pairs of adjectives were matched for social desirability, and K.C. and his mother had to pick which one of each pair applied.

The test-retest reliability of K.C.'s self-ratings of his own postmorbid personality was relatively high, showing 76% agreement across two tests.
His mother's ratings of his pre- and postmorbid personality showed only chance levels of agreement (50%), indicating just how much his personality had changed.
K.C.'s ratings of his postmorbid personality agreed with his mother's ratings about 73% of the time, showing accurate self-knowledge on his part.
And his ratings of his postmorbid personality agreed with his mother's ratings of his premorbid personality at only about chance levels, 53% -- thus, again, testifying to the magnitude of his personality change.

So, despite his total lack of episodic memory, K.C. nevertheless had fairly accurate semantic self-knowledge, of what he was like as a person. Yes, it would have been interesting if Tulving had also assessed K.C.'s knowledge of his premorbid personality, but Tulving didn't do that.

Watch Tulving Interview Patient K.C. on YouTube
Part 1	Part 2	Part 3	Part 4	Part 5	Part 6

Stanley Klein and his colleagues have conducted a number of other neuropsychological case studies, yielding the same conclusions. We'll talk about them more later in the course, but for now the case of W.J. will suffice. She was a 2nd-quarter college freshman who suffered a concussive blow to her head in an accident in her dormitory. Medical examination revealed no neurological damage, but she did experience an anterograde amnesia covering the 45 minutes following her accident, and a retrograde amnesia covering the previous 6-7 months. The RA cleared after about 11 days, but in the meantime, Klein devised a number of tests of her episodic and semantic self-knowledge. W.J. didn't undergo any personality change following her injury; but, like many young adults, she had changed a lot, as a person, since graduating from high school and matriculating at college. So now the question was: does she know what kind of person she now is, even though she doesn't remember the experiences that, presumably, shaped her new personality? So like Tulving, Klein assembled a quick assessment of personality -- quick enough to administer before her RA cleared. Klein also tested a control group of female college students.

Klein

first administered a number of tests to document her RA. In particular, he employed the Galton-Robinson technique, in which familiar words serve as cues for the retrieval of specific autobiographical memories. During the period of her RA, W.J. showed a profound deficit in memory for events from the past 12 months (by comparison, the control subjects showed a strong recency bias). After her RA cleared, the temporal distribution of W.J.'s autobiographical memories was indistinguishable from controls.

During the period of her RA, Klein assessed W.J.'s knowledge of her own personality. While Tulving checked K.C.'s self-ratings against his mother's ratings, Klein employed W.J.'s boyfriend for that purpose.

During her period of amnesia, W.J.'s ratings of her personality showed significant agreement with her boyfriend's ratings of her personality, r = .65.

The control group of women, and their boyfriends, showed exactly the same level of agreement, r = .65.

While she was amnesic, W.J. was also asked to rate her personality when she was in high school. These ratings correlated significantly with her ratings of her college personality, r = .53. So it's not that there was no continuity between high-school and college. But this correlation was significantly lower than the correlation of her current self-ratings with her boyfriends ratings of her.
When tested again after she recovered her memories, the test-retest reliability of her self-ratings was very high, r = .74.

The control group, retested after the same interval, also showed high test-retest reliability, = .78.
This correlation is also higher than her high-school/college correlation, further evidence that her college self is not completely accounted for by her high-school self.

The bottom line is that W.J. retained semantic self-knowledge even in the absence of episodic self-knowledge.

Independent Representation of Episodic and Semantic Person Memory

With these sorts of results in hand, we can draw some conclusions about the representation of persons in memory:

Persons can be represented as nodes in an associative memory system.
The person's traits and behaviors are also represented as nodes in the network, fanning out from the person node.
Within this system, items of (semantic) trait and (episodic) behavioral knowledge are represented independently of each other; behaviors do not fan out from the traits they exemplify.

Alternative Cognitive Architectures for Person Memory

The kinds of memory models discussed so far are known as symbolist and declarativist in nature.

Knowledge about a person is represented symbolically, in propositional form.
Propositions are represented symbolically as nodes in an associative network.
The associative network represents both trait (semantic) and behavioral (episodic) knowledge.

Empirically, it appears that trait and behavioral knowledge are represented independently in the network.

These models are, by far, the most popular in both cognitive psychology and social cognition. But they do not exhaust the possibilities, and some theorists have proposed alternative architectures for both nonsocial and social memory.

A Proceduralist View

Some of these models are proceduralist in nature, because at least some knowledge is represented procedurally, as production systems, rather than declaratively, as propositions. Rather than encoding all person memory in propositional form, at least some aspects of person memory are not encoded at all, but rather are computed as needed, on line, by executing various productions.

Among the most prominent such models in person memory is one proposed by Smith (1984), in which behaviors are represented declaratively, as in the Srull-Wyer model, but traits are not represented at all. Instead, trait information is computed as needed by productions that take the following form (very roughly):

If the goal is to form an impression, and the target rescued the kitten, then say he is kind.
If the goal is to form an impression, and the target cursed the salesgirl, then say he is thoughtless.

The model is not entirely proceduralist, because behavioral (episodic) information is still represented in memory as declarative knowledge. But it is partially proceduralist, in that trait (semantic) knowledge is not represented at all, but rather is computed as needed by procedures that are executed for that purpose.

A model like this underlies Bem's self-perception theory of attitudes, which holds that people do not actually hold attitudes about various objects and issues. Instead, when asked their attitudes, they infer what their attitudes must be, based on their perception (and memory) of their own behavior.

A Connectionist View

The declarativist and proceduralist views of person memory are both "symbolic " models of knowledge representation, in which items of knowledge are represented symbolically, as propositions or productions, located at nodes in an associative network. An alternative view, known as connectionism, dispenses with the idea that knowledge is represented at discrete nodes in an associative network. Instead, it asserts that knowledge is represented as a pattern of activation across all the elements of a network.

So, two different pieces of knowledge, such as Judy attended the symphony concert and James Bartlett rescued the kitten are not associated with different nodes in a network, but instead are represented by the same nodes -- that is, by the same processing elements. They simply differ in the pattern of activation in the associative pathways connecting the processing elements.

Connectionist models are sometimes called parallel distributed processing (PDP) models, because all the elements are activated simultaneously (instead of a more serial view in which activation takes time to spread from one node to another), and because knowledge is distributed across the entire network (instead of being associated with discrete nodes). They are also called parallel constraint satisfaction models, because the pattern of activation in the network is constrained by both the inputs and the outputs to the network. In the domain of social cognition, Kunda & Thagard (1996) offered a connectionist, "parallel constraint satisfaction" model of impression formation that is based on a generic connectionist model of memory.

Connectionist models are sometimes favored over symbolic processing models because they are more "neurally plausible" models of how memory must be represented in the brain. To understand this claim, we finish with a discussion of the neural representation of person memory.

Symbolist and Connectionist Architectures Compared

The differences between symbolist and connectionist architectures can best be appreciated by a direct comparison of models of semantic memory.

Consider first a typical associative-network model of semantic memory such as that proposed by Collins and Quillian (1969). The figure depicts a three-level conceptual hierarchy with living thing at the top, plant and animal in the middle, and specific plants and animals like pine and rose, and robin and salmon at the bottom. Each node is connected to some other nodes, as well as to nodes representing various features and properties, by associative links that specify the relations between nodes: isa, has, can, etc. Thus, a question about a robin activates the node representing that concept, and activation spreads along associative links. This pattern of spreading activation represents propositions like robin isa [sic] bird and robin is red, bird can fly and bird is a living thing.

An alternative connectionist model representing exactly the same knowledge has been proposed by Rogers & McClelland (2009), based on earlier work by Rumelhart & Todd (1993). One advantage of a connectionist model is that it can be "taught" that, for example, robins are birds and roses have petals. Units in the Item Layer correspond to things like roses and robins. Units in the Relation Layer correspond to relations like isa (model-speak for "is a") or can. Units in the Attribute Layer correspond to features like living thing, fly, and petals. The network can be trained so that it will activate the correct attributes when an item is presented. Thus, when asked "What can a canary do? it responds with grow, move, fly, and sing.

The Neural Representation of Person Memory

At the abstract level of cognitive theory, person memories can be viewed as nodes representing a person, his or her characteristics, and his or her behaviors, all linked together by associative links and embedded in a wider associative network of memories. But what is person memory like at the neural level?

An Exchange About Nodes

At a seminar once, long ago and far away, a famous cognitive psychologist was giving a presentation on his associative-network model of memory. In the audience was a famous cognitive neuroscientist, who asked the following question:

Bill, I don't understand this business about nodes! What do nodes look like?

To which the cognitive psychologist replied:

Well, Karl, to begin with they're very small.

Competing Views of Memory Representation

The easiest answer is that the every memory is represented by a single neuron, or perhaps a small cluster of neurons, located in a particular part of the brain, and that person memories are no exception to this rule. Thus, the nodes in associative-network models of person memory, like those discussed here, have their neural counterparts in distinct (clusters of neurons).

Early research by Wilder Penfield (1954), a Canadian neurologist, suggested that this is indeed the case. In the process of diagnosing and treating cases of epilepsy, Penfield would stimulate various areas of the brain with a small electrical current delivered through a micro-electrode implanted in the brain. This procedure does not hurt, because the cortex does not contain afferent neurons, and patients remain awake while it was performed. Accordingly, Penfield asked patients what they experienced when he stimulated them in various places. Sometimes they reported experiencing specific sensory memories, such as an image of a relative or the sound of someone speaking. This finding was controversial: Penfield had no way to check the accuracy of the memories, and it may be that what he stimulated were better described as "images" than as memories of specific events. In any event, the finding suggested that there were specific neural sites, perhaps a cluster of adjacent neurons, representing specific memories in the brain.

However, evidence contradicting Penfield's conclusions was provided by Karl Lashley (1950), a neuroscientist who conducted a "search for the engram", or biological memory trace, for his entire career. Lashley's method was to teach an animal a task, ablate some portion of cerebral cortex, and then observe the effects of the lesion on learned task performance. Thus, if performance was impaired when some portion of the brain was lesioned, Lashley could infer that the learning was represented at that brain site. After 30 years of research, Lashley reported that his efforts had been entirely unsuccessful. Brain lesions disrupted performance, of course. But the amount of disruption was proportional to the amount of the cortex destroyed, regardless of the particular location of the lesion.

Lashley's Law of Mass Action states that any specific memory is part of an extensive organization of other memories. Therefore, individual memories are represented by neurons that are distributed widely across the cortex. It is not possible to isolate particular memories in particular bundles of neurons, so it is not possible to destroy memories by specific lesions.

At about the same time, D.O. Hebb, a pioneering neuroscientist, argued that memories were represented by reverberating patterns of neural activity distributed widely over cerebral cortex. Hebb's suggestion was taken up by others, like Karl Pribram, another neuroscientific pioneer, who postulated that memory was represented by a hologram, in which information about the whole object was represented in each of its parts.

Connectionist models are inspired, in part, by both Lashley's Law of Mass action and Hebb's reverberating-network model of memory.

Still, Penfield's vision held some attraction for some neuroscientists, who continued to insist that individual memories were represented by the activity of single neurons, or at most small clusters of neurons, at specific locations in cortex.

Sherrington (1941) postulated pontifical cells that represent sensory scenes.
Konorski (1967) postulated gnostic neurons that represented unitary percepts.
Barlow (1969, 1972) argued on the basis of a principle of "economy of impulses" that the brain should achieve a complete representation of a sensory scene with the fewest number of active neurons possible.

Problems with Penfield's clinical studies aside, early advances in understanding the neural basis of perception led support to the localist views of representation.

Barlow (1953) identified specific cells in the frog retina that responded to particular elementary patterns of visual stimulation: contrast between light and dark, moving edges, dimming of light, and convexity (where a dark object appears against a bright field).
Hubel and Wiesel (1959) won the Nobel Prize for similar studies that identified orientation-specific fields in the visual cortex of the cat.

While these neural systems responded to the physical properties of the stimulus, their discovery fed speculation that the meaning of the stimulus, and other cognitive contents, might similarly be represented by a localized cluster of neurons.

Jerome Lettvin (1969) speculated that a mother cell, or rather mother cells, plural, might represent all that subjects knew about their mothers. It was Lettvin who called Barlow's convexity detectors cells "bug perceivers".
Barlow himself (1972) speculated about a grandmother cell.
Harris (1980) somewhat facetiously speculated that if we have cells that respond to yellow, and other cells that respond to Volkswagens, we might also have yellow Volkswagen cells.

Nobody, including Lettvin and Barlow themselves, took any of this all that seriously, and neuroscientific doctrine has emphasized distributed representations of the sort envisioned by Lashley and Hebb.

Until recently.

A "Halle Berry" Neuron?

A serendipitous finding, ingeniously pursued by a group of investigators at UCLA and Cal Tech, has suggested that there might be something to the idea of a "grandmother neuron" after all (Quian Quiroga, Reddy, Kreiman, Koch, & Fried, Nature, Vol. 425, pp. 1102-1107, June 23, 2005; see also "Brain Cells for Grandmother" by Quian Quiroga, Fried, and Koch, Scientific American, 02/2013)).

These investigators worked with eight patients with intractable epilepsy. In order to localize the source of the patients' seizures, they implanted micro-electrodes in various portions of the patients' medial temporal lobes (the hippocampus, amygdala, entorhinal cortex, and parahippocampal cortex). Each micro-electrode consisted of 8 active leads and a reference lead. They then recorded responses from each lead to visual stimulation -- pictures of people, objects, animals, and landmarks selected on the basis of pre-experimental interviews with the patients.

In one patient, the investigators identified a single unit (i.e., a single lead of a single electrode, corresponding either to a single neuron or to a very small, dense cluster of neurons), located in the left posterior hippocampus, that responded to a picture of Jennifer Aniston, an actress who starred in a popular television series, Friends. (A response was defined very conservatively as an activity spike of magnitude greater than 5 standard deviations above baseline, consistently occurring within 1 second of stimulus presentation). That unit did not respond to any other stimuli tested. The investigators quickly located other pictures of Aniston, including pictures of her with Brad Pitt, to which she was once (and famously) married. The same unit responded to all the pictures of the actress -- except those in which she was pictured with Pitt!

Similarly, a single unit in the right anterior hippocampus of another patient responded consistently and specifically to pictures of another actress, Halle Berry (who won an Academy Award for her starring role in Monsters' Ball). Interestingly, this unit also responded to a line-drawing of Berry, to a picture of Berry dressed as Catwoman (for her starring role in the unfortunate film of the same name), and even to the spelling of her name, H-A-L-L-E--B-E-R-R-Y (unfortunately, the investigators didn't think of doing this when they were working with the "Jennifer Aniston" patient -- remember, they were flying by the seat of their pants, doing this research under the time constraints of a clinical assessment). The fact that the unit responded to Berry's name, as well as to her picture, and to pictures of Berry in her (in)famous role as Catwoman, suggests that the unit represents the abstract concept of "Halle Berry", not merely some configuration of physical stimuli.

As another example, yet a third patient revealed a multi-unit (i.e., two or more leads of a single electrode, evidently corresponding to a somewhat larger cluster of neurons) in the left anterior hippocampus that responded specifically, if not quite as distinctively, to pictures of the Sydney Opera House. This same unit also responded to the letter string SYDNEY OPERA HOUSE. It also responded to a picture of the Baha'i Temple -- but then again, in preliminary testing this patient had misidentified the Temple as the Opera House! So again, as with the Halle Berry neuron, the multi-unit is responding to the abstract concept of the Sydney Opera House, not to any particular configuration of physical features.

Across the 8 patients, Quian Quiroga et al. tested 993 units, 343 single units and 650 multi-units, and found 132 units (14%) that responded to 1 or more test pictures. When they found a responsive unit, they then tested it with 3 to 8 variants of the test pictures. A total of 51 of these 132 units yielded evidence of an invariant representation of people, landmarks, animals, or food items. In each case, the invariant representation was abstract, in that the unit responded to different views of the object, to line drawings as well as photographs, and to names as well as pictures.

So maybe there is a "grandmother neuron" after all! This research -- which, remember, was performed in a clinical context and thus may have lacked some desirable controls -- identified sparse neural representations of particular people (landmarks, etc.), in which only a very small number of units is active during stimulus presentation.

Moving away for a moment from strictly social memory, Quian Quiroga and his colleagues have proposed that the same principle applies to concepts in general, not just to faces and places. That is to say, concepts like tree and surfboard are also "sparsely" represented in the brain, as a discrete cluster of relatively few (meaning hundreds or thousands) of adjacent neurons (Quian Quiroga, Nature Reviews Neuroscience, 2012).

Of course, this evidence for localization of content contradicts the distributionist assumptions that have guided cognitive neuroscience for 50 years. Further research is obviously required to straighten this out, but maybe there's no contradiction between distributionist and locationist views after all. After all, according to Barlow's (1972) psychophysical linking principle,

Whenever two stimuli can be distinguished reliably... the physiological messages they cause in some single neuron would enable them to be distinguished with equal or greater reliability.

In other words, even in a distributed memory representation, there has to be some neuron that responds invariantly to various representations of the same concept. Neural representations of knowledge may be distributed widely over cortex, but these neural nets may come together in single units.

But wait a minute -- we're talking about the cerebral cortex, and the data from Quian Quiroga et al. came from the hippocampus and other subcortical structures. Note, however, that the hippocampus is crucial for memory: it was the destruction of his hippocampus that rendered H.M. amnesic. Nobody thinks that memories are stored in the hippocampus -- it's just too small for that purpose. But one prominent theory of the hippocampus is that it performs a kind of indexing function, relating memories to each other that are located in the cortex. Accordingly, maybe Quian Quiroga didn't exactly tap into their patient's whole knowledge representation of Halle Berry -- but instead, hit on the neural index card that locates all that information.

Note: This whole issue was battled out in the pages of Psychological Review (2009-2010) -- naturally, inconclusively.

Autobiographical Memory

A special case of person memory is autobiographical memory (ABM) -- that is, a real person's memories for his own actions and experiences, which occurred in the ordinary course of everyday living. Episodic memory, as studied with variations on the verbal-learning paradigm, is explicitly intended as a laboratory analogue of autobiographical memory: each list, and each word on a list, constitutes a discrete event, with a unique location in space and time. ABM is episodic memory, as opposed to semantic memory or procedural knowledge, but ABM isn't just episodic memory -- there's more to it than a list of items studied at particular places and particular times (Kihlstrom, 2009).

In the first place ABM is autobiographical -- it's about the remember him- or herself. Self-reference is critical to ABM in a way that it's not so critical to the episodic memory studied in the typical verbal-learning experiment. ABMs are about oneself in a way that wordlists are not.
ABMs also have an "Aristotelian" plot structure -- that is, they are related to each other in particular ways, such as the characteristics that Aristotle (writing in the Poetics) considered necessary for a good drama.

First, there is a temporal organization: Each ABM, as an episodic memory, has a unique location in time (and space) -- all episodic memories have this. But ABMs are also linked together in some sort of temporal organization, so that some represent earlier, and others later, episodes in the person's life. Put another way, ABMs comprise a narrative of the person's own life, from beginning to end.
The organization of ABMs isn't just chronological -- it's also causal. One event may function as the cause of another ('My father beat me, so I left home"), or one is the effect of another ("My father got laid off, so we had to sell our house").
Finally, ABMs reveal something about the character of the person who has them.

ABMs are conscious memories -- or, at least, they are accessible to consciousness. The question of unconscious ABMs has been with us at least since Freud (remember "Hysterics suffer from reminiscences"), and I don't want to deal with it further here. The fact of the matter is that, as they are studied by cognitive and social psychologists, ABMs are conscious recollections.

Another familiar phenomenon of emotional memory is the flashbulb memory, in which subjects remember the circumstances under which they first learned about a surprising, consequential, affect-laden event. For members of the "baby-boom" generation, a classic example is the assassination of President John F. Kennedy. For younger individuals, as well as boomers, other familiar examples are the space shuttle Challenger disaster of 1986 and the terror attacks of September 11, 2001.

For more on autobiographical memory, including an extensive discussion of flashbulb memories, see the lecture supplement on "The Self".

Social Memory Beyond the Individual

So far, we've followed the traditional paradigms of psychology and cognitive science, and discussed social memory as it exists inside the heads of individuals. But social memories seem to exist at the level of the group as well as the level of the individual -- for example, in the monuments and memorials that we erect to commemorate various individuals and events. These, too, are social memories -- they are ways for entire societies, not just individuals, to remember things that are important to them.

For example:

In the United States we hear people say that they will never forget the events of September 11, 2001 -- events which, to one degree or another, directly or vicariously, they personally experienced.
But Israelis, Jews, in general, and many non-Jews as well, say that they will never forget the Holocaust -- even though most Holocaust survivors have now died.
And on June 28, 1989, on Kosovo Field, site of an epic battle of 1389 in which the Ottoman Turks overran the Eastern Orthodox Serbs, Slobodan Milosevic, then president of Yugoslavia, visited Kosovo and gave a speech in which he vowed to "Let the memory of Kosovo heroism live forever" -- the year that the Ottoman Turks overran the Serbian (and thus Eastern Orthodox) population of that province.

In what sense do people who only witnessed 9/11 on CNN "remember" it? In what sense do the children of Holocaust survivors, who never themselves experienced the Holocaust, "remember" it? In what sense did Milosevic "remember" the events of 1389?

Maybe what they mean is that they know these events happened, as matters of historical fact. But maybe these memories really exist, as something like personal recollections, but at a level that extends beyond the individual, and to the group(s) of which the individual is a member.

Collective Memory

The formal study of collective memory begins with Maurice Halbwachs (1877-1945), who was a student of both Henri Bergson, a pioneering French psychologist, and Emile Durkheim, a pioneering French sociologist. Bergson was particularly interested in individual consciousness, particularly as it was manifested in memory. Durkheim, for his part, was interested in collective consciousness, and in the proposition (which is axiomatic for sociologists) that groups had special properties that were not reducible to the properties of the individuals in them.

Putting these two themes together, Halbwachs articulated a concept of collective memory:

"The individual calls recollections to mind by relying on the frameworks of social memory.... There are surely many facts, and many details..., that the individual would forget if others did not keep their memory alive for him. But, on the other hand, society can live only if there is sufficient unity of outlooks among the individuals and groups comprising it.... [T]he necessity by which people must enclose themselves in limited groups... is opposed to the social need for unity.... This is why society tends to erase from its memory all that might separate individuals, or that might distance groups from each other. It is also why society, in each period rearranges its recollections in such a way as to adjust them to the variable conditions of its equilibrium."

For Halbwachs, collective memories are more than merely the sum of individuals' memories. They represent the integration of personal pasts into a common past. However, as the sociologist Lewis Coser points out, collective memory is not the memory of a group mind. Minds exist in individuals, and not somewhere in the space between or above them.

"While the collective memory endures and draws strength from its base in a coherent body of people, it is individuals as group members who remember."

Collective memories are shared within groups and institutions, as individuals draw on the group context to remember and reconstruct the past. Halbwachs went so far as to argue that all individual memories are collective, in the sense that "We are never alone". The only exception, perhaps, is our memories of our dreams -- but even dreams are largely remembered to the extent that we share them with others.

In his work, Halbwachs analyzed a number of examples of what he called the social frameworks of memory:

Religious Collective Memory, including stories of gods, heroes, and saints.
Social Classes and their Traditions, such as when the names and titles of noble families evoke the past;
The Collective Memory of the Family, where there are certain memories known only to family members, and shared only with family members.

Halbwachs' basic point is simply that the past is a social reality that transcends individual subjectivity, and is shared by others around us.

The current social environment influences the way we remember the past.
Our memories of the past go beyond our individual experience. They extend to every member of the group, and are shared by every member of the group by virtue of his or her group membership.

Beyond Collective Memory: Cognitive Sociology

Taking Halbwachs' notion of collective memory as an inspiration, Eviatar Zerubavel (in Social Mindscapes: An Invitation to Cognitive Sociology, 1997) has thought hard about what it would mean to translate the concepts of cognitive psychology to the sociological level of analysis -- producing a thoroughgoing sociology of knowledge, or a cognitive sociology as a companion to cognitive psychology (including social cognition).

Zerubavel argues that there are a number of features that distinguish cognitive sociology from social cognition. Perhaps the most important is scope:

"Social cognition studies the individual's cognition of social objects;

Cognitive sociology studies the social basis of all cognition."

Social cognition is universalistic, in that it tries to identify principles of social cognition that are valid always and everywhere; or it is individualistic, in that it tries to identify the particular understanding of the social world that explains the individual's behavior. By contrast, cognitive sociology tries to identify thought communities -- groups of people who perceive, remember, and think in a way that is distinctively different from the ways other groups do.
Social cognition emphasizes objectivity, in that it tries to describe the laws of mind and behavior from a third-person perspective; or it emphasizes subjectivity, in that it is concerned with the way the unique individual understands his or her social environment. By contrast, cognitive sociology emphasizes intersubjectivity -- the shared percepts, memories, and understandings that people have in common by virtue of their membership in some group.
Social cognition emphasizes the commonalities of thought among all people, or the idiosyncracies of thought in the individual. By contrast, cognitive sociology is more interested in group differences, as opposed to individual differences.

In line with this program, Zerubavel (1997) has tried to outline what a sociology of mind and knowledge would look like, roughly following the canonical chapters of a cognitive psychology text:

social optics
the social gates of consciousness
the social division of the world
social meanings
social memories
etc.

This page last modified 01/17/2017.