Here's the question: What do memories look like? We're talking about secondary, or "long-term" memory here, but still the answer turns out to depend on what kind of knowledge we're talking about. There have been quite different proposals depending on whether we're talking about declarative or procedural knowledge , or episodic or semantic memory. In addition, there are different proposals about how conceptual knowledge -- an aspect of semantic memory, to be sure -- is represented in the mind. In this supplement, we'll focus on episodic memory, with some side glances at semantic memory, and then turn to conceptual representations as a special case of semantic memory. But, in the end, there's just one memory, and a task left largely undone is to figure out how to represent conceptual knowledge in the same cognitive architecture as episodic and semantic knowledge.
A representation is just that: it's something that represents, or stands for, or models, something else. An event can be represented as a list of features, or as a sentence, or as a picture, or a string of digits, or a bunch of beer cans connected with string.
Anything can represent something else, so long as
the representational system satisfied certain requirements
outlined by UCB's Steven Palmer (1978):
Behaviorists like Watson had a simple answer to the question: memories look like associations between stimuli and responses -- because that's what everything is. This emphasis on associations as the basic structure of memory has proved remarkably durable -- though, as we will see, not the way the S-R theorists framed them.
But first a little history, mostly taken from Anderson & Bower's Human Associative Memory (1973).
Aristotle's Associationism. The idea that associations are central to memory has its origins in Aristotle's treatise De Memoria et Reminiscentia. Beginning with the proposition that ideas are derived from sensory experience (instead of being innate, as Plato had asserted), he further argued that ideas became associated with each other by virtue of a small number of principles such as similarity (and contrast), and especially contiguity. (Aristotle also offered subsidiary principles of association such as frequency, intensity, and good order). Memories were retrieved (Aristotle didn't use precisely this term) by virtue of the association of ideas, where one idea served as a probe to elicit an associated idea as a memory.
Aristotle further distinguished between two forms
For the British associationists, associations had only one property: strength, or the likelihood that one idea would elicit another.
British associationism was extremely influential on the early verbal-learning tradition. For example, Ebbinghaus (1885) employed the serial learning of nonsense syllables to study how associations were formed during the learning process, what kinds of associative links were stored in memory, and how associations led from one memory to another. Similarly, Mary Whiton Calkins (1898), working in William James' laboratory at Harvard, invented the paired-associate learning paradigm expressly to study the formation of associations. (Calkins completed a doctoral dissertation, but Harvard refused her a degree, and she in turn refused its offer of a doctoral degree from Radcliffe College. Nevertheless, she founded the psychological laboratory at Wellesley College and later became the first female president of the American Psychological Association.)
American Associationism. Following the lead of the British associationists, there arose an American tradition of associationism at the hands of J.B. Watson, E.L. Thorndike, and later behaviorists such as E.B. Guthrie, C.L. Hull, and especially B.F. Skinner. These were all learning theorists, and they considered the association to be a primitive concept for learning theory. The difference between American and British associationism, of course, was that the British were interested in the association of ideas, while the Americans, being behaviorists, abandoned ideas as mentalistic, in favor of observable stimuli and responses. Thus, for Watson and the others, the conditioned response was the basic unit of behavior, and complex behaviors were built from elementary conditioned responses -- sometimes linked by implicit mediating responses, implicit stimuli, and response-produced stimuli. Ebbinghaus' and Calkins' work fit fairly comfortably into this framework, leading to the S-R reinterpretation of verbal learning.
There were, of course, dissenters among the neo-behaviorists, particularly E.C. Tolman, who argued that stimulus-response associations were not sufficient to explain learning.
With the cognitive revolution in psychology came a return to mentalism, and revived interest in the association of ideas.
In fact, even before the cognitive revolution, a number of researchers in the verbal-learning tradition collected data on pre-existing patterns of word association (actually, this line of research was initiated by C.G. Jung, who in turn was influenced by Freud; but Jung's work -- let along Freud's -- had no direct influence on the verbal-learning tradition). Here, for example, is a fragment of an associative network centered on the word lion. Thus, if you ask subjects to respond with the first word that comes to mind after hearing some other word, the stimulus lion often leads to the responses of tiger, Africa, and den; den leads to the response lair.
But it soon became clear that verbal associations had some funny properties that had not been anticipated by the British and American associationists.
First, it turned out that associations are not necessarily symmetrical. For example, the stimulus tiger may strongly elicit the response tail, but the stimulus tail does not tend to elicit tiger as a response; a much stronger response is end. If you're a British or American associationist, that should strike you as strange. If tiger is associated with tail by virtue of contiguity (or similarity, or whatever), then why isn't tail associated with tiger?
Even earlier, Thorndike (1931) had uncovered the phenomenon of belongingness.. In one of his experiments, he had subjects learn a list of names, in which some names were repeated, such as
Mary Jones Bill Smith Sam Peck Richard Jones Bill Smith.
When subjects were tested with the stimulus Bill-_____, the likelihood of the correct response Smith increased with repetition, as predicted. But when tested with the stimulus Jones-_____, there was no effect of repetition on the correct response Bill. It seemed that, despite being equally contiguous, and equally repeated, Bill and Smith belonged together in a way that , Jones and Bill did not. Thorndike had no way to account for this, but it did suggest that something was wrong with the general principle that associations were formed by virtue of contiguity, and strengthened by means of repetition.
For the British and American associationists, all associations were created equal -- all qualitatively the same, if quantitatively differing in strength. But in 1979, the Mandlers -- George and Jean, one of cognitive psychology's first husband-and-wife teams, working at UCSD, distinguished among different qualitative types of associative structures in memory.
Jean Mandler (1979) distinguished between two
types of associations:
In one line of research, we looked at the organization of recall during partial posthypnotic amnesia. We asked subjects, while they were hypnotized, to memorize a list of words, following standard verbal-learning procedures. In one experiment, we used a serial learning paradigm that encouraged pro-ordinate, serial associations. In another experiment, we used a free-recall paradigm, with a categorized list, that encouraged subordinate, vertical associations. A third experiment encouraged subjective organization. Then they received a suggestion to forget the words. The most highly hypnotizable subjects showed a dense amnesia, temporarily forgetting most or all of the words, while the insusceptible subjects showed no amnesia at all. But some subjects, who are relatively highly hypnotizable, showed a partial response to the amnesia suggestion. These subjects recalled some words, but tended to do so in a disorganized fashion -- but the disorganization only appeared in the serial-learning condition. Posthypnotic amnesia disrupted pro-ordinate, serial, organization, but spared organization based on semantic relationships.
Another line of research made use of the associative memory illusion (sometimes known as the Deese-Roediger-McDermott or DRM effect), in which studying a list of associates to a stimulus word (such as sharp, prick, and haystack, which are all close associates of needle), led subjects to falsely recognize the critical lure (in this case, needle) as having been in the list, when in fact it was not. It turns out that the AMI occurs when the study list consists of co-ordinate associates, such as needle-haystack, but not when it consists of subordinate associates, such as animal-tiger.
The fact that posthypnotic amnesia dissociates serial associations from horizontal associations, and the AMI dissociates horizontal associations from vertical associations, suggests that these kinds of associations really are qualitatively different.
It also turns out that associations are labeled in terms of the semantic roles of cue and response. Thus, eating is related to glutton as act to actor, while eating is related to steak as act to object. A theory of association has to deal with the fact that associations do not differ only quantitatively, simply in terms of strength, but also differ qualitatively with respect to the type of association that has been created between one idea and another.
Despite these problems, the basic idea of
association has been critical to cognitive theories of
memory. These theories generally construe memory as a sort
of mental dictionary in which words stand for concepts, and
associations represent the relations between them. In a
generic network model of memory:
early model proposed by Collins and Quillian (1969)
assumes that concepts are stored in a hierarchical structure,
with associated features stored according to a principle of
cognitive economy -- meaning that each feature gets stored only
once, at the particular level of the hierarchy to which it is
The model correctly predicts performance in a sentence-verification
task, in which subjects are asked to say whether some
statements are true or false. Although subjects rarely
make a mistake in this kind of task, their reaction times vary,
depending on the distance between the concept and the feature..
Yet a third model, proposed by Collins and Loftus (1975) -- this is the same Collins as in the Collins & Quillian model -- also employs distance to represent similarity. The model correctly predicts priming effects in a lexical decision task, such that reading the word street (which is a word) makes it easier to judge that car is also a word (which it is), compared to apples (which also is a word). Similarly, red primes apples and fire engine, but not street or sunrises.
Each of these models has problems, but their success in predicting even subtle aspects of human performance suggests that they are pretty good first approximations of how the mental dictionary is arranged -- that is, how semantic knowledge is represented in memory.
And that's all well and good, except we're not so much interested in the mental dictionary. We're working in the verbal-learning tradition at this point, and what we're really interested in is how people represent lists of words that they've been asked to memorize.
Estes (1976) offered several simple associative models of memory, attempting to capture some aspect of verbal learning.
This general idea has been implemented in a computer model of memory known as SAM (for Search of Associative Memory), proposed by Shiffrin and Raaijmakers (1992). A similar model, called REM (for Retrieving Effectively from Memory), has been proposed by Shiffrin & Steyvers (1997). In SAM:
Thus, during learning subjects link nodes representing list items to a node representing the list. When asked to recall, they activate the list node, and follow associative pathways to list items.
All these models view memory as a mental dictionary, nodes representing words linked to each other, and to nodes representing list membership. But it turns out that memory consists of more than words.
In particular, Paivio (1971, 1986) proposed that
concrete objects, like fish and canaries, can be represented as
images as well as words. He cited lots of different pieces
of evidence in support of this proposition.
J.R. Anderson (1978, 1979), argued that the issue was ultimately undecidable because, for every dual-code model that could be proposed, one could generate a single-code model that would produce the same effects. Here's where it has to be said that parsimony cuts both ways. In some sense it is more parsimonious to have one code than two. But in another sense it is more parsimonious to have two codes than one, if the single-code model has to go through all sorts of contortions to match the dual-code model.
In the next salvo of the debate, Finke (1980, 1985) identified a number of functional equivalences between imagery and perception. He relied on comparisons between recalling, imaging, and perceiving objects and their properties, and found a surprising number of instances where the effects of imagining were identical or similar to those of perceiving, and different from simply recalling. He concluded that "[visual] imagery involves the activation of many of the same information-processing mechanisms that are activated during visual perception" (1980, p. 130).
For some people, neuropsychological evidence clinched the case for the equivalence of imagery and perception. Farah (1988), investigated cases of visual agnosia, in which brain-injured patients are no longer able to identify familiar objects (prosopagnosia is a special form of visual agnosia). The syndrome is famously the subject of a case study by Oliver Sacks, The Man Who Mistook His Wife for His Hat (which was subsequently rendered into an opera, no less). Farah found that visual agnosics also lack a capacity for mental imagery, supporting the idea that mental images rely on the same mechanisms as actual perception.
Incidentally, Farah's arguments are often cited as
an example where neuroscientific evidence constrains
psychological theory, by offering decisive evidence for one
theory (the dual-code theory) and against another (the
single-code theory). But (with all due respect to Farah,
who is a brilliant cognitive neuroscientist) this isn't exactly
Still, by far, most work on mental representation has focused on the verbal side.
Tulving and Bower (1974) summarized the view in
the early 1970s as follows: "A rather general and atheoretical
conception of the memory trace of an event regards it as a
collection of features or a bundle of information" (p.
269). This bundle included a number of different
"[T]he purpose of long-term memory is to record facts about various things, events, and states of the world. We have chosen the subject-predicate construction as the principal structure for recording such facts in HAM" (p. 156).
In other words, events are represented in sentence-like structures. This is quite a different approach from that implied by Tulving and Bower, in which the sentence might just be represented by a cluster of linked nodes. But in a representation like this, you don't really know who did what to whom, where, or when -- much less why. For this purpose, sentence-like structures seem to be better.
In order to illustrate their approach, they focused much of their exposition on variants of a single sentence:
In the park the hippie touched the debutante.
Perhaps Anderson and Bower were inspired by Hair: The American Tribal Love-Rock Musical, which opened in 1967. But they were even more inspired by two developments in linguistics.
First was the work of Noam Chomsky (1957, 1965) on phrase-structure grammar, in which sentences are rewritten as noun phrases and verb phrases, and verb phrases are rewritten as verbs plus noun phrases -- generically, The noun phrase verbed the other noun phrase. Thus, in the sentence the man who hits the ball kisses the girls, The man is the subject noun phrase, and kisses the girls is the verb phrase (which includes an object noun phrase). This phrase-structure representation is the easiest way to represent knowledge in memory.
One problem with Chomsky's system is that that there's more to grammar than syntax (as UCB's George Lakoff would put it, you need generative semantics as well as generative syntax). The UCB linguist Charles Fillmore (1968, 1971) pointed out that nouns, especially, played different semantic roles in sentences -- they weren't just subjects and objects. For example, in the sentence Mary pinched John on the nose, Mary is the agent of the action, John is the experiencer, and nose is the location where she pinched John. Fillmore invented case grammar to represent these semantic roles, and his innovation was picked up by Anderson and Bower.
Accordingly, the HAM representation of an event would look something like this, with a node linking a fact (that a hippie touched a debutante) with the context in which it is true (that the incident happened in a park sometime in the past).
pretty good solution, and HAM does a pretty good job of
emulating the actual performance of subjects who are remembering
lists of words, or sentences about hippies and debutantes.
But it quickly became clear that there is more in memory than
sentences. As noted earlier, the knowledge stored in
memory comes in two forms:
The subject verbed the object
-- as in
The hippie touched the debutante.
If the goal is to drive a standard shift car and the car is in neutral then shift the car into first gear.
Classical and instrumental conditioning are special cases of procedural knowledge:
If Conditioned Stimulus then Unconditioned Stimulus.
If Conditioned Response in the presence of the Conditioned Stimulus then Conditioned Response.
Individual propositions are, of course, embedded in a vast network of propositional knowledge -- more or less along the lines envisioned by Collins and Loftus (1975).
And individual productions, for their part, are embedded in a vast network of productions known as a production system, in which the output of one production provides input to another. In some sense, the action of one production creates the conditions for execution of the next one in the system .
The procedural-declarative distinction was
introduced into artificial intelligence by Terry Winograd (1972,
1975), and imported into psychology by John Anderson
(1976). But it also has deeper origins:
At roughly the same time, Endel Tulving (1972,
1983) introduced a further distinction between two forms of
declarative (meaning factual) knowledge:
Episodic and semantic memory can be dissociated in
the case of source amnesia, but it is evident that both kinds of
memories can be stored in the same declarative, propositional,
self, viewed as a knowledge structure, consists of whatever
one knows about oneself, including episodic and semantic
Can Animals Have Episodic Memory?
Animals can learn, for sure, and so they acquire knowledge stored in memory. But it's not clear that they can acquire episodic memories -- that they can remember particular events that happened to them at a particular time and a particular place. Their memories may be more generic, represented in procedural, or perhaps semantic form, but not necessarily as episodic memories of specific experiences. Although the Darwinian principle of evolutionary continuity should caution us not to make sharp distinctions between human and nonhuman mental capacities, some authorities have suggested that, in the absence of language, permitting self-report, the question of episodic memory in animals is essentially undecidable (e.g., Tulving, 1983).
Still, there experiments that seem to reveal something very much like episodic memory.
animals do have episodic memory after all, even
though they can't share their conscious recollections
with us via language.
Actually, Anderson and Bower were aware of Winograd's work -- they were all together at Stanford after all -- but they were not ready to incorporate the procedural-declarative distinction into their model. That task fell to Anderson, in his ACT (Adaptive Control of Thought) model of cognition, which he introduced in 1976 and has continued to develop over the subsequent 30-plus years. ACT is a complete cognitive theory, written in the form of a computer simulation, that includes learning and memory, but also includes language, reasoning, and problem-solving (Anderson is especially interested in simulating students' learning and use of algebra, which he has called "the Drosophila of cognitive theory" [2007}).
ACT is rather complex, and its complexities need not detain us here. There have also been a number of versions of ACT developed over the years by Anderson and his colleagues, and these evolutionary steps need not detain us either. The following is adapted from the succinct description of the generic ACT model by Medin, Ross, and Markman (2001).
Declarative knowledge is represented in memory by conceptual nodes linked in a network to form propositions like The flower is pretty and Bill thought that the flower was pretty. Like HAM, ACT recognizes a number of semantic roles, but for purposes of simplicity we will only consider three: Agents, Objects, and the Relations between them.
The links between nodes differ in strength.
ACT also recognizes the type-token distinction first proposed by Simon and Feigenbaum (1964), which is a distinction between a general concept and a specific instance of it. For example, a particular chair may be blue, but it is not true that all chairs are blue; blue is the color of only a particular chair. ACT handles this by linking the marker X, which represents a particular chair, to a node representing chairs in general. Thus, Some particular chair is blue, or Some particular small chair is blue. This permits ACT to represent facts about other chairs, which may be large or beige or whatever.
ACT also includes a working memory, which should not be confused with the working memory of Baddeley and Hitch (1974). By working memory Anderson only means that subset of nodes that are activated at any given time. Activation makes a node accessible in memory, but the total amount of activation in a network is limited -- which, effectively limits the number of nodes that can be in working memory at any particular time (think of Miller's "magical number seven, plus or minus two").
Processing a sentence (which is Anderson's proxy for perception) activates nodes corresponding to the elements of the sentence. This activation spreads along links to associated nodes. But the total activation accruing to a conceptual node is divided among the links emanating from that node, such that the strongest links receive the most activation.
While this discussion focuses on the declarative
side of ACT, there is also a procedural side, and these are
ACT is generally considered a symbolic or localist model of cognition, in which concepts are represented as symbols that stand for some piece of knowledge, and these symbols are localized at discrete nodes in the associative network (Anderson himself disagrees with this characterization, but we're not going to let this fact get in the way of our exposition, are we?). When a person acquires a new piece of knowledge, a new node is added to the network (as well as new links from that node to other, pre-existing nodes).
An alternative model is a connectionist or parallel distributed processing (PDP) model, in which the same set of nodes represents each piece of knowledge -- because the knowledge is not represented by the nodes at all, but rather by the connections between them (hence the name). Put another way, knowledge is distributed across the entire network -- hence that name, too! PDP models were introduced to cognitive theory by James (Jay) McClelland and David Rumelhart (1986a; Rumelhart & McClelland, 1986b; McClelland, Rumelhart, et al., 1995), who at the time were colleagues at UCSD (McClelland subsequently moved to Carnegie-Mellon University, where he was a colleague of John Anderson, which may account for Anderson's qualms about the characterization of his model as "symbolic" or "localist"; Rumelhart subsequently moved to Stanford; then McClelland himself moved to Stanford; it's a small world).
As with the ACT model, this discussion of PDP models draws heavily on the treatment by Medin et al. (2001).
In large part, connectionist or PDP models are
motivated by considerations of neural plausibility.
are "neurally inspired" because they take the brain as a
But they have one big disadvantage: they are
extremely prone to forgetting, especially forgetting via
retroactive interference. In fact, this vulnerability to
so bad that it has been characterized as catastrophic
interference by McCloskey and Cohen (1989; see also
Ratcliff, 1990) and French
(1999). To see why this is so, consider the A-B/A-C
retroactive interference paradigm.
At a more conceptual level, and with all due
respect to McClelland (with whom I went to graduate school) and
Rumelhart (who was without a doubt
one of the world's most distinguished cognitive scientists), the
whole connectionist enterprise smacks of the S-R theory of
learning (not for nothing was Thorndike's S-R theory of learning
In these terms, ACT-R might be identified with the computational level of analysis, and is symbolic in nature. The connectionist implementation might be identified with the implementational level of analysis.
Interestingly, recent findings from cognitive neuroscience may help us to choose between symbolic and connectionist architectures. After all, the chief argument in favor of distributed models of representation is that they are more biologically plausible than localist models. But are they? Let's look at the evidence from neuroscience.
The presentation so far has focused on representation as viewed by cognitive psychology, but the rivalry between localist and distributed models has also played itself out within cognitive neuroscience.
Consider the following true story from the annals of cognitive psychology. There once was a seminar at Stanford University attended by both William K. Estes, a pioneering cognitive psychologist, and Karl Pribram, a pioneering cognitive neuroscientist. A student had presented some puzzling new experimental results, and the exchange went something like this:
Bill: Suppose there are a series of little drawers in the brain.
Karl: I have never seen any drawers in there.
Bill: They're very small.
We have a pretty good idea what memories look like in the mind. They look like propositional networks, or maybe like networks of connections. But what do memories look like in the brain? The answer comes in two forms.
The easiest answer is that the every memory is represented by a single neuron, or perhaps a small cluster of neurons, located in a particular part of the brain, and that person memories are no exception to this rule. Thus, the nodes in associative-network models of person memory, like those discussed here, have their neural counterparts in distinct (clusters of neurons).
Early research by Wilder Penfield (1954), a Canadian neurologist, suggested that this is indeed the case. In the process of diagnosing and treating cases of epilepsy, Penfield would stimulate various areas of the brain with a small electrical current delivered through a microelectrode implanted in the brain. This procedure does not hurt, because the cortex does not contain afferent neurons, and patients remain awake while it was performed. Accordingly, Penfield asked patients what they experienced when he stimulated them in various places. Sometimes they reported experiencing specific sensory memories, such as an image of a relative or the sound of someone speaking. This finding was controversial: Penfield had no way to check the accuracy of the memories, and it may be that what he stimulated were better described as "images" than as memories of specific events. In any event, the finding suggested that there were specific neural sites, perhaps a cluster of adjacent neurons, representing specific memories in the brain.
contradicting Penfield's conclusions was provided by Karl
Lashley (1950), a neuroscientist who conducted a "search for
the engram", or biological memory trace, for his entire
career. Lashley's method was to teach an animal a
task, ablate some portion of cerebral cortex, and then observe
the effects of the lesion on learned task performance.
Thus, if performance was impaired when some portion of the
brain was lesioned, Lashley could infer that the learning was
represented at that brain site. After 30 years of
research, Lashley reported that his efforts had been entirely
unsuccessful. Brain lesions disrupted performance, of
course. But the amount of disruption was proportional to
the amount of the cortex destroyed, regardless of the
particular location of the lesion.
Lashley's Law of Mass Action states that any specific memory is part of an extensive organization of other memories. Therefore, individual memories are represented by neurons that are distributed widely across the cortex. It is not possible to isolate particular memories in particular bundles of neurons, so it is not possible to destroy memories by specific lesions.
At about the same time, D.O. Hebb, a pioneering neuroscientist, argued that memories were represented by reverberating patterns of neural activity distributed widely over cerebral cortex. Hebb's suggestion was taken up by others, like Karl Pribram, who postulated that memory was represented by a hologram, in which information about the whole object was represented in each of its parts.
Connectionist models are inspired, in part, by both Lashley's Law of Mass action and Hebb's reverberating-network model of memory.
Penfield's vision held some attraction for some
neuroscientists, who continued to insist that individual
memories were represented by the activity of single neurons,
or at most small clusters of neurons, at specific locations in
Until recently, that is.
A serendipitous finding, ingeniously pursued by a group of investigators at UCLA and Cal Tech, has suggested that there might be something to the idea of a "grandmother neuron" after all (Quiroga, et al., 2005).
These investigators worked with eight patients with intractable epilepsy. In order to localize the source of the patients' seizures, they implanted microelectrodes in various portions of the patients' medial temporal lobes (the hippocampus, amygdala, entorhinal cortex, and parahippocampal cortex). Each microelectrode consisted of 8 active leads and a reference lead. They then recorded responses from each lead to visual stimulation -- pictures of people, objects, animals, and landmarks selected on the basis of pre-experimental interviews with the patients.
In one patient, the investigators identified a single unit (i.e., a single lead of a single electrode, corresponding either to a single neuron or to a very small, dense cluster of neurons), located in the left posterior hippocampus, that responded to a picture of Jennifer Aniston, an actress who starred in a popular television series, Friends. (A response was defined very conservatively as an activity spike of magnitude greater than 5 standard deviations above baseline, consistently occurring within 1 second of stimulus presentation). That unit did not respond to any other stimuli tested. The investigators quickly located other pictures of Aniston, including pictures of her with Brad Pitt, to which she was once (and famously) married. The same unit responded to all the pictures of the actress -- except those in which she was pictured with Pitt!
Similarly, a single unit in the right anterior hippocampus of another patient responded consistently and specifically to pictures of another actress, Halle Berry (who won an Academy Award for her starring role in Monsters' Ball). Interestingly, this unit also responded to a line-drawing of Berry, to a picture of Berry dressed as Catwoman (for her starring role in the unfortunate film of the same name), and even to the spelling of her name, H-A-L-L-E--B-E-R-R-Y (unfortunately, the investigators didn't think of doing this when they were working with the "Jennifer Aniston" patient -- remember, they were flying by the seat of their pants, doing this research under the time constraints of a clinical assessment). The fact that the unit responded to Berry's name, as well as to her picture, and to pictures of Berry in her (in)famous role as Catwoman, suggests that the unit represents the abstract concept of "Halle Berry", not merely some configuration of physical stimuli.
As another example, yet a third patient revealed a multi-unit (i.e., two or more leads of a single electrode, evidently corresponding to a somewhat larger cluster of neurons) in the left anterior hippocampus that responded specifically, if not quite as distinctively, to pictures of the Sydney Opera House. This same unit also responded to the letter string SYDNEY OPERA HOUSE. It also responded to a picture of the Baha'i Temple -- but then again, in preliminary testing this patient had misidentified the Temple as the Opera House! So again, as with the Halle Berry neuron, the multi-unit is responding to the abstract concept of the Sydney Opera House", not to any particular configuration of physical features.
Across the 8 patients, Quiroga et al. tested 993 units, 343 single units and 650 multi-units, and found 132 units (14%) that responded to 1 or more test pictures. When they found a responsive unit, they then tested it with 3 to 8 variants of the test pictures. A total of 51 of these 132 units yielded evidence of an invariant representation of people, landmarks, animals, or food items. In each case, the invariant representation was abstract, in that the unit responded to different views of the object, to line drawings as well as photographs, and to names as well as pictures.
So maybe there is a "grandmother neuron" after all! This research -- which, remember, was performed in a clinical context and thus may have lacked some desirable controls -- identified sparse neural representations of particular people (landmarks, etc.), in which only a very small number of units is active during stimulus presentation.
Of course, this evidence for localization of content contradicts the distributionist assumptions that have guided cognitive neuroscience for 50 years. Further research is obviously required to straighten this out, but maybe there's no contradiction between distributionist and locationist views after all. After all, according to Barlow's (1972) psychophysical linking principle,
Whenever two stimuli can be distinguished reliably... the physiological messages they cause in some single neuron would enable them to be distinguished with equal or greater reliability.
In other words, even in a distributed memory representation, there has to be some neuron that responds invariantly to various representations of the same concept. Neural representations of knowledge may be distributed widely over cortex, but these neural nets may come together in single units.
But wait a minute -- we're talking about the cerebral cortex, and the data from Quiroga et al. came from the hippocampus and other subcortical structures. Note, however, that the hippocampus is crucial for memory: it was the destruction of his hippocampus that rendered H.M. amnesic. Nobody thinks that memories are stored in the hippocampus -- it's just too small for that purpose. But one prominent theory of the hippocampus is that it performs a kind of indexing function, relating memories to each other that are located in the cortex. Accordingly, maybe Quiroga didn't exactly tap into their patient's whole knowledge representation of Halle Berry -- but instead, hit on the neural index card that locates all that information.
In any event, more recently Quian
Quiroga and his colleagues (2008) have backed off their
earlier, strong claims for having discovered something very much
like a grandmother cells.
So maybe symbolic/localist cognitive models have some life in them after all!
Just such an argument has been
made by Bowers (2009), in a Psychological
Review paper whose
title gives the argument away: "On the Biological Plausibility
of Grandmother Cells". At the very least, Bowers
argues that localist models of cognition are
compatible with neurophysiological
Bowers begins with an instructive discussion of the differences between localist (symbolic, computational) and connectionist (PDP) models.
Bowers argues that the general preference for distributionist vs. localist coding schemes is based not just on the neural analogies discussed earlier, or a particular set of neurophysiological findings, but also on a misunderstanding of localist models -- not least because there is not just one possible localist model, but several.
As Bowers notes (2009,
p. 225), "The critical question is not whether a given neuron
responds to more than one object, person, or word but rather
whether the neuron codes for more than one thing. Localist
coding is implemented if a stimulus is encoded by a single node
(neuron) that passes some threshold of activity, with the
activation of other nodes (neurons) contributing nothing to the
interpretation of the stimulus.
For their part, distributed models also come in various forms.
Memory, like any other aspect of mind and behavior, can be analyzed at the psychological level, as in models like HAM and ACT, and it can be analyzed at the neuroscientific level, as in discussions of the hippocampus and grandmother neurons. But memory can also be analyzed at a level "above" the individual mind and brain. So, for example, sociologists discuss collective memories shared by groups, organizations, institutions, and whole societies and cultures.
So how are memories represented at the
sociocultural level of analysis?
Concepts, in turn, are a form of knowledge representation known as schemata. F.C. Bartlett (1932) introduced the concept of schema (pl. schemata, although schemas is acceptable too) as a central concept in his reconstructive theory of memory. According to Bartlett, remembering is not like taking a book off the shelf and reading it, as the traditional library metaphor would have it. Rather, remembering is more like writing the book anew, based on fragmentary notes. The process of remembering, of reconstructing a memory, is guided throughout by an organized framework of world-knowledge and attitudes, within which the memory is reconstructed. This organized framework is the schema.
Many people find schemata difficult to understand, but you begin to get the idea if you think of a more familiar derived term, schematic. A schematic diagram is a kind of logical diagram of a house or piece of equipment. It shows how the parts are associated with each other. But in the case of the house, it doesn't specify what the walls are made of, or what color they are painted. And in the case of a piece of electronic equipment, it doesn't show how the parts are actually configured inside the case. A schematic diagram represents the general idea of a thing -- and that is exactly what a schema is.
Bartlett actually got the schema concept from Sir Henry Head (1861-1940), a British neurophysiologist famous for his studies of bodily posture and of aphasia. In his Studies in Neurology (1920), Head asserted that, in order to maintain correct posture, an organism must have some conception of its own body in space and time -- a homunculus-like "plastic model" which registers information about successive movements of various body parts (arms, legs, etc.), and updates the conception accordingly (see also Head and Holmes, 1911). The body schema is an internal representation of the body, but it's not exactly a picture of what our bodies look like now; but rather a more generic concept of our bodies, that we have arms and legs and hands, and what kinds of motions these body parts can make, where these body parts are likely to be found, and so on.
"Schemas are abstractions from specific instances that can be used to make inferences of the concepts they represent" (Anderson, Cognitive Psychology and Its Implications, 2000).
"A schema is a general knowledge structure used for understanding" (Medin, Ross, & Markman, Cognitive Psychology 2001).
In his theory of memory, Bartlett defined a schema
as "an active organization of past reactions, or of past
experiences, which must always be supposed to be operating in
any well-adapted organic response" (p. 201) -- not just in
moving around the physical world, but in mental activities such
as remembering as well.
The great Swiss developmental psychologist Jean Piaget (1896-1980) also employed the schema concept in his "genetic epistemology" theory of cognitive development. For Piaget, as for Bartlett, a schema is an internal representation of some general class of situations. Incoming stimulus information is assimilated to prevailing schemata, which in turn accommodate to information that doesn't quite fit. Thus, the child is born with innate sensory-motor schemata, which develop through pre-operational, concrete-operations, and formal-operations stages as a result of the dynamic interplay of assimilation and accommodation. It's easy to see the similarities between Bartlett's and Piaget's ideas about schemata, but neither of them references the other. As far as I can tell, Piaget first employed the schema concept in The Language and Thought of the Child (1926), so one would not expect Piaget to cite Bartlett. But Bartlett didn't cite Piaget, either. My best guess is that they derived the idea independently -- Bartlett from Henry Head, and Piaget from Immanuel Kant. Oldfield and Zangwill (1942-1943) do not cite Piaget in their discussion of Head and Bartlett, and deny any connection between Bartlett's views and Kant.
It was Kant, in fact, who first introduced the notion of a schema, referring to the a priori categories that Kant invoked in his synthesis of Cartesian rationalism and British empiricism. Think, for example, of the associationist principle of association by contiguity (never mind that it's wrong). You can't perceive things as close together in space and time unless you already have some notion of space and time. Such notions are schemata, in Kant's terms.
Incidentally, the Bartlett-Piaget coincidence repeated itself several decades later. In his pioneering textbook on Cognitive Psychology, published in 1967, Ulric (Dick) Neisser made considerable use of Bartlett's notion of the schema as the generic knowledge against which percepts are constructed and memories reconstructed. At exactly the same time, Aaron T. (Tim) Beck published a pioneering cognitive theory of depression (as opposed to the prevailing psychoanalytic one), based on the idea that depressed individuals suffer from depressogenic schemata -- basically, negative construals of self, the future, and the world. Neisser was at the time on the faculty at Cornell, but he wrote his book while on sabbatical at the University of Pennsylvania -- which was where Beck, on the faculty of Penn's psychiatry department, was writing his book. I know both individuals (being a Penn PhD), and so far as I can tell neither knew what the other was up to.
Partly owing to the influence of Neisser's book, and partly owing to the increasing interest on the part of memory researchers in memory for stories (as opposed to word-lists), the schema concept was revived in the 1970s -- first within cognitive psychology, and then within social psychology. For example, a number of experiments showed that comprehension of prose passages was better if subjects were first given information about the general theme of the passage; expert chess players, remember chess positions better than novices; and story details that fit subjects' expectations and world-knowledge are remembered better than those that do not.
Taylor and Crocker (1981) discussed a number of
functions of schemata:
Brewer and Nakamura (1984) outlined five ways that schemata could specifically influence memory:
Both Bartlett's and Piaget's notions of schemata are relatively informal, and so was the concept of schema held by the cognitive and social psychologists just described. For them, the term simply refers to an organized body of more-or-less generic knowledge that guides perception, memory, thought, and action. But this is a lecture supplement on representation, so we need to ask:
what do schemata look like?
We got an answer when the schema concept was revived in cognitive science, and particularly in work on artificial intelligence, by theorists who rejected the "atomistic" implications of information-processing theory -- as in HAM or ACT, with individual pieces of knowledge represented as local nodes in an associative network. They had to figure out what schemata looked like, because they wanted to incorporate the concept in their computer-simulation models of memory and other aspects of cognition.
For example, Minsky (1975) explicitly rejected atomism and postulated the existence of "larger" "data structures" for representing knowledge known as frames. A frame has nodes that provide its basic structure, and slots that accept only certain kinds of information. If a slot is not filled by information to the contrary, it is filled in by "default" information. For example, a room has a floor, walls, windows, doors, and a ceiling, each represented by nodes. The floor may be wood or tile or carpeted, but it is unlikely to be made of water or grass. The ceiling may be level or vaulted, but if it is vaulted the vault is unlikely to point downward. There are usually four walls, and at least one window on every outside wall.
At roughly the same time, Rumelhart and Ortony (1977) also invoked the schema concept to handle the problem of representing "higher-level abstractions" in story memory (which they used as a proxy for episodic memory in general).
Rumelhart (1981, 1984) began by offering some
analogies between schemata and more familiar terms:
For Rumelhart (1984), schemata have several major
thorough discussions of Bartlett's schema theory, and
its more modern adaptations, see
A special form of schema is known as a script. The notion of scripts has its origins in sociological role theory, and sociologists of sex often discuss sexual interactions as scripted in nature. For a long time, however, the script concept was relatively informal, based on a dramaturgical metaphor for social behavior in general.
Just what goes into scripts,
and how they are structured, was discussed in detail by Schank
& Abelson (1977), who went so far as to write script
theory in the form of an operating computer program -- another
exercise in artificial intelligence, this time applied
to the domain of social cognition. Schank and Abelson
based their scripts on conceptual dependency theory
(Schank, 1975), which attempts to represent the meaning of
sentences in terms of a relatively small set of primitive
elements. Included in these primitive elements are
primitive acts such as:
Schank & Abelson illustrate their approach with what they call the Restaurant Script:
Entering the Restaurant
Customer PTRANS Customer into restaurant
Customer MOVE Customer to sitting position
Customer MTRANS Signal to Waiter
Waiter PTRANS Food to Customer
Cook ATRANS Food to Waiter
Customer INGEST Food
Waiter ATRANS Check to Customer
Customer PTRANS Customer out of restaurant.
Although script theory attempts to specify the major elements of a social interaction in terms of a relatively small list of conceptual primitives, Schank and Abelson also recognized that scripts are incomplete. For example, there are free behaviors that can take place within the confines of the script.
There are also anticipated
variations of the script, such as
In any event, scripts enable us to categorize social situations: we can determine what situation we are in by matching its features to the prototypical features of various scripts we know. And, having categorized the situation in terms of some script, that script will then serve to guide our social interactions within that situation. By specifying the temporal, causal, and enabling relations among various actions, the script enables us to know how to respond to what occurs in that situation.
Our discussion of memory storage has focused on
episodic memory -- that is, how specific episodes of experience,
thought, and action are represented in the mind. But it is
also clear that more than episodic memories are stored in the
mind. There is also semantic knowledge of various sorts,
as well as procedural knowledge. A special form of
semantic knowledge concerns conceptual knowledge about the
world. Technically, conceptual knowledge is part of
semantic memory, and we have already discussed how certain
classic models of semantic memory represent conceptual
So the question becomes -- what are concepts, and how are categories represented in the mind?
The terms concept and category
are often used interchangeably, even though there is an
important technical distinction between them:
think of our mental concepts as being derived from the actual
categorical structure of the real world, but there are also
points of divergence:
Some categories may be defined through enumeration: an exhaustive list of all instances of a category. A good example is the the English alphabet, A through Z; these letters have nothing in common except their status as letters in the English alphabet.
A variant on enumeration is to define a category by a rule which will generate all instances of the category (these instances all have in common that they conform to the rule). An example is the concept of integer in mathematics, which is defined as the numbers 0, 1, and any number which can be obtained by adding or subtracting 1 from these numbers one or more times.
The most common definitions
of categories are by attributes: properties or
features which are shared by all members of a category. Thus,
birds are warm-blooded vertebrates with feathers and wings,
while fish are cold-blooded vertebrates with scales and fins.
There are three broad types of attributes relevant to category
Still, most categories are defined by attributes, meaning that concepts are summary descriptions of an entire class of objects, events, and ideas. There are three principal ways in which such categories are organized: as proper sets, as fuzzy sets, and as sets of exemplars.
Now having defined the differences between the two terms, we are going to use them interchangeably again. The reason is that it's boring to write concept all the time; moreover, the noun category has a cognate verb form, categorization, while conceptual does not (unless you count conceptualization, which is a mouthful that doesn't mean quite the same thing as categorization).
Still, the semantic difference
between concepts and categories raises two particularly
interesting issues for social categorization:
Perhaps the earliest philosophical discussion of conceptual structure was provided by Aristotle in his Categories. Aristotle set out the classical view of categories as proper sets -- a view which dominated thinking about concepts and categories well into the 20th century. Beginning in the 1950s, however, and especially the 1970s, philosophers, psychologists, and other cognitive scientists began to express considerable doubts about the classical view. In the time since, a number of different views of concepts and categories have emerged -- each attempting to solve the problems of the classical view, but each raising new problems of its own. Here's a short overview of the evolution of theories of conceptual structure.
According to the classical view, concepts are summary descriptions of the objects in some category. This summary description is abstracted from instances of a category, and applies equally well to all instances of a category.
classical view, categories are structured as proper sets,
meaning that the objects in a category share a set of defining
features which are singly necessary and jointly
sufficient to demarcate the category.
Such hierarchies of proper sets are characterized by perfect nesting, by which we mean that subsets possess all the defining features of supersets (and then some). Examples include:
superset: points, lines, planes, solids
subsets of planes: triangles, quadrilaterals, etc.
sub-subsets of quadrilaterals: parallelograms, rhomboids, etc.
sub-sub-subsets of parallelograms: rectangles, squares, etc.
superset: male, female
subsets of males: youth, bachelor, husband, widowersubsets of females: girl, maiden, wife, widow
superset: executive, legislative, judicial
subsets of legislative: senator, representative
subsets of executive: president, cabinet secretary, administrator
subsets of judicial: supreme court, court of appeals, district court, magistrate
example, the perfect nesting in the hierarchy of geometrical
Proper sets are also characterized by an all-or-none arrangement which characterizes the horizontal relations between adjacent categories, or the distinction between a category and its contrast. Because defining features are singly necessary and jointly sufficient, proper sets are homogeneous in the sense that all members of a category are equally good instances of that category (because they all possess the same set of defining features). An entity either possesses a defining feature or it doesn't; thus, there are sharp boundaries between contrasting categories: an object is either in the category or it isn't. You're either a fish, or you're not a fish. There are no ambiguous cases of category membership.
According to the classical view, object categorization proceeds by a process of feature-matching. Through perception, the perceiver extracts information about the features of the object; these features are then compared to the defining feature of some category. If there is a complete match between the features of the object and the defining features of the category, then the object is labeled as another instance of a category.
The proper set view of categorization is sometimes called the classical view because it is the one handed down in logic and philosophy from the time of the ancient Greeks. But there are some problems with it which suggest that however logical it may seem, it's not how the human mind categorizes objects. Smith & Medin (1981) distinguished between general criticisms of the classical view, which arise from simple reflection, and empirical criticisms, which emerge from experimental data on concept-formation.
Criticisms. On reflection, for example, it appears
that some concepts are disjunctive: they are defined
by two or more different sets of defining features.
Another problem is that many entities have unclear category membership. According to the classical, proper-set view of categories, every object should belong to one category or another. But is a rug an article of furniture? Is a potato a vegetable? Is a platypus a mammal? Is a panda a bear? We use categories like "furniture" without being able to clearly determine whether every object is a member of the category.
Furthermore, some categories are associated with unclear definitions. That is, it is difficult to specify the defining features of many of the concepts we use in ordinary life. A favorite example (from the philosopher Wittgenstein) is the concept of "game". Games don't necessarily involve competition (solitaire is a game); there isn't necessarily a winner (right-around-the-rosy), and they're not always played for amusement (professional football). Of course, it may be that the defining features exist, but haven't been discovered yet. But that doesn't prevent us from assigning entities to categories; thus, categorization doesn't seem to depend on defining features.
Empirical Criticisms. Yet another problem is imperfect nesting: it follows from the hierarchical arrangement of categories that members of subordinate categories should be judged as more similar to members of immediately superordinate categories than to more distant ones, for the simple reason that the two categories share more features in common. Thus, a sparrow should be judged more similar to a bird than to an animal. This principle is often violated: for example, chickens, which are birds, are judged to be more similar to animals than birds. This results in a tangled hierarchy of related concepts.
The chicken-sparrow example reveals the last, and perhaps the biggest, problem with the classical view of categories as proper sets: some entities are better instances of their categories than others. This is the problem of typicality. A sparrow is a better instance of the category bird -- it is a more "birdy" bird -- than is a chicken (or a goose, or an ostrich, or a penguin). Within a culture, there is a high degree of agreement about typicality. The problem is that all the instances in question share the features which define the category bird, and thus must be equivalent from the classical view. But they are clearly not equivalent; variations in typicality among members of a category can be very large.
Variations in typicality can be observed even in the classic example of a proper set -- namely, geometrical figures. For example, subjects usually identify an equilateral triangle, with equal sides and equal angles, as more typical of the category triangle, than isosceles, right, or acute triangles.
There are a
large number of ways to observe typicality effects:
Typicality is important because it is another violation of the homogeneity assumption of the classical view. It appears that categories have a special internal structure which renders instances nonequivalent, even though they all share the same singly necessary and jointly sufficient defining features. Typicality effects indicate that we use non-necessary features when assigning objects to categories. And, in fact, when people are asked to list the features of various categories, they usually list features that are not true for all category members.
implication of these problems, taken together, is that the
classical view of categories is incorrect. Categorization by
proper sets may make sense from a logical point of view, but
it doesn't capture how the mind actually works.
Recently, another view of categorization has gained status within psychology: this is known as the prototype or the probabilistic view.
The prototype view retains the idea, from the classical view, that concepts are summary descriptions of the instances of a category. Unlike the classical view, however, in the prototype view the summary description does not apply equally well to every member of the category, because there are no defining features of category membership.
the prototype view, categories are fuzzy sets, in that
there is only a probabilistic relationship between any
particular feature and category membership. No feature is
singly necessary to define a category, and no set of features
is jointly sufficient.
Fuzzy Sets and Fuzzy Logic
The notion of categories as fuzzy rather than sets, represented by prototypes rather than lists of defining features, is related to the concept of fuzzy logic developed by Lofti Zadeh, a computer scientist at UC Berkeley. Whereas the traditional view of truth is that a statement (such as an item of declarative knowledge) is either true or false, Zadeh argued that statements can be partly true, possessing a "truth value" somewhere between 0 (false) and 1 (true).
Fuzzy logic can help resolve certain logical conundrums -- for example the paradox of Epimenides the Cretan (6th century BC), who famously asserted that "All Cretans are liars". If all Cretans are liars, and Epimenides himself is a Cretan, then his statement cannot be true. Put another way: if Epimenides is telling the truth, then he is a liar. As another example, consider the related Liar paradox: the simple statement that "This sentence is false". Zadeh has proposed that such paradoxes can be resolved by concluding that the statements in question are only partially true.
Fuzzy logic also applies to categorization. Under the classical view of categories as proper sets, a similar "all or none" rule applies: an object either possesses a defining feature of a category or it does not; and therefore it either is or is not an instance of the category. But under fuzzy logic, the statement "object X has feature Y" can be partially true; and if Y is one of the defining features of category Z, it also can be partially true that "Object X is an instance of category Z".
A result of the probabilistic relation between features and categories is that category instances can be quite heterogeneous. That is, members of the same category can vary widely in terms of the attributes they possess. All of these attributes are correlated with category membership, but none are singly necessary and no set is jointly sufficient.
Some instances of a category are more typical than others: these possess relatively more central features.
According to the prototype view, categories are not represented by a list of defining features, but rather by a category prototype, or focal instance, which has many features central to category membership (and thus a family resemblance to other category members) but few features central to membership in contrasting categories.
It also follows from the prototype view that there are no sharp boundaries between adjacent categories (hence the term fuzzy sets). In other words, the horizontal distinction between a category and its contrast may be very unclear. Thus, a tomato is a fruit but is usually considered a vegetable (it has only one perceptual attribute of fruits, having seeds; but many functional features of vegetables, such as the circumstances under which it is eaten). Dolphins and whales are mammals, but are usually (at least informally) considered to be fish: they have few features that are central to mammalhood (they give live birth and nurse their young), but lots of features that are central to fishiness.
there are two different versions of the prototype view.
The prototype view solves most of the problems that confront the classical view, and (in my view, anyway) is probably the best theory of conceptual structure and categorization that we've got. But as research proceeded on various aspects of the prototype view, certain problems emerged, leading to the development of other views of concepts and categories.
In the prototype view, as in the classical view, related categories can be arranged in a hierarchy of subordinate and superordinate categories. Many accounts of the prototype view argue that there is a basic level of categorization, which is defined as the most inclusive level at which:
For example, some theorists now favor a third view of concepts and categories, which abandons the definition of concepts as summary descriptions of category members. According to the exemplar view, concepts consist simply of lists of their members, with no defining or characteristic features to hold the entire set together. In other words, what holds the instances together is their common membership in the category. It's a little like defining a category by enumeration, but not exactly. The members do have some things in common, according to the exemplar view; but those things are not particularly important for categorization.
When we want to know whether an object is a member of a category, the classical view says that we compare the object to a list of defining features; the prototype view says that we compare it to the category prototype; the exemplar view says that we compare it to individual category members. Thus, in forming categories, we don't learn prototypes, but rather we learn salient examples.
Teasing apart the prototype and the exemplar view turns out to be fiendishly difficult. There are a couple of very clever experiments which appear to support the exemplar view. For example, it turns out that we will classify an object as a member of a category if it resembles another object that is already labeled as a category member, even if neither the object, or the instance, particularly resemble the category prototype.
Nevertheless, some theorists investigators are worried about it because it seems to be uneconomical. The compromise position, which has many adherents, is that we categorize in terms of both prototypes and exemplars. For example, and this is still a hypothesis to be tested, novices in a particular domain categorize in terms of prototypes and experts categorize in terms of exemplars.
Despite these differences, the exemplar view agrees with the prototype view that categorization proceeds by way of similarity judgments. And they further agree that similarity varies in degrees. They just differ in what the object must be similar to:
As noted, the prototype and exemplar views of categorization are all based on a principle of similarity. What members of a category have in common is that they share some features or attributes in common with at least some other member(s) of the same category. The implication is that similarity is something that is an attribute of objects, that can either be measured (by counting overlapping features) or judged (by estimating them). But ingenious researchers have uncovered some troubles with similarity as a basis for categorization -- and, for that matter, with similarity in general.
Context Effects. However, recently it has been recognized that some categories are defined by theories instead of by similarity. For example, in one experiment, when subjects were presented with pictures of a white cloud, a grey cloud, and a black cloud, they grouped the grey and black clouds together as similar; but when presented with pictures of white hair, grey hair, and black hair, in which the shades of hair were identical to the shades of cloud, subjects grouped the grey hair with the white hair. Because the shades were identical in the two cases, grouping could not have been based on similarity of features. Rather, the categories seemed to be defined by a theory of the domain: grey and black clouds signify stormy weather, while white and grey hair signify old age.
Ad-Hoc Categories. What do children, money, insurance papers, photo albums, and pets have in common? Nothing, when viewed in terms of feature similarity. But they are all things that you would take out of your house in case of a fire. The objects listed together are similar to each other in this respect only; in other respects, they are quite different.
This is also true of the context effects on similarity judgment: grey and black are similar with respect to clouds and weather, while grey and white are similar with respect to hair and aging.
These observations tell us that similarity is not necessarily the operative factor in category definition. In some cases, at least, similarity is determined by a theory of the domain in question: there is something about weather that makes grey and black clouds similar, and there is something about aging that makes white and grey hair similar.
In the theory-based
view of categorization (Medin, 1989), concepts are
essentially theories of the categorical domain in
question. Conceptual theories perform a number of
One way or another, concepts and categories have coherence: there is something that links members together. In classification by similarity, that something is intrinsic to the entities themselves; in classification by theories, that something is imposed by the mind of the thinker.
But what to make of this proliferation of theories? From my point of view, the questions raised about similarity have a kind of forensic quality -- they sometimes seem to amount to a kind of scholarly nit-picking. To be sure, similarity varies with context; and there are certainly some categories which are only held together by a theory, and similarity fails utterly to hold a category together. For most purposes, the prototype view, perhaps corrected (or expanded) a little by the exemplar view, seems to work pretty well as an account of how concepts are structured, and how objects are categorized.
As it happens, most work on social categorization has been based on the prototype view. But there are areas where the exemplar view has been applied very fruitfully, and even a few areas where it makes sense to abandon similarity, and to invoke something like the theory-based view.
To summarize this history,
concepts were first construed as summary descriptions of
categories are just about the most interesting topic
in all of psychology and cognitive science, and two
very good books have been written on the
subject. They are highly recommended:
This page last revised 02/14/2014.