Home Introduction Prescientific 19th Century Interference Cognitive Primary Learning Representation Encoding Storage Retrieval Implicit Memory Implicit Learning Neural Bases Modeling Development Emotion Personal Social Conclusion



Cognition is the mental faculty by which we know the world, and cognitive psychology is concerned with the acquisition, representation, transformation, and utilization of knowledge by humans (and animals).    Learning is the first step in that process.  

In terms of human information processing, the mind performs a sequence of activities:

Traditionally, this activity was described in terms of the formation of associations of three types:


Reflexes, Taxes, and Instincts

Some of these associations are innate or inborn, part of the organism's native biological endowment.  



The reflex is the simplest possible connection between an environmental stimulus and an organismic response. Examples are:

Reflexes are automatic, in that they occur inevitably in response to an adequate stimulus, and occur the first time that stimulus is presented. They do not require the involvement of the higher centers in the nervous system: they persist even when the spinal cord is severed from the brain. Most reflexes are fairly simple, but even fairly complicated activities can be reflexive in nature.

The 19th-century French physiologist Marie-Jean-Pierre Flourens conducted a series of classic studies of reflexes in the decorticate pigeon. He removed both lobes of the cerebral cortex in the bird, and then attempted to determine which patterns of behavior remained in the repertoire. Certain behaviors were preserved:

However, other behaviors disappeared:
Thus, Flourens characterized the decorticate pigeon as a reflex machine, that merely reacted to external stimulation by means of reflexes, but displayed no spontaneous or self-initiated behavior.

Human beings also come "prewired" with a repertoire of reflexes: automatic responses to stimulation that appear soon after birth, before the infant has had any opportunity for learning.

Teit6_1Rooting.jpg (22718 bytes) Some of these are reflexes of approach, elicited by weak stimuli, and which have the effect of increasing contact with the stimulus.Among  these is rooting: when the infant's cheek is touched, it will turn its head in the direction of the touch and open its mouth; if its mouth makes contact with any object, it will close and begin to suck (this reflex will occur even if the infant is asleep or comatose). 
Teit6_2Palmar.jpg (12769 bytes) Another reflex of approach is  grasping: if the palm of the hand is touched, the fingers will flex and close around the object; 
Teit6_3Grasping.jpg (19415 bytes) the grasping reflex can be very strong.

Similarly, if the sole of the foot is touched, the response will be "plantarflexion": the toes will stretch and turn downward.

Other stimulus-response patterns are reflexes of avoidance, which are elicited by intense or noxious stimuli, and have the effect of decreasing contact with the stimulus.
Teit6_4Light.jpg (12118 bytes) For example, the infant's eyes will close automatically in  response to a bright light, and the mouth will close at the introduction of an unpleasant taste (e.g., quinine).

If the palms or soles are scratched, pinched, or pricked, there will be spreading of the fingers or toes, and withdrawal of the hands or feet (in the case of the feet, the toes will also show "dorsiflexion", or turning upward -- the "Babinski reflex").
Teit6_5Stepping.jpg (13996 bytes) A very interesting set of behaviors is the stepping reflex. Infants appear to "learn to walk", but this appearance is deceiving. If the infant's body is supported, and it is moved forward along a flat surface, it will show synchronized stepping. If its toes strike the riser of a set of stairs, it will lift its feet. Neonates don't learn to walk: they can't walk because their skeletal musculature has not matured so that they can support themselves.



Despite the large repertoire of reflexes, infants do not show much initiation of directed activity. The behaviors of the young infant are pretty much confined to reflexes, which are gradually replaced with voluntary action.

Reflexes are an important part of the organism's behavioral repertoire, but they have their limitations.

With subsequent development, reflexes tend to disappear. But they are not abolished entirely: the knee-jerk and rooting reflexes can be elicited in adults; and adult paraplegics display a full repertoire of reflexes. However, adult behavior is dominated by voluntary action, and reflexes slip into the background.

Reflexes involve relatively small portions of the nervous system. In principle, the reflex arc requires only three neurons -- though in practice, spinal reflexes involve entire afferent and efferent nerves, as well as the spinal cord. Other innate stimulus-response connections consist of more complicated action sequences, that involve larger portions of the nervous system, and skeletal musculature.


A taxis (plural, taxes) is a gross orientation response: after presentation of a stimulus, the whole organism turns and moves. Taxes come in two forms:

Phototaxes involve responses to light, geotaxes involve responses to gravity (these can be observed in worms and ants as they move up and down inclines).

There are actually lots of other taxes, which can be observed mostly at the cellular level:

Taxes are not simple reflexes, because they involve the entire skeletal musculature of the organism. But they are still innate, and involuntary.


Taxes and Reflexes in the Neonate Kangaroo

The behavior of the newborn kangaroo illustrates an effective combination of reflexes and taxes. The kangaroo, like all marsupials (e.g., the opossum), has no placenta. The female gives birth after one month of gestation, and carries the developing fetus in a pouch. But how does the fetus get into the pouch? 

Immediately after birth, the newborn climbs up the mother's abdomen -- perhaps by virtue of a negative geotaxis. If it reaches the opening of the pouch, it reverses its behavior and climbs in -- maybe a positive geotaxis.  If it does not encounter the opening of the pouch, it will continue climbing until it reaches the top, stop -- or maybe fall off -- and eventually die.  The mother kangaroo has no way of helping the infant -- the appropriate behaviors simply aren't in her instinctual repertoire, and -- a point I'll expand on later -- she has no opportunity to learn them through trial and error. 

Once in the pouch, if the neonate encounters a nipple, it will attach to it and begin to nurse -- probably a variant on the (rooting reflex.  If not, it will simply stop at the bottom of the pouch and eventually die. 

Assuming that all goes well, the baby kangaroos emerges from the pouch after about six more months of gestation. 

Note that the neonate gets in the pouch by its own automatic actions, with no assistance from its mother. The behavior is entirely under stimulus control, and if it fails to contact the appropriate stimulus it will simply die.



Other innate behaviors involve more complicated action sequences, and more specific, discriminating responses. These are known as instincts or fixed action patterns.  Instincts have several important properties.  As a rule, they are:

Instincts are studied by ethology, a branch of behavioral biology devoted to understanding animal behavior in natural environments, viewed from an evolutionary perspective.  As a biological discipline ethology asks four questions about behavior -- all of them variants on Why does an animal behave the way it does?

These constitute different levels at which behavior can be analyzed.

Note, however, the focus on ethology on behavior -- and, in particular, on natural behavior.  Ethologists analyze animal behavior in its ecological and evolutionary context; they do experiments, but their experiments are performed under field conditions (or something very closely resembling them), not in the sterile confines of the laboratory.  Ethologists are not really psychologists, because they are interested only in behavior, not in mind per se.  Nevertheless, psychology is a big tent, and many ethologists have found their disciplinary home in a department of psychology, as well as in departments of biology (especially integrative biology  as opposed to molecular and cellular biology). 

A Nobel Prize for Ethology

Three important ethologists, Konrad Lorenz, Nikolas (Niko) Tinbergen, and Karl Von Frisch, won the 1973 Nobel Prize in Physiology or Medicine for their pioneering research on instincts (four years earlier, in 1969, Tinbergen's father Jan had shared the first Nobel Prize in Economics for his pioneering research on econometrics).  For an intellectual biography of Tinbergen, see Niko's Nature: A Life of Niko Tinbergen and His Science of Animal Behaviour by H. Kruuk (2003).

lorenz3_p.jpg (10111
              bytes)The lorenz-p.jpg (17763
              bytes) concept of instinct is well illustrated in Konrad Lorenz' research on imprinting in newly hatched ducks and geese. Once out of the egg, the hatchling follows the first moving object it sees. This is usually the mother, but the hatchling will also follow a wooden decoy, block of wood on wheels, or even a human -- provided that it is the first moving object that the bird sees. The emphasis on the "first" moving object is somewhat overstated, because there is a critical period for imprinting: the imprinted object must be present soon after birth; if exposure to a moving object is delayed for several hours or days, imprinting may not occur at all. If imprinting occurs, the imprinted object will be followed even under adverse circumstances, over or around barriers, etc. When the imprinted object is removed from the bird's field of vision, the bird will emit a distress call. If imprinting has occurred to an unusual object that object will be preferred to the bird's actual parent, or any other conspecific animal. 

Link to a film of Lorenz demonstrating imprinting: http://www.youtube.com/watch?v=eqZmW7uIPW4.

The power and perils of imprinting are vividly illustrated by an incident that occurred in Spokane, Washington, in 2009.  George Armstrong, a banker, had been watching a female duck nesting on a ledge outside his office window.  In the usual course of events, the ducklings would hatch, imprint on their mother, and then follow her as she led them to water.  But -- they're on a ledge!  And they can't fly yet!.  The mother duck knew nothing of this.  She's built to wait until her eggs have hatched, and then go to water; and ducklings are built to follow her.  The mother jumped off the ledge and -- she's built for that, too -- flew down to the street.  The chicks were stranded.  Armstrong went out on the street, stood below the ledge, and caught each of the ducklings as they stepped off the ledge, instinctually following their mother (actually, he had to collect a couple from the ledge).  Then he served as a crossing guard while the mother collected her young and led them to water.  The power of imprinting is that the ducklings will follow their mother -- or Konrad Lorenz everywhere.  The peril of imprinting is that the behavior has been selected for a particular environmental niche -- in the case of ducks, the grassy area near water where they usually nest; if that environment changes, for whatever reason, the instinctive behavior may be very maladaptive.  

Link to a video of Armstrong catching the ducks: http://blogs.abcnews.com/theworldnewser/2009/05/the-duck-parade.html (sorry about the ad).

There are actually two kinds of imprinting. 

Imprinting is extremely indiscriminate: basically, the bird imprints on the first object that moves within the critical period. However, other instincts are much more discriminating.

Another good example of an instinct is the alarm reaction in some birds subject to predation by other birdTin26BirdModels.jpg (8971 bytes)s (studied by Tinbergen).  If an object passes overhead, the birds will emit a distress call and attempt to escape. However, these birds do not show alarm to just any stimulus: it must have a birdlike appearance; moreover, birdlike figures with short (hawk-like) necks elicit alarm, while those with long (goose-like) necks do not (the length and shape of the tail and wings is largely irrelevant).








Imprinting and the alarm reaction involve, basically, only one organism. Other instincts involve the coordinated activities of two (or more) species members.

05FoodBeg.gif (20883
              bytes)A good example is food-begging in herring gulls (studied by Tinbergen). Hatchling birds don't forage for their own food, but must be fed a predigested diet by their parents. But the parents do not do this of their own accord. Rather, the chick must peck at the parent's bill: the parent then regurgitates food, and presents it to the chick; the chick then grasps the food and swallows it. But the chick will not peck at any bird-bill. Rather, the bill must have a patch of contrasting color on the lower mandible. The precise colors involved do not matter much, so long as the contrast is salient. Food-begging exemplifies the coordination of instinctive behaviors: the patch is the releasing stimulus for the hatchling to peck; and the peck is the releasing stimulus for the parent to present food.


Tin67MaleSticklebackDance.jpg (12835 bytes)An excellent example of a complex, coordinated sequence of instinctual behaviors is provided by the "zig-zag" dance, part of the mating ritual of the stickleback fish (Tinbergen).




A male stickleback, when it is ready to mate, develops a red coloration on its belly.

Tin01MaleStickleback.jpg (45366 bytes)Tin19MaleSticklebackFight.jpg (19695 bytes)It then establishes its territory by fighting off other sticklebacks. But he fights only sticklebacks, not other species of fish; and only males; and only males who display red bellies and enter his territory in the head-down "threat posture" (other colorations indicate that the other male is not ready to mate; other postures indicate that the other male is only passing through the territory; in either case, there is no territorial fighting).

Tin20RedBellyStickleback.jpg (22711 bytes)Experiments by Tinbergen, employing "dummy" models of fish, show that It actually doesn't matter much whether the other fish looks like a stickleback, so long as it has a red-colored belly.  Sticklebacks without red bellies may enter this fish's territory, because they don't constitute threats.



Tin32SticklebackInTube.jpg (16051 bytes)Other experiments, in which fish were enclosed in capsules to control their orientation, show that a male who elicits aggression when it enters a territory with its head down will not elicit aggression if it enters the territory with its head level -- perhaps indicating that it is just "passing through".


After the territory has been cleared of threatening males, the male builds a nest out of weeds.

Tin33FemaleStickleback.jpg (11827 bytes)Then he entices a female into the nest -- but only a female stickleback who enters his territory with a swollen abdomen, and in the head-up "receptive posture".



Tin34ReadyFemaleStickleback.jpg (11016 bytes)The female enters the nest only if the male displays a red belly, and performs a "zig-zag" dance.Tin47_48SticklebackMatingBehavior.jpg (30921




Tin40SticklebackMating.jpg (11216 bytes)Once in the nest, the female spawns eggs -- but only if she is stimulated at her hind quarters.


Once the eggs are laid, the female leaves the nest and the territory.

The male fertilizes the eggs, fans them to maintain an adequate oxygen supply around them, and cares for the young after hatching (until they're ready to go off to school).

When the young are hatched the red belly fades, and the male no longer incites males and attracts females -- until the next mating cycle starts.

Notice the serial organization to this pattern of stickleback behaviors. It is as if each act is the releasing stimulus for the next one. There is no flexibility in this sequence: once initiated, it does not stop, provided that the appropriate releasing stimulus is present. If any element in the sequence is left out, the entire sequence will stop abruptly. All three parties go through this pattern of behaviors, even if one of them doesn't remotely resemble a stickleback. For example, a female, ready to mate, will enter the nest if she observes a tongue depressor, painted red on one half, imitate the zig-zag dance!


Instincts in Humans?

Taxes and instincts are important elements in behavior, especially of invertebrates, birds, and reptiles. Some psychologists and behavioral biologists argue that much human behavior is also instinctual in nature. One of the first to make this argument was MacDougall, who argued that human behavior was rooted in instinctual behaviors related to biological motives. One of his examples, which is offered here without comment (except to note that similar descriptions could be made of the behavior of men), is reminiscent (at least in tone) of what Tinbergen discovered in sticklebacks:

The flirting girl first smiles at the person to whom the flirt is directed and lifts her eyebrows with a quick, jerky movement upward so that the eye slit is briefly enlarged. Flirting men show the same movement of the eyebrows. After this initial, obvious, turning toward the person, in the flirt there follows a turning away. The head is turned to the side, sometimes bent toward the ground, the gaze is lowered, and the eyelids are dropped. Frequently, but not always, the girl may cover her face with a hand and she may laugh or smile in embarrassment. She continues to look at the partner out of the corners of her eyes and sometimes vacillates between looking at, and looking away.

Among modern biological and social scientists, this point of view is expressed most strongly by the practitioners of sociobiology, especially E.O. Wilson, who argue that much human social behavior is instinctive, and part of our genetic endowment.  More recently, similar ideas have been expressed by proponents of evolutionary psychology such as Leda Cosmides, John Tooby, and David Buss.  At their most strident, evolutionary psychologists claim that our patterns of experience, thought, and action evolved in an environment of early adaptation (EEA) -- roughly the African savanna of the Pleistocene epoch, where homo sapiens first emerged about 300,000 years ago  -- and have changed little since then.  Although this assertion is debatable, to say the least, the literature on instincts makes it clear that evolution shapes behavior as well as body morphology.  Many species possess innate behavior patterns that were shaped by evolution, permitting them to adapt to a particular environmental niche.  Given the basic principle of the continuity of species, it is a mistake to think that humans are entirely immune from such influences -- although humans have other characteristics that largely free us from evolutionary constraints.  For a discussion of evolutionary psychology, see the lectures on Psychological Development.


Meanings of "Instinct"

The concept of instinct has had a difficult history in psychology, in part because early usages of the term were somewhat circular: some theorists seemed to invoke instincts to explain some behavior, and then to use that same behavior to define the instinct.  But, in the restricted sense of a complex, discriminative, innate response to some environmental stimulus, the term has retained some usefulness.  For example, the psychologist Steven Pinker has referred to language as a human instinct.

Nevertheless, the term instinct has evolved a number of different meanings, as outlined by the behavioral biologist Patrick Bateson (Science, 2002):

  1. present at birth (or at a particular stage of development);
  2. not learned;
  3. developed before it can be used;
  4. unchanged once developed;
  5. shared by all members of the species (at least those of the same sex and age);
  6. organized into a distinct behavioral system (e.g., foraging);
  7. served by a distinct neural (brain) module;
  8. adapted during evolution;
  9. differentiated across individuals due to their possession of different genes.

Bateson correctly notes that one meaning of the term does not necessarily imply the others.  Taken together, however, the various meanings capture the essence of what is meant by the term "instinct".


From Instinct to Learning

Innate response tendencies such as food-begging can be very powerful behavioral mechanisms, especially for invertebrates and non-mammalian vertebrate species.  In their natural environment, some species seem to live completely by virtue of reflex, taxis, and instinct.  


Limitations on Innate Behaviors

But at the same time, these innate behavioral mechanisms are extremely limited.  They have been shaped by evolution to enable the species to fit a particular environmental niche, which is fine so long as the niche doesn't change.  When the environment does change, evolution requires an extremely long time to change behavior (or body morphology, for that matter) accordingly -- much longer than the lifetime of any individual species member.

              (64765 bytes)Consider, for example, the behavior of newborn sea turtles.  Female turtles lay their eggs on the beach above the tide line, and these eggs hatch at night in the absence of the parents.  As soon as they have hatched, the hatchlings begin walking toward the water (what you might call a "positive aquataxis"): when they reach it, they begin to swim (another innate behavior), and live independently.   However, the young turtles are not really walking toward the water: they are walking toward the reflection of the moon on the water (thus, a positive phototaxis).   This hatching behavior evolved millions of years ago.  Since then, however, the beaches where the turtles hatch have become crowded with hotels, marinas, oil refineries, and other light sources.  Accordingly, these days, the hatchling turtles will also move toward these light sources, and die before they ever reach water.  The animals' behavior evolved when the only light in the environment was from the sun and the moon, and they just don't know any better.  In order to prevent a disaster, beach-side hotels and oil refineries now take steps to employ different kinds of light, or block their lights entirely.

              (34662 bytes)Now perhaps, there is some subtle difference (like polarization) between moonlight and electrical light.  If so, individual animals who can make this distinction, moving toward one and not the other, will survive, reproduce, and, over time, generate more individuals who can make this distinction.  But again this takes time -- assuming that any individual can make the distinction in the first place.  But even so, each individual gets only one chance.  If it makes the right "choice", this behavioral tendency will pass on to successive generations, and the species may eventually come to distinguish between "good" and "bad" light -- provided that the species doesn't go extinct first.  But that just illustrates the point that evolved behavior patterns take a very long time to change.

In June 2011, a group of diamondback terrapins caused the temporary shutdown of Runway 4 Left at New York's Kennedy International Airport.  And it's happened before.  The runway crosses a path that the turtles take from Jamaica Bay one side to lay their eggs on the sandy beach on the other side.  Usually, in egg-laying season, the runway is not in frequent use, due to prevailing winds.  But that day was an exception, and the turtles brought takeoffs and landings to a halt for about an hour until they could be moved to their destination (we don't know what happened when they tried to get back in the water).  It's another example of the difficulty that animals have in adjusting evolved patterns of behavior to rapidly changing environmental circumstances.  (See "Delays at JFK? This Time, Blame the Turtles" by Andy Newman, New York Times 06/30/2011).

              (48235 bytes)Here's another example: seabirds, like albatrosses, feed their young through the same sort of instinctual food-begging shown by herring gulls.  Adult albatrosses forage over open water, dive to catch fish swimming near the surface, and then regurgitate the fish into the mouths of their young.  But it's not only fish that are near the surface.  There's a lot of garbage in the ocean, as well.  The birds don't know the difference -- they're operating solely on reflex.  That garbage is of relatively recent vintage, so there hasn't been enough time -- assuming it were even possible -- for the birds to evolve a distinction between fish and garbage.  The result is that adult albatrosses pick up garbage and regurgitate it into the bills of their chicks, who promptly die of starvation -- such as this albatross chick photographed on Midway Atoll in the Pacific.

Altamont.jpg (51686
              bytes)And here's yet another example, a little closer to home.  Wind farms like the one in Altamont Pass produce a large amount of electrical energy for California, reducing carbon emissions from coal-fired plants, and our dependence on Middle East oil.  But they also create a hazard for birds, especially raptors, who like to forage for small mammals over open areas.  Never mind that wind farms are built where there is strong, steady wind, and therefore often on migratory flight paths.  The result is that a large number of raptors and other birds are killed every year because they run into the blades of the windmills.

In general, we can identify several limitations on innate response patterns:

Thus the problem: everyday life requires many organisms to go beyond simple, innate patterns of behavior, and acquire new responses to new stimuli in their environment.  


Evolutionary Traps

Ecologists and evolutionary biologists are becoming increasingly aware of the problems caused by rapid environmental change.  The United Nations Summit on Sustainable Development, held in Johannesburg, South Africa, in 2002, drew international attention to the fact that "nature", far from being "natural", has in fact been remade by human hands.  According to Andrew C. Revkin, "People have significantly altered the atmosphere, and are the dominant influence on ecosystems and natural selection (see his article, "Forget Nature.  Even Eden is Engineered", and other articles in a special section on "Managing Planet Earth", New York Times, 08/20/02).  Even in the early part of the 20th century, Revkin notes, the geochemist Vladimir I. Vernadsky had suggested that "people had become a geological force, shaping the planet's future just as rivers and earthquakes had shaped its past".  Now in the 21st century, with the growth of megacities, the increase in population, and the disappearance of the forests, to name just a few trends, we are beginning to recognize, and deal with, the impact of human activity on the environment.  

The human impact on the environment doesn't just affect the conditions of human existence.  Nature is a system, and what we do affects animal and plant life as well, and sometimes in non-obvious ways.  

In a recent paper in Trends in Ecology & Evolution (10/02), Paul W. Sherman and his colleagues, Martin A. Schlaepfer and Michael C. Runge, detail a number of "evolutionary traps", mostly caused by the impact of human activity which alters the natural environment -- activity which goes beyond the simple destruction of habitat, which would be bad enough.  More subtle changes alter the environment in such a way that a species' evolved patterns of behavior are no longer adaptive, reducing the chances of individual survival and reproduction, and eventually leading to the decline and extinction of the species as a whole.  As Sherman puts it, "Evolved behaviors are there for adaptive reasons.  If we [disrupt] the normal environment, we can drive a population right to extinction" ("Trapped by Evolution" by Lila Guterman, Chronicle of Higher Education, 10/18/02).  

The concept of evolutionary trap is a variant on the more established notion of an ecological trap, in which animals are misled, through human environmental change, to live in less-than-optimal habitats, even though more suitable habitats are available to them.  For example, Florida's manatees have progressively moved north, attracted by the warm water discharged by power plants; but when the plant goes down for maintenance, the water cools to an extent that they can no longer survive in it.

Some examples of evolutionary traps:

  • The male buprestid beetle (Julodimorphabakewelli) of Australia recognizes the female of its species as a brown, shiny object with small bumps on its surface.  However, this is also what some Australian beer bottles look like.  Accordingly, males will frequently be found attempting to mate with beer bottles, instead of with more appropriate partners.  The solution is to get Australians not to litter.
  • American wood ducks, Aix sponsa, build nests in the cavities of dead trees.  When wildlife managers constructed nesting boxes for them, in an attempt to help them meet the demands of habitat loss, the animals actually declined.  The reason is that female wood ducks adapted to the loss of natural nesting places by following each other to the few sites that were still available.  When the artificial nesting boxes appeared, they all gravitated to the same ones, and laid too many eggs in individual boxes to incubate properly.  The solution was to hide the boxes in the woods, increasing the likelihood that individual ducks would find their own nesting sites.
  • Male Cuban tree frogs, Osteopilus septentrionalis, attempt to mate with females that are actually roadkill (at least they don't move!).  Not only does this increase the chance that they themselves will be run over by cars and trucks, but of course the exercise yields no offspring. 
  • Due to global warming, yellow-bellied marmots, Marmota flaviventris, come out of hibernation too early in the season for food to be available, and so many will starve.


Learning Defined

In vertebrates, and especially mammalian species, everyday action goes beyond such innate behavior patterns. These organisms can also acquire new patterns of behavior through learning.

Psychologists define learning as:

a relatively permanent change in behavior that occurs as a result of experience.

This definition excludes changes in behavior that occur as a result of insult, injury, or disease, the ingestion of drugs, or maturation.  Learning permits individual organisms, not just entire species, to acquire new responses to new circumstances, and thereby to add behaviors to the repertoire created by evolution.  In addition, social learning permits one individual species member to share learning with others of the same species (this is one definition of culture).  The pace of social learning far outstrips that of evolution, so that learning provides a mechanism for new behavioral responses to spread quickly and widely through a population.  Although all species are capable of learning, at least to some degree, learning is especially important in the natural lives of vertebrate species, and especially in mammalian vertebrates.  Like us.  And, it turns out, most human learning is social learning: we learn from each other's experiences, and we have even developed institutions, like libraries and schools, that enable us to share our knowledge with each other.


For a good treatment of instinctual behavior, see N. Tinbergen, The Study of Instinct (1969). 

For a positive treatment of sociobiology, see E.O. Wilson, Sociobiology: The New Synthesis (1975).  

For extensions of sociobiology to psychology, see The Adapted Mind : Evolutionary Psychology and the Generation of Culture edited by Jerome H. Barkow, Leda Cosmides, and John Tooby (1992), and  Evolutionary Psychology: The New Science of the Mind (1999) by David M. Buss.  


Classical Conditioning

PavlovDog.jpg (37712
              bytes)One important form of learning, classical conditioning, was accidentally discovered by Ivan P. Pavlov, a Russian physiologist who was studying the physiology of the digestive system in dogs (work for which he won the Nobel Prize in Physiology or Medicine in 1904).  Pavlov's method was to introduce dry meat powder to the mouth of the dog, and then measure the salivary reflex which occurs as the first step in the digestive process.  Initially, Pavlov's dogs salivated only when the meat powder was actually in their mouths.  But shortly, they began to salivate before the powder was presented to them -- just the sight of the powder, or the sight of the experimenter, or even the sound of the experimenter walking down the hallway, was enough to get the dogs to salivate.  In some sense, this premature salivation was a nuisance.  But Pavlov had the insight that the dogs were salivating to events that were somehow associated with the presentation of the food.  Thus, Pavlov moved away from physiology and initiated the deliberate study of the psychic reflex -- not, as the term might suggest, something out of the world of parapsychology, but rather a situation where the idea of the stimulus evokes a reflexive response.  Pavlov called these responses conditioned (or conditional) reflexes.  

              (36105 bytes)In honor of Pavlov's discovery, this form of learning is now called "classical" conditioning.  A classical conditioning experiment involves the repeated pairing of two stimuli, such as a bell and food powder.  One of these stimuli naturally elicits some reflex, while the other one doesn't.  With repeated pairings, the previously neutral stimulus gradually acquires the power to evoke the reflex.  Thus, classical conditioning is a means of forming new associations between events (such as the ringing of a bell and the presentation of meat powder) in the environment.

The apparatus for Pavlov's experiments included a special harness to restrict the dog's movement; a tube (or fistula) placed in its mouth to collect saliva, a mechanical device for introducing meat powder to its mouth, and some kind of signal such as a bell.  (Some writers have questioned whether Pavlov actually used a bell, as the myth has it.  Pavlov was actually unclear on this detail in his own writing.  But a 1997 article by the American psychologist R.K. Thomas documented this historical tidbit conclusively). 


The Basic Vocabulary of Classical Conditioning

The procedure just described illustrates the basic vocabulary of classical conditioning:

The process by which a conditioned stimulus acquires the power to evoke a conditioned response is known as acquisition.  In traditional accounts of conditioning, acquisition of the CR occurs by virtue of the reinforcement of the CS by the subsequent US.  The strength of the CR is measured in various ways:
14Acquisition01.gif (7270 bytes)On the initial acquisition trial, when the CS and the US are paired for the very first time, there is only an unconditioned response to the US; there is no conditioned response to the CS.  

15Acquisition10.gif (7714 bytes) On later trials, we begin to observe a response that resembles the UR, occurring after presentation of the CS but before presentation of the US.  This is the first appearance of the CR.  


16Aquisition20.gif (8356 bytes) Even later, we may observe the CR immediately after the presentation of the CS, well before the presentation of the US.  


The characteristic curve portraying the acquisition of the CR is an ogive, in which there is a slow increase in response strength on the initial trials, followed by a rapid increase in middle trials, and a further slow increase in the last trials.

More on this later.

Actually, learning can occur even before a CS is paired with a US.  When a novel stimulus (NS), such as Pavlov's bell, is presented for the very first time, the organism will show an reflexive orienting response (OR) -- perhaps a startle response -- to that stimulus.  But if that stimulus is presented repeatedly, all by itself, the magnitude of the OR will progressively diminish.  This is known as habituation. It counts as learning because there is a change in behavior -- in this case, a change in the OR -- that occurs as a result of experience.  Habituation is the very simplest form of learning, and has been observed in animals as simple as protozoa (Penard, 1947) -- and since protozoa are one-celled creatures, you can't get any simpler than that!

If the NS is now paired with a US, so that the NS becomes a CS, conditioning will occur.  However, the CR will be acquired at a slower rate than if there had been no prior habituation trials.  This phenomenon is known as latent inhibition (Lubow, & Moore, 1959).

Extinction is the process by which the CS loses the power to evoke the CR.  Extinction occurs by virtue of unreinforced presentations of the CS -- that is, presentation of the CS alone, without subsequent presentation of the US.  When the CS is no longer paired with the US, the CR loses strength relatively rapidly.

19Extinction01.gif (7737 bytes)On the first extinction trial, there is a strong CR: after all, the organism does not yet "know" that the US has been omitted.  


20Extinction10.gif (7361 bytes) On later trials, the magnitude of the CR falls off, 



21Extinction20.gif (6632 bytes) until the CR disappears entirely.



22ExtinctionCurve.gif (11444 bytes)On extinction trials, the CR loses strength relatively rapidly.  But it is not lost entirely, and it is possible to demonstrate that the CR is still present, in a sense, even after it seems to have disappeared.


Habituation can be thought of as a special case of extinction, in that the organism learns not to respond to the NS.

Spontaneous recovery is the unreinforced revival of the conditioned response.  If, after extinction has been completed, we allow the animal a period of inactivity, unreinforced presentation of the CS will evoke a CR.  This CR will be smaller in magnitude than that observed at the end of the acquisition phase, but CR strength will increase with the length of the "rest" interval.  

SpontRecovery.jpg (32892 bytes)If we continue with unreinforced presentations of the CS, the spontaneously recovered CR will diminish in strength -- it is extinction all over again.


SavingsRelearn.jpg (34278 bytes)If we continue with new reinforced presentations of the CS, the CR will grow in strength.  The reacquisition of a previously extinguished CR is typically faster than its original acquisition, a difference known as savings in relearning.  


During extinction, formal extinction trials can continue after the CR has disappeared, a situation known as extinction below zero.   Of course, there is no further visible effect on the CR -- it is already at zero strength. However, extinction below zero has two palpable consequences: spontaneous recovery is reduced (though not eliminated), and reacquisition is slower (but still possible).

Spontaneous recovery, savings in relearning, and extinction below zero, have important implications for our understanding of the nature of extinction. Extinction is not the passive loss of the CR: the organism does not "forget" the original association between CS and US, and extinction does not return the organism to the state it was in before conditioning occurred. Spontaneous recovery and savings in relearning are expressions of memory, and they show clearly that the association between CS and US has been retained, even though it is not always expressed in a CR. Rather, it seems clear that the CR is retained but actively suppressed.  Extinction does not result in a loss of the CR, but rather imposes an inhibition on the CR. The strength of the inhibition grows with trials, producing the phenomenon of extinction below zero. The inhibition also dissipates over time, producing spontaneous recovery. Thus, reacquisition isn't really relearning. Rather, it is a sort of disinhibition. Both acquisition and extinction, learning and unlearning, are active processes by which the organism learns the circumstances under which the CS and the US are linked.

Other major phenomena of classical conditioning can be observed once the conditioned response has been established.  For example, The organism may show generalization of the CR to new test stimuli, other than the original CS, even though there have been no acquisition trials on which these new stimuli have been associated with the US.  The extent to which generalization occurs is a function of the similarity  between the test stimulus and the original CS.  

029Generalization.jpg (52562 bytes)The generalization gradient is an orderly arrangement of stimuli along some physical dimension (such as the frequency of an auditory stimulus).  The more closely the test stimulus resembles the original CS, the greater the CR will be.  The generalization gradient provides one check on generalization: having been conditioned to respond to one stimulus, the organism will not respond to any and all stimuli.  Response is greatest to test stimuli that most closely resemble the original CS. 


Generalization, Frequency, and Musical Pitch

In discussing generalization of response among stimuli, it is easiest to use the example of the frequency of tones, because differences in frequency -- whether a tone is high or low -- are easy to appreciate.  And the example is accurate so far as it goes.  If you condition an animal to a tone CS of 250 cycles per second (cps; also known as hertz, abbreviated hz, after the physicist Heinrich Rudolf Hertz, 1857-1894), it will emit a stronger conditioned response to a tone of 300 hz than to one of 350 hz -- because a tone of 300 hz more closely resembles a tone of 250 hz than does a tone of 350 hertz.

SeashorePitchHertz.jpg (135683 bytes)With humans, though, things can get a little more complicated, because musical pitch is also related to the frequency of tones, but similarity among pitches is not just a matter of relative frequency.

  • Tones that are an octave apart, such as Middle C and third-space C on the treble clef, are perceived as more similar than any other pair of tones.
  • Tones that are a major fifth apart, such as Middle C and second-line G on the treble clef, are also perceived as highly similar.
  • And tones that are a major third apart, such as Middle C and first-line E on the treble clef, are also perceived as similar, though not as similar as those separated by an octave or a major fifth.

Thus, when tones are presented in the context of the diatonic scale familiar in Western music, the generalization gradient may be distorted by the vicissitudes of pitch similarities.

Consider an experiment in which a subject is initially conditioned to respond to a tone of 262 hertz, roughly corresponding to Middle C.  Such a subject may well show larger conditioned responses to tones of 524 hz (roughly 3rd-space C), 392 hz (second-line G), and 262 hz (1st-line E), than to either B-flat (233 hz) or D (292 hz), even though the former tones are more distant from the original CS, in terms of frequency, than the latter.  

However, this may only occur if we establish a musical context for the tones in the first place -- for example, by embedding the C in the other pitches of the diatonic scale.  Or by beginning the experiment by playing a tune in the key of C major.  There are some experiments here....


031Discrimination.jpg (56388 bytes)Discrimination provides a further check on generalization.  Consider an experiment in which we present two previously neutral stimuli:  one, the CS+, is always reinforced by the unconditioned stimulus; the other, the CS-, is never reinforced.  As conditioning proceeds, the CS+ will come to elicit the CR, but the CS- will not acquire this power.  If the CS+ and CS- are close to each other on the generalization gradient, both will initially elicit a conditioned response.  But as conditioning proceeds, the CR to the CS+ will grow in strength, while the CR to the CS- will extinguish.  The CR is only elicited by CSs that are actually associated with the US.

Before we Habituation is a very primitive form of learning.  


SensPrecondit.gif (12247 bytes)New conditioned responses can also appear even if they are very dissimilar to the original conditioned stimulus.  Consider the phenomenon known as sensory preconditioning, which occurs before acquisition trials in which a CS is paired with a US

HighOrdCondit.gif (12389 bytes)Something similar happens in higher-order conditioning, except that the first two phases are reversed, so that higher-order conditioning occurs after acquisition trials in which CS is paired with US.

The Scope of Classical Conditioning

By means of acquisition, extinction, generalization, discrimination, sensory preconditioning, and higher-order conditioning, stimuli come to evoke and inhibit reflexive behavior even though they may not have been directly associated with an unconditioned stimulus.  By means of classical conditioning processes in general, reflexive responses come under the control of environmental events other than the ones with which they are innately associated.

The phenomena of classical conditioning are ubiquitous in nature, occurring in organisms as simple as the sea mollusk and as complicated as the adult human being.  Pavlov himself thought that all learning entailed classical conditioning, but this position is too extreme. Still, classical conditioning is important because, in a very real sense,

The laws of classical conditioning are the laws of emotional life.  

Classical conditioning underlies many of our emotional responses to events -- our fears and aversions, our joys and our preferences.  


The Physiological Basis of Learning

The ability to learn -- to change one's behavior as a result of experience -- obviously must reflect changes in the organism's nervous system, and indeed the ability to learn is an important example of the plasticity of the nervous system -- the ability of the nervous system to be modified.  But what exactly is going on in the nervous system when an organism learns something.

The fact that at least some phenomena of classical conditioning can be observed in every organism that has a nervous system has allowed behavioral neuroscientists to gain important insight into precisely how the nervous system is modified when organisms learn something.  In work that won the Nobel Prize for Physiology and Medicine in 2000 (shared with Arvid Carlsson and Paul Greengard) Eric Kandel of Columbia University examined synaptic changes in the marine mollusk, Aplysia, as it acquired a simple conditioned response.  

The most important of these changes is long-term potentiation, an increase in the sensitivity of a postsynaptic neuron as a result of repeated stimulation by a presynaptic neuron.  This is the neural representation of both a simple association -- an association between neurons that is created as a result of repeated pairing of CS and US.  


Instrumental Conditioning

ThorndikePuzzle.jpg (32700 bytes)At roughly the same time as Pavlov was beginning to study classical conditioning, E.L. Thorndike, an American psychologist at Columbia University, was beginning to study yet another form of learning -- what has come to be known as instrumental conditioning.  Beginning in 1898, Thorndike reported on a series of studies of cats in "puzzle boxes".  The animals were confined in cages whose doors were rigged to a latch which could be operated from  inside the cage.  The animal's initial response to this situation was agitation -- particularly if it was hungry and a bowl of food was placed outside the cage.  Eventually, though, it would accidentally trip the latch, open the door, and escape -- at which point it would be captured and placed back in the cage to begin another trial. 

34ThorndikePic.gif (20212 bytes) Over successive trials, Thorndike observed that the latency of the escape response progressively diminished.  Apparently, the animals were learning how to open the door -- a learning which seemed to be motivated by reward and punishment.


On the basis of his studies of cats in puzzle boxes, Thorndike formulated a set of 8 Laws of Learning, of which three are particularly important for our purposes:

For the record, the other laws were:
The general principle of instrumental conditioning is that adaptive behavior is learned through the experience of success and failure.  Instrumental learning is also sometimes called operant conditioning, because the organism "operates" on the environment, changing it in some way (for example, changing the cage from one whose door is closed to one whose door is open), and this behavior is "instrumental" in obtaining some desired state of affairs (like food or simply escape from confinement).

Beginning in the 1930s, the study of instrumental conditioning was taken up by B.F. Skinner, a radical behaviorist.  Behaviorism was a school of psychology founded by John B. Watson, then at Johns Hopkins University, who believed that psychology could become a legitimate science only by eliminating references to hypothetical mental states (which cannot be publicly observed) and confining the analysis to the relations between publicly observable behavior and the publicly observable environmental conditions under which it is observed.  (Watson was forced to resign from Hopkins over a sexual scandal, and went on to a career in advertising.  He invented the notion of the "coffee break" as a promotion for Maxwell House Coffee.)  Like Watson, Skinner thought that behavior could be, and should be, explained solely in terms of the associations between stimuli and responses, and without reference to hypothetical states (such as hunger) existing in a hypothetical mind of an organism (including humans). Thus the term S-R behaviorism. Skinner was something of a visionary, and he is famous for his utopian novel, Walden II, which describes a community organized along behaviorist lines (he was an English major in college, contemplated a career as a writer, and indeed wrote some very beautiful stuff); and for his meditation on human nature, Beyond Freedom and Dignity. Both are very provocative books.  A collection of Skinner's scientific papers, most of which are very readable, is entitled Cumulative Record


A Note on Two "Functionalisms"

Tracing the relations between environmental stimuli (inputs) and organismic responses (outputs) is often called functional behaviorism, or simply functionalism, but this brand of functionalism (which is currently popular among some philosophers of mind and some theorists in artificial intelligence, a branch of cognitive science) should be clearly distinguished from the 19th-century "Chicago functionalism" of John Dewey and James Rowland Angell (Angell was, however, Watson's graduate mentor), which had its roots in the work of William James and which underlies this course.  

OperantChamber.jpg (62081 bytes) Skinner refined Thorndike's apparatus into what has become known as the Skinner box, though Skinner himself did not use the term and actually disliked it. He preferred the term operant chamber. A generic operant chamber, intended to house an animal during learning trials, includes lights for presenting signals, levers or keys for collecting responses, a hopper for presenting food pellets, and a floor grid for presenting electrical shock.


The "Superstition" Experiment

B.F. Skinner demonstrated the power of Thorndike's Law of Effect with the following classic "superstition" experiment. A food-deprived (remember, if you're a behaviorist you can't say hungry) pigeon was placed in an operant chamber.  As pigeons are wont to do, it displayed a variety of random pigeon behaviors: it wandered around the chamber, it groomed itself, it flapped its wings and stretched its neck, it cooed, and it pecked at various locations. Every 30 seconds, a food pellet was dropped into the hopper of the operant chamber; this occurred regardless of the pigeon's behavior. Over trials, each bird developed a stereotyped pattern of behavior, but the precise nature of this pattern was different for each bird. The only regularity was this: whatever behavior that had been emitted at the time that the first pellet dropped now began to occur more frequently.

This is a classic illustration of the Law of Effect. Initially, the association between behavior and reward was purely accidental. Nevertheless, following the principle that rewarded responses are strengthened, while unrewarded and punished responses are weakened, that particular behavior began to occur more frequently. Therefore, the bird was more likely to be displaying that behavior the next time a food pellet dropped into the hopper. So, that behavior was strengthened even more. Eventually, whatever behavior had originally coincided with reinforcement comes to dominate the behavior of that individual bird -- all because of an initially accidental link between behavior and reward.


And the "Air Crib"

Skinner_AirCrib_Deborah.jpg (58995
                          bytes)There's a kind of urban legend circulating that Skinner raised his children in an infant-sized Skinner box: it's not true.  Skinner, an inveterate tinkerer, did invent what he called the "Air Crib", a climate-controlled environment which he hoped would ease some of the burdens of child-rearing and foster child development.  The Air Crib looked like a regular, if somewhat large, crib.  It had a ceiling, three opaque walls, and a glass pane which could be opened to move the infant in and out.  There were controls for temperature and humidity, a canvas floor, and sheeting which could be removed and washed when soiled.  In this way, the infant had considerable freedom of movement.  Skinner was publicized the Air Crib in an article in  the Ladies Home Journal entitled "Baby in a Box: The Mechanical Baby-Tender" (1945).  It has been estimated that at least 300 infants were raised in a version of the Air Crib (see Robert Epstein, "Babies in Boxes", Psychology Today, 1995).  And contrary to rumors that Deborah eventually sued her father and committed suicide, she was alive and well in 2004, when she wrote a newspaper Op-Ed piece in the (Manchester) Guardian that was very positive about both Skinner and the device.  

The Vocabulary of Instrumental Conditioning

The experiment described above illustrates the basic vocabulary of instrumental conditioning, whose terms largely parallel that of classical conditioning -- though be careful, because their meaning sometimes changes slightly.

Reinforcement (Rft) is an event which increases the strength (probability) of the behavior (the conditioned response) which preceded it.

Note that "positive" and "negative" do not necessarily mean "pleasant" (e.g., food) and "aversive" (e.g., shock). As it happens, positive reinforcers are typically pleasant (presentation of food is a good thing if you're a food-deprived pigeon); but then again, so are negative reinforcers (termination of shock is also a good thing). Reinforcers always increase the probability of the behavior being reinforced. This is the hardest thing about instrumental conditioning to get straight, because it is the most counterintuitive use of language. Blame Skinner, don't blame me.  When most people think of "negative reinforcement", they really mean "punishment".  Punishment has a technical meaning in the literature on instrumental conditioning, as it entails the presentation of a negative reinforcer.

A conditioned response (CR) is the behavior which is strengthened by reinforcement. The strength of the CR is usually indicated by response rate, or the frequency with which the organism displays the behavior.

A conditioned stimulus (CS) is an environmental event which leads to the performance of a conditioned response. Put another way, the CS is a signal or cue that the CR will be reinforced. Sometimes, as in Phase 2 of the typical experiment described above, the CS is the operant chamber itself. That is, the presence of the pigeon in the chamber is a cue that key-pecking will produce food. Other times, as in Phase 3 described above, the CS is some discrete feature of the environment -- such as a lighted key, or a buzzer or tone.

These technical definitions of CS and CR give us the term stimulus-response (or S-R) learning theory. The animal learns that emitting the CR (key-pecking) in the presence of the CS (the illuminated key) leads to reinforcement (food in the hopper). Or, to be a strict, radical, Skinnerian, functional behaviorist, reinforcement of the CR in the presence of the CS leads to an increase in the rate of the CR.  

Classical conditioning can also be described in S-R terms.  The key is to remember how instrumental conditioning defines reinforcement -- as any stimulus that increases the likelihood of the conditioned response.  Thus, in classical conditioning, the CR (e.g., salivating) is reinforced by the US (meat powder) in the presence of the CS (the bell).  By virtue of this reinforcement, the CR comes to be emitted in the presence of the CS.

Note that in instrumental conditioning there is no discussion of unconditioned stimuli or unconditioned responses. This is because the behaviors in question are not reflexive in nature, as they are in classical conditioning. Rather, these behaviors are emitted spontaneously by the organism. They are what we ordinarily call voluntary, as opposed to the involuntary behaviors involved in classical conditioning -- except that radical behaviorists like Skinner didn't like to talk about "voluntary" responses, or anything else that smacked of "free will", because they felt that all behaviors were under control of environmental stimuli and reinforcements.

The Phenomena of Instrumental Conditioning

Similarly, the major phenomena of instrumental conditioning parallel the classical case.

Schedules of Reinforcement

To a great degree, the major phenomena of instrumental conditioning parallel those observed in the classical case: acquisition, extinction, generalization, and discrimination. However, studies of instrumental conditioning also illustrate a new concept: schedules of reinforcement, each schedule resulting in a different pattern of behavior.

The term refers to the contingent relationship between the organism's emission of its response and the environment's delivery of reinforcement. In the continuous case, reinforcement is delivered after every CR. In the partial case, reinforcement is occasionally withheld. Partial reinforcement retards acquisition, but it also increases resistance to extinction. 

schedules.jpg (59939
              bytes) Continuous and partial reinforcement are also terms that occur in the vocabulary of classical conditioning, and they have the same effects.  But there is another category of reinforcement schedules, intermittent reinforcement, that is unique to instrumental conditioning. There are four general types of intermittent schedules of reinforcement.

The Cumulative Record

In textbook figures that depict the effects of various schedules of reinforcement, the organism's cumulative responses are plotted as a function of time (plotted on the horizontal or X axis).  This is known as a cumulative record of responses.  Every time the organism makes a response, the line moves up a notch on the vertical (Y) axis. Thus, a horizontal tracing means that the organism has made no responses. The slope of the tracing indicates the response rate: shallow slopes indicate a slow rate of response (relatively few responses per unit time), while steep slopes indicate a relatively rapid response rate (relatively many responses per unit time). 

B.F. Skinner invented the cumulative record technique, and the term served as the title for his autobiography.

Each schedule of reinforcement produces its own characteristic pattern of behavior. For example, DRL schedules typically produce a string of "ritualistic" responses, that are ineffective in terms of controlling reinforcement but nevertheless effectively fill the long interval between reinforcements.

Both features are eliminated by switching from fixed to variable schedules, which produce constant, stable rates of response.
Both VR and VI schedules are highly aversive for the organism being conditioned.


The Matching Law and the Monty Hall Problem

Animals (and humans) can also be put on concurrent schedules of reinforcement.  For example, pecking a green key might be reinforced on a VI5 schedule, while pecking a red key might be reinforced on a VI10 schedule.  In such cases, the organism will distribute its responses between the two keys in proportion to their rate of reinforcement -- for example, pressing the red key about twice as frequently as the green key.  The fact that animals will distribute their responses in proportion to the rate at which those responses are reinforced is called the matching law, which was first announced by Richard Herrnstein (1970), B.F. Skinner's protege at Harvard; see also the review by Peter deVilliers (1977) -- who was, in turn, Herrnstein's protege.  

The matching law, in turn, was one of the first contacts between experimental psychology and neoclassical economic theory, as it seemed to reveal a fundamental, perhaps universal, law governing rational choice.

An interesting illustration of the matching law is provided when pigeons are confronted with a version of the Monte Hall problem, popularized by Let's Make a Deal, a television game show.  The show's host, Monte Hall, would offer a contestant a valuable prize, such as a car or a vacation, which is hidden behind one of three closed curtains; behind another curtain is nothing; but behind the third curtain is a booby-prize, like a goat.  After the contestant makes his choice, Hall opens one of the curtains to reveal nothing, and then offers the contestant the opportunity to change his mind.  Note that, at this point, the prize lies behind one of the remaining curtains, while the goat is behind the other one.  

Most contestants choose to stick with their original choice (pose this to your friends, and see what they do).  But this is the wrong choice. The prior probability that the prize lies behind the contestant's original choice is 1/3.  But that's the probability that the prize lies behind any of the curtain.  Accordingly, the probability that the prize lies behind the other curtain -- the one that the contestant did not originally choose -- has now doubled to 2/3.  Many people don't get this, even after multiple trials with the problem.  But it turns out that pigeons catch on pretty quickly -- they're really good at matching responses to reinforcement rates, perhaps because they don't over-analyze the problem, using erroneous theories that lead them to misestimate probabilities.  We'll return to the liabilities of estimation later, in the lectures on "Thought and Language".


The Scope of Instrumental Conditioning

By means of instrumental conditioning in general, and schedules of reinforcement in particular, voluntary behaviors come under the control of environmental events.  The phenomena of instrumental conditioning are ubiquitous, or nearly so: every vertebrate organism, and some invertebrates as well, is capable of acquiring behaviors under conditions of reward and punishment.  

Thorndike and Skinner believed that most adaptive behavior is the product of instrumental conditioning.  Again, their position is probably too extreme.  But the laws of instrumental conditioning do appear to account for the acquisition, maintenance, and loss of both adaptive and maladaptive voluntary behavior -- habitual behaviors of all sorts, and actions performed under conditions of incentive.


Classical and Instrumental Conditioning Compared and Combined

In several respects, classical and instrumental conditioning appear to represent two quite different forms of learning.


Classical Conditioning


Instrumental Conditioning

Reinforcement is not contingent on the organism's behavior.  The US is delivered following the CS, no matter what the organism does.   Reinforcement is contingent on the organism's behavior.  The "reward" or punishment is not delivered unless the organism makes the response to be conditioned.
The response to be conditioned is elicited involuntarily by the US.   The response to be conditioned is spontaneously emitted by the organism as a "voluntary" behavior.
The response being conditioned is "involuntary" (or reflexive) in nature.   The response being conditioned is a "voluntary" (or spontaneous) response.
Because classical conditioning is limited to involuntary, reflexive responses, relatively few responses can be conditioned.   Because instrumental conditioning is open to any behavior (or combination of behaviors) the organism is capable of emitting, a large, possibly infinite, number of responses can be conditioned.


One Form of Learning After All?

Procedurally, the two forms of conditioning represent quite different procedures for studying learning:

Donahoe and Vegas (2004) have argued that these differences are more apparent than real, and that classical conditioning also entails an association between the CS and the CR.

On the other hand, it seems equally likely that in instrumental conditioning the organism is forming an association between two stimuli -- between the CS and the reinforcement.

Ultimately, as Donahoe and Vegas argue, it may be that classical and instrumental conditioning are simply two forms of the same underlying learning process.  But for now, the procedural differences between them are great enough that we will continue to consider them to be different forms of learning.  As will be argued later, in classical conditioning the organism learns to predict events; in instrumental conditioning the organism learns to control them.


Avoidance Learning

Although classical and instrumental conditioning appear (to me, anyway) to represent two different forms of learning, most examples of adaptive behavior appear to involve combinations of classical and instrumental conditioning.  That is, through classical conditioning the organism learns to anticipate some future event; through instrumental conditioning it learns to cope with that event.  

039Shuttlebox.jpg (88487 bytes)This sort of combination has been studied in the laboratory in the form of avoidance learning.  The procedure in a typical avoidance learning experiment is as follows:

Early in training, the animal neither escapes nor avoids, but (naturally) shows agitation when the shock is presented. 
At this point, the experimenter may turn the shock off entirely.  Even so, the animal will continue to make avoidance responses, as if the shock were still connected. In this sense, avoidance learning shows a failure of extinction.

The two-factor theory of avoidance learning proposed by O. Hobart Mowrer (1947) illustrates how avoidance combines classical and instrumental conditioning.  According to Mowrer, by virtue of the pairing of the tone CS with the shock US two kinds of learning occur. 

As we will see later, Mowrer was somewhat wrong to attribute avoidance learning to the reduction of conditioned fear, but his essential point, that avoidance combines classical and instrumental conditioning, remains valid.

Theories and Theories of Learning

Arising in 1898 with the research of Pavlov and Thorndike, the next half-century saw a vast proliferation of research on learning, summarized by E.R. Hilgard in his 1948 Theories of Learning -- a book which went through five editions (the last in 1981, co-authored with Gordon H. Bower), and initiated a large number of popular "Theories Of" courses in developmental, social, and personality psychology. 

Already in the 2nd (1956) edition, written before the formal beginning of the cognitive revolution in psychology, Hilgard classified these theories into two major categories:

In Hilgard's view, these theories were distinguished by three theoretical preferences:

Hilgard points out that all of these theories were "behaviorist" in nature, in that they took behavior, rather than introspections, as their data.  There's a difference between between methodological and radical behaviorism.

One might also suggest that S-R and cognitive theorists differ in their choice of experimental subjects -- S-R theorists preferring nonhuman animals, and cognitive theorists preferring humans, as subjects.  But this is a false distinction. 

The fact of the matter is that, under the spell of Watsonian behaviorism, almost all research on learning in the first half of the 20th century was on animals -- mostly rats and pigeons.  This was, in large part, because the use of nonhuman animals forced psychologists to rely on objective behavior, rather than subjective introspections, as their data.  Still, I think we can see Hilgard's two categories in researchers' views about the human-animal distinction.

In any event, Hilgard noted that all learning theorists must accept all of the same facts discovered through research; they differ in terms of interpretation.  And all learning theorists seek to answer the same small set of questions:

Note what is missing here: there is nothing about the brain (the term barely appears in Hilgard's index).  Partly, of course, this reflected the primitive state of neuroscience at the time.  But the reasons went deeper than that.

For the most part, the classical learning theories have been confined to the dustbin of history.  But it's worth reviewing at least some of them, for their relevance to the modern cognitive psychology of learning and memory.  Herewith are some summary notes, based mostly on the 3rd edition of Hilgard's Theories of Learning, published in 1966, before the cognitive revolution really took hold in psychology.  This edition was the first to be co-authored with Gordon H. Bower, his Stanford colleague.  Bower, for his part, had begun his career as a mathematical psychologist focused on animal learning, and became a distinguished first-generation cognitive psychologist whose most famous research focused on verbal learning and memory. 


First things first: We have to start with Pavlov, whose studies of classical conditioning got the whole ball rolling.  Of course, Pavlov wasn't a psychologist at all.  he was a physiologist, who worked first on the cardiovascular and circulatory systems, and then on the gastrointestinal system. I usually give the beginning of Pavlov's work on conditioned reflexes as 1898, the same year as Thorndike, with the first publication being Wolfson's dissertation published in 1899.

First, a couple of notes on terminology:

By the time he published Conditioned Reflexes  (1927) and Lectures on Conditioned Reflexes (1928), Pavlov had developed pretty much the entire vocabulary of conditioning and learning.

You could take Pavlov out of physiology and into psychology, but you couldn't take physiology out of Pavlov.  Of all the classical learning theorists, Pavlov is the only one to have taken specific positions on the neural basis of conditioning.

For an appreciation of Pavlov's contributions to psychology, written by a leading psychologist of the Soviet era, see Razran, G. (1965).  Russian physiologists' psychology and American experimental psychology.  Psychological Bulletin,63, 42-64.


While Pavlov dominated learning theory in the Soviet Union, Thorndike's theory dominated in the United States.  Thorndike called his theory connectionism, because learning was held to strengthen the associations between sensory stimuli and motor responses.  In order to avoid confusing Thorndike's "connectionism" with the "modern" connectionism initiated by Rumelhart and McClelland, it's probably best to think of Thorndike's theory as the "mother" of all stimulus-response (S-R) theories of learning.

I've already listed Thorndike's eight laws of learning, which I'll just list here again without much further comment.

There are three primary laws

  1. The Law of Readiness states that motivational states such as hunger arouse behavior. 
  2. The Law of Effect states that responses that lead to reward are strengthened, occurring more quickly and reliably, while responses that are unrewarded, or even punished, are weakened.
  3. The Law of Exercise states that associations between stimuli (such as the puzzle box) and responses (such as tripping the latch) are strengthened by practice and weakened by disuse.

And the subordinate laws:

  1. The Law of Multiple Responses: organisms must be able to vary their responses to a stimulus, to give them the opportunity to stumble on the response which will be rewarded.
  2. The Law of Set (or Attitude): an organism's momentary set or attitude will determine which rewards are effective (the opportunity to play tennis may not be rewarding to a golfer).
  3. The Law of Prepotency of Elements: organisms must be able to distinguish between those elements of a situation that are really important, and those that are merely adventitious.
  4. The Law of Response by Analogy: organisms respond to novel situations by drawing analogies to familiar situations.  
  5. The Law of Associative Shifting: a response that has been conditioned to a number of different stimuli will be likely to be given in response to a new stimulus.

These laws were set out fairly early in Thorndike's career, and subsequent research led to the revision or abandonment of some of them.


Watson, the founder of behaviorism, never developed a full-fledged theory of learning. 

That job fell to B.F. Skinner, most prominently in his Behavior of Organisms (1938).  Skinner's is an S-R theory, but he rejected the idea of "no stimulus, no response", by which earlier behaviorists had assumed that every response was preceded by some stimulus, even if that stimulus couldn't be identified.  Instead, Skinner focuses on two types of response:

Thus, Skinner's theory can be viewed as an extended meditation of Thorndike's Law of Effect: the association between a stimulus and response is increased when the operant is reinforced in the presence of the stimulus.

This led Skinner to distinguish between two types of learning.

And in another counterintuitive move, Skinner distinguished between two types of primary reinforcers:

While Thorndike's Law of Effect gives rise to the impression that positive reinforcers are pleasant, or satisfy some biological motive, while negative reinforcers are unpleasant, Skinner is a true behaviorist, rejecting all reference to mental states.  Reinforcements are known only by their effects: something is reinforcing if it increases the probability of the response with which it is paired.

In his critique of Skinner's Verbal Behavior (which absolutely must be read), Chomsky (1959) found time to show that Skinner's definition of reinforcement is circular, and thus empty:
Estes (1944), while a doctoral student working under Skinner's supervision, showed that punishment suppresses behavior, but does not weaken habits.

As noted earlier, Skinner and his students and colleagues placed great emphasis on the schedule of reinforcement -- that is, the precise relationship between response and reinforcement (e.g., Ferster & Skinner, 1957).  Each of these schedules produced a corresponding pattern of behavior.


Hull also classifies as an S-R theorist, as in his famous formulation:


But that little element D, distinguishes Hull from the other S-R behaviorists, because it posits that learning is a function of an internal physiological (if not mental) state, drive.  And it's the presence of this drive state that makes reinforcements reinforcing.  So, by postulating an internal state, Hull makes it clear that learning isn't just a matter of associating stimuli and responses.  And it offers a non-circular definition of reinforcement: reinforcements reduce physiological drives.

Hull's mathematico-deductive theory of learning (1940) is, in some ways, a masterpiece of quantitative psychological theory, expressly inspired by, and explicitly modeled on, Newton's Principia and Whitehead and Russell's Principia Mathematica with each of its elements stated verbally, then translated into symbolic logic, followed by experimental tests conducted on a variant of the verbal-learning paradigm known as role learning -- essentially, an extension of Ebbinghaus's method.  (Earlier, Hull had adapted Ebbinghaus's method for the study of concept acquisition -- ever the tinkerer, inventing the memory drum in the process and creating a whole industry of makers of equipment for university psychology laboratories.

Hull's research gave rise to the standard, ogival, form of the learning curve showing the acquisition of a response over time.  Actually, there has been some confusion over the shape of the learning curve.  Often, the curve is described as negatively accelerated, with large gains on initial trials followed by smaller gains as learning approaches asymptote.  But Culler & Girden (1951), in an exhaustive analysis of published learning curves (following Culler, 1928), determined that it is ogival after all.

Hull's system attracted a great number of adherents, and he gained additional fame after leaving Wisconsin (where he got his PhD, with Joseph Jastrow as his advisor) to Yale, where his colleagues at the Institute of Human Relations applied his drive-reduction theory to a wide variety of issues in personality and social behavior -- most famously, Miller and Dollard's work on frustration and aggression and on conflict (approach-approach, approach avoidance, and avoidance-avoidance).  Together, these two lines of research laid the foundation for a translation of Freudian psychoanalytic theory into the vocabulary of Hull's S-R theory of learning.

Unfortunately, the mathematical rigor of his theory proved its undoing.  In a famous paper, Gleitman, Nachmias, and Neisser (1954) showed that Hull's theory of extinction was simply wrong.  It contained a number of internal, logical contradictions; and its empirical predictions proved to be simply wrong.  A theory that can't explain extinction isn't a very good theory of learning, after all.  And by this time, any Skinnerian functional behaviorism was at its apex -- soon to be overthrown itself, by the cognitive revolution in psychology.  


The cognitive revolution was foreshadowed by the genuinely cognitive theory of learning proposed by E.C. Tolman (who had been Gleitman's teacher at Berkeley).  As a learning theorist, Tolman was the chief competitor to both Hull and Skinner. 

Tolman is best known for his studies of latent learning, discussed later, which cast doubt on the role of reinforcement.  Here, I'll talk in general terms about his theoretical approach.

Everybody's got their method.  Pavlov had dogs in harnesses; Thorndike had cats in puzzle-boxes; Skinner had his operant chamber.  Tolman had the maze -- a series of alleys and choice points where his rats could -- well, make choices.  In fact, Tolman used the same maze throughout his career.  It was a thing of real beauty, with lots of alleys and choice points, which could be walled off with curtains to create different pathways from start box to goal box (diagram courtesy of UCB Prof. Donald Riley, who was one of Tolman's students).  

Tolman's research program focused on three aspects of learning.

  1. Reward expectancy: Rats learn to run the maze for a particular preferred reward.  When the reward is shifted to a less-preferred reward, they will leave the goal box and search for the preferred reward.
  2. Place learning: rats learn where to go, not what movement to make, in order to get a rewarded.
  3. Latent learning: animals learn about their environment by exploring it, in the absence of reward.  If they receive a reward in a particular place, they know where to go the next time.
When they come to a choice point for the first time, Tolman found that individual rats consistently favored one choice over another, behaving as if they were generating hypotheses about where to go.

Tolman considered himself a behaviorist, but it is clear that he was a behaviorist of quite a different stripe than others. 

When psychology was ready for the cognitive revolution, Tolman, and a few others (like Jerome Bruner) had pointed the way.

A final note: There's a reason that the Education/Psychology Building at UCB is named after him.  Along with Brunswik, Tolman was probably Berkeley's most famous psychologist: his experiments, from almost 100 years ago, are still described in introductory textbooks.  But Tolman's contributions to the University go far beyond the experiments on latent learning.  In the late 1940s and early 1950s, at the height of the McCarthy Period in American politics, the Regents of the University of California (there was only Berkeley and UCLA then) required all UC faculty to sign a loyalty oath.  Tolman viewed this as an infringement of academic freedom, and (along with some other faculty) refused to sign.  He was then dismissed from his post, and took a visiting position at Harvard (where he had gotten his PhD under Munsterberg).  He then sued the University for reinstatement.  In Tolman v. Underhill (1955), the Supreme Court overturned the loyalty oath, and required the University to reinstate him and the other plaintiffs.

A Note on Functionalism

This is as good a place as any to make some remarks about a general trend in learning theory what is known as functionalism, and clear up some misunderstandings about it.

As a "school" of psychology, functionalism was skeptical of the structuralist claim that we can understand mind in the abstract. Based on Charles Darwin's (1809-1882) theory of evolution, which argued that biological forms are adapted to their use, the functionalists focused instead on what the mind does, and how it works. While the structuralists emphasized the analysis of complex mental contents into their constituent elements, the functionalists were more interested in mental operations and their behavioral consequences. Prominent functionalists were:

  • William James, the most important American philosopher of the 19th century, and who taught the first course on psychology at Harvard, James's seminal textbook, Principles of Psychology (1890), is still widely and profitably read by new generations of psychologists. True to his philosophical position of pragmatism, James placed great emphasis on mind in action, as exemplified by habits and adaptive behavior.
  • John Dewey (1859-1952), now best remembered for his theories of "progressive" education, who founded the famous Laboratory School at the University of Chicago.
  • James Rowland Angell (1869-1949), who was both Dewey's student (at Michigan) and James's student at Harvard, and who rejoined Dewey after the latter moved to the University of Chicago; later Angell was president of Yale University, where he established the Institute of Human Relations, a pioneering center for the interdisciplinary study of human behavior. In contrast to Titchener, who wanted to keep psychology a "pure" science, Angell argued that basic and applied research should go forward together.

Psychological functionalism is often called "Chicago functionalism", because its intellectual base was at the University of Chicago, where both Dewey and Angell were on the faculty (functionalism also prevailed at Columbia University). It is to be distinguished from the functionalist theories of mind associated with some modern approaches to artificial intelligence (e.g., the work of Daniel Dennett, a philosopher at Tufts University), which describe mental processes in terms of the logical and computational functions that relate sensory inputs to behavioral outputs.

The functionalist point of view can be summarized as follows:

  • Adaptive value of mind. Functionalists assume that the mind evolved to serve a biological purpose -- specifically, to aid the organism's adaptation to its environment. Thus, functionalists are interested in what James called (in the Principles) "the relationship of mind to other things" -- how the mind represents the objects and events in the environment. Functionalism also laid the basis for the application of psychological knowledge to the promotion of human welfare.
  • Mind in context. From a functionalist point of view, the mind essentially mediates between the environment and the organism. Therefore, the functionalists were concerned with the relations between internal mental states and processes and the states and processes in the internal physical environment (i.e., the organism) on the one hand, and the external social environment (i.e., the real world) on the other.
  • Operations over content. Whereas structuralism attempted to analyze the contents of the mind into their elementary constituents, functionalism attempted to understand mental operations -- that is, how the mind works. It's this sense of functions as operations that gives functionalism its name.
  • Individual differences. For Wundt and other structuralists, it didn't matter who the observer was: so long as observers were properly trained, they were interchangeable. But the functionalists, with their roots in Darwin's theory of natural selection, were interested in variation.
  • Mind and body. Because the mind is what the brain does, functionalists assumed that understanding the nervous system, and related bodily systems, would be helpful in understanding the workings of the mind.  At the very least, mind and body ought to be related somehow, and psychologists should be free to investigate the neural underpinnings of mental life, and other aspects of the mind-body relationship.

So where's the confusion?  The confusion comes from another form of functionalism, "philosophical" functionalism, which holds a prominent position in cognitive science -- in particular, those proponents of what John Searle calls "strong artificial intelligence" .  Essentially, functionalists identify mental states with certain input-output functions, irrespective of the medium which performs those functions. It the follows that any physical system which performs those functions has mental states -- regardless of whether that physical system is a brain, a computer, or -- to take a vivid image -- a bunch of beer cans connected by string and powered by windmills.  The connection to Stimulus-Response theory is obvious.  Philosophical functionalism does have one advantage over behaviorism, in that it at least acknowledges the existence and causal power of of mental states. 

So don't get confused.  When somebody identifies himself as a "functionalist", these days, he's likely to be a philosopher who identifies mind with certain functions, and who thinks that computers can have minds.  And he's also likely to be inclined toward something like stimulus-response behaviorism.

But, as Dewey and his friends understood, functionalism doesn't have to stand for any such thing.  In the American tradition of Dewey and James, functionalism can just be an umbrella term for a particular approach to learning, memory, and other aspects of mind and behavior:

  • Mind mediates between the environment and the person.
  • Mind enhances the adaptation of the organism to its environment.
  • Psychologists should feel free to investigate the biological substrates of mental life.

What is Learned in Conditioning?

So far, we have simply described the phenomena of conditioning -- acquisition, extinction, generalization, discrimination, reinforcement, and the like.  But what actually happens in learning?  Or, put another way, what is the organism learning from experience?


The Stimulus-Response Theory of Learning

Learning was once thought to be as automatic as reflexes, taxes, and instincts.  Just as these are innate stimulus-response associations, part of the organism's biological endowment, so classical and instrumental conditioning was thought to represent acquired stimulus-response connections, formed as a result of experience but no less automatic.  

As its name implies, S-R learning theory holds that what is learned in conditioning is an association between a stimulus and a response -- an association that is strengthened by reinforcement. 

Traditional stimulus-response theories of learning were based on four assumptions:
The stimulus-response theory of learning, and the assumptions on which it was predicated, dominated the study of learning for more than 50 years since Watson.  Beginning in the 1960s, however, experiments began to challenge this view of learning as a passive, associationistic process.  These experiments showed that there were two broad types of constraints on what can be learned -- biological and cognitive.  And in revealing these constraints, research overturned the four assumptions of S-R learning theory and completely changed our view of learning.  


Biological Constraints on Learning

One important line of research challenged the arbitrariness assumption that organisms could learn to attach any response in their repertoire to any stimulus in the environment, by showing that some conditioned responses are easier to acquire than others.  

This research begins with work by the American psychologist John Garcia and his colleagues on a phenomenon known as taste-aversion learning (or bait shyness).  Before Garcia became a graduate student of Tolman's at UC Berkeley, he grew up on a sheep ranch in the American southwest, where ranchers routinely used poison to control coyotes and other predators.  Garcia knew from this experience that when animals eat poisoned food or drink poisoned liquids, and nonetheless survive, they will avoid that substance later (hence the term, "bait-shyness").  Garcia and his associates developed a laboratory analogue of bait-shyness in an attempt to study the anticipatory nausea which some cancer patients develop in the course of receiving chemotherapy.  Garcia's paradigm was a variant on classical fear conditioning:

Later, learning was tested through an avoidance procedure.  The animals were presented with two sources of water, and allowed to drink from either one. 
GarciaResults.jpg (29333 bytes)Garcia and his associates found that the animals' avoidance behavior depended on the US to which they had been exposed.

In other words, the animals formed associations between shock and sight and sound, and between nausea and taste; but they made no connection between nausea and taste, or between shock and sight and sound.  This outcome violates the arbitrariness assumption of traditional S-R theories of learning, because all elements of the compound CS occur at precisely the same time and place.  Thus, they all have precisely the same spatial and temporal contiguity with respect to the US.  Therefore, under the assumption of arbitrariness or equipotentiality, they should all have been equally powerful as CSs.  But they were not.

This experimental outcome is commonly interpreted as indicating that the potency of a stimulus is related to the evolutionary history of the species.  Rats are nocturnal animals, and under ordinary circumstances choose their food according to its taste.  Therefore, their evolution has disposed them to form associations between the taste of food and its gastrointestinal consequences, but not between sight or sound and nausea.  The explanation is supported by experiments on birds (like quail), who are sight-feeders.  They quickly form associations between nausea and visual stimuli, but not between nausea and taste.

From Coyotes to Sheep to Wolves

Garcia became interested in bait shyness because of its use by sheepherders and other ranchers in the natural control of coyotes and other predators, but you don't have to be a predator to be susceptible to bait shyness.  

In 2007, Morgan Doran, a farm advisor with the University of California Agricultural and Natural Resources Cooperative  Extension, based in Davis, began a program of research on bait-shyness in sheep.  Sheep and goats are often used for brush control and weed abatement -- you can see them, for example, in the Oakland and Berkeley Hills in an attempt to prevent wildfires from spreading through dry overgrowth.  And vintners have been interested in using this same technique for weed control in vineyards.  

That's all very good on paper, but the practical problem is how to get the sheep to eat the weeds, and not the very tasty tender shoots of young grapevines!

In Doran's study, a group of sheep are allowed to feed freely on vine leaves, and then they are fed a capsule filled with lithium chloride -- which, while not lethal, induces pretty severe nausea.  A control group is also allowed to feed on the grape leaves, but gets a placebo capsule.  Results from a pilot study indicates that the sheep will, in fact, avoid the grape leaves in the field, and focus their feeding on the leaves.

A similar project is underway in Marin County's dairyland, where cattle have been trained to prefer a particular kind of thistle.  

Turning the tables, bait-shyness (and preparedness) has been enrolled in the effort to protect the Mexican wolf, which was hunted to near extinction by ranchers seeking to protect their cows and sheep from predation.  An experiment with captive Mexican wolves shows promise in getting the animals to avoid sheep, and might be effective in wildlife management as well.  

Who says that animal research has no practical significance!?  Or that's it's bad for the animals.

Further reading:

  • "Scientist Gives Sheep Useful, Discerning Taste" by M.S. Enkoji, West County Times, 07/02/07.
  • "A Second Chance for the Mexican Wolf" by Sadie F. Dingfelder, Monitor on Psychology, 11/2010.

In addition to violating the assumption of arbitrariness, the outcome of the Garcia experiment also violates the assumption of association by contiguity.  Recall that all the elements of the compound CS were presented simultaneously.  Therefore, all elements of the compound CS were equally contiguous with the US -- temporally contiguous, because they occurred close together in time; and spatially contiguous, because they occurred close together in space.  But despite being equally contiguous, not all potential CSs acquired the power to evoke a CR.  

Moreover, consider the special circumstances of the X-ray condition.  X-rays require a long time to take effect, about 30 minutes, by which time the rats may well be in another part of the compartment, some distance from the food source.  Therefore, the bright, noisy sweet water was separated from nausea by an appreciable distance in time (and, for that matter space).  Even so, an association was formed between taste and nausea.  Apparently, then conditioning is possible even in the absence of temporal (or spatial) contiguity, a point to which we will return later.

Finally, just for good measure, the outcome of the Garcia experiment violates Thorndike's Law of Exercise.  Thorndike concluded that stimulus-response associations were strengthened with repetition, but Garcia's rats formed strong taste-nausea associations after only a single trial.  If evolution has predisposed rats to form associations between the taste of food and its gastrointestinal consequences, it has also predisposed rats to form these associations quickly and over long delays.

The arbitrariness assumption was also challenged by research by Robert C. Bolles on species-specific defense reactions.  Boles noted that pigeons could quickly learn to flap their wings or stretch their necks (both behaviors preparatory to flight) to avoid shock, but could not learn to peck a key to avoid shock (even though, as Skinner discovered, pigeons quickly learn to peck a key to obtain food).  Moreover, rats quickly learn to jump and run (both part of the "flight" reaction to stress), but are slower to learn to press a key, to avoid shock.  Again, outcomes like these violate the arbitrariness assumption: because all the behaviors are in question are in the species' repertoire, and reinforcement is equally contingent on each type of response, all the stimulus-response connections should be equally associable.  Nevertheless, some S-R connections are easier to form than others.  Bolles concluded that the ease of conditioning depends on the natural defense reactions of each species.  Pigeons learn to hop, flap their wings, and stretch their necks, because these behaviors are preparatory to flight.  Rats quickly learn to jump or run, because these behaviors are part of their innate response to threat.

So, in both classical and instrumental conditioning, learnable associations are not arbitrary.  It is not the case that just any CS can be attached to just any CR.  Taken together, these results illustrate what Martin E.P. Seligman, Paul Rozin, and James Kalat have called the preparedness principle

The repertoire of prepared associations differs for each species of animal.  It is part of the species' evolutionary heritage, and provides biological constraints on the learning process.  The implication of this conclusion is that the assumption of the empty organism is wrong.  We cannot treat the organism as if it were empty, a "black box" that merely connects stimulus inputs to response outputs.  Rather, in order to understand learning we have to understand something about the organism's internal biological workings -- how evolution has shaped the organism's nervous system to support certain kinds of learning, but not others.

The biological constraints on learning are important, but in order to understand what the organism is really learning we also have to understand the organism's internal cognitive workings -- what the organism's mental states are, the nature of its internal, mental representation of the world, and what's it is trying to do over the course of learning.  These aspects of the learning process are revealed by studies of the cognitive constraints on learning.


Contiguity versus Contingency in Conditioning

For example, the principle of association by contiguity, already challenged by Garcia's experiments on taste-aversion learning, is further undermined by certain peculiarities of classical conditioning.




These kinds of results highlight the distinction between contiguity and contingency

Given the results just summarized, we can conclude several things about the role of contiguity and contingency in conditioning.
According to conventional S-R learning theory, associations are formed by virtue of the spatiotemporal contiguity between events in the environment, stimuli and responses, or actions and their outcomes. That is to say, associations are formed between two elements that occur closely together in space and time. However, an increasing body of evidence, including the outcomes of various classical-conditioning paradigms, indicates that contiguity is not the important element in learning. Rather, the important element is contingency: the degree to which one event (etc.) predicts another (etc.).

Put another way, conditioning occurs when the CS acts as a signal that the US is forthcoming.  In backwards conditioning, however, the CS signals that the US is not forthcoming.  In backwards fear conditioning, the CS actually serves as a safety signal -- informing the animal that the shock will not be forthcoming for a while.  The CS has value as a signal only when there is a contingent relationship between the CS and the US, regardless of whether the CS and US are temporally and spatially contiguous.  The conclusion is that contingency is more important than contiguity: conditioning occurs only when the CS predicts the US. When the CS is uninformative about the US, no conditioning occurs. And when the CS predicts the absence of the US, as in extinction or backwards conditioning, the CR is actually inhibited.  


The Rescorla Experiment

A compelling demonstration of the role of contingency in classical conditioning was provided in a classic experiment by Robert Rescorla (1967), for his doctoral dissertation at the University of Pennsylvania (after many years at Yale, Rescorla returned to his alma mater in a faculty role). In this experiment, Rescorla varied the predictability of a shock US, given the presentation of a tone CS.

In one condition of the experiment, the CS was a perfect predictor of the US, in that the CS always immediately preceded the US (that is, within 1 second or so). No CS was ever presented that was not immediately followed by a US; and no US was ever presented that was not immediately preceded by a CS. Thus, expressed in terms of probabilities:


p(US | CS) = 1.0; and


[Read this as "the probability that the US will occur given the prior occurrence of the CS is 1".]


p(US | no CS) = 0.0.

[Read this as "the probability that the US will occur given no prior occurrence of the CS is 0"]

This condition resulted in very good conditioning.

In another condition of the experiment, the CS was a less-than-perfect predictor, because Rescorla interspersed a number of unreinforced CSs -- that is, CSs that were not immediately followed by USs. Thus, of all the CSs that were presented, half were not followed by USs. However, the US never occurred unless it was immediately preceded by a CS. Again, expressed in terms of probabilities:

p(US | CS) = 0.5 and p(US | no CS) = 0.0.

This condition still resulted in fairly good conditioning.

In a third condition of the experiment, the CS rendered ineffective as a predictor of the US, because Rescorla interspersed a number of unsignalled USs -- in fact, half of the USs -- USs that were not immediately preceded by CSs. Now, the situation was that CSs and USs occurred randomly, independently of each other. Expressed in terms of probabilities:

p(US | CS) = 0.5 and p(US | no CS) = 0.5.

Under these conditions, no conditioning occurred, even though the CS and US were frequently presented together in the same place at the same time.

The upshot of Rescorla's experiment, which stands as a modern classic in psychology, is that conditioning is not simply the formation of an association between spatially and temporally contiguous stimuli. Rather, conditioning occurs only when the CS provides information about the US. The amount of information provided may be estimated as the difference between two probabilities:

p(US | CS) - p(US | no CS).

Conditioning occurs only if, and to the degree that, the CS is a reliable predictor of the US.  Put another way, conditioning occurs only if the US is more likely following a CS than in the absence of the CS.  What's amazing about this is that it appears that even organisms as simple as the white rat, or simpler, are in some sense computing the conditional probabilities involved.  The computation is not necessarily conscious, of course -- the rats haven't taken Statistics 2, after all.  But it is a computation nonetheless.


The Kamin Experiments

The importance of the predictive relationship between the CS and the US is underscored by two other phenomena discovered by Leo Kamin.

Kamin's first experiment concerned the phenomenon of overshadowing. Consider two standard conditioning preparations:

Both preparations yield good conditioning.  And when we combine these two effective CSs into a single compound CS, bright light and soft tone presented simultaneously and followed by shock, just as Garcia did with his compound of "bright, noisy, sweet" water, what we find is good conditioning,   

But what happens if, after we condition the organism to the compound, we test the two elements separately?   When we do, we get a good CR to the light, but not to the tone.  This is not a problem of differential preparedness, as in the Garcia experiment, because neither light nor tone is particularly prepared or contraprepared to serve as a signal for shock.  Instead, once more, the result violates the assumption of association by contiguity. Both the light and the tone were equally contiguous with the shock. But it appears that the more salient, noticeable CS, in this case the bright light, overshadows the less salient or noticeable one. Both are contiguous with the shock, and both are good predictors of the shock as well, but conditioning occurs to the CS that is more salient.

The second experiment concerned the phenomenon of blocking.  As background to this research, recall that in standard classical fear conditioning, a foot-shock US is preceded by a tone or light CS.  Under these conditions, we get good conditioning of fear, as represented by such conditioned emotional responses as heart-rate acceleration, in response to previously neutral CSs.  

                (30922 bytes)We now give an animal acquisition trials with a compound CS, consisting of a tone and a light presented simultaneously, followed by a shock US in the usual manner.  After 16 pairings of tone and light followed by shock, we test the animal's response to a variety of stimuli:

Blocking2.jpg (36343
              bytes)But something different happens when the procedure is reversed, and conditioning trials with the compound CS are preceded by conditioning with only one element alone.

What happens when we now test the animal's response to presentation of the light alone? 
057BlockingEffect.jpg (38740 bytes)Here are the actual results of some of Kamin's experiments. 

Apparently, the prior conditioning to the noise has "blocked" conditioning to the light.  This surprising outcome is explained in terms of the information provided by the various CSs.  In the case of the compound CS, the new element, light, is redundant with the noise.  Expressed in terms of Rescorla's conditional probabilities:

p(shock | noise) = p(shock | noise + light) = 1.0.

Now, the outcome would be different under different conditions.

This leads us to a clarification of the principle of association by contingency:

Conditioning occurs only when the CS signals a change in the US.

Kamin concluded, further that conditioning only occurs when the US surprises the organism.  In the presence of a surprising event, the organism then searches the environment for possible predictors of that event.  Among these, it will pay attention to the most reliable predictor, which becomes the effective CS.  If there is more than one reliable predictor, it will attend to most salient predictor, leading to the phenomena observed in the "overshadowing" experiment.  And it will ignore stimuli that lack predictive power, leading to the phenomena observed in the blocking" experiment.  

Kamin's experiments are important because they simultaneously undermine three assumptions of classical S-R learning theory. 

Pretty good for one experiment.  No wonder it's a classic.


Learned Helplessness

Similar considerations apply to instrumental conditioning.  The behaving organism is searching for predictability, but it is also searching for control.  It wants to know what to do about forthcoming events, not just where and when to expect them.  In instrumental conditioning, the organism is acquiring these expectancies of control.  

The role of these expectations can be seen clearly in the phenomenon of learned helplessness, discovered by Martin E.P. Seligman, Steven Maier, and Bruce Overmaier when they were graduate students at the University of Pennsylvania, working under Richard L. Solomon.  Mowrer's two-factor theory of avoidance learning, discussed above, predicts that avoidance learning will be facilitated if the organism has already undergone fear conditioning. The idea is that the organism already knows to fear the CS, and all it has to do is to learn to avoid the US.  To test Mowrer's theory, they performed the following experiment:

061Helplessness1.jpg (46899 bytes)In standard avoidance-learning situations, without Phase 1, animals learn the avoidance response readily, and shuttle nonchalantly back and forth from one section to the other as tones come on.  However, Overmeier & Seligman (1967) discovered that in their new situation, with Phase 1 inserted before Phase 2, avoidance learning was actually retarded.

063Helplessness2.jpg (42791 bytes)In a subsequent experiment, Seligman & Maier (1967) used a yoked-control design to insure that animals in the two conditions received exactly the same amount of shock.  In each pair, one dog could escape shock while the other received the same amount of shock as the first dog, no matter what he did.  In a subsequent avoidance experiment, the "escape" animals responded like controls who had received no pretreatment of any kind, while the "yoked" animals showed considerable evidence of learned helplessness.

Proper avoidance responding can be established in dogs who have been pretreated with inescapable shock, but only by forcibly dragging the dogs from one side of the shuttlebox to the other.

Why does this happen?  Seligman and his associates reasoned that learned helplessness reflects the acquisition of negative expectations of control.  In classical fear conditioning, the shock is both inescapable and unavoidable.  Tone is followed by shock, and there is nothing the animal can do about it, because in classical conditioning reinforcement is not contingent on the subject's behavior.  It is only contingent on the CS.  Accordingly, the animal in such a situation acquires a negative expectation that nothing can be done about the shock.  This negative expectation, in turn, generalizes to the avoidance learning situation.

Learned helplessness is significant because it may underlie certain forms of clinical depression.  But it also has great theoretical significance, because it shows that instrumental behavior is determined by the organism's expectancies, not by environmental events.


Helplessness at the World Trade Center

In the aftermath of the terrorist attacks of September 11, 2001, emergency-service workers at the World Trade Center employed "search and rescue" dogs to locate victims, living and dead, who might have been buried under the rubble.  These animals were trained through instrumental conditioning procedures to sniff out human bodies: basically, when they found a body they received a reward (a similar training procedure is used for the "drug-sniffing" dogs employed by the police).  At the WTC, however, there were very few such bodies to be found -- not because there weren't any victims, of course, but because the victims' bodies had been pulverized into dust by the collapse of the building.  As a result, the search-and-rescue dogs became obviously depressed -- because they were not able to do the job they were trained to do.  In the language of learned helplessness, the animals were not able to engage in behaviors that controlled reward.  In order to maintain the animals' motivation for the job, emergency-service workers would sometimes lie down in the rubble -- just to give the dogs somebody to find -- or, in the language of learned helplessness, to maintain their sense of control.

The bottom line is that conditioning is the wrong metaphor for learning. A better metaphor might be computing. The learning organism is trying to figure things out, and it does this by, in some sense, computing conditional probabilities.

Prediction and control. Conditional probabilities. Signals.  Information.  That's what "figuring it all out" is all about.

Experimental Neurosis

Predictability and controllability are central to conditioning, but they also have clinical implications.  I referred earlier to a body of research, initiated in Pavlov's laboratory on experimental neurosis.  Inspired by Seligman's learned helplessness model of depression, which focused on uncontrollable aversive events, Mineka and Kihlstrom (1978) proposed that experimental neurosis was caused by exposure to unpredictable aversive events. 

Whereas a history of uncontrollable aversive events can lead to depression, as Seligman argued, Mineka and Kihlstrom suggested that unpredictable aversive events are a source of anxiety.


The Role of Reinforcement

A similar point can be made with respect to the role of reinforcement in learning.  The conventional view, expressed in Thorndike's Law of Effect, which says that nothing is learned in the absence of reinforcement.  In classical conditioning, the CS must be followed by a reinforcing US.  In instrumental conditioning, the CR must be followed by reward or punishment.  However, a number of experiments now make clear that reinforcement is not necessary for learning to occur.


More Vicissitudes of Classical Conditioning

Consider, for example, two phenomena of classical conditioning discussed earlier.

In both instances, the animal has learned to respond to a stimulus even though its response to that stimulus has never been reinforced.  However, we can explain these phenomena by an extension of the principle of association by contingency, which states that animals in conditioning experiments learn the predictive relationships among events in their environment.


Latent Learning

60TolmanMaze.gif (41985 bytes)A similar point is made with respect to instrumental learning by classic studies on latent learning performed by Edward C. Tolman of the University of California, Berkeley (after whom the Education/Psychology Building at Berkeley is named).  Tolman's experiments involved a maze-learning procedure, in which hungry rats were placed in the start box of a maze, and food placed in the goal box.  Over trials, the rats would learn, through trial and error, the route through the maze.  In theory, these responses -- turn left here, turn right there, go straight, whatever -- were reinforced by the delivery of food in the goal box.  Intuitively, this makes sense, but Tolman asked whether the reinforcement was really necessary for learning to occur.

                (40339 bytes)The experiment, by Tolman and Honzik, involved three groups of rats:

Tolman concluded that the animals in this group learned how to get from the start box to the goal box on the first 10 trials, but just needed a reason to do it.  This reason was provided on Trial 11 and subsequent trials.  In other words, Tolman's animals learned the maze without any reinforcement.  Over 10 trials of exploration, they developed a "mental map" of their environment, which was subsequently available for use for a variety of purposes.  However, they didn't perform a goal-directed response until the introduction of reinforcement established a goal.  

Put another way, 

Reinforcement controls performance rather than learning.


Curiosity and Intrinsic Motivation

A similar point was made in research on rhesus monkeys published in the early 1950s by Harry Harlow of the University of Wisconsin (later to become famous for his studies of "monkey love" and "motherless monkeys".  In one set of studies, Harlow presented his monkeys with a wooden "puzzle lock" consisting of a series of latches which, when moved in the right order, would open a door.  Some animals were rewarded with food (rhesus monkeys love FrootLoops) for making correct moves; others received no reward at all.  Harlow observed no difference in the monkeys' problem-solving behavior.  In fact, if they were hungry, hunger appeared to interfere with solving the puzzle.  If they were not hungry, but were "rewarded" with food anyway, they usually stored the food for later consumption.  Harlow concluded that the monkeys were simply curious about the puzzle.  In his view, curiosity is an aspect of intrinsic motivation, or the desire to perform an activity without the promise or prospect of reward.  This is not to say that animals are not also motivated by extrinsic considerations such as hunger and thirst, only that these are not the only rewards.  Considering only extrinsic motivation such as hunger, Harlow's monkeys learned whether they were rewarded or not.  


Statistical Learning

The point of all of this is that organisms are built to learn from experience, and they do this naturally, in the ordinary course of everyday living, without requiring reinforcement, by computing the contingent probabilities among the objects and events that they observe in their environments.  This learning mechanism is sometimes known as statistical learning, because the organism samples the environment and then makes probabilistic inferences about what is going on in it -- what are technically known as the transitional probabilities from one thing to another (Aslin & Newport, 2012).

Here's an example of statistical learning in the domain of language.  As we'll see later in the lectures on Language and Communication, an early phase in language learning occurs when an infant learns to recognize the particular phonemes -- basic sound units -- and combinations of phonemes that occur in his or her native language.  Saffran, Aslin, and Newport (1996) presented eight-month-old human infants with a steady stream of speech-like sounds consisting of four randomly ordered three-syllable nonsense words, such as:

pa bi ku go la tu da ro pi ti bu do go la tu ti bu do da ro pi pa bi ku pa bi ku da ro pi go la tu ti bu do.

Note that, in such a string, the transitional probabilities of syllables within words (e.g., pabi within the word pabiku is a perfect 1.0, while the transitional probability of syllables across words (e.g., tuda between the words golatu and daropi is only 0.33).  They then tested the infants' recognition of individual worlds by presenting them with "legal" real words, like pa bi ku, and non-legal "part-words" like tu da ro.  

How do you test word-recognition in infants?  One way is to give them an artificial nipple to suck on, and measure the rate at which they do so: when they're surprised, they stop sucking for a moment.  In this experiment, the infants were placed in front of a blinking light, and changes in their looking behavior were used as an index of surprise.

Anyway, the upshot of the experiment was that, after only two minutes of exposure, the infants were able to discriminate between legal and non-legal words.  Learning occurred, a very sophisticated learning at that, just by listening to the audio stream, without any reinforcement at all.  

Other experiments have shown similar learning effects with sequences of musical tones as well as syllables; and in the visual domain, as infants learned the spatial arrangements of shapes in scenes.  

And it's been shown that statistical learning extends to neonates as well as to infants.  

Moreover, infants can generalize from the stimulus materials to which they've been exposed to novel stimulus materials.  For example, infants who have been exposed to one set of pseudowords in a pattern such as dadapi or pabibi also recognized novel pseudowords arranged in the same AAB or ABB pattern, such as kikino or golala.  In other words, they acquired something like a concept or a rule that went beyond the specific instances to which they had been exposed to cover novel elements or combinations of elements.

In statistical learning, infants are doing exactly what Pavlov's dogs and Thorndike's cats and Rescorla and Kamin's rats were doing: learning the structure of the world, acquiring expectations about what goes with what and what is going to happen next, simply through observation.


The Bottom Line on Reinforcement

Learning occurs naturally in most behaving organisms.  Some species are so well adapted to their environmental niches, and their environmental niches are so stable, that they have little need (or opportunity) to learn much more than where they are likely to find food.  For other species, a capacity for altering behavior through learning is itself an important adaptation.  Through the experience of various contingencies, organisms acquire information about events in their environment, and about the outcomes of behavior.  Reinforcement merely motivates the organism to act on what it learns, in order to achieve certain outcomes, and avoid others.

Reinforcement plays a particularly limited role in language learning.  Babies do not learn their native language through trial and error, mediated by reinforcement.  Rather, they simply pick up language by being exposed to it.  Human babies seem to be innately programmed to learn natural language, merely through exposure to a linguistic community. 


Reinforcement may not be necessary for learning, but practice is.  Hardly anything is learned in a single trial, and that is especially true for complex motor and cognitive skills like learning to play a musical instrument or reading music.  In a famous paper, Anders Ericsson and his colleagues (1993), interviewed musicians and determined that, by age 20, the best violinists had engaged in deliberate practice for a cumulative amount of more than 10,000 hours, compared to 7,800 hours for merely "good" violinists, and 4.600 hours for the least-accomplished group.  Assuming that they began playing the violin at 5 years of age, that comes to more than 666 hours per year, or about an hour per day, every day, week in an week out.  Findings such as these led Eriksson (2007) to conclude that "extended and intense practice" was the feature that most distinguished elite performers from "normal adults".  Ericsson's research, in turn, formed the basis of the 10,000 Hour Rule" popularized by Malcolm Gladwell in his book, Outliers (2008).  That is, it appears to take about 10,000 hours to become an expert at something.  And indeed, when you examine the histories of elite performers, 10,000 hours seems about right -- the equivalent of about 250 40-hour workweeks. 

Of course, talent matters, too.  A twin study by Mosing et al. found that individual differences in musical ability -- defined as the ability to make subtle discriminations of pitch, rhythm, and melody -- had a substantial genetic component, accounting for about 50% of population variance (for more on how such calculations are made, see the lectures on Psychological Development).  Most of the remaining variance was accounted for by the nonshared environment.  Somewhat surprisingly, Mosing et al. reported that music practice had no effect on musical ability. That is to say, there was no difference in test performance between monozygotic twins who differed in the amount of musical practice (e.g., between two twins, one of whom became an orchestra musician, and the other of whom became a brain surgeon).  Interestingly, Mosing et al also found a substantial genetic contribution to the amount of practice that their subjects engaged in, explaining about 69% of population variance.

Still further doubt on the 10,000 Hour Rule was cast by a meta-analysis of studies of expertise by a meta-analysis of expertise studies by Macnamara et al. (2014).  These investigators surveyed a large number of studies of the effects of practice on skilled performance, covering games, music, sorts, education, and professional activities.  Across 88 studies involving more than 11,000 subjects, they found that the average correlation between deliberate practice and performance was .35, explaining about 12% of total variance.  This outcome, they claim, is inconsistent with Ericsson's claim that individual differences in performance are mostly explained by individual differences in practice.

It has to be said that the claim that practice has no effect on expertise, and that all the action is in the genes -- which is what Mosing et al. expressly state in the title of their paper -- is implausible on the face of it. 


Observational Learning

Usually, we think of learning as entailing the direct experience of environmental events, organismal responses, and their outcomes.  In classical conditioning, Pavlov's dog gets the food after hearing the bell.  In instrumental conditioning Thorndike's cats get freedom after pressing the latch.  But can animals learn from the experience of other animals?  This is the question of vicarious or observational learning.


Observational Fear Conditioning

The phenomenon of observational learning was first demonstrated convincingly in the laboratory by Susan Mineka, who was then at the University of Wisconsin (she is now at Northwestern University), in a study of snake fear in rhesus monkeys.  SnakeFear1.gif
                (14830 bytes)Rhesus monkeys born and raised in the wild are universally afraid of snakes.  This is quite adaptive: after all, the monkeys live in an environment where there are lots of deadly snakes, vipers as well as constrictors.  Therefore, traditional theory has held that the fear of snakes in rhesus monkeys is innate, programmed by evolution in much the same way that instincts are.  The only problem with the theory is that monkeys who are born and raised in laboratory conditions do not fear snakes.  When exposed to a snake, they show no signs of fear.  Therefore, it seems that snake-fear must be acquired through experience.  But, if you think about it, it's not entirely clear how you learn from experience to fear a deadly snake.  Because after the first encounter, you're dead (snakes are like that).  Therefore, Mineka proposed that monkeys acquire their fear of snakes vicariously, from observing the reactions of other monkeys when they encounter snakes.  Thus, snake fear is not innate, but a learned part of what might be thought of as "monkey-culture".  

Mineka conducted an ingenious series of experiments to investigate the social learning of snake fear in rhesus monkeys.  For her test of fear, she employed a piece of equipment known as the Wisconsin General Test Apparatus (WGTA), in which the monkey is seated in a restraining chair, something like a baby's high chair, while being presented with various stimuli and making responses.  Mineka offered the monkeys a highly desirable food treat (Fruit Loops are dandy for this purpose), but in order to obtain the treat it had to reach past a snake or some other object.  Response latency, or the time it took the animal to reach past the object, was the measure of fear: the longer the latency, the more fear.

Mineka's initial study compared monkeys reared in the wild and in the lab in their response to various test stimuli such as real, toy, and model snakes (the real snake was a small boa constrictor), black and yellow cords, and a painted wood block.  As expected, the wild-reared monkeys were more afraid of the snakes than were the lab-reared monkeys.

For her first vicarious conditioning study, Mineka paired a (snake-phobic) wild-reared adult with a (non-snake-phobic) lab-reared adolescent (in her first study, the adult was actually the parent of the adolescent).  

In other words, the adolescents learned to fear snakes -- not from having unpleasant experiences with snakes themselves, but merely from watching an adult react negatively to them.  They learn, from observing other monkeys behave fearfully, that snakes are things to be feared.  One is reminded of "You've Got to be Carefully Taught", from the Rogers and Hammerstein musical South Pacific (1949).  During World War II, Lt. Joe Cable has come to the base to conduct an espionage mission against the Japanese forces on a neighboring island.  He falls in love with Liat, the daughter of Bloody Mary, but despairs of gaining acceptance for their biracial love back in the United States:

You've got to be taught to hate and fear
You've got to be taught from year to year
It's got to be drummed in your dear little ear
You've got to be carefully taught

You've got to be taught to be afraid
Of people whose eyes are oddly made
And people whose skin is a different shade
You've got to be carefully taught

You've got to be taught before it's too late
Before you are six or seven or eight
To hate all the people your relatives hate
You've got to be carefully taught
You've got to be carefully taught

Link to a recording of William Tabbert singing this song, from the original Broadway cast.

Mineka performed a number of variants on this basic experiment, with increasingly sophisticated methods, to explore the parameters of observational conditioning. 

              (14731 bytes)In her most fascinating experiment, Mineka discovered that, despite the central role of vicarious experience, observational learning was also constrained by preparedness.  In this study, she modified her apparatus, employing mirrors and video so that she could independently vary what the model and the target see.  For example, an adult model might see a snake, and react fearfully, while the adolescent sees a flower rather than the snake.  From the adolescent's point of view, then the adult is reacting fearfully to the flower, not the snake.  Will an adolescent who sees such a thing subsequently show fear of flowers?

The answer is no: Vicarious fear conditioning occurs only to snakes and snakelike objects.  It does not occur to the flower.  

Vicarious or observational learning is fascinating, but it is also theoretically important, because it is another instance of learning in the absence of reinforcement.  That is, the animal learns, even though reinforcement is provided to the other animal.


Language Acquisition

In humans, perhaps the most powerful and dramatic example of observational learning occurs in the domain of language.  By the time they are 4 or 5 years of age, every normal human child has become a fluent speaker of his or her native language -- that is, whatever language the child's parents and others speak in his or her presence.

The acquisition of language occurs effortlessly, and it occurs without reinforcement, before they are ever formally taught the rules of grammar in elementary school (which used to be called "grammar school", after all), and get graded for learning them. It all happens by the child hearing spoken language, and connecting what is said to what is going on in the world around them.  In this sense, language acquisition is a lot like Tolman's latent learning.

By contrast, even our closest primate relatives, chimpanzees, have no ability to learn language.  They may learn some "words" in the form of symbols, spoken or visual, that represent things like bananas.  But even after years of effortful training, they have essentially no ability to use syntactical rules to form and understand meaningful sentences.  When it comes to language, the "smartest" chimpanzee can't hold a candle to the dullest human 5-year-old.

In fact human language learning is so effortless and automatic that many linguists speculate that there is an innate capacity for language -- a "language acquisition device" that is a product of evolution, and which is a unique feature of human nature.  Knowledge of English or Chinese or Swahili or Farsi isn't innate, but the mechanism that allows children to learn these languages does appear to be.  

Put another way, language acquisition is highly prepared in humans.  Just like rhesus monkeys are highly prepared to learn to fear snakes, so human beings are prepared to learn language.  In chimpanzees, the best we can say is that language learning is unprepared, and it may even be contraprepared -- which is why chimpanzees can't learn syntax no matter how much training they receive.  

Social interaction is critical to language acquisition: without models.  Not only does the child require exposure to spoken language (and thus to the people who speak it), but the child needs to be exposed to what others are doing, and looking at, when they speak.  You can't just play a CD of spoken English under the child's crib and expect it to learn semantics and syntax (though it will learn the basic sound patterns).  The child has to interact with other people.  And these people don't even have to speak.  Deaf children whose parents and teachers use sign language, will effortlessly pick up the semantics and syntax of sign language, just like hearing children pick up whatever language their parents speak.  

And this interaction has to occur within a particular interval of time -- roughly, before the onset of puberty.  "Wild" children, who are raised in isolation from others until they reach adolescence, never really "get" language.  Within the more normal range of human experience, children who are raised in a bilingual environment -- say, with parents who speak both English and Spanish -- will effortlessly learn both languages, and speak both without an accept.  But if the learning of one language is delayed -- say, until high school or college -- it is very hard to gain facility in the second language, and the person is likely to speak it with a decided accent.  So, as with imprinting, there appears to be a critical period in language learning.

The capacity to learn language appears to be innate, a gift of human evolution.  And there is a critical period in language learning.  But despite this innate component, language acquisition requires exposure to a linguistic environment.  In this sense, it fits the true definition of learning as a change in knowledge that occurs as a result of experience.  And instead of being taught deliberately, through the direct experience of rewards and punishments, is occurs vicariously -- just by virtue of observation, without any particular reinforcement.


Social Learning Theory

As language acquisition illustrates, observational learning is particularly important in humans.  If you think about it, we do not learn all that much through the direct experience of trial and error, reward and punishment.  Rather, most of our learning comes through interactions with others.  To take a somewhat extreme example, physicians don't learn how to perform surgery by trial and error.  Rather, they learn surgery by watching experienced surgeons perform, and by being taught by them.  When a surgeon takes a scalpel to his or her first patient, he or she already knows what to do and how to do it.

Albert Bandura, of Stanford University, argues that human social learning takes two forms:

Language plays a particularly important role in learning by precept, as it provides a very flexible, efficient way of communicating our thought and knowledge to others.  Humans have a far greater capacity for language than any other species, and so it is not surprising that so much of our social learning is accomplished through language.

Consciousness also plays an important role in learning by precept.  To deliberately teach someone something presupposes that you are aware of it yourself.  Without conscious awareness, there could be no conscious intent, and so no sponsored teaching of the sort that is critical to learning by precept.


Social Learning and Imitation

Although most studies of learning performed before 1950 employed lower animals such as rats, dogs, and pigeons for subjects, the ultimate object of inquiry was humans. The major theories of learning assumed, explicitly or implicitly, that the same principles of learning adduced to explain simple behavior in these species would also be found relevant to complex human behavior. This program of application to the human case was pursued most prodigiously by B.F. Skinner, in his analyses of personality and social behavior (1953) and language (1957). According to Skinner, human behavior is performed under the conditions of stimulus control. Rather than focusing on internal dispositions such as traits and motives, or cognitive constructs such as expectation, a proper analysis of personality will focus on the individual's reinforcement history, as well as on discriminative stimuli and reinforcement contingencies present in the current environment. Human behavior is complex only insofar as the stimulus conditions in which it occurs are complex.

Other investigators also took up the Skinnerian program. For example, Staats and Staats (1963) attempted to apply the principles of learning to problems in personality, motivation, and social interaction, among other topics. Their work is not exactly Skinnerian in nature, because it attempts to come to grips with certain aspects of language that are outside the scope of Skinner's analysis. Nevertheless, the list of psychologists whom they cite as the inspiration for their efforts begins with Skinner, and includes most of major figures identified with the behaviorist analysis of learning. Staats' most recent statement of his theory, in fact, is entitled Social Behaviorism (1975). 

At the same time, it became clear that certain aspects of complex human behavior resisted conventional behavioral analysis. As one example, already discussed, language does not seem to be acquired through the principles of conditioning and reinforcement that are central to behaviorist analyses. The same is true of many human social behaviors. The problem of accounting for learning without direct experience of reinforcement ultimately lead to the development of a different cognitive theory of personality: cognitive social learning theory.

A step in this new direction was taken with the social learning theory of Miller and Dollard (1941). According to Miller and Dollard, personality consists of habits formed through learning. The learning process, in turn, is described in terms of a version of S-R learning theory proposed by Clark L. Hull. According to Hull, a habit represents a strong connection between some stimulus and some response. This association is acquired by virtue of drive-reduction: in the presence of the stimulus, the behavior has led to the satisfaction of some drive (you can see the connection to Thorndike's Law of Effect). 

Although Hull conceived of these drives as biological in nature, Miller (1951) later added concept of acquired (or secondary) drive. That is, through conditioning some external stimuli come to possess some of the properties of an internal drive state. For example, while fear is an innate drive, elicited by noxious stimulation, it can also be conditioned to previously neutral stimuli. Habits can be learned because they lead to fear reduction (a primary drive), and also because they eliminate fear stimuli (secondary drives). Drive-reduction theory thus provides the basic elements of personality viewed as a system of habits, in the form of principles of learning. A drive is any need which activates behavior. It can be innate, or it can be acquired through experience. However, drive itself does not give any particular direction to behavior. This directionality is given by the operation of other principles. Hull's theory, like Freud's, assumes that people are motivated to maintain homeostasis, eliminating states of tension. Drive-reduction serves to reward behavior. Responses are behaviors that lead to rewards. Finally, cues are stimuli that determine the selection of responses. Thus, personality can be viewed as a system of habits acquired and maintained through drive-reduction. Individual differences in habitual responses to environmental stimulation comprise the whole of personality. 

Miller and Dollard argued that in order to understand human personality, it was necessary to understand the principles of learning. However, because the habits that comprise personality are social behaviors, it is also important to understand the social circumstances in which that learning takes place. Thus, Miller and Dollard called their approach social learning theory. In this regard, it is interesting to note that the theory represents the collaboration between Miller, a psychologist, and Dollard, a sociologist. Thus, personality becomes an interstitial field, combining different levels of analysis.

Like Skinner's stricter behavioral approach, social learning theory as stated would seem to imply that the person must have direct experience with reinforcement in order to establish habits. As noted, this is unlikely to be the case. In order to cope with this problem, Miller and Dollard postulated a drive of imitation. Imitation is a process by which similar actions are performed by two individuals in response to appropriate cues. At the start, imitation is a behavior which can be reinforced by the environment, just as other behaviors are. When rewarded regularly, however, it takes on the properties of an acquired drive. Thereafter, the individual is motivated to imitate the behavior of others -- to copy their behavior in order to obtain the same rewards that they receive from their actions. Imitation is widespread because the culture reinforces it strongly, as a means of maintaining social conformity and discipline. For this reason, although imitation is an acquired drive (and therefore optional in principle), it is almost a necessary consequence of socialization.

Miller and Dollard discussed two principal forms of imitation. In both forms, one person matches another's behavior. 

Social Learning and Expectations

Although some social-learning theorists continued to embrace the tradition of functional behaviorism into the 1960s and 1970s the break from the behaviorist view of social learning was apparent in the Rotter's Social Learning and Clinical Psychology, which appeared in 1954 (see also Rotter, 1955, 1960; Rotter, Chance, & Phares, 1972). Where Staats and Staats (1963), writing almost a decade later, were still acknowledging the primary influence of Skinner and other functional behaviorists, Rotter (1954) acknowledged the influence of no behaviorists at all. Rather, he aligned himself with the dynamic psychologist Adler and the gestalt psychologists Kantor and Lewin (see also Rotter, Chance, & Phares, 1972, p. 1). From the beginning, Rotter intended his theory as a fusion of the drive-reduction, reinforcement learning theories of Thorndike and Hull with the cognitive learning theories of Tolman and Lewin. Although Rotter's version of social learning theory often uses behaviorist vocabulary, it is with a clear cognitive twist.

In the first place, Rotter is less interested in behavior than in choice, an internal mental state which obviously manifests itself in behavior.  Rotter's cognitive-social learning theory employs three basic concepts: 

Rotter's intellectual debt to the behaviorists is clear. Instead of predicting behavior in general, behavior is predicted only under certain conditions. When these conditions change, the behavior may likely change as well. Moreover, the behaviorist construct of reinforcement is central to his theory. However, Rotter's departure from the behaviorists is equally clear: whereas behaviorists such as Skinner hoped to dispense with mental constructs entirely, Rotter places them at the center of his theory. Although the behaviorists defined reinforcements objectively in terms of their effects on behavior (Thorndike's empirical law of effect), Rotter defines them subjectively: the value attached to any potentially reinforcing event is subjective, and one person's meat can be another person's poison. Moreover, whereas behaviorists defined reinforcement contingencies objectively, in terms of the contingent probability of the event given a particular response, Rotter clearly defines them subjectively, in terms of the individual's cognitive expectations. Finally, Rotter defined the situation in psychological terms, as it is experienced by the individual, and as the individual ascribes meaning to it. 


Cognitive Social Learning Theory

Rotter labeled his approach a social learning theory, and employed some of the concepts and principles of reinforcement theory in it. Nevertheless, his approach is less a theory of learning than it is a theory of choice. That is to say, Rotter is primarily concerned with how expectancies and values govern the choices we make among available behaviors. However, the theory has relatively little to say about how those expectancies, values, and behavioral options are acquired -- except to say that they are acquired through learning. It remained for another social learning theorist, Albert Bandura (Bandura, 1971, 1977, 1985; Bandura & Walters, 1963) to add to the concept of expectancies an explicit theory of the social learning process. Like Miller and Dollard, Bandura stressed the role of imitation in social learning. However, his concept of imitation departs radically from theirs in that it no longer functions as a secondary drive. By emphasizing cognitive processes over reinforcement, observation over direct experience, and self-regulation over environmental control, Bandura took a giant step away from the behaviorist tradition and offered the first fully cognitive theory of social learning.

Bandura's behaviorist roots are seen most clearly in his earliest statement of social learning theory, Social Learning and Personality Development (Bandura & Walters, 1963). On the surface, this book seems to draw heavily on Skinnerian analyses of instrumental conditioning. For example, there is a great deal of attention paid to the role of reinforcement schedules in the maintenance of behavior. Bandura and Walters argued that most social systems operated on some combination of fixed- and variable-interval schedules of reinforcement. For example, Bandura and Walters argued that most social reinforcements are delivered on an intermittent schedule. For example, family routines such as dining, parent-child interactions, shopping trips, and the like occur in a relatively unchanging cycle. Insofar as these activities can take on reinforcing properties, then, they are delivered on a fixed-interval schedule: the child cleans his plate at dinnertime during the week, and then gets to sit on his mother's lap during the family television hour on Saturday night. Other social reinforcements, however, seem to be delivered on a variable-interval. When a child seeks her mother's attention, she may get immediately, or at some time in the future when her mother doesn't have her hands full. Still other situations seem to involve the differential reinforcement of high or low rates of behavior. If a father pays attention to his child only when she kicks and screams, he is virtually guaranteeing that she will misbehave when she wants attention.

For a number of reasons, Bandura and Walters argued, most social reinforcements are dispensed on complex schedules combining variable ratios and variable intervals. In some respects, this complexity reflects the unreliability of social reinforcement. Often, the reinforcing agent is simply not present when the target behaviors occur -- in such a case, reinforcement must be deferred to a later time. And because humans are not automated machines, they will sometimes simply fail to deliver reinforcements that are due. Perhaps more important, the complexity of social reinforcement schedules reflects the complexity of social demands. It is rarely enough simply to perform a certain social behavior: it must be done in a particular way. A child asked to set the dinner table will not be rewarded simply for piling dishes and utensils; the forks have to be on the left side of the plate, and the blade of the knife turned inward. As Bandura and Walters note, effective social learning entails both adequate generalization and fine discriminations.

Social learning is also complex because of the wide variety of factors that affect the effectiveness of social reinforcements. For example, Bandura and Walters noted that children with strong dependency habits (note the phrase) are more susceptible to social reinforcement. Moreover, the prestige of the reinforcing agent is important, as is the match between the person and the agent on such attributes as gender. The person's internal states of deprivation, satiation, and emotional arousal are also important. The point is that social reinforcement is complex but not chaotic or haphazard. Social behavior is maintained by virtue of schedules of reinforcement, even if the precise nature of that schedule is sometimes hard to discern.

Although Miller's theory gained impressive support from analyses of animal behavior, Bandura and Walters were critical of its application to the case of human social behavior. For example, they argued that deliberate social learning also played a role in displacement. Thus, parents often direct their children's aggressive behaviors towards some targets rather than others, and displacement itself is maintained by contingencies of reinforcement. Clear examples of this may be found in scapegoating and other examples of prejudice towards minorities and other outgroups. By and large, these sorts of aggressive behaviors are not simply selected by the vicissitudes of the generalization gradient. Rather, children get their prejudices from their parents: as Rogers and Hammerstein wrote in South Pacific, "You've got to be carefully taught" whom to hate and fear.

While agreeing on the importance of reinforcement in the control of behavior, Bandura and Walters differed most from their behaviorist predecessors over the manner in which behavior was acquired in the first place. Taken at their word, Skinner and other functional behaviorists actually appear to deny that new behaviors are learned at all. Rather, responses already in the organism's repertoire come to be elicited by certain environmental cues by virtue of the law of effect. What are acquired are new patterns of behavior, by virtue of shaping and successive approximations. That is, a piece of behavior is synthesized from more elementary behaviors already in the organism's repertoire. Bandura and Walters, while agreeing that shaping procedures can be effective, doubted that they were responsible for the acquisition of most complex human social behaviors. Like Miller and Dollard, Bandura argues that social learning is largely mediated by imitation.

On the basis of anthropological studies as well as informal observation, Bandura and Walters argued that socialization -- the acquisition of socially sanctioned beliefs, values, and patterns of behavior -- was largely mediated by imitative learning. In some cultures, for example, young boys and girls are provided with miniature replicas of the tools used by their parents, and they spend a great deal of time tagging along with their parents practicing their use -- thus preparing for their adult roles. Similarly, children in the United States (and other developed societies) are given toys that the child can use to imitate adult behavior. In this way, for example, children in all cultures acquire behaviors consistent with the occupational roles deemed appropriate by their culture for persons of their gender.

Gender-role socialization is far from the only example of learning by imitation. In some tribal cultures, children even obtain their sex education by watching adults engage in various aspects of mating behavior. Certain aspects of language acquisition, such as the meanings and pronunciation of words, are learned largely through observation and imitation of other people. In addition, certain complex motor and cognitive skills appear to be acquired in this manner. Medical residents do not learn to perform surgery through a trial-and-error process. Rather, they learn by watching skilled practitioners operate, and by reading about the procedures in textbooks. In a very real sense, a surgeon knows how to do surgery before he or she ever puts a scalpel to a patient -- that is, before there can be any direct experience of trial and error. On a more mundane level, driver education courses in high schools make sure that students have acquired basic skills in handling an automobile before they ever take to the road.

In tribal cultures, parents and older siblings are probably the models for most imitation. They are, after all, the primary agents of socialization. However, this purpose may also be served by exemplary models sanctioned by the parents: children are constantly being encouraged to emulate various national heroes and mythological figures, as well as the children next door. In technologically advanced societies, models for imitation are provided by books, television, movies, and other media as well as by real life. One of the sources of the constant controversy over children's television viewing concerns the kinds of models presented to children in cartoons and action series. A major function of written and oral language is this kind of cultural transmission. By virtue of linguistic communication, we can tell someone what to do in a particular situation -- describe the behavior, and indicate when it should be performed -- instead of letting the person discover the relations between cues, acts, and outcomes for him- or herself. For this reason, social learning by imitation is highly efficient. In a complex, highly developed society, it also seems necessary.

While agreeing with Miller and Dollard that imitation is an important source of social learning, Bandura and Walters took issue with the theory that imitation -- either as a general tendency or of a specific act -- is acquired through reinforcement. For example, developmental studies show that children imitate others before they ever are reinforced for doing so. Very young infants, up to about four months of age, engage in pseudoimitation, in which they repeat some simple act (like babbling) displayed by their caretaker. However, this imitation will not occur unless the infant him- or herself had just recently performed the same act. Somewhat older infants will engage in genuine imitation of others, in circumstances where they have not just performed the same act themselves. The extent to which behavior will occur will depend on the degree to which the child's sensorimotor operations have developed. For example, children cannot reliably stick out their tongues in imitation of adults, until they have acquired some mental representation of their facial anatomy (Piaget, 1951; but see Meltzoff & Moore, 1977). Children are not reinforced for this: it simply happens, apparently as a reflection of an innate tendency to do so.

Even imitation of specific behaviors is not learned by virtue of reinforcement. The behaviorist model of imitation involves three elements: a discriminative stimulus (Sd) that serves as a cue, the response of imitating the model (R), and the reinforcing stimulus (Sr). By virtue of the law of effect, repeated reinforcement of the imitative behavior will make that behavior more likely to occur. However, a classic experiment on aggression by Bandura (1962) shows that this is not the case. Children watched a film in which a model displayed novel aggressive behaviors (that is, behaviors not previously in the children's repertoires) towards a "Bobo the Clown" doll. In one condition, the model was punished for this behavior; in another, he or she was rewarded; in a third condition, there were no consequences to the behavior of any sort. In a later test, children who viewed the punished model showed less imitative aggression than those who viewed the rewarded model; interestingly, those who viewed the unreinforced model displayed the same amount of aggression than those who saw the model rewarded. This first test was performed under conditions of no incentive. In a second test, the children were promised a reward for imitating the model: under these circumstances, the group differences disappeared. Thus, novel aggressive behaviors were acquired by the children even though they were not reinforced for imitating the behavior. However, the performance of these behaviors was under reinforcement control: those who saw the model punished were less likely to engage in the behaviors themselves, until instructed that the reinforcement contingencies had been changed.

In a later statement, Bandura (1977) argued that there are two forms of learning. Learning by response consequences is the kind of trial-and-error acquisition of knowledge familiar from the operant behaviorism of Skinner. However, this learning is given a cognitive emphasis. Direct experience provides information concerning environmental outcomes and what must be done to gain or avoid them. As a result, the person forms mental representations of experience that permit anticipatory motivation and behavioral self-control. Modeling involves learning through vicarious experience -- by observing the effects of other's actions. While a term such as "modeling" encompasses learning through example, Bandura also uses it to cover learning through precept -- deliberate teaching and learning, often mediated by linguistic communication.

Although Bandura goes beyond Rotter in discussing the process of social learning, his analysis of performance is similar to Rotter's in many respects. That is, Bandura agrees that the person's behavior is governed primarily by his or her expectancies concerning the future. Our responses to various situations are governed by information we possess concerning forthcoming events, and the outcomes of our actions. These expectancies are formed, respectively, through processes resembling classical and instrumental conditioning -- except that conditioning is given an active, cognitive interpretation as opposed to the conventional passive interpretation in terms of the laws of practice and effect. Moreover, conditioning is not the only -- or even the most important -- way that these expectancies can develop. Rather, they can be acquired vicariously through precept and example.

Expectations before the fact are, of course, subject to revision by the information gained subsequently. The actual consequences of an environmental event, for example, or of a person's actions, serve to confirm or revise the person's expectations. These consequences can be directly experienced by the person in question, or they may be experienced vicariously through observation or symbolic mediation. Moreover, in discussing the consequent determinants of behavior, Bandura stresses the role of aggregate as opposed to momentary outcomes. In his view, people are more influenced by what happens in the long run than by minor setbacks, delays, and irregularities. In large part, this is due to the cognitive capacities of humans, whose powerful memories permit them to transcend even long intervals, and integrate information from different points in time.

A unique feature of Bandura's social-learning theory is the active role played by the self. Behaviorist doctrine, of course, eschewed any reference to the self as an active organizer of experience or agent of action. Such talk was banned as mentalistic and ultimately beyond the pale of science. Insofar as the self was discussed at all, it was as (in Skinner's terms) a system of responses. As a cognitive theorist, however, Bandura (1977) permits the self to take an active, executive role in the regulation of behavior. In this way, the self plays a role as both an antecedent and a consequent determinant of behavior.

In the cognitive view offered by Tolman and by Rotter, outcome expectancies are vitally important determinants of behavior. That is, we tend to engage in behaviors that we expect will lead to outcomes we desire, and prevent outcomes we dislike. Bandura agrees that outcome expectancies are important. However, he has also added a new concept: self-efficacy expectations (Bandura, 1977, 1978). While it is obviously important that the individual expect that a particular behavior will lead to a certain outcome, it is equally important that the person have the expectancy that he or she can reliably produce the behavior in question. Note that the actual state of affairs is irrelevant here. It does not matter whether the person can, in fact, perform some particular action. What matters is whether the person thinks he or she can. Self-efficacy expectations are conceptually similar to the sense of mastery, and have important motivational properties, in that they determine whether the person will even attempt the behavior in question.

An example of self-efficacy can be found in the literature on learned helplessness. As a rule, dogs placed in a shuttlebox will acquire escape and avoidance responses fairly readily, shuttling back and forth in response to stimuli signaling forthcoming shock. However, dogs who have first received classical fear conditioning are retarded in learning escape and avoidance. In some instances, they simply sit and take the shock passively. Learned helplessness can also be produced in humans. For example, subjects who have been exposed to unsolvable anagram problems are retarded in completing subsequent problems that are solvable. Although the learned helplessness effect is quite complex, it appears to involve the subject's belief that he or she cannot master the situation. In fact, that is objectively not the case: the shock in the shuttlebox is avoidable, and the dog has in his repertoire the necessary behavior; the second set of puzzles is soluble, and the student has the intelligence to do so. Yet, experience has taught the subject to believe otherwise (if we can speak of beliefs in lower animals), and this belief controls behavior.

Self-efficacy can serve as an example of how antecedent expectations develop through social learning. Obviously, one source of self-efficacy is performance accomplishments: the personal experience of success and failure. Repeated failure experiences will lower the person's expectancy that he or she can effectively control outcomes. But the same sorts of expectancies can be generated through vicarious experience. Observing other people's success or failure will lead to appropriate expectations about oneself -- at least to the degree that one perceives oneself to be similar to those other people. But perceived self-efficacy can also be shaped in the absence of any experiential basis whatsoever, merely through verbal persuasion. A person who is repeatedly told that he or she is incapable of accomplishing some goal, especially if that information comes from an authoritative source, may actually come to believe it about him- or herself. Perceived self-efficacy can also change on a moment-to-moment basis, depending on the person's emotional state. Feelings of elation may increase feelings of mastery (sometimes beyond all reason, as in the megalomania of a manic patient), while anxiety or depression may reduce them. Finally, self-efficacy can vary from one situation to another. Even though a person has not encountered a particular problem before, he or she may have a high degree of self-efficacy if it closely resembles some other problem that the person has been able to master in the past.

Another way in which Bandura departs radically from the behaviorist analysis of social learning is by embracing the concept of self- reinforcement. Recall that Skinner objected to self-reinforcement on the ground that it was ineffective as a means of behavioral control. However, Bandura acknowledged that people can effectively regulate their own behavior in the absence of, or in opposition to, schedules of external reinforcement. For example, a run-of-the mill jogger can reward herself by finishing in the top half of a local road race, even though she will never get a medal for her performance. Alternatively, a college professor may feel remorse about flunking a student, even though he receives praise from his dean for upholding academic standards. It is so common to find writers, painters, and composers pursuing their own vision even though the are denied any professional recognition, that the image of the starving artist has become part of our cultural mythology. By means of goal-setting and self-reinforcement, people can free themselves from environmental control. This independence of the person from environmental control distinguishes Bandura's social learning theory from its behaviorist forebears.

In principle, self-reinforcement frees people from external control. As a practical matter, however, the essential first step in self- regulation, setting the standard, tends to be based on imitation. That is, we set standards for ourselves that a similar to those set for themselves by those we admire. These models may be our parents, teachers, or spiritual leaders. However, models may also come from other sources, such as books, films, and media. One important consequence of literacy, coupled with free access to books and magazines, is that we encounter potential models whose standards may be quite different from those whom we would otherwise meet. Modeling our standards on those individuals is another way in which we free ourselves from the constraints of our local social environment.

In addition to standard-setting, Bandura postulates three other component processes in self-regulation. The person must monitor his or her own performance, and evaluate it according to the standard set for him- or herself. The dimensions on which the performance is evaluated can vary widely, as can the precise standards. Very often, the individual will measure him or herself against actual or assumed population norms; or, some single individual will serve as the standard of comparison; in other circumstances, the standard will be set by the person's own previous behavior. It is important, of course, not to set standards that cannot be met. Research in a variety of domains, from academic achievement to weight loss, indicates that people should set goals for themselves that are clearly specified, and of only moderate difficulty. Vague or unambiguous goals, of course, are not goals at all. Setting an unattainable goal obviously has motivational drawbacks, while setting a goal that is too easy to accomplish will yield little or no satisfaction in its accomplishment. (It should be noted that the same considerations apply to goals set by others, as when parents enforce standards for their children's behavior.)

Once the evaluation has been made, the person will reinforce his or her performance appropriately. These rewards come in two forms, tangible and symbolic. The student who aces an exam may reward herself with a movie or punish herself by canceling a date; or she may just praise or censure herself. The effectiveness of self-praise or self-reproach, in the absence of tangible consequences, is currently subject to considerable debate. However, research clearly shows that people -- even young children -- who fail to meet their own performance standards will deny themselves reward. Apparently, such internal states as self-esteem and self-efficacy have their own motivating properties. While behavior that is controlled only by external contingencies will be unreliable in the absence of those contingencies, our selves are always with us. Thus, in principle self- reinforcement should lead to more effective behavioral regulation, because it is less subject to situational variation.

Moreover, human intelligence and consciousness permits us to project the consequences of our actions far into the future. Traditional behavioral theories, of course, assert that present behavior is under the control of past events, and that future prospects that have no parallel in the past are very weak determinants of behavior. However, this is clearly not the case. The emergence of political movements supporting environmental protection and nuclear disarmament are clear examples of the control of behavior by the future. We have had no experience of the greenhouse effect or nuclear winter, but the prospects of them in the future led us to try to protect the ozone layer, and reduce the number of nuclear warheads, today. The behaviorist analysis of future determinants is largely correct when it is applied to lower animals, with their limited cognitive capacities. Bandura's openness to such determinants is another mark of the extent to which social learning theory has embraced cognitivism, and abandoned its behaviorist roots.


Social Learning as the Cognitive Basis of Culture

Social learning is the cognitive basis of culture, which anthropologists define as the customary beliefs, social forms, and material traits of a racial, ethnic, or social group, transmitted through informal learning and formal training from one generation to the next.  This intergenerational transmission cannot be accomplished through the genes: there is no inheritance of acquired characteristics.  Instead, if must be accomplished by learning -- which is to say, social learning, through example and precept.  It is through social learning, both informal modeling and in formal institutions (such as schools and libraries) organized for the purpose, that we pass down its knowledge, beliefs, and attitudes from one generation to the next.  In this way, each generation builds on the advances made by those who went before, and doesn't have to start "from scratch".  

Which raises the question of whether nonhuman animals have "culture" as well.  Observations of animals behaving in their natural environment suggests that animals do indeed learn vicariously from observing the experiences of others, and in this respect possess sets of cultural traditions that are passed from one generation to the next. 

It is an open question whether individuals can learn from watching animals of other species.  But these instances certainly leave open the possibility of learning vicariously, taking others as models for one's own behavior.  In that sense, at least some nonhuman species have at least the rudiments of culture.

Along with consciousness and language, and culture, the capacity for learning, and especially for social learning, is one of the greatest gifts of evolution to the human species.

For More on Social Learning, Go to the Appendix: The Evolution of Cognitive Social Learning Theory


The Nature of Learning

Behaving organisms are not just machines, operating by reflex, taxis, or instinct.  Rather, even organisms with very simple nervous systems are able to modify their behavior in accordance with what they have learned.  Much learning can be described in terms of classical and instrumental conditioning, and combinations thereof.  But not all learning is of this sort: language learning is a particularly salient example of learning merely through exposure to others, without any reinforcement.   

What is learned is not a simple connection between stimulus and response.  Rather, the learning organism forms a mental representation of the world and its relation to it: of objects, events, its own behavior, and the contingent relations between them. 

In light of modern experiments on predictability, controllability, and social learning, we should revise our definition of learning. 

We cannot understand learning solely by focusing on events outside the organism, tracing connections between stimuli and responses, and treating the organism as if it were empty.  Rather, we must go inside the "black box", to see how the mind is structured, and how its structures operate.  We need to understand the principles by which information about the world is acquired through sensation and perception, retained through memory, transformed through thought, and communicated by language.  These matters are the province of cognitive psychology.

For a comprehensive survey of the psychology of learning, see The Psychology of Learning and Behavior by B. Schwartz and S.J. Robbins (Norton, 1978), and subsequent editions.  The most up-to-date of these is Learning and Memory by B. Schwartz and D. Reisberg (1991).

For a  thorough discussion of behaviorism, see Behaviorism, Science, and Human Nature by B. Schwartz and H. Lacey (Norton, 1982).  

For a comprehensive survey of theories of learning, see the various editions of Theories of Learning by E.R. Hilgard and G.H. Bower (1st ed. by E.R. Hilgard, published by Appleton-Century -Crofts, 1948; 5th ed. by G.H. Bower and E.R. Hilgard,  published by Prentice-Hall, 1981).


From Animal Learning to Animal Memory

The fact that animals can learn means that they have a capacity to encode, store, and retrieve memories.  But the sorts of memories implied by classical and instrumental conditioning represent semantic and procedural knowledge:
  • Bells are followed by food.
  • Tones are followed by shock.
  • If I press this lever, I'll get out of this puzzle box.
  • If I press this bar, I'll get a piece of rat-chow. 

Mostly, however, when we think about memory we mean episodic memory, which raises the question: can nonhuman animals have episodic memory, in the sense of an ability to remember specific experiences as such?  Some theorists (like Tulving, 1983) think not -- that the ability to remember specific episodes of experience is a uniquely human faculty.  But we've long since learned to accept the Darwinian principle of evolutionary continuity, so it would be surprising if at least some nonhuman species, most likely primates or other animals, had the ability to remember specific episodes in their lives.

Let's first define the terms.  An episodic memory is a memory for an episode -- an event with a unique location in space and time.  So, at the very least, an episodic memory has to have been encoded after a single experience.

By this standard, any example of one-trial learning -- such as the one-trial step-down passive avoidance learning often used in animal models of traumatic retrograde amnesia (e.g., Miller & Marlin, 1979) might count as episodic memory.  In this paradigm, a rat is perched on a platform above a floor grid which is wired to deliver an electric shock.  If the animal steps down (and they always step down), it gets a foot-shock, at which point it jumps back up onto the shelf and won't step down again.  It has learned the association between floor and shock in a single trial, and it passively avoids further shock by refusing to step down.  (If the rat receives electroconvulsive shock immediately after jumping back up, it will step back down onto the floor as if nothing had happened, apparently amnesic for the shock experience.) 

Now, it might be that the rat remembers the specific experience of getting shocked when it stepped down onto the floor -- in which case the memory might count as episodic.  Alternatively, it might be the case that the animal has acquired more generic knowledge that the floor delivers foot-shock in which case we're talking about something more like semantic memory -- abstract knowledge about the world.  A human analogue would be source amnesia, in which a subject remembers factual knowledge acquired during a learning session, but not the learning session itself.  So, the occurrence of one-trial learning isn't enough to qualify as an animal model of episodic memory.

So, returning to our definition of episodic memory, it seems that, at a minimum, an episodic memory has to contain information about the target event, as well as information about the time and place at which it occurred.  Call it a what-where-when structure (Tulving, 1972).  It is this W-W-W structure that makes the verbal-learning paradigm a model of episodic memory: subjects must remember what words were on a particular list studied at a particular time and in a particular place.  So, a successful animal model of episodic memory would have to demonstrate, at a minimum, that an animal remembers not just what happened, but also where and when it happened. 

Such a model was introduced by Clayton & Dickinson (1998) based on cache-recovery behavior in scrub jays (note: not a primate or even mammalian species!). 

  • The birds were allowed to cache two different foods for later consumption: wax worms, a preferred food which decays relatively quickly, and peanuts, a less-preferred food which does not. 
  • The foods were cached in different locations.
  • After a short or long retention interval, the birds were allowed to retrieve the food.
  • After the short interval, the birds went for the worms; after the long interval, they went for the peanuts.

This sort of experiment, which has been repeated many time in various species (including rats), seems to indicate that the animals have the ability to remember what was cached, where it was cached, and when it was cached - -thus meeting the minimal requirements for an episodic memory. 

But maybe episodic memory requires more than this.  Remember James's definition of secondary memory:

Memory requires more than a mere dating of a fact in the past.  It must be dated in my past.  In other words, I must think that I directly experienced its occurrence.

This feature of "reminiscence, recollection, reproduction, or recall" is necessarily subjective, and would seem to be ruled out by the fact that we simply have know way of knowing what the subjective experience of remembering is like for subjects who can't talk to us about their introspections.  Which is one reason why Clayton and others refer to "episodic-like memory". 

Related to this is Tulving's notion that episodic memory represents mental time travel (MTT), or traveling back in time to relieve a prior episode.  Tulving (2005) now believes that this self-referential autonoetic experience is the real hallmark of episodic memory -- and that the ability to mentally travel backward in memory is also related to our ability to project ourselves, mentally, into the future.  And and he also believes that this ability -- MTT in either direction -- is uniquely human. At the same time, we've known since Tolman, and certainly since the cognitive revolution in animal learning (Rescorla, Seligman, Kamin, and the others) that animals form expectations during both classical and instrumental conditioning.  And the very idea of expectations implies some ability to anticipate the future.

"Episodic Memory" in Animals

For a recent overview of this research, see:
  • Crystal, J.D. (2010), Episodic-like memory in animals.  Behavioural Brain Research, 210, 235-243.
  • Roberts, W.A.  (2002).  are animals stuck in time?  Psychological Bulletin, 128, 473-489.
  • Suddendorf, T., & Corballis, M.C.  (2007). The evolution of foresight: What is mental time travel, and is it unique to humans?  Behavioral & Brain Sciences, 30,  299-313.


This page last revised 09/16/2014.