Home Introduction Cognitive Psychology Cognitive Perspective Social Perception Social Memory Social Categorization Social Judgment Language Automaticity Self Social Neuropsychology Personality Social Intelligence Development Sociology of Knowledge Social Construction Conclusion Lecture Illustrations Exam Information

Social Perception

For background on sensation and perception, see the General Psychology Lecture Supplements on Sensation and Perception.

Social cognition is the study of the acquisition, representation, and use of social knowledge -- in general terms, it is the study of social intelligence.

A comprehensive theory of social cognition must contain several elements (Hastie & Carlston, 1980; Kihlstrom & Hastie, 1987):

The first two of these facets have to do with perception and attention -- the processes by which knowledge is acquired.


The Origins of (Social) Knowledge

Or are they?  Actually, a prior question is: Where does social knowledge come from?

When asked about knowledge in general, psychology and cognitive science offer two broad answers -- and a third, correct answer.

Historically, psychology has come down on the side of empiricism.  There are exceptions, of course: some aspects of our linguistic knowledge seem to be innate (Chomsky called his approach to language "Cartesian linguistics").  But in general, psychologists hold the view that we learn what we know, acquiring knowledge through experience.  For that reason, historically, scientific psychology began with an analysis of elementary processes of sensation and perception.

And so it is with social cognition: it, too, begins with social perception:

We can begin by distinguishing between sensation and perception.

Sensation has to do with the detection of stimuli in the environment -- in the world outside the mind, including the world beneath the skin.  To make a long story short: Sensory processes:

For example, the rods and cones in the retina of the eye convert light waves emitted by an object into neural impulses that flow over the optic nerve to the occipital lobe.

Perception gives us knowledge of the sources of our sensations -- of the objects in the environment, and of the states of our bodies: What objects are in the world outside the mind, where they are located, what they are doing, and what we can do with them.  Put another way, perception assigns meaning to sensory events.

In many respects, sensation is not an intelligent act.

By contrast, perception is the quintessential act of the intelligent mind.  Perception goes beyond the mere pickup of sensory information, and involves the creation of a mental representation of the object or event that gives rise to sensory experience.  In order to form these mental representations, the perceiver (in the lovely phrase of Jerome Bruner, a pioneering cognitive psychologist) "goes beyond the information given" by the stimulus, combining information extracted from the current stimulus with pre-existing knowledge stored in memory, employing processes of judgment and inference.

In cognitive psychology, there are basically two views of perception.

By any standard, the constructivist view dominates research and theory on perception.  But, as we will see, the ecological view also finds its proponents. 

Link to the "General Psychology" lecture supplements on Sensation and Perception.



The study of social perception begins with an analogy between social and nonsocial objects.  The study of social perception assumes that any person is an object who has an existence independent of the mind of the perceiver.  Accordingly, the perceiver's job is to extract information from the stimulus array to form an internal, mental representation of the external object of regard.

The term person perception was introduced by Bruner and Tagiuri (Handbook of Social Psychology, 1954) to reflect the status of persons as objects of knowledge.  As with any other aspect of perception, they argued that a number of factors influence perceptual organization:

Jerome Bruner and the "New Look" in Perception

Bruner was a pioneering cognitive psychologist and cognitive scientist.  Among his notable accomplishments was the introduction of what he called a "New Look" in perception, which sought to redirect perception research from an analysis of stimulus features to an analysis of the perceiver's internal mental states.  Although perception is obviously a field of cognitive psychology, to some extent Bruner's New Look was influenced by psychoanalysis, as when he argued that emotional and motivational processes interacted with cognitive processes -- so that, in some sense, our feelings and desires affect what we saw.

Our impressions of other people are typically represented linguistically, often as trait adjectives.  Consider a survey by the Washington Post, which asked respondents to describe in three words the various candidates for the Democratic and Republican presidential nominations. The most frequent responses, arranged as "tag clouds" in which the font size represents the frequency with which the word was used, looked like these:

                      (66904 bytes) 9Edwards.JPG
                      (68489 bytes) 10Giuliani.JPG (63968 bytes) 11Huckabee.JPG (56659 bytes)
                      (66899 bytes) 13Obama.JPG
                      (64933 bytes) 14Romney.JPG
                      (65445 bytes) 15Thompson.JPG (68145 bytes)

Here's a similar survey, conducted over Facebook by the Daily Beast, a journalism website (with, perhaps, a somewhat liberal bent), following the February 2012 Republican presidential debates.  During the debate, CNN correspondent John King had asked each of the candidates to describe themselves in one word.  The Daily Beast polled subscribers to its Facebook page with the same question, resulting in the following word clouds.  For good measure, they also asked their Facebook subscribers to describe Barack Obama, who was unopposed for the Democratic nomination.

In 2015, in the run-up to the 2016 election, YouGov.com, a global online community that promotes citizen participation in government, conducted a similar survey in which visitors to the organization's website were asked to characterize some of the leading presidential candidates in one word (at the time, there was some indication that Mitt Romney would join the race).  Separate word clouds were constructed from the responses of people who liked and disliked each candidate.

Often, the stimulus information for person perception also comes in verbal form, as a list of traits and other descriptors.  This is certainly the case with the self-descriptions that appear in "personals" ads in newspapers and magazines.  But it is also true when we describe other people.  Consider this passage from the Autobiography of Mark Twain, the author describes the countess who owned the villa in Florence where he and his family stayed in 1904:

"excitable, malicious, malignant, vengeful, unforgiving, selfish, stingy, avaricious, coarse, vulgar, profane, obscene, a furious blusterer on the outside and at heart a coward."

            (191981 bytes)Or, this description of Osama Bin Laden, which the former chief of the CIA's "Bin Laden Issues Station" has endorsed as a "reasonable biographical sketch" of the man.



When Fiske and Cox (1979) coded peoples' open-ended descriptions of other people, they identified six major categories:

Because verbal lists of traits are easy to compose and present to subjects, many studies of person perception begin with traits, and proceed from there.  This is reasonable, because so much of our social knowledge is encoded and transmitted via language.  


The Asch Impression-Formation Paradigm

Actually, the study of person perception began before 1954, with the work of Solomon Asch (1946).  Like Lewin (about whom you've already heard), and Fritz Heider (about whom you'll hear a lot in the future), Asch was a German refugee from Hitler's Europe. And like them, he was heavily influenced by European Gestalt psychology.  Much of Asch's early work was on aspects of nonsocial perception, but he brought the Gestalt perspective to bear on problems of social psychology in his classic textbook, Social Psychology (1952), which was the first social psychology text to be written with a unifying cognitive theme running throughout.

Asch (1946) set out the problem of social perception as follows:

[O]rdinarily our view of a person is highly unified. Experience confronts us with a host of actions in others, following each other in relatively unordered succession. In contrast to this unceasing movement and change in our observations we emerge with a product of considerable order and stability.  

Although he possesses many tendencies, capacities, and interests, we form a view of one person, a view that embraces his entire being or as much of it as is accessible to us. We bring his many-sided, complex aspects into some definite relations....

How do we organize the various data of observation into a single, relatively unified impression?

How do our impressions change with time and further experiences with the person?

What effects in impressions do other psychological processes, such as needs, expectations, and established interpersonal relations, have?

In addressing these questions, Asch set out two competing theories:

Obviously as a Gestalt psychologist, Asch had a pre-theoretical preference for the latter theory.

In order to study the process of person perception, Asch (1946) invented the impression-formation paradigm.  He presented subjects with a trait ensemble, or a list of traits ostensibly describing a person (the target) -- varying the content of the ensemble, the order in which traits were listed, and other factors.  The subjects were asked to study the trait ensemble, and then to report their impression of the target in free descriptions, adjective checklists, or rating scales.

011Asch1.jpg (44668
            bytes)Asch's first experiment compared the impressions engendered by two slightly different trait ensembles.  Subjects were presented with one of two trait lists, which were identical except that target A was described as warm while target B was described as cold.



            (51065 bytes)After studying the trait ensemble, the subjects reported their impressions in terms of a list of 18 traits, presented as bipolar pairs such as generous-ungenerous.



013Asch1Results.jpg (59748 bytes)The two ensembles generated two quite different impressions, 014Asch1Scales.jpg (55976 bytes)with A perceived in much more positive terms than B.  There were significant differences between the two impressions on 10 of the 18 traits in subjects' response sets.  A later experiment, varying only intelligent-unintelligent, yielded similar results.



016Asch3Results.jpg (58383 bytes)But when the experiment was repeated (Experiment 3), with the words polite and blunt substituted for warm and cold, there were relatively few differences between the two impressions. 



From these and related results, Asch concluded that traits like warm-cold and intelligent-unintelligent were central to impression formation, while traits like polite-blunt were not.  In Asch's view, central traits are qualities that, when changed, affect the entire impression of the person.  Other traits are more peripheral, in that they make little difference 

Although being described as warm rather than cold led the target to be described in highly positive terms, Asch distinguished the effect of central traits from the halo effect described by Thurstone, by which targets described with one positive trait tend to be ascribed other positive traits as well.  Being described as warm rather than cold does not lead to an undifferentiated positive impression; the warm-cold effect is more differentiated than that.

019Asch2.jpg (35787
            bytes)As proof, Asch pointed out that  warm-cold is not always central to an impression.  In his Experiment 2, where warm and cold were embedded in a different trait ensemble, there were few differences between the resulting impressions. If anything, the person was perceived as somewhat dependent, rather than the glowingly positive terms that emerged from Experiment 1.


Consistent with Gestalt views of perception, the effect of one piece of information (whether the person is warm or cold) depends on the entire field in which that information is embedded.  To explain why traits are sometimes central and other times peripheral, Asch offered the change of meaning hypothesis, which holds that the total environmental surround changes the meaning of the individual elements that comprise it.  Remember, for Gestalt psychologists, the distinction between figure and ground is blurry, because both figure and ground are integrated into a single unified perception.  Perception of the figure affects perception of the background, and perception of the background affects perception of the figure.

021Asch6.jpg (50304
            bytes)In addition to studying the semantic relations among stimulus elements, Asch also studied 022Asch6Results.jpg (53465 bytes)their temporal relations.  After all, he argued, impression-formation is extended over time: as we gradually accumulate knowledge about a person, our impression of that person may change.  In his Experiment 6, Asch presented subjects with two identical trait ensembles, except that intelligent was the first trait listed for target A, and the last trait listed for target B. The two impressions differed markedly, revealing an order effect in impression formation.


In order to explain order effects, Asch held that the initial terms in the trait ensemble set up a "direction" that influences the interpretation of the later ones.  The first term sets up a vague but directed impression, to which later characteristics are related, resulting in a stable view of the person -- just as our perception of a moving object remains stable, even though our perspective on it may change over time.

Taken together, Asch's studies illustrate principles of person perception that are familiar from the Gestalt view of perception in general.  The whole percept is greater than the sum of its stimulus parts, because the elements interact with each other; just as the perception of the individual stimulus elements influences perception of the entire stimulus array, so the perception of the entire stimulus array influences the perception of the individual stimulus elements.

Asch's 1946 experiments set the agenda for the next 20 to 30 years of research on person perception and impression formation, which basically sought clarification on questions originally posed by Asch himself:

Interestingly, however, a recent large-scale study failed to replicate one of Asch's findings: the "primacy of warmth" effect, by which warm-cold serves is not only a central trait, but more important to impression-formation than the other big central trait, intelligent-unintelligent.  Nauts et al. (2014) carefully repeated Asch's  (1946)procedures (for his Studies I, II, and IV), in a sample of 1140 subjects run online via Mechanical Turk. 

So, just to be clear, Nauts et al. confirmed that arm-cold and intelligent-unintelligent are central to impressions of personality; but they failed to find, as he claimed, that warm-cold was more important than intelligent-unintelligent.  

What Makes a Trait Central?

Asch's distinction between a central and a peripheral trait was made on a purely empirical basis: He discovered that some traits, such as warm-cold and intelligent-unintelligent, exerted a disproportionate effect on impressions of personality, while others, such as polite-blunt, did not.  But although he could predict the effects of central traits on impressions, he had no theory that would enable him to predict which traits would be central, and which peripheral.

So what makes a trait central as opposed to peripheral?  Julius Wishner (1960) offered a plausible answer.  He administered a 53-item adjective checklist, derived from the checklists that Asch had used, asking his subjects to describe their acquaintances (often, their teacher in introductory psychology).  Using the power of high-speed computers that simply were not available to Asch in 1946 (and which, frankly, are dwarfed by the computational power of the simplest laptop or even palmtop computer today), Wishner calculated the correlations between each trait and every other trait in the list.  Examining the matrix of trait intercorrelations, Wishner observed that traits such as warm-cold and intelligent-unintelligent, which Asch had identified as central, had significant correlations with many other traits (e.g., mean rs = .62 and .56, respectively); by contrast, peripheral traits such as polite-blunt had relatively few correlations (e.g., mean r = .43).  

The upshot of Wishner's study is that central traits carry more information than peripheral traits, in that they have more implications for unobserved features of the person.  By virtue of their high intercorrelations with other traits, knowing that a person is warm or intelligent tells us a great deal about the person, while knowing that a person is polite does not.  In the same way, a change in one central trait, from warm to cold or from intelligent to unintelligent, implies changes in many other traits as well, while a change from polite to blunt does not.

Wishner's findings also explained why a trait like warm-cold was not always central: it depends on the precise list of traits on which subjects make their ratings.  Any trait from the stimulus ensemble will function as a central trait, so long as it is highly correlated with many of the traits on the response list.  

Wishner's solution to the problem of central traits made his paper a classic in the person-perception literature, but it is not completely satisfactory.  For example, it might be nice if "centrality" was a property of the trait itself, and did not depend on the context provided by the response set (though, frankly, Asch, as a Gestalt psychologist, might not think this was so desirable!).  Are there any traits that are inherently central? 

Seymour Rosenberg (1968), making use of even more computational power than had been available to Wishner, factor-analyzed the intercorrelations among a large number of trait terms, yielding a hierarchical structure consisting of subordinate traits, primary traits, secondary and even tertiary traits.  He discovered that Asch's central traits tended to load highly on two very broad superordinate factors of personality ratings representing two dimensions:

So, in fact, warm-cold and intelligent-unintelligent do seem to be inherently central to impression formation -- unless the experimenter uses a strange set of response scales, which don't bear on either of these major dimensions of personality impressions.

Interestingly, these two "superfactors" are not entirely independent of each other: people who tend to be described in positive social terms also tend to be described in positive intellectual terms.  There is a "super-duper" factor of evaluation which runs through the entire matrix of personality traits, and gives rise to Thurstone's halo effect.

Pulling all of this material together, we can conclude that central traits have two properties:

 Talking with Strangers

Malcolm Gladwell, a journalist who interprets social-science research in the popular press (Blink was about automaticity; The Tipping Point was about minority influence and social contagion), tackled the problem of social cognition in his book, Talking with Strangers: What We Should Know About the People We Don't Know (2019).   The general thesis of the book is that bad things happen "when a society does ot know how to talk to strangers".   His prime example is the "Peace in our time" meeting between Neville Chamberlain and Adolph Hitler, which led to the German annexation of Czechoslovakia and, thus, World War II.  In this telling, Chamberlain erroneously concluded from Hitler's double handshake and other aspects of his nonverbal behavior that he could be relied on to keep his word.  Gladwell concludes that "the people who were right about Hitler were those who knew the least about him personally", while "the people who were wrong about Hitler were the ones who had talked with him for hours".  Well, maybe. 

Gladwell also discusses other, more prosaic examples of egregious misundestanding, which he explains with two basic principles, both drawn from soial-science research.
  1.  Gladwell cites Tim Levine, a communications researcher, who has suggested that a common factor in social misunderstandings is that some people are "mismatched" -- for example, a dishonest person presenting him- or herself as honest (the Hitler example), or a neurotic person presenting as stable.  Levine argues that we are usually able to determine whether someone is being truthfulness -- but this ability fails when that person is "mismatched" -- underscoring the possibility that people deliberately misrepresent themselves in everyday life (apologies to another sociologist, Erving Goffman).
  2. Gladwell also suggests that people suffer from an "inability to make sense of the stranger as an individual -- for example, by taking note of the context in which their behavior occurs -- which sounds a little like what we will discuss later as the Fundamental Attribution Error. 
Despite all this, especially the Hitler example, Gladwell cautions his readers not to distrust everyone.  Rather, we should "accept the limits of our ability to decipher strangers".  On the other hand, Andrew Gottlieb, reviewing the book in the New York Times Book Review concludes that "the threads that connect Gladwell's somewhat rambling material have to do with misreading people -- mistaking their intentions, drawing erroneous conclusions from their demeanors and believing their false claims of innocence.  Yet despite its title, the book is not really about strangers....  Lies, misunderstandings, and escalating confrontations have, after all, been known to occur even within marriages  ("Malcolm Gladwell's Advice When 'Talking to Strangers': Be Careful", 10/06/2019).  Maybe we should accept the limits of our ability to decipher coworkers, neighbors, friends, and even intimates, as well.

Gladwell probably makes too much of two relatively small principles (he made similar mistakes in Blink and The Tipping Point), but his book only underscores the importance of understanding both how we perceive other people, and how accurate, or inaccurate, those perceptions can be.


Implicit Personality Theory

While a great deal of personality research has been devoted to determining the hierarchical structure of personality traits (e.g., the Big Five structure of neuroticism, extraversion, agreeableness, conscientiousness, and openness to experience), it seems likely that laypeople possess some intuitive knowledge of the structure of personality as well.  In fact, Asch's concept of the central trait assumes that laypeople possess some intuitive knowledge about the relations among personality traits.  If they did not, all traits would be created equal, none more central, or more peripheral, to impression formation than any other. 

The term implicit personality theory (IPT) was coined by Bruner and Tagiuri (1954 -- they were busy that year) to refer to "the naive, implicit theories of personality that people work with when they form impressions of others".  Bruner and Tagiuri understood that person perception entailed "going beyond the information given" by combining information extracted from the stimulus with information supplied by pre-existing knowledge.  In other words, in the course of person perception the person must make use of knowledge that he or she possesses about the relations among various aspects of personality.

People's implicit theories of personality may be quite different from the formal theories of personality researchers.  In fact, compared against formal theories resulting from methodologically rigorous research, they may even be wrong.  But right or wrong, they are used in the course of person perception. 


Defining Implicit Personality Theory

The domain of IPT was further explicated by L.J. Cronbach in his contribution to a 1954 book on person perception edited by Bruner and Tagiuri (it was a very good year for person perception).  Cronbach discussed IPT in the context of personality ratings made by judges in traditional trait-oriented personality research, and suggested that in addition to information derived from the judge's observations of the target, the ratings will be influenced by the "Judge's description of the generalized Other" -- that is, by the judge's beliefs about what people are like in general.  In Cronbach's view, IPT consists of several elements:

Of course, these are also the elements of formal, scientific theories of personality structure.

Cronbach believed that IPT was widely shared within a culture, but he acknowledged that there might also be individual differences in IPT.  For example, some people might assume that most people are friendly and well-meaning, and that becomes the "default option" when they make judgments about some specific person; but other people might assume that most people are hostile and aggressive.  In addition, he suggested that there may be cultural differences in IPT.  For example, within Western culture, IPT seems to be centered on clusters of traits, or stable individual differences in behavioral dispositions; but other cultures might have more "situationist" or "interactionist" views of personality.

Following Cronbach, an expanded concept of implicit personality theory might look like this:

            (41289 bytes)For example, people might think that Rosenberg's traits of social and intellectual "good-bad" are normally distributed, with populations right about the midpoint.


  Osgood's "Pollyanna Principle" reflects the assumption that the distribution of positive traits in the population is skewed toward the positive end of the continuum. 


Or, people might believe that there is a bimodal distribution of social and intellectual goodness, with most people tending toward "good" but a substantial minority tending toward "bad".

Thorndike's "halo effect"  reflects the assumption that socially desirable traits, whether social or intellectual, are positively correlated with each other, as in the two-dimensional structure uncovered by Rosenberg.



Note for statistics mavens: the correlations between variables can be represented graphically by vectors, with the angles between vectors reflecting the correlation between them such that r is equal to the cosine of the angle at which the vectors meet (this is how factor analysis can be done geometrically).  Thus, two variables that are uncorrelated with each other (r = 0.0) are represented by two vectors that meet at right angles (cos 90o = 0); two variables that are perfectly correlated (r = 1.0) are represented by two vectors that overlap completely (cos 0o = 1.0); and two variables that are highly but not perfectly correlated, say .60 < r < .65, are represented by vectors that meet at an angle of about 45o (cos 45o = 0.635). 


Two Models of IPT

Implicit theories of personality have been studied through the application of multivariate statistical methods, such as factor analysis, multidimensional scaling, and cluster analysis to various types of data:

But because these techniques are very time-consuming, developments in implicit personality theory had to await the proper technology, particularly the availability of cheap, high-speed computational power.  In the 1960s, as appropriate computational facilities became widely available, two competing models of implicit personality theory began to emerge.

The first of these was a semantic differential model of IPT based on Charles Osgood's tridimensional theory of meaning (e.g., The Measurement of Meaning by Osgood, Suci, & Tannenbaum, 1957).  According to Osgood, the meaning of any word can be represented as a point in a multi-dimensional space defined by three vectors:

In this EPA scheme, closely related words are represented by points that lie very close to each other in this space.  Osgood's method was to have subjects rate objects and words on a set of bipolar adjective dimensions.  When these ratings were factor analyzed, the three dimensions of evaluation, potency, and activity came out regardless of the domain from which the objects and words were sampled -- people, animals, inanimate objects, even abstract concepts.  If these three dimensions are the fundamental dimensions of meaning, they are likely candidates for implicit personality theory -- the cognitive framework for giving meaning to people and their behaviors -- as well.

The principal problem with the semantic differential model is that the best evidence for the three factors came from studies employing adjective checklists, where the adjectives on the list were deliberately chosen to represent evaluation, potency, and activity.  Accordingly, it seems possible that evaluation, potency, and activity came out of analyses of adjective ratings because, however unintentionally, they were built into these ratings to begin with.  Interestingly, Osgood's three dimensions were not clearly obtained from free-response data which was not constrained by the experimenter's choices.  Left to their own devices, without the experimenter's constraints, subjects' cognitive structures look somewhat different from Osgood's scheme.

For example, Rosenberg and Sedlak (1972) asked subjects to provide free descriptions of 10 people each.  These investigators then selected the 80 traits that occurred most frequently in these descriptions, and then submitted the 80x80 matrix of trait co-occurrences to a technique of multivariate analysis called multidimensional scaling (why only 80 traits?  An 80x80 matrix, generating more than 3,000 unique correlation coefficients, exhausted the computational power available at the time).  They found that Osgood's evaluation and potency factors were highly correlated (r = .97): people who were perceived as good were also perceived as strong.  Osgood's activity factor was quite weak in the data, but was also positively correlated with the evaluation and potency factors (r = .57).  Accordingly, Rosenberg & Sedlak concluded that the evaluation dimension dominated people's implicit theories of personality.

Based on free-description data such as this, Rosenberg proposed an alternative evaluation model of IPT. He argued that evaluation was the only perceptual dimension common to all individuals, and that any additional dimensionality came from correlated content areas such as social and intellectual evaluation.  The figure graphically represents the loadings on the two dimensions of a representative set of trait adjectives.  Note that Asch's central traits, warm-cold and intelligent-unintelligent, lie fairly close to the axes that define the two-dimensional space.



Kim and Rosenberg (1980) offered a direct test of the two models.  In the Rosenberg and Sedlak (1972) study, and other studies of implicit personality theory, individual subjects rated only a single person, and then the subjects' responses were aggregated, so that the resulting IPT structure reflected the average "Judge's description of the generalized Other".  But averaging may obscure the structures that exist in individual Judges' minds.  It is entirely possible that individual judges have something like the Osgood structure in their heads, but when their responses are aggregated, only evaluation remains.  Accordingly, Kim and Rosenberg decided to compare the adequacy of the two models at the individual level (again, this is the kind of analysis that can only be done when computing resources are cheap). Because multivariate analysis requires multiple responses, they had subjects describe themselves and 35 other people that they knew well; and they collected both free descriptions and ratings on an adjective checklist.  Multidimensional scaling of the individual subject data revealed that something resembling Osgood's three-dimensional EPA structure appeared in only 8 of the 20 subjects studied, and in 3 of these 8, the potency and activity dimensions were not independent of evaluation.  More important, the evaluation dimension emerged from every individual subject's data set. 

Kim and Rosenberg concluded that that Osgood's EPA structure was an artifact of aggregation across subjects.  All subjects use evaluation as a dimension for person perception; some use potency, some use activity, and some use both, and these dimensions are strong enough to create the appearance of a major dimension when data is aggregated across subjects.  Potency and activity are positively correlated for some subjects, and negatively correlated for others; these tendencies balance out, and give the appearance that potency and activity are independent of each other and of evaluation.  But this is an illusion produced by data aggregation.  In their view, only the evaluation dimension is genuine; the potency and activity dimensions are largely artifacts of method.

Fiske et al. (Trends in Cognitive Sciences, 2007) has drawn new attention to Rosenberg's work by claiming that warmth and competence are universals in social cognition, which exert a powerful influence on how we interact with other people.  In particular, she has argued that various combinations of warmth and judgment characterize various out-group stereotypes.  This is true in both "individualist" and "collectivist" (or 'independent" and "interdependent") societies.  Fiske has even argued that there is a specific module in the brain, located in the medial prefrontal cortex (MPFC), which constitutes a "social evaluation area".  In any event, she and her colleagues have argued that assessments of warmth and competence are made automatically and unconsciously -- even though they may not necessarily be accurate. 

The "Big Five" as an Implicit Theory of Personality

Implicit theories of personality are like formal scientific theories, except that they are "naive" and "implicit".  Recently, scientific research on personality has focused on a five-factor model of personality structure originally proposed by Norman (1963).  In his research, Norman examined subjects' ratings of other people on a representative set of trait adjectives.  Factor analysis reliably revealed five reliable dimensions of personality:

Norman (1963) recovered this five-factor structure from factor analyses of questionnaires and rating scales, of self-ratings and other-ratings, regardless of the method of collecting data or factor analysis he employed.  In his view, the five-factor structure was ubiquitous.

Norman's findings were soon replicated by others (actually, they had been obtained by earlier investigators as well) -- so reliably that they came to be known as The Big Five dimensions of personality.

Goldberg (1981) proposed that the Big Five comprised a universally applicable structure of personality.  By universally applicable Goldberg meant that it could be used to assess individual differences in personality under any circumstances:

In line with the doctrine of traits, Norman (and many other advocates of the Big Five) assumed that these five traits had actual existence, just like physical traits, as behavioral dispositions. 

Goldberg noted that the Big Five are so ubiquitous that they have been encoded in language, as familiar trait adjectives like extraverted and cultured.  Of course, if ordinary "laypeople" (not just trained scientists) notice these dimensions enough to evolve words for them, the Big Five structure may exist in people's minds as well as their behavior.  That is, the Big Five may well serve as the structural basis for people's implicit theories of personality, as well as a formal theory of personality structure.

Along these lines, I have often thought of the Big Five as The Big Five Blind Date Questions -- representing the kind of information that we want to know about someone that we're meeting for the first time, and will be spending some significant time with:

052HierarchyIPT.jpg (50900 bytes)If these are indeed the kinds of questions we ask about people, then it seems like the Big Five -- and not just a single dimension of evaluation -- resides in our heads as an implicit theory of personality.  As it happens, we can fit The Big Five into a hierarchical structure of implicit personality theory.

And in fact, there is some empirical evidence that The Big Five -- whatever its status as a scientific theory of personality -- serves as an implicit theory of personality as well. 

The evidence comes from a provocative study by Passini and Norman (1966), who asked subjects to use Norman's adjective rating scales to rate total strangers -- people they had never met before, and with whom they were not permitted to interact during the ratings session.  The subjects were simply asked to rate others as they "imagined" them to me.  Nevertheless, factor analysis yielded The Big Five, just as had earlier factor analyses of ratings of people the subjects had known well.  Note that the Passini and Norman study violates the traditional assumption of personality assessment: that there is some degree of isomorphism between personality ratings and the targets' actual behavior.  In this case, the judges had no knowledge of the targets' behavior.  The Big Five structure that emerged from their ratings was not in their targets' behavior -- simply because they had no knowledge of their targets' behavior; but it certainly existed in the judges' heads, as a "description of the generalized Other".  

Based on this evidence, it may be that the Big Five provides a somewhat more differentiated implicit theory of personality than the two-dimensional evaluation model promoted by Rosenberg and Sedlak.  If so, we would have another answer to the question of what makes a central trait central.  Just as R&S argued that central traits loaded highly on the two dimensions of evaluation, perhaps central traits load highly on one or the other of the Big Five dimensions of personality.  Certainly that's true for warm-cold, which loads highly on extraversion, and intelligent-unintelligent, which loads highly on openness.  


The Illusion of Coherence 

By now there have been many studies similar to that of Passini and Norman, all with similar results: every factor structure derived from empirical observations has been replicated by judgments of conceptual similarity.  Thus, we do seem to carry around in our heads an intuitive notion concerning the structure of personality -- the co-occurrences among certain behaviors, the covariances among certain traits, the notion that certain things go together, and other things contradict each other.  This conceptual structure -- this implicit personality theory -- is thus cognitively available to influence people's experience, thought, and action in the social world.

The existence of implicit personality theory is interesting, but in some sense it is also troublesome, because it raises a difficult question that has long bedeviled theorists of perception in general -- the question of realism vs. idealism:

Recall that a major assumption underlying traditional psychometric approaches to personality is coherence:
Coherence yields a hierarchical structure of personality:
The assumption of coherence is apparently confirmed by factor-analytic studies of personality, because the factors that emerge from the statistical analysis summarize the patterns of co-occurrences and correlations, and represent primary and superordinate personality traits.  The fact that certain factor structures, such as The Big Five, appear to be extremely stable, gives rise to the notion that factor analysis (or similar multivariate methods) yields the structure of personality.

But this kind of evidence is problematic.  In principle, factor analysis should be applied to objective observations.  But for pragmatic reasons, this is generally impossible in the domain of personality research, simply because it is very difficult to perform the systematic observations of behavior that are required for this purpose.  Because we have no direct measurements of personality traits, factor analysis is generally applied to rating data -- subjective impressions of behavior and traits; judgments that rely heavily on memory.

The problem is the reconstructive nature of memory retrieval.  Memory for the past is contaminated by expectations and inferences.  When factor analysis is applied to memory-based ratings, therefore, we cannot be sure what the factor matrix represents: the structure residing in the personalities of the targets, or the structure residing in the minds of the raters.

057MoonAlcatraz.jpg (200809 bytes)The fact is, we know from studies like Passini and Norman's that the structure of personality -- and, specifically, the Big Five structure that is so popular -- resides in the minds of raters.  058Moon2.jpg (137005
            bytes)Therefore, it is possible that the structure of personality is to some degree illusory -- in a manner somewhat resembling the Moon illusion familiar to perception researchers.  The moon looks larger on the horizon than at zenith, even though it isn't, because of "unconscious inferences" made by perceivers that take account of distance cues in estimating size.  Perhaps personality raters make similar sorts of unconscious inferences in rating other people's personalities (or their own).  Note that the existence of a moon illusion doesn't imply that there is no moon.  It simply means that the moon isn't as big as it looks.  

Similarly, in the realm of person perception, our expectations and beliefs can distort our person perceptions, and thus our person memories; in particular, our expectations and beliefs about the coherence of personality can magnify our perception of that coherence.


Where Does Implicit Personality Theory Come From?

Research has already established that the structure of personality exists in the mind of the observer.  The important question is whether it also has an independent existence in the world outside the mind.  As in the moon illusion, we usually take a modified realistic view of perception -- that our perceptions are fairly isomorphic with the world.  Accordingly, we may assume that our beliefs about personality are to some extent isomorphic with the actual structure of personality.  But are there really?

The controversy about the nature of implicit personality theory is reflected in two competing hypotheses:

Obviously the fact that the structure of memory-based ratings resembles those of conceptual similarity ratings cannot resolve this conflict, because both hypotheses predict that these two structures will be highly similar.  But the two hypotheses do make different predictions about the structure of observer ratings of behavior.  Observer ratings, made "on-line" as it were, have no opportunity to be distorted by mental structures.  In observer ratings, behavior is recorded as it occurs, and traits are measured directly, with minimal recourse to inference.  

Thus, we can test the accurate reflection hypothesis against the systematic distortion hypothesis by comparing the structures derived from three types of data:

The two hypothesis making competing predictions:
Unfortunately, for reasons alluded to earlier, observer ratings of behavior are extremely difficult to obtain -- especially of behaviors that are relevant to the Big Five.  However, it is possible to conduct such an experiment on a smaller scale.

One such experiment, by Shweder and D'Andrade (1980), employed 11 categories of interpersonal behavior as target items: these were behaviors such as advising, informing, and suggesting.

063ShwederDAndrade.jpg (46424 bytes)Shweder and D'Andrade then constructed correlation matrices representing the structural relations among all 11 behaviors.  They then performed 7 different tests of the correspondence between the matrices.



For example, examining the correlations between parallel cells of the matrices, they observed the following pattern of correlations:

Aggregating the results across the 7 different tests, they observed the following pattern of correlations:

In other words, there was a high degree of correspondence between memory-based ratings and conceptual similarity ratings, but very little correspondence between either of these and ratings of observed behavior.  These results are consistent with the systematic distortion hypothesis, but inconsistent with the accurate reflection hypothesis.

Systematic Distortion or Accurate Reflection?

Although the Shweder & D'Andrade study seems quite compelling, it has come under criticism from advocates of the accurate reflection hypothesis.  In particular, UCB's own Prof. Jack Block has been an ardent defender of the notion that memory-based ratings, and implicit theories of personality, are accurate reflections of external reality.  See, in particular, an exchange between Shweder and D'Andrade and Block, Weiss, and Thorne that appeared in the Journal of Personality & Social Psychology for 1979. 

To be honest, the systematic distortion hypothesis is somewhat paradoxical, because it seems to refute the realist assumption that there is a high degree of isomorphism between the structure of external reality and our internal mental representations of it.  Where does implicit personality theory come from, if not from the world outside?  According to the ecological perspective on semantics, "the meanings of words are in the world" ( a quote from Ulric Neisser): our cognitive apparatus picks up the structure of the world, and so our mental representations are faithful to that structure.  But they apparently aren't, at least in the case of person perception, where it is very clear that our cognitive structures depart radically from the real world that they attempt to represent.

So where else might implicit personality theory come from?  How do our behaviors and traits become schematized, organized, and clustered into coherent knowledge structures?  

D'Andrade and Shweder have suggested a number of possibilities:

In addition to these ideas, it may be that implicit personality theory reflects ideal types -- that is, it represents our wishes about what goes with what, as represented by cultural heroes and villains.

Information Integration in Impression Formation

Regardless of the ontological status of implicit personality theory, Asch's initial question remains on point: How do we integrate information acquired in the course of person perception into a unitary impression of the person along some dimension?  Asch (1946) considered two possibilities: either we simply sum up a list of a person's individual features to create a unitary impression, or the unitary impression is some kind of configural gestalt.  Asch clearly preferred the gestalt view to the additive view, a preference that integrated social with nonsocial perception, but his impression-formation paradigm has permitted later investigators to consider simpler alternatives.

Chief among these investigators has been Norman Anderson (1974), who has promoted cognitive algebra as a framework for impression formation and for cognitive processing in general.  According to Anderson, perceptual information is integrated according to simple algebraic rules, which take information (about, say, primary traits) and performs a linear (algebraic) combination that yields a summary of the trait information in terms of a superordinate dimension (say, a superordinate trait). 

In particular, Anderson has considered two very simple algebraic models (where S = stimulus information and R = the impression response):

          (55974 bytes)Anderson's basic procedure follows the Asch paradigm:

            (48157 bytes)Anderson's experiments include critical comparisons that afford a test of the adding and averaging models of impression formation.  Assume that the trait ensemble includes a mix of traits:

Thus, the adding and averaging functions give the following values to various trait ensembles:

Trait Ensemble






















Thus, the adding rule predicts that both the HHHH and the MMHH ensemble will be preferred to the HH ensemble; but the averaging rule predicts no difference between HH and HHHH, and that both will be preferred to MMHH.   


074AndersonResults1.jpg (72854 bytes)When Anderson (1965) actually performed the comparison, the empirical results were a little surprising:

So, this experiment seemed to offer no decisive test between the adding and averaging models.

Anderson resolved the conflict by adding three new assumptions:

Because of the new emphasis on stimulus weights, this revised form of cognitive algebra view is known as the weighted adding or averaging rules.

076AddAverage2.jpg (53407 bytes)In a revised test, Anderson set aside the matter of stimulus weightings.  Instead of asking subjects whether they were biased positively or negatively, he simply assumed that positive and negative biases would average themselves out, so that the average subject could be considered to be neutral at the outset (in fact, there is probably an average positive bias, but the essential point remains intact).  Accordingly, a value of 0 was entered into the adding and averaging equations, along with the values of the stimulus information.  Of course, adding 0 does nothing to sums; but it can have a marked effect on averages.  Compare, for example, the following table to the table just above:

Trait Ensemble























077AndersonResults2.jpg (78040 bytes)When Anderson (1965) actually performed the comparison, the empirical results were less confusing:

So, this experiment seemed to offer decisive evidence favoring the weighted averaging model of impression formation.  In the weighted averaging model, the perceiver's final impression builds up slowly, and is heavily constrained by his or her initial bias and first impressions.

Cognitive Algebra as Mathematical Modeling

Anderson's cognitive algebra is an attempt to represent a basic cognitive function as a mathematical formula.  As such, cognitive algebra is intended to be a formal mathematical model of the impression-formation process.  So, if you've always been wary of mathematical modeling (perhaps because you've thought it was too dry, or perhaps because of a little math phobia), but you've followed the arguments about cognitive algebra so far, then


You've just successfully worked your way through a mathematical model of a psychological process.

Anderson's cognitive algebra, and especially the weighted-averaging rule, is an extremely powerful framework for studying social judgment.  Cognitive algebra can be applied to any social judgment, so long as the stimulus attributes are quantifiable, and so long as the perceiver's judgment response can be expressed in numerical terms.  

But cognitive algebra also has some problems:

Despite these problems, lots of work in social cognition has been done within the framework of cognitive algebra -- so much so that we could devote an entire course to it.  But we won't.

The Social Relations Model

Another prominent model of person perception is the Social Relations Model (SRM) developed by David Kenny (1994; for earlier versions, see Kenny & LaVoie, 1984; Malloy & Kenny, 1986; Kenny, 1988), based on the early work of the existential psychiatrist (before he became an "anti-psychiatrist") R.D. Liang (Liang et al., 1966). 

Link to an overview of the Social Relations Model, based on Kenny's 1994 book, Interpersonal Perception: A Social Relations Analysis.  For critical reviews of the SRM, see the review of Kenny's book by Ickes (1996), as well as the book review essays published in Psychological Inquiry (Vol. 7, #3, 1996).

The SRM is focused on dyadic relations -- that is, relations between two people, say Andy and Betty, and in particular how these two people perceive each other.  For purposes of illustration, let's suppose that Andy perceives Betty as high in interpersonal warmth.  In Kenny's s analysis, this perception -- or impression -- has a number of components, such that A's perception of B's warmth is given by the sum of four quite different perceptions (actually, five, depending on how you count):

Because the relationship among the components is additive, you can think of the SRM as a version of Anderson's additive model for impression-formation.  But because the addition includes a constant, which reflects A's biased view of people in general, it's actually a weighted additive model.  But it's not necessarily a strictly additive model (which would go against Anderson's results, which favor averaging).  The relationship component may well be achieved by an averaging process.  We're not going to get into that detail: it will be enough just to explore the surface features of the SRM.

This is because the SRM is actually quite complicated, because Kenny has built into the model two features that are not found in other, simpler models of person perception:

In applying the SRM, Kenny prefers to employ a round-robin research design, in which each person in a group rates everyone else in the group, as well as themselves.  The particular rating scales can be selected for the investigator's purposes, but we might image that the ratings are of likability (Anderson), warmth and competence (Rosenberg, Fiske), or the Big Five traits of extraversion, neuroticism, agreeableness, conscientiousness, and openness to experience.  Of course, selection of the traits to be rated matters a great deal: results may differ greatly if subjects are forming impressions of a person's masculinity, sexual orientation, or likeability, or extraversion.

Note that in the round-robin design, the perceiver is also a target, and the target also a perceiver -- just as in the General Social Interaction Cycle, the actor is also a target and the target also an actor.  The SRM treats the individual as both subject and object, stimulus and response, simultaneously.

The mass of data from the round robin design is then decomposed into three components:

With the round-robin design in hand, Kenny can proceed to address a number of questions about interpersonal perception.  Here are these questions, and short answers, based on some 45 studies reported in Kenny's 1994 monograph. 

Like Anderson's cognitive algebra, Kenny's Social Relations Model is more a method than a theory.  The round-robin design, coupled with sophisticated statistical tools, can be used to partition person perception into its various components, and so to answer a wide variety of questions about a wide variety of topics in social cognition. 

Person Perception as Perception

Research on impression-formation, from Asch (1946) to Anderson (1974) and beyond, has largely made use of trait terms as stimulus materials.  This is certainly appropriate, because -- as Fiske & Cox (1979) demonstrated, as if we needed any proof -- we often describe ourselves and others in terms of traits.  Working with traits injects substantial economies into impression-formation research, because they're easy to manage.  Moreover, traits may fairly closely represent the way information about people is stored in social memory. But it's also clear that an exclusive focus on traits can give a distorted view of the process of impression formation, because traits are not really -- or, at least, not the only -- stimulus information for social perception.  

We don't walk around the world with our traits listed on our foreheads, to be read off by those who wish to form impressions of us.  Rather, the real stimulus information for person perception consists of our physical appearance, our overt behavior, and the situational context in which they appear.  Accordingly, in addition to describing how we make use of trait information to form impressions of personality, a satisfactory account of person perception needs to answer a different sort of question -- to wit:

How do we get from the physical stimulus of the person -- his or her appearance and behavior -- to his or her mental state?

Or, put another way,

What features of the physical stimulus give rise to our impressions of a person's mental state?

Again, the same question about perception occurs in the social domain as in the nonsocial domain.

  • In the nonsocial domain, the problem of perception is to unpack physical stimulus information to make perceptual inferences about the object's physical state -- it's form, location, activity, and affordances of the distal stimulus.
  • In the social domain, the problem of perception is to unpack physical stimulus information to make perceptual inferences about a person's mental state -- his or her thoughts, feelings, and desires.

In the nonsocial domain, the stimulus information for perception consists of patterns of physical energy (the proximal stimulus) radiating from the distal stimulus, and impinging on the perceiver's sensory surfaces.  In the social domain, the stimulus information for perception consists of a person's surface appearance and overt behavior.  These include, among others, the person's:

  • facial expressions;
  • bodily orientation, posture, and movement;
  • vocal cues;
  • interpersonal distance;
  • eye contact and touching;
  • physical appearance, dress, and cleanliness; and
  • the person's local behavioral environment in which we encounter the person (particularly those aspects of the situation that are under the person's control).
Many investigators interested in the perceptual processing of physical information have been heavily influenced by Gibson's "ecological" view of social perception. Recall that, according to Gibson, all the information needed for perception (whether nonsocial or, by extension, social) is provided by the stimulus field (including the nominal stimulus and its background).  And our perceptual apparatus has evolved in such a way as to enable us to perceive the world the way it really is, without any need for "higher" cognitive processes such as thinking, reasoning, or problem-solving; and, for that matter, without any need for "implicit" theories of personality.  Among the leaders in this research are Ruben Baron at the University of Connecticut and Leslie Zebrowitz (nee McArthur) at Brandeis University (Connecticut is a hotbed of Gibsonian perception research).  Others don't know or care much or anything about Gibson, but still focus their research on the information supplied by the physical stimulus, as opposed to language-based information provided by trait names. 

Until relatively recently, there was relatively little work on these more "physical" aspects of person perception -- not least because research with trait ensembles is relatively easy to do.  As a result, we have very little knowledge of the physical stimuli in the natural social world -- their basic features, and the relations among them.  However, some investigators have made promising starts.  The major exceptions have to do with faces and voices -- social stimuli that are amenable to fairly simple physical descriptions.


The use of physical features to make inferences about character and personality has its roots in physiognomy, a pseudoscience in which a person's character was judged according to stable features of the face -- much as the 19th-century phrenologists judged character from the bumps and depressions on the skull.  

The word physiognomy comes from the Greek Physis (nature) and gnomon (judge), and began with the observation that some people looked like certain animals.  It was only a short step, then, to infer that those individuals shared the personality traits presumed to be characteristic of those animals.  More generally, physiognomy was based on the assumption that a person's external appearance revealed something about his internal personality characteristics.  

References to physiognomy go back at least as far as Aristotle, who wrote in his Prior Analytics (2:27) that

It is possible to infer character from features, if it is granted that the body and the soul are changed together by the natural affections... passions and desires....

Aristotle (or perhaps one of his students) actually produced a treatise on the subject, the Physiognomonica.

Physiognomy fell into disrepute in the medieval period, and was revived by Giambattista della Porta (De humana physiognomia, 1586), Thomas Browne (Religio Medici, 1643), and Johann Kaspar Lavater (Physiognomische Fragmente zur Beforderung der Menschenkenntnis and Menschenliebe, 1775-1778). 

And it's been revived again, much more recently.  In a study of transactions on a peer-to-peer lending site (Prosper.com), Durate (2009) showed that people could make valid judgments of trustworthiness based on a head-shot: the criterion was the applicant's actual credit rating and history. 

Here are some physiognomic drawings by Charles LeBrun (1619-1690) a French artist who helped establish the "academic" style of painting popular in the 17th-19th centuries (from Charles LeBrun -- First Painter to King Louis XIV).

                                                          bytes) LeBrun_Ram.jpg
                                                          bytes) LeBrun_Camel.jpg

                                                          bytes) LeBrun_Cat.jpg
                                                          bytes) LeBrun_Donkey.jpg



Facial Expressions of Emotion

A good example of social-perception research involving descriptions of physical stimuli is the work of Ekman (1975, 2003; Ekman & Friesen, 1975) and others on facial expressions of emotion.  In his work, Ekman has been particularly concerned with determining the "sign vehicles" by which people communicate information about their emotional states to other people.  The fact that such communication occurs necessarily entails that there is a receiver who is able to pick up on the communications of a sender -- and this information pickup is exactly what we mean by perception. 

Facial Expressions in Art

Among the many formalisms taught in European painting academies in the 17th-19th centuries were standards for the depiction of emotion on the face.  Among the most popular of these texts was the Methode pur apprendre a dessiner les passions proposee dans une conference sur l'expression general et particuliere (1698) by Charles LeBrun, a leader of French academic painting.  Here are samples from LeBrun's book, showing how various emotions should be depicted (from Charles LeBrun -- First Painter to King Louis XIV).


LeBrun_Anger.jpg (72236 bytes)


LeBrun_Desire.jpg (66604


LeBrun_Fear.jpg (74566 bytes)


Lebrun_Hardiness.jpg (79910


LeBrun_Sadness.jpg (85214


LeBrun_Scorn.jpg (99562 bytes)

Simple Love

                                                          (67963 bytes)


LeBrun_Sorrow.jpg (102785


LeBrun_Surprise.jpg (69373

One of Ekman's most famous findings is that people can reliably "read" certain emotions from the expressions on people's faces.  This is true even when the sender and receiver come from widely disparate cultures.  Close analysis of these expressions shows that each of them is comprised of a particular configuration of muscle activity.  These include:

  • The type (or topography) of action (brow raise, nose wrinkle, lip corners down, etc.);
  • the intensity of action (the magnitude of the change in physical appearance resulting from an action); and
  • the timing of action (abrupt or gradual speed of onset, short or long duration, speed of offset, etc.).
Facial Expression Corresponding Emotion Analysis
                                                          (51691 bytes) Happiness HappinessAnalysis.JPG

                                                          (54166 bytes)
                                                          (52119 bytes) Sadness 76FACSSad.JPG
                                                          (64635 bytes)
                                                          (53409 bytes) Fear 77FACSFear.JPG
                                                          (66415 bytes)
                                                          (50245 bytes) Anger 79FACSAnger.JPG
                                                          (71413 bytes)

                                                          (50737 bytes) Surprise SurpriseAnalysis.JPG

                                                          (54681 bytes)

                                                          (50146 bytes) Disgust 78FACSDisgust.JPG

                                                          (58539 bytes)


81FACS.JPG (78204 bytes)Ekman's system for coding the facial musculature is known as the Facial Action Coding System (FACS).  The system has more than 60 coding categories for various muscle action units (like the Inner Brow Raiser or the Lip Corner Puller) and other action descriptors (such as Tongue Out or Lip Wipe).  Each basic emotion, and every variant on each basic emotion, can be described as a unique combination of these coding categories.  And each coding category is associated with a specific pattern of muscle activity.

The Universality Thesis... and Its Discontents

Cross-cultural studies show that Ekman's basic emotions are highly recognizable across cultures.  Nelson and Russell (2013) summarized several decades' worth of such studies, involving subjects from literate Western cultures (mostly, frankly, American college students), literate non-Western cultures (e.g., Japan, China, and South Asia), and non-literate non-Western cultures (e.g., indigenous tribal societies in Oceania, Africa, and South America).  Subjects from all three cultures recognized prototypical displays of the six basic emotions at levels significantly and substantially better than chance.  This is consistent with the hypothesis that the basic emotions, and the apparatus for producing and reading their displays on the face, is not a cultural artifact but something that is, indeed, biologically basic.

Evidence like this is generally taken as support for the universality thesis that facial expressions of the basic emotions are universally recognized.  They are a product of our evolutionary heritage, innate (not acquired through learning), and shared with at least some nonhuman species (especially primates).  Recognition of these emotions is a product of "bottom-up" processing of stimulus information -- essentially a direct, automatic readout from the target's facial musculature.  And, as the evidence shows, they are invariant across culture.  The ability to read the basic emotions from the face does not depend on contact with Western culture, literacy, or stage of economic development.  The universality thesis has its origins in the work of Darwin, and also in the writings of Sylvan Tomkins, who was Ekman's mentor; but it is most closely associated these days with Ekman himself.

The universality thesis is widely accepted, but there are those who have raised objections to it, arguing that, at the very least, it has been overstated.  They note, in the first place, that recognition of the basic emotions is not, in fact, constant across cultures.  If you look at the results of the Nelson & Russell (2013) review, depicted above, you'll see clearly that, while recognition is significantly and substantially above chance levels, there are also substantial and significant cultural differences.  Only happiness, apparently, is truly universally recognized.  Recognition of surprise, and especially the more negative emotions, drops off substantially as we move to literate non-Western and then non-literate non-Western cultures. 

  • It turns out that emotion-recognition isn't purely a bottom-up process, and that context can make a big difference.
    • Viewing a face against a contrary background -- say, a sad face against a background picture of an amusement park -- makes it more difficult to recognize the facial expression.
    • Photoshopping a face on a contradictory bodily posture -- say, a smiling face on the body of someone who is asking for forgiveness -- also makes it more difficult to recognize the facial expression.
  • There are also a host of methodological issues that appear to inflate levels of emotion-recognition in the classic experiments.
    • In the Ekman pictures presented earlier, which are used in many studies, the facial expressions are posed by actors who have been trained to display the critical cues.  Pictures of "spontaneous" emotion are recognized at much lower levels than posed ones.
    • The subjects in typical experiment are shown several expressions at a time; accuracy is reduced when subjects must deal with pictures shown only one at a time, precluding comparison.
    • The typical emotion-recognition experiment is structured as a within-subjects design, meaning that subjects are are tested on all of the basic emotions.  Again, accuracy is reduced in a between-subjects design, in which individual subjects deal with only a single emotion.
    • Most critically, most of these experiments employ a forced-choice format, in which subjects are asked to match a picture with the appropriate emotion.  Recognition levels are substantially reduced in experiments employing a free-response format, in which subjects must generate their own labels for the pictures.
The bottom line is that while there is some evidence for some universality in emotion recognition, accuracy even when dealing with these six ostensibly "basic" emotions has been exaggerated to some extent.  In fact, a comprehensive survey by Lisa Feldman Barrett et al. ("
Emotional expressions reconsidered: Challenges to inferring emotion in human facial movements", Psychological Science in the Public Interest,  July 2019) shows that, while people do tend to display these classic facial expressions, they do not do so with enough consistency across contexts, individuals, and cultures to make them reliable indicators of an individual's emotional state.  Nor, for that matter, do perceivers reliably infer emotional states from facial expressions.  In a commentary on their article, Dacher Keltner and Alan Cowen et al., who have worked closely with Ekman, agree that there is nothing like an isomorphism between emotional state and facial expression, and that information from the face should be supplemented with other nonverbal channels, such as posture, body movements, and speech prosody facial expressions ("Mapping the Passions: Toward a High-Dimensional Taxonomy of Emotional Experience and Expression", Psychological Science in the Public Interest, July 2019)

The Smile

Just as the face may be the pre-eminent social stimulus, so the smile may be the pre-eminent social behavior.

  • Smiles express happiness and interpersonal warmth, and so encourage social interaction.
    • Although, as Landis (1924) discovered, people also smile during a wide variety of activities, not all of which are pleasant.  In fact, smiling may be the most commonly used strategy for hiding our true emotions, positive or negative.
  • The perception of a smile tends to bring about an imitative smile on the part of the perceiver.
  • Feedback from the facial musculature that creates the smile may sustain, or even enhance, the mood of the person who is doing the smiling -- in this case, two people.

So, for example, Ekman distinguishes between two kinds of smile:

  • the "Duchenne smile", expressing genuine, involuntary happiness, involves the zygomaticus major muscle around the mouth and the orbicularis oculi muscle around the eyes.
  • the "Pan American smile", a voluntary, polite smile so named after the cabin attendants on a famous, but now-defunct, airline, involves only zygomaticus major.
  • In a similar way, Ekman believes that the sorts of "non-happy" smiles observed by Landis (1924) were not genuine "Duchenne"smiles, and can be distinguished from the real thing by his FACS system.  In fact, Ekman has identified as many as 17 distinct types of smile!
    • Infants as young as 10 months will smile differently to a stranger than they do to their mothers.

According to the Associated Press, customer-service employees at the Keihin Electric Express Railway Company in Japan can check their smiles against the Okao Vision face-recognition software system, to make sure that they are smiling properly at customers "Japan Train Workers check Grins with Smile" by Jay Alabaster, Contra Costa Times 07/26/2009).  It's not clear whether they're being checked against a Duchenne smile or a Pan American smile.

80FACSAnger2.JPG (35601 bytes)And as another example, anger involves a large number of muscles.  So, as your grandmother told you, it really does take more muscles to frown than to smile.  So smile and save your energy.


147MOnaLisa.jpg (116865 bytes)Ekman's analysis of facial emotion has been used to offer a solution to a famous question in art history: what is it about the smile of Mona Lisa, in Leonardo da Vinci's famous painting (c. 1503-1505)?.  It's not just Nat "King" Cole who has found this smile mysterious.  Part of the mystery of the smile is the ambiguous way it's painted -- which, according to a conventional theory, reflects the "archaic smiles" in the ancient Greek and Roman paintings and sculpture that so inspired Leonardo and other artists of the Renaissance.


083MonaAnalysis.jpg (105627 bytes)In 2005 NIcu Sebe, a computer-vision researcher at the University of Amsterdam, scanned the Mona Lisa with an emotion-recognition program he developed with colleagues at the Beckman Institute of the University of Illinois, and based on Ekman's analysis of facial expressions of basic emotions.  Using this program, he determined that the Mona Lisa's smile consisted of 83% happiness, 9% disgust, 6% fear, and 2% anger (New Scientist, 12/17/05).  So that's part of the mystery.  Or maybe it's just a smirk, as in this New Yorker cartoon by Emily Flake (08/30/2021).

On the other hand, Peter Schjeldahl, commenting on the sale (for almost half a billon dollars) of another Leonardo painting, Salvator Mundi, remarked on the "ambiguous Mien" of Jesus, and went on to write "Giving an ambiguous character an ambiguous mien doesn't seem a stop-the-presses innovation.  The trick of it, by the way is the same as that of the "Mona Lisa": painting different expressions in the eyes and in the mouth.  When you look at one, your peripheral sense of the other shifts, and vice versa.  You try to reconcile the impressions, with frustration that seeks and finds relief in awe."

Since then, work on computer recognition of emotion, based largely on facial cues, has progressed apace, evolving into a new sub-discipline known as affective computing.  Among the most highly developed of these systems is Affdex, a product of Affectiva, an offshoot of the MIT Media Lab. Based largely on Ekman's FACS system, Affdex scans the environment for a face, isolates it from its background, and identifies major regions such as mouth, nose, eyes, and eyebrows -- distinguishing between non deformable points, such as the tip of the nose, which remain stationary, and deformable points, such as the corners of the lips, which change with different facial expressions.  It computes various geometric relations between these points, and compares the current face to a very large number of other faces, previously analyzed, stored in memory.  It then outputs a probabilistic judgment of whether the face is displaying such basic emotions as happiness, disgust, surprise, concentration, and confusion.  It can distinguish between social smiles and genuine "Duchenne" smiles, and between real and feigned pain.  And it does this in real time.  

For an article on affective computing, including the story of how market forces turned Affdex from an emotional prosthetic for autistic people into a marketing tool, see "We Know How You Feel" by Raffi Khatchadourian, New Yorker, 01/19/2015.

FACS is intended for use by professional researchers and clinicians, including computer analysis of facial expressions.  But the fact that people can reliably read emotions from other people's faces suggests that our perceptual systems are sensitive to changes in facial musculature. Ekman's FACS system is a formal description of the physical stimulus that gives rise to the perception of another's emotional states.

The stimulus faces used in much of Ekman's research comes from actors posing various expressions according to Ekman's instructions, but we can read emotional states from people's faces in other circumstances as well.  

Consider, for example, photographs taken in April 2000, when Elian Gonzalez, a Cuban boy who had lost his mother during an attempt to escape from Cuba, was being sheltered by some of his mother's relatives in Miami.  Elian's father, who was estranged from his mother, and had remained in Cuba, demanded that he be returned to Cuba.  The US Department of Justice, for its part, determined that, legally, custody of the boy should be given to his closest living relative -- his father, who was in Cuba.  The Miami relatives refused to turn Elian over for repatriation, and in the final analysis an armed SWAT team from the Border Patrol forced its way into the relatives' house to retrieve the boy.  Little did the officers know that a newspaper reporter and photographer were already in side the house.  The resulting remarkable sequence of photographs shows the surprise of one officer when he discovered the photographer in the room.

                                                          bytes) Elian090.jpg
                                                          bytes) Elian091.jpg
                                                          bytes) Elian092.jpg
                                                          bytes) Elian093.jpg

The Embodied Smile

Like Ekman, Paula Niedenthal and her colleagues have cataloged a number of different types of smiles, but the differences they observe go far beyond patterns of facial activity..  It turns out (to quote the headline in the New York Times over an article by Carl Zimmer,  01/25/2011), that there's "More To a Smile Than Lips and Teeth".  Some smiles are expressions of pleasure, while others are displayed strategically, in order to initiate, maintain, or strengthen a social bond; some smiles comprise a greeting, others display embarrassment -- or serve as expressions of power.  Niedenthal has been especially active in examining the process of smile recognition -- recognizing the differences among smiles of pleasure, embarrassment, bonding, or power.  She proposes that smiles are "embodied" in perceivers through a process of mimicry, in which different types of smiles initiate different patterns of brain activity in the perceiver -- patterns that are similar to those in the brain of the person doing the smiling.

In one experiment, Niedenthal found, not surprisingly, that subjects could accurately distinguish between these different types of smiles.  But when they held a pencil between their lips, essentially interfering with the facial musculature that would mimic the target's smile, accuracy fell off sharply.  Similarly, the judgments of subjects in the "pencil" condition were more influenced by the contextual background, than by the smiles themselves.  Apparently, mimicry plays an important role in the recognition of smiles (and, probably, other facial expressions as well).

Niedenthal's work exemplifies a larger movement in cognitive psychology known as embodied cognition or grounded cognition.  For most of its history, psychology has assumed that the brain is the sole physical basis of mental life. Embodied cognition assumes that other bodily processes -- in this case, the facial musculature -- are also important determinants of mental states.  And so is the environment -- like the context in which a smiling face appears.  Proponents of embodied cognition do not deny the critical role of the brain for the mind.  They just argue that other factors, in the body outside the brain, and in the world outside the body, are also important.

Full disclosure: Prof. Niedenthal worked in my laboratory as an undergraduate.  But even so, her work on the smile is the most thorough analysis yet.  For an overview of her work on smiles, see P.M. Niedenthal et al. "The Simulations of Smiles (SIMS) Model: Embodied simulation and the Meaning of Facial Expression", Behavioral & Brain Sciences, 33(6), 2010.

Ekman and Darwin

Ekman's work on facial expressions of emotion is strongly informed by evolutionary theory.  Charles Darwin, in his book on The Expression of the Emotions in Men and Animals (1872), noted that the facial expressions by which humans expressed such emotions as fear and anger strongly resembled those by which other animals, such as apes and dogs, expressed the same states.  Ekman assumes, as Darwin suggested, that facial expressions of emotion are part of our phylogenetic endowment, or evolutionary heritage, a product of natural selection.  Ekman edited the 3rd edition of Darwin's Expression (1998).

According to Ekman, Darwin made five major contributions to the study of emotional expressions (Transactions of the Royal Society, 2009):

  1. He treated the emotions as discrete entities, not as points on a continuum.
  2. He focused primarily on the face (and secondarily to vocalization, posture, and other features).
  3. The facial expressions of emotion are universal (though emotional gestures might be culture-specific.
  4. Emotions are not unique to humans, but are found in many other species, especially vertebrates.
  5. The facial expressions of emotion stem from "serviceable habits" -- the raised upper lip characteristic of anger, for example, also exposes the teeth, which our evolutionary forebears used as weapons of attack and defense.

Based on comparative studies of emotional expression in different cultures, Ekman has suggested that there are at least six basic emotions, each associated with an evolved mode of facial expression:

  • joy
  • sadness
  • fear
  • anger
  • surprise
  • disgust.

There may also be other basic emotions, also "hard-wired" through natural selection:

  • contempt
  • anguish.

Ekman's evolutionary theory of facial emotion is interesting, but we do not have to accept it to construe his work on emotional expression as an aspect of person perception.  After all, the fact that people can "read" emotions in others' faces is precisely what we're interested in: how we get from physical stimulus information -- the facial expression -- the perception of the states (cognitive, affective, conative) of the person.


Emotion Perception Beyond the Face

The face is a major channel for communicating emotional states, and probably the most important, but it is not the only one.  Tone of voice, gesture, posture, and gait are also available channels -- although they have not been given as much systematic attention as the face.

                                                (41530 bytes)The importance of nonfacial expressions of emotion is underscored by cases of Moebius Syndrome, a congenital condition first described by Paul Julius Moebius in 1888.  The condition entails a paralysis of the facial musculature (the illustration shows Kathleen Bogart, who has Moebius syndrome, and who studies the disorder, with her husband, Beau, from the New York Times, 04/06/2010).  People with Moebius syndrome cannot express emotions on their faces, so they must find other means of emotional expression, including both verbal and nonverbal channels.  They have no difficulty recognizing other people's facial expressions, however.  And they still feel various emotions.  This is important, because a major theory of emotion communication implies that we mimic other people's facial expressions, and feedback from our own facial expressions shapes both our perception of their emotional states, and our own emotional experience.  That can't happen in cases of Moebius syndrome, of course, because the facial paralysis prevents the feedback.  so either mimicry isn't important, or there are other mechanisms for emotion perception. 

A similar problem is encountered in Bell's palsy, a neurological condition involving the usually) temporary paralysis of the facial musculature caused by inflammation of the VII cranial nerve.  Jonathan Kalb, a theater professor at Fordham University, has written about his own experience with Bell's palsy in "Give Me a Smile" (New Yorker, 01/12/2015).  He has never completely recovered from the illness, with the result that his smile is "an incoherent tug-of-war between a grin on one side and a frown on the other: an expression of joy spliced to an expression of horror).  Kalb reports that he has difficulty communicating positive affect to other people, and they have difficulty reading positive affect from his facial expressions.  He also suggests that, because of disrupted feedback from the facial musculature, he has diminished experience of pleasant affect, and must engage other, compensatory strategies -- some drawn from tricks employed by Method actors.


The Determinants of Physical Attractiveness

Another prominent topic for person perception research has to do with the perception of facial beauty.  We know from research on interpersonal attraction that physical attractiveness is the most powerful determinant of likeability (e.g., Berscheid & Walster, 1974).  And we also know that likeability -- evaluation, in Anderson's terms -- influences a host of social judgments through the halo effect.  But exactly what determines physical attractiveness remains a mystery.  As Berscheid and Walster (1974), two social psychologists who are probably the world's foremost experts on interpersonal attraction, concluded, "There is no answer to the question of what constitutes beauty". 

Why is this question important?  One reason is Thorndike's halo effect.  People tend to believe (regardless of whether it's actually true) that socially desirable features go together.  Therefore, if someone is physically attractive, they'll also tend to think that they're socially attractive -- on the "warm" end of the social good-bad scale, and on the "intelligent" end of the intellectual good-bad scale.  As the English Romantic poet John Keats wrote (in Ode on a Grecian Urn, 1819), "Beauty is truth, truth beauty".

Interestingly, there may also be a reverse halo effect.  Vincent Yzerbt, Kocolas Kervyn, and their colleagues have found that, when comparing people with each other, a person (or group) who receives high ratings on warmth may receive low ratings on competence, and vice-versa.  Apparently, the traditional halo effect occurs when evaluating individuals separately, while the reverse halo effect occurs when comparing one individual with another. 

There's no question about the bias toward facial attractiveness -- and not just in bars and bedrooms.  A number of writers have commented on the pervasiveness of "lookism", a concept modeled on racism, having to do with discrimination against those who are less than a perfect "10" (to use the title of a 1979 movie on this theme starring Dudley Moore, Julie Andrews, and Bo Derek as the eponymous beauty.  For more on lookism, see the following books, discussed by Rachel Shteir in "Taking Beauty's Measure" (Chronicle of Higher Education, 12/16/2011):

  • Hope in a Jar: The Making of America's Beauty Culture by Kathy Peiss (1998).
  • The Beauty Bias: The Injustice of Appearance in Life and Law by Deborah L. Rhode (2010).
  • Beauty Pays: Why Attractive People Are More Successful by Daniel S. Hamermesh (2011). 
  • Erotic Capital: The Power of Attraction in the Boardroom and the Bedroom by Catherine Hakim (2011).
  • Pricing Beauty: the Making of a Fashion Model by Ashley Mears (2011).
  • And last, but by no means least, Lip Service: Smiles in Life, Death, Trust, Lies, Work, Memory, Sex, and Politics by Marianne LaFrance (2011), a social psychologist and director of the Women's Studies Program at Yale.

The Role of Averageness

Actually, maybe there is.  A large body of literature now strongly suggests that attractiveness is strongly related to averageness -- in other words, that we find most attractive those faces (and, for that matter, bodies) that are close to the average for the population.  As counterintuitive as that may seem, there are actually good reasons to think that average faces really are highly attractive.

  • According to evolutionary theory, natural selection has a normalizing function.  Because (again according to evolutionary theory) extreme values on any feature mark genetic mutations, features that are at or near the average mark reproductive fitness.  Therefore, mate selection (which, according to evolutionary theory, is all about reproductive fitness) will prefer those with average features.
  • According to prototype theory, category prototypes may be thought of as the average of the instances of a category.  We respond to category prototypes as if they were familiar -- which they are, because they look like so many category instances; and we tend to prefer the familiar to the unusual.
107LangArray.jpg (122317
                                              bytes)Theory aside, a study by Langlois and Roggman (1990) does indicate that, as an empirical fact, average faces are more attractive.  In this study, full-front face-and-neck photographs of people bearing a pleasant, 108LangResults.jpg
                                              (54002 bytes)neutral expression (with background and lighting controlled) were digitized.  Each face was matched on the location of the eye pupils and the lip midline, and then composites were created through a computer averaging program.  The results of the study were clear: composite faces were preferred to individual faces, and the more faces that went into the composite (from 2 to 32), the more the composite face was preferred.  The more a face reflects the average of all faces, the more attractive it is.  

Referring to the Berscheid and Walster (1974) quote above, Langlois and Roggman (1990) concluded that the question of facial beauty had been solved: [A]ttractive faces... represent the central tendency or the averaged members of the category of faces".

Averageness or Symmetry? 

But the question isn't entirely resolved, because evolutionary psychology has a somewhat different answer to the question of why we prefer average faces.  According to evolutionary psychology, patterns of experience, thought, and action that were adaptive in our ancestral environment (the Environment of Early Adaptation -- roughly the East African savanna during the Pleistocene era) have been preserved in current members of the human species through natural selection.  In this view, mate selection prefers healthy, fecund mates: facial symmetry is a marker of health and fecundity, while fluctuating asymmetries on the face (and elsewhere on the body are signs that the organism is unhealthy, and less desirable from the point of view of reproductive fitness.  Averaging eliminates these fluctuating asymmetries, and produces symmetrical faces.  So, according to evolutionary psychology, average faces are not attractive because prototypes seem familiar, but because average are more symmetrical. 

119Nefertiti.jpg (123430
                                                bytes)There's certainly anecdotal evidence in favor of a connection between symmetry and attractiveness.  Queen Nefertiti, wife and co-ruler of ancient Egypt with the Pharaoh Akhenaten (14th century BCE) was widely acclaimed as the most beautiful woman in the ancient world (this was before Helen of Troy): her name Nefertiti even means (in rough translation) "The perfectly beautiful woman has come".  And Nefertiti is portrayed in images that survive from her time as having a perfectly symmetrical face.  Of course, these are only images -- we don't know what she really looked like.  She might just have had a good public-relations firm.


elizabeth-taylor.jpg (44317
                                                bytes)But Elizabeth-Taylor.gif

                                                (37172 bytes) we do know what the actress Elizabeth Taylor (who died in 2011 at age 79) looked like, and she really did have a perfectly symmetrical face -- and was universally acknowledged as fabulously beautiful.




                                                (110460 bytes)On the other hand, 20th-century culture gives us lots of examples of very attractive woman who have prominent fluctuating asymmetries on the face: consider, for example, the prominent "beauty marks" on the faces of Marilyn Monroe and Cindy Crawford.  Beauty marks are called "beauty marks" precisely because they enhance the person's facial beauty, but as fluctuating asymmetries they're supposed to mark a lack of reproductive fitness, and thus make the person less attractive, not more.  

So something's wrong with the evolutionary argument.  In fact, the evolutionary story, like many of the "just-so" stories that abound in evolutionary psychology, sounds good, but doesn't stand up to close scrutiny.

113AttractHealth.jpg (56797
                                                bytes)In the first place, the connection between facial attractiveness and reproductive fitness appears to be pretty weak, perhaps nonexistent.  Kellick, Zebrowitz, Langlois, and Johnson (1998) analyzed data from the Intergenerational Studies conducted by the Institute for Human Development at the University of California, Berkeley, which included data from a large group of individuals who were born in the Berkeley-Oakland area between 1920 and 1929.  These subjects had been photographed as adolescents, and health assessments had been made on them during adolescence (ages 11-18), middle age (30-36), and old age (56-66).  There was essentially zero correlation between facial attractiveness, rated from the adolescent photographs, and health at any stage of life.  So, attractiveness does not seem to serve as a marker of health -- and thus of reproductive fitness.  Men aren't attracted to women because they think they'll produce lots of healthy babies.  Men are attracted to women because -- well, they're attractive.

In the second place, the relation between averageness and attractiveness does not appear to be mediated by symmetry.  In another experiment, Rhodes, Sumich, and Byatt (1999) employed computer-averaged composites of facial photographs that varied in their averageness, as defined in the Langois et al. (1990) study.  Subjects then rated these photographs on symmetry, pleasantness, and attractiveness.

  • The raw (zero-order) correlation between attractiveness and averageness (r = .77) was much higher than the corresponding correlation between attractiveness and symmetry (r = .43).
  • Using a statistical technique called the partial correlation, Rhodes et al. recalculated the correlation between attractiveness and averageness, controlling for symmetry; and also the correlation between attractiveness and symmetry, controlling for averageness.  Both correlations dropped a little, as partial correlations do.  But the important point is that averageness continued to correlate with attractiveness, even with symmetry partialled out.  Therefore, the correlation between averageness and attractiveness was not an artifact of the symmetry of average faces.
  • Similar findings were obtained for pleasantness of emotional expression.     
115RhodesResults.jpg (62593
                                              bytes)Rhodes et al. concluded that averageness (the opposite of distinctiveness), symmetry, and pleasantness each make an independent contribution to physical attractiveness.  We find pleasant, symmetrical, average faces attractive, but the attractiveness of average faces is not an artifact of their symmetry.  (Nor, Rhodes et al. argued, is it an artifact of blending, which tends to remove distinctive characteristics even if they are symmetrical.)

Rhodes et al. (1999) assert that their experimental results "settle the dispute" between averageness and symmetry.  But the question remains why average faces are attractive.  Their best guess (and mine) is that average faces look like lots of other faces, and so they seem familiar; and we know from the mere exposure effect that we find the familiar more attractive than the unfamiliar.

Beyond Averageness and Symmetry

As the examples of Marilyn Monroe and Cindy Crawford suggest, there's probably more to facial beauty than averageness and symmetry.  In addition to those "beauty marks", there's skin tone, body-mass index and waist-to-hip ratio -- and a genuine smile.


117LorenzBaby.jpg (65482
                                                bytes)Another facial feature that has been studied by researchers of person perception is babyfacedness119PittengerBaby.jpg (75676
                                                bytes)Ethologists such as Konrad Lorenz have long noted that immature organisms, whether mammals or even birds and reptiles, share certain features in common:

  • enlarged eyes and lips;
  • soft, chubby cheeks;
  • fine eyebrows;
  • pug nose;
  • large cranium, relative to the face; and
  • non-sloping forehead.
Lorenz suggested that babyfacedness constitutes a "universal stimulus" that elicits care-taking behaviors, and inhibits aggression.  Leslie Zebrowitz (nee McArthur) picked up on Lorenz's ideas, and has studied the consequences of babyfacedness for social perception and social interaction.

118MacArthurBaby.jpg (86079
                                                bytes)Using computer "morphing" programs, it is possible to take line drawings or photographs of faces and adjust their features to make them appear more or less baby-like.  Subjects then rate the targets for various personality traits.  Research by Zebrowitz and her colleagues generally finds that people perceive baby-faced individuals as warmer, weaker, more naive and trusting; they are also more likely to help baby-faced people, even when help isn't needed.


132ZebrowBaby.jpg (131915
                                                bytes)In one study, Friedman and Zebrowitz (1992) took schematic drawings of male and female human faces and manipulated their facial features to create or erase aspects of babyfacedness.  As it happens, there is a sex difference here, with the typical female face possessing more "baby-faced" features than the typical male face.  Therefore, by adding baby-faced features they made the typical male face more "babyish" in appearance, and the typical female face appear more "mature".  They then had male and female subjects view the sketches, and make ratings of their impressions of the targets' personalities.


133Power.jpg (44218 bytes)Baby-faced males and females alike were rated lower on power, compared to their mature-faced counterparts.  But because the typical male face has more mature features than the typical female face, the typical male was rated as more powerful than the typical female.



134Warmth.jpg (45008 bytes)Baby-faced females (but not baby-faced males) were rated higher on warmth than their mature-faced counterparts.  Again, because the typical male face has more mature features than the typical female face, the typical male was rated as less warm than the typical female.



135Masc.jpg (46158 bytes)Perhaps not surprisingly, because of the sex difference in babyfacedness, baby-faced males and females alike were rated lower on masculinity (and thus higher on femininity), compared to their mature-faced counterparts.  Again, because the typical male face has more mature features than the typical female face, the typical male was rated as more masculine than the typical female.


136ChildCare.jpg (48231 bytes)Baby-faced females (but not baby-faced males) were rated more likely to be the "child caretaker" in the family higher on warmth than their mature-faced counterparts.  Again, because the typical male face has more mature features than the typical female face, the typical male was rated as less likely to be a child caretaker than the typical female.


137FinProvid.jpg (48637 bytes)Baby-faced females (but not baby-faced males) were rated less likely to be the "financial provider" in the family than their mature-faced counterparts.  Again, because the typical male face has more mature features than the typical female face, the typical male was rated as more likely to be a financial provider than the typical female.



These are social stereotypes, of course, but that's the point: social perceivers use the physical properties of the face to make inferences about the emotions and dispositions of the person.

110FuddTweety.JPG (60904
                                                bytes)The 111CheyneyFudd.JPG
                                                (67476 bytes)baby-faced stereotype is so commonly held that it has been employed in cartoon characters (such as Elmer Fudd and Tweetybird).  And in political humor as well.  After Vice President Cheney was involved in a quail-hunting accident, in which he peppered one of his companions with bird shot, he was depicted as Elmer Fudd -- a clear contrast between the befuddled lovableness of the cartoon character and Cheney's own reputation as a humorless right-wing ideologue.  


Abdulmutallab.JPG (41435
                                                bytes)Of course, "baby-facedness" is a stereotype, and stereotypes can be misleading, and sometimes downright wrong.  For example, Umar Farouk Abdulmutallab, the "Underpants Bomber" of Christmas Day, 2009, was commonly described in press accounts as "baby-faced".


Beyond the Face in Person Perception

Most research on social perception has focused on the face, which is after all the most salient, perhaps the quintessential, social stimulus.  However, other nonverbal cues play a part in person perception, including vocal (prosodic) cues, gestures, and other aspects of body language.  Here, in a famous photograph from the New York Times (1957), the future president Lyndon B. Johnson (then Majority Leader of the United States Senate) discusses a point of legislation with a colleague.

Body Language

Edward T. Hall first drew attention to several aspects of body language in his popular book, The Hidden Dimension (1966).

  • Proxemics, or the study of social distance.
  • Posture
  • Calypsis, or the strategic covering and uncovering of body parts.
  • Gesture

Robert Rosenthal and his colleagues have developed a psychological test to assess individual differences in people's sensitivity to nonverbal cues -- including, but going beyond, facial cues (Rosenthal, Hall, DiMatteo, Rogers, & Archer, 1979).  The Profile of Nonverbal Sensitivity (PONS) consists of 220 2-second audio/video clips portraying a 24-year-old woman (Judith Hall, now a Professor of Psychology at Northeastern University) acting out a set of 20 vignettes.  The subjects' task is to guess which of two vignettes is being acted out.  

The vignettes are classified into a 2x2 scheme crossing positive-negative with dominant-submissive, with 5 scenes in each category.

  • Positive/Dominant: e.g., admiring the weather, or talking about a wedding;
  • Positive/Submissive: e.g., asking for a favor, or helping a customer;
  • Negative/Dominant: e.g., threatening someone, or nagging a child; and
  • Negative/Submissive: e.g., asking forgiveness, or returning a faulty purchase.
Each vignette, in turn, presents information over one of several nonverbal channels of communication:
  • 3 Visual Channels Alone: full figure, face alone, or body alone, all with no vocal cues.
  • 2 Vocal Channels Alone: there are no visual cues at all; moreover, the content of the actor's speech has been rendered unintelligible by one of two methods: electronic filtering of certain frequencies or (literally) cutting up the audiotape and splicing it back together randomly.  In either case, only the "tone of voice" remained.
  • 6 Combinations of Each Visual and Each Vocal Channel.
The 11 communication channels, x 20 affective scenes, yielded the 220 items of the PONS test.

Rosenthal and his colleagues were interested in using the test to measure individual differences in sensitivity -- i.e., in perceptual ability to various channels of nonverbal communication.  In the context of this course, the PONS is a good illustration of the point that there are physical sources of social information beyond the face, including vocal and gestural cues.


Person Perception Beyond the Body

Person perception is shaped by the person's physical features, but it is also influenced by aspects of his or her dress -- what one hides and reveals, what one draws attention to are also stimulus cues as to a person's internal psychological state.  Even a person's office or bedroom can provide clues to his or her personality.

A Different Sort of "White Coat Syndrome"

An interesting illustration of this point occurred in 2000, in a dispute over the dress code at Duke University Medical Center (a similar dispute also arose at the Massachusetts General Hospital in Boston, one of Harvard's teaching hospitals).  At Duke, physicians wore two different types of white coats: a knee-length duster was restricted to senior physicians, while a hip-length jacket was imposed on interns and residents.  Thus, in the Duke (and MGH) environment, the type of white coat worn by a physician conveyed information about the person wearing it --his or her level of training, and presumed levels of knowledge and expertise.


Interestingly, this long coat-short coat tradition has a long history.  In England and America, through the 19th century, there was a professional distinction between physicians and surgeons.  Physicians diagnosed and cured illness, while surgeons removed diseased body parts.  Physicians, who enjoyed a higher social status than surgeons, wore long coats, while the lower-status surgeons wore short coats.  When medicine and surgery were united in the 20th century, this long coat-short coat distinction was transferred to the person's degree of training.  

Actually, the professional distinction between physicians and surgeons is still honored in some ways: Columbia University has a College of Physicians and Surgeons, while British medical students work toward two professional degrees, the MB, or Bachelor of Medicine, and the ChB, or Bachelor of Chirurgerie (surgery).  

Anyway, Duke's medical residents complained about this policy.  They said that they usually wore white trousers with the short white jacket, and the combination made them feel like ice-cream vendors.  More important, perhaps, patients picked up on the fact that senior physicians wore long coats, and often questioned the profession, status, and competence of those who wore short jackets.  Moreover, female residents were often confused with nurses, who also wore white coats; for the same reason, male residents were often confused with the janitorial staff.  The short coat-long coat distinction even affected relations among the medical staff.  Senior staff were more likely to speak to residents who wore long coats, and especially more likely to address them as equals.

In the event, Duke altered the dress code as it pertained to residents, but short coats were still imposed on interns.  So, class distinctions prevailed after all!

The white coat, which has been worn by physicians since the scientific revolution in medicine of the late 19th century, is part of the identity of most physicians, and part of the concept of physician held in the mind of the public.  If nothing else, it creates a link between medicine and science.  

"The coat is part of what defines me, and I couldn't function without it", said Dr. Richard Cohen, a clinical professor of medicine at Weill Medical College of Cornell University and an attending physician at New York-Presbyterian Hospital.  "When a patient shares intimacies with you and you examine them in a manner that no one else does, you'd better look like a physician -- not a guy who works at Starbuck's"....   A Postgraduate Medical Journal study in 2004 found that 56 percent of patients surveyed felt that physicians should wear them.  About 94 percent of schools of medicine and osteopathy in the united States have "white coat ceremonies" whereby new students don the garment to signify their entry into the profession {note by JFK: Geez, they used to gt a little black bag and a stethoscope].  ("The Lab Coat Is On the Hook In the Fight Against Germs" by Thomas Vinciguerra, New York Times, 07/26/2009).

But not for long.  As Vinciguerra notes, it has become increasingly clear that the white coat, whether long or short, is a major carrier of bacteria and thus a major source of hospital-based infections, and a major contributor to morbidity and mortality -- not to mention increased healthcare costs.  In 2007, the British national health system adopted a policy of "bare below the elbow" banning lab coats as well as long-sleeved shirts and blouses, neckties, long fingernails, and jewelry on the hands and wrists.  Maybe physicians will have to settle for wearing their stethoscopes.


Lie Detection as a Problem of Person Perception

Much of Ekman's work on facial emotion, and much of the interest in nonverbal communication generally, has to do with the detection of deception -- or, put bluntly, with lie detection.  How can we know when someone is deceiving us?  Note that, in terms of person perception, the question of deception is this: how can we know that a person is deceiving us about his or her internal mental state -- about what he or she is thinking, feeling, or desiring?  Ekman's work on behavioral (as opposed to physiological) lie-detection has been extremely influential. He has consulted with law-enforcement agencies at all levels of government, and has even "gone Hollywood" as a consultant to the TV show Lie to Me (Fox), about Dr. Cal Lightman (played by Tim Roth), a "human polygraph" who can read body-language "micro-expressions". 

The problem, of course, is that people rarely tell us that they are lying -- what would be the point of that?  (Even Epimenides, the Cretan philosopher of the 6th-century BC, who asserted that "All Cretans are liars", could not have been lying, because if all Cretans really were liars, he -- a Cretan himself -- would have been telling the truth, thus disproving his own statement).  Instead, we usually have to infer, from their nonverbal behavior, that their verbal communications are not accurate.

And let's be clear -- lying is a serious problem of social perception.  DePaulo et al. (1996) conducted a study of everyday lying by means of diary study in which subjects were asked to keep track of all of their social interactions for a week, including instances in which they lied.  They found that lying is a common feature of social interaction.  College students recorded lying about twice a day, on average, in 1/3 of all their social interactions.  A community sample lied somewhat less often: about once a day, in about 1/5 of their interactions.  DePaulo et al. hasten to point out that most of these lies were trivial, but they were untruths nonetheless.

So lying is an important aspect of social interaction, and so our ability to detect lying is an important aspect of social perception.


Can "Only a Few" "Tell a Liar"?

It turns out that we are surprisingly bad at this.  Our poor lie-detection abilities were dramatically illustrated in a study by Ekman and O'Sullivan (1991).  For this study, they created 10 1-second video clips, showing the full head-on view of a target's face and body.  Then the target described his or her positive emotions as s/he was viewing a video.  Half of the targets were viewing a pleasant nature scene, in which case s/he was telling the truth about his/her emotional state.  The other half of the targets were actually viewing a very gruesome scene -- in which case, s/he was not telling the truth.  The subjects' task was to identify which of the targets were telling the truth, and which were lying.  Ekman and O'Sullivan tested several different groups of subjects, ranging from college students and psychiatrists to law-enforcement officials. 

Averaged across all the groups, the subjects were only about 57% correct -- barely above chance levels.  Only agents of the United States Secret Service, a branch of the Treasury Department that has responsibility for protecting the President and other high officials, were particularly good at picking out liars: 53% had 70% or greater accuracy, compared to 50% "chance" level. 


A later study by Ekman, O'Sullivan and Frank (1999), focused on subjects who had special professional interests in lie-detection yielded similar, if somewhat better, results.  This time, the subjects achieved about 63% accuracy -- which is better than chance, but not all that great.  The top scorers (those with 70% or greater accuracy) were federal "law enforcement" officers, most of whom were actually agents of the Central Intelligence Agency.

A cautionary note: The data in this study was collected while Ekman delivered a research on behavioral lie detection to various professional audiences.  After presenting the film clips, Ekman revealed to his audience which targets had been lying, and which telling the truth.  He then asked the members of the audience to raise their hands if they got 10, 9, 8,etc. correct, and tallied the results.  Given natural tendencies for self-enhancement, it seems likely that the audience self-reports of accuracy were inflated somewhat.  By how much, however, we cannot know for sure.  Presumably, the same polling procedure, also resulting in possibly inflated scores, was employed in a 1999 follow-up

So while most people are pretty bad at behavioral lie-detection, some people are better than others.  Ekman argued that lie-detection is possible when perceivers pick up on the leakage of nonverbal cues.  For example, people tend to display "Duchenne" smiles when telling the truth, but "Pan American" smiles when telling lies.  Their vocalizations also tend to show an increase in fundamental pitch.  Ekman and his colleagues were able to detect this leakage through special means, such as viewing the videos at slow-motion and noticing micro-expressions of affect that are incongruent with the content of the target's message.  However, these micro-expressions can also be picked up in real time, especially by people -- like Secret Service and CIA agents, perhaps -- who have had a lot of experience with distinguishing truths from lies.  In the 1991 and 1999 experiments, the successful subjects were able to pick up on these instances of leakage.

However, the situation is a little more complicated than this, because even some of the "top scorers" didn't perform better than chance.  This is a little counterintuitive, because you'd think that anything better than 50% would count as "greater than chance".  But as Nickerson and Hammond (1993) pointed out, when the probability of a hit p equals probability of a miss q, even 8 hits out of 10 is not significantly greater than chance with p < .05 (actually, it just misses). 

  • Using a more stringent criterion of 8/10 hits, the proportion of high-scoring Secret Service agents in the Ekman & O'Sullivan (1991) study fell from 53% to 29%.  That might be a better rate than the ordinary  person on the street, but it's nothing to write home (or a paper!) about.
  • And in the Ekman et al. (1999) study, the percentage of high-scoring Federal officers would fall from 74% as well (Ekman et al. don't provide data that would permit the actual calculation). 
Still, as Ekman & O'Sullivan (1993) pointed out in reply, the Secret Service agents did better than anyone else.  The point of this is not to criticize Ekman's work, but rather to point out that the determination of "better than chance" levels of responding isn't quite as simple as it would seem, intuitively, to be.

109LyingSD.jpg (55332 bytes)The detection of deception can be construed as a problem for signal detection theory.  In contrast to traditional analyses of accuracy, which focus on hits (and their obverse, misses), signal-detection theory focuses on hits and false alarms -- in this case, instances where a target is called a liar but is actually telling the truth.  If you call everyone a liar, you'll correctly identify every actual liar, but you'll also misidentify all the truth-tellers.  Good lie-detection will maximize hits while minimizing false alarms. 

A bigger problem has to do with the measure of "accuracy" employed in these Ekman studies, which takes only correct responses into account.  For example, in the studies described, a subject would "catch" 100% of the liars simply by calling everyone a liar.  So it's important to take error into account.  From this perspective, we can classify subjects' responses into four categories:

There are some ways of taking false positives and false negatives into account.

It was just to address this problem that signal detection theory (SDT) was invented (Green & Swets, 1966; see also Tanner & Swets, 1954).  In sensory psychophysics, the observer's problem is to discriminate between trials in which a signal is presented against a background of noise, and other trials in which only noise is presented, no signal.  On any trial, an observer might actually detect the signal.  Alternatively, he might miss the signal, because it's too faint.  Or, the signal might be strong enough, but he might miss it because he's not expecting it.  Or, he might miss it because the costs of making a mistake are relatively low.  There are other possibilities.  In any event, the point is that the observer's performance must take account f both the observer's sensory acuity and his s biases, expectations, and motivations.  SDT does this by separating performance into two parameters:

For the purpose of this course, you don't need to know how to calculate either d' or beta (or any of the other SDT parameters).  You just need to know the concepts.

Signal-detection experiments are set up so that on some trials (e.g., half), a signal is presented against a background of noise; on other trials, the signal is omitted, and only the noisy background is presented.  On each trial, the observer responds with a "Yes", indicating that the signal was present, or a "No", indicating that it was absent.  The 2x2 arrangement yields the proportion of trials representing "Hits", "Misses", "False Alarms", and "Correct Rejections".  The translation of this framework into the lie-detection situation is obvious. 

Unfortunately, all too many studies of lie-detection aren't amenable to analysis in terms of SDT, because all too many investigators fail to report false alarms as well as hits.  This was the case with the Ekman & O'Connor (1991) study, but Ekman et al. (1999) did report separate values for accuracy in lie-detection and accuracy in truth-detection, which enables us to calculate the false-alarm rate (as 100% - accuracy in truth detection).

For the Federal Officers, accuracy in lie-detection was 80%, while accuracy in truth-detection was about 66% -- which means that truth-tellers were falsely called liars about 34% of the time.  Applying the formulas of SDT yields a d' measure of sensitivity of about 1.26, and a C measure of bias of about -.21.

All by themselves, these numbers aren't too meaningful.  But if you construct tables representing the values of various combinations of hits and false alarms, you can get a sense of the subjects' performance. 
  • A d' = 1.26 puts the Federal Officers about in the middle, between randomness (i.e., no sensitivity, or d' = 0) and almost-perfect performance (99% hits and 1% false alarms, yielding d' = 4.65). 
  • Similarly, a C = -.21 indicates a slight "liberal" bias toward "Yes" -- that is, a bias toward calling targets liars.

For all the subjects in the 1999 study, the results were much the same, except that all of the subjects taken together, including the Federal Officers, showed less sensitivity (d' = .66), and only a very slight liberal bias (C = -.07).

Moreover, there is another, more subtle problem with the Ekman/O'Sullivan studies, which is that the targets in these studies were individuals who were determined, by the FACS system, to be "leaking" cues, especially through their faces, that they were lying instead of telling the truth.  The good lie-detectors were apparently able to pick up on these cues, so that they were able to perform better than chance.  But how representative are these "leaky" liars.  In developing their stimulus materials, Ekman and his associates drew on a sample of 31 individuals who were instructed to deliberately lie.  Only 10 of these liars (32%) actually leaked cues that were picked up by the FACS system, and were used as targets in the 1991 and 1999 studies.  This means that the remaining 21 liars (68% of the total), who were excluded from the experiment, didn't leak any (facial) cues that revealed them to be lying.  So, the good lie-detectors in the Ekman studies were only "good" with respect to their ability to pick up certain facial cues to deception in those liars who were so poorly skilled in lying that they leaked them in the first place. 

Apparently, if most people are bad at detecting lies, most of us are pretty good at lying.  In fact, maybe that's why we're so bad at lie-detection: it's not so much that we're bad at detecting lies, but that we're so good at lying undetectably! 

Or, put another way, lie-detection is a problem of signal-detection, and people are typically bad lie-detectors because so often there is no signal to detect!

Lie-Detection in Lab and Life

Ekman and O'Sullivan based their conclusions about people's lie-detection abilities on their own studies -- where, frankly, the experimental procedures are somewhat informal (the subjects are typically members of the audience to whom Ekman is giving a talk).  More systematic laboratory research comes to much the same conclusion: People just aren't particularly good at it.

BDeP1.JPG (50662 bytes)In tBDeP2.JPG (55910 bytes)heir reviews of the experimental literature, Kraut (1980), Vrij (2000), and Bond and DePaulo  (2006) all found that the average receiver was barely better than chance at detecting lying under natural conditions -- that is, when the senders included both leaky and non-leaky liars.  Things looked a little better, though, when B&DeP looked at continuous ratings of honesty, rather than dichotomous judgments of lying.  Under these circumstances, honesty ratings distinguished between liars and truth-tellers to a modest degree.

Of course, these results can also be cast in terms of signal-detection theory.  Applying the formulas, as we did in the study by Ekman et al. (1999):
  • A d' = 0.20 puts the average subject very close to randomness (i.e., no sensitivity, or d' = 0) and far from almost-perfect performance (99% hits and 1% false alarms, yielding d' = 4.65). 
  • Similarly, a C = 0.18 indicates a slight "conservative" bias toward "No" -- that is, a bias toward calling targets truthful.  You can see hints of this same sort of bias in the Ekman et al. (1999) study, among those subjects who were not Federal Officers).

BDeP3.JPG (38360 bytes)Part of the problem with lie-detection may be that most of us may tend to assume that people are telling the truth.  B&DeP discovered a small "truth bias" -- to judge that people are telling the truth, even when they're not.  This effectively reduces our ability to make correct judgments of the matter.

B&DeP also examined a number of other variables that might affect lie-detection performance:

                                                          (41270 bytes) There is a modality effect, such that lie-detection is more accurate when receivers have access to an audio channel, as well as -- or instead of -- a video channel.  This may also help explain Ekman & O'Sullivan's results: because of their emphasis on facial cues, they typically present subjects with only a visual channel.

                                                          (51652 bytes)

BDep6.JPG (43575 bytes)

Detection of deception does not depend on whether the sender is highly motivated to be believed.  In fact, there is a paradox of motivation such that the receiver's "truth bias" is reduced when the sender is highly motivated to deceive.  Still, the receivers are not very accurate at correctly detecting deception.  

Nor does it depend on whether the sender has been given an opportunity to prepare to deceive.  Nor does the receiver's expertise matter much (remember, even Ekman and O'Sullivan's experts weren't all that good, and there aren't that many of them to begin with.

 However, detection of deception is better if the receiver has had some prior exposure to the sender -- that is, prior to the experimental test.

                                                          (42709 bytes) A very provocative finding is that third parties, observing the interaction between the sender and the receiver, are better at detecting deception than the receivers are.

Another problem is that people do not have very accurate knowledge about valid cues to deception -- and many of their beliefs about valid cues turn out to be wrong.  When Miron Zuckerman (1981, 1985) reviewed research on nonverbal cues to deception, he discovered that there were a number of valid cues on which we could base a judgment that a person was being deceptive.  However, there were two important aspects of his findings:

  • First, none of these cues is pathognomonic of lying -- that is, always diagnostic of a lie.  Some nonverbal cues are correlated with lying, but the correlations are far from perfect.
  • People frequently believe that certain cues are valid that are not; or, they believe that certain cues are more valid than they are; and they ignore some cues that are, at least to some degree, actually valid.

                                                          (38811 bytes) Some cues are components of the deceiver's verbal behavior.

                                                          (46726 bytes) Other cues are paralinguistic -- that have less to do with what the deceiver says than how he or she says it.

                                                          (59656 bytes) Still other cues are visual rather than vocal or auditory.

                                                          (33849 bytes) And then there are some miscellaneous cues to deception.

Despite the availability of such cues, people are surprisingly poor at reading them -- partly because they're attending to cues that are, in fact, invalid! 

Similar findings were obtained in a review by DePaulo et al (2003), who examined more than 100 studies and more than 150 possible cues.

Presumably, though, people could be taught to read these cues properly, in the same way that Ekman's FACS system presumably teaches people to read people's emotional expressions more accurately.  If so, the detection of deception from verbal and nonverbal cues, like the reading of facial expressions of emotion, is a perceptual skill that can be acquired through perceptual learning, much as people can learn to adjust to viewing the world through distorting prisms.

This is, in fact, the premise of a program, initiated by the Transportation Security Administration, called SPOT -- for Screening of Passengers by Observational Techniques. Beginning in 2007, the TSA spent approximately $200 million per year training personnel to spot behavioral cues to deception in airline passengers' facial expressions and other aspects of body language.  However, a November 2013 evaluation by the Government Accountability Office recommended that the SPOT program be terminated, on the grounds that it was adequately supported by scientific evidence.  In large part, the GAO based its conclusions on Bond and DePaulo's 2006 review.  Ekman's response is that B&DeP relied too much on laboratory studies of "low-stakes" lies, which may not generalize to the real-world problems faced by TSA screeners.  (For a journalistic account of the debate, see"The Liar's 'Tell'" by Christopher Shea, Chronicle of Higher Education, 10/17/2014.)

Lie Detection in Forensic Settings

Ekman's studies suggest that law-enforcement personnel -- or, at least, some of them -- tend to have acquired a particular perceptual skill of lie-detection.  But Bond and DePaulo's studies, among others, suggest that most people don't have the knack.  And other critiques suggest that even the skills of trained law-enforcement personnel may be exaggerated.

First, let's get one thing -- actually, two closely related things - -straight.  Lie detectors don't work very well either.

  • The traditional polygraph, which records various indices of autonomic nervous system functioning, such as heart rate and blood pressure, are very poor except under very tightly controlled conditions -- conditions that don't usually obtain in actual field settings.

  • More recent innovations, such as the use of EEG and brain-imaging methods (including so-called "brain fingerprinting") don't work any better.

Lie-Detection and Berkeley

The traditional polygraph was developed at Berkeley, based on an earlier prototype developed by William Moulton Marston, a Harvard psychologist.  August Vollmer, at that time the police chief for the city of Berkeley, collaborated with John Larson, a UCB physiology PhD who had joined the police force as a patrolman (!). Beginning in 1920, Larson tested a wide variety of suspects on his apparatus.  When Vollmer left Berkeley to become the police chief in Los Angeles, in 1923, he took another polygraph enthusiast, Leonarde Keeler, with him (Larson, for his part, took the device to a new job in Chicago at the Institute for Juvenile Research).  It was Keeler, in fact, who coined the term polygraph, and popularized the technique within law-enforcement circles.

Subsequent legal debates over the validity of polygraphic lie detection resulted in the Frye Rule concerning the admission of scientific evidence in court -- that "expert testimony deduced from a well-recognized scientific principle or discovery" requires that "the thing from which the deduction is made must be sufficiently established to have gained general acceptance in the particular field in which it belongs". 

For the whole story, see The Lie Detectors: The History of an American Obsession (2010).

The only physiological technique that is really good at detecting lies is the Guilty Knowledge Test, devised by David T. Lykken, which assesses a suspect's "secret knowledge" of the details of a crime.  So, for example, in the case of a stolen watch, the investigator might ask the suspect whether the stolen watch was a Bulova or a Rolex. All things being equal, an innocent person will not respond differentially to the two probes; but a guilty person will know the truth, and this knowledge will show up in his physiological response.  Studies have shown that the GKT produces a hit rate of 80-90%, and a false alarm rate of less than 10%.  The trick, of course, is that there has to be some aspect of the crime that only the perpetrator would know.  For this reason, it is not always possible to use the GKT; but when it's possible, it's dynamite. 

Traditionally, the GKT is performed with a traditional polygraph.  More recently, EEG and fMRI have been touted for this purpose, but in this case, the neural signature is just another physiological response.  People may be more inclined to believe a "neural signature", but it should be clear that the EEG or fMRI is nothing more than a hopped-up polygraph.

So how do law-enforcement personnel determine who is lying to them? To some extent, they rely on nonverbal cues, such as facial expressions and posture, a la Ekman.  But new work focuses on what people actually say, rather than how they say it (see "Judging Honesty by Words, not Fidgets" by Benedict Carey, New York Times, 05/12/2009).  Problems with false confessions have led police to focus their interrogations on gathering information about the crime and a suspect, instead of forcing a confession.  

But even in the determination of honesty and dishonesty, there are linguistic as well as paralinguistic cues that can make a difference.

  • Liars tend to prepare a script ahead of time that is tight, but lacking in detail -- and then they stick to it.
  • Truth-tellers tend to recall lots of incidental, extraneous details, they may change their stories over time, and they may make mistakes.  When recounting the event repeatedly, truth-tellers may add up to 20-30% more details over trials -- again, these are often often extraneous details.

But even these kinds of clues are far from infallible.  In one experiment, the rate of lie detection was only about 70%

In the absence of disciplined perceptual learning, however, most of us are pretty bad at detecting deception -- we perceive people as lying who are telling the truth, and we perceive people as telling the truth who are in fact lying. 

"Gaydar" as Person Perception

Person perception is the perception of a person's internal mental states of knowledge and belief, feeling and desire.  In addition to making judgments of competence, neuroticism, extraversion, and the like, we also make judgments of other people's sexuality -- both sexual orientation in general, and -- if we're interested -- sexual interest in us.  The problem of judging sexual orientation is known colloquially as gaydar -- the idea that people, especially gay people, can intuitively tell whether another person is gay or not. 

Beginning with a set of studies by Rule and Ambady (2008; Rule et al., 2009), a number of studies have demonstrated that people can identify, at better than chance levels, a target's sexual orientation based on visual, auditory, and even olfactory (don't ask) cues, even when the stimulus is severely degraded (e.g., exposures of only 50 milliseconds in duration). 

A representative study is one by Lyons et al. (2014a), in which women, self-identified as straight or lesbian, viewed head-shots of men and women who were self-identified as gay or straight on social media.  The study was conducted over the internet, and the subjects were simply asked to classify each target as homosexual or heterosexual.  Women were pretty good at this, averaging about 61% hits (i.e., classifying as gay people who really were gay, and straights as straight), and about 27% false alarms, for both male and female targets.  Both values differ significantly from the chance level of 50%.  Applying signal-detection theory yields substantial values for the d' measure of accuracy; it also revealed a bias toward classifying women as gay, especially by perceivers who themselves were lesbians.

Earlier research by Joshua Tabak and Vivian Zayas (2012) employed more degraded stimulus materials.  They presented (mostly female) judges with very brief (50 msec) flashes of faces of male and female targets who were self-identified as heterosexual or homosexual, and found that subjects were accurate in judging the targets' sexual orientation about 60% of the time -- as compared to the 50% accuracy that would be expected just by chance.  Women were more accurate than men, Judgments of women's faces were more accurate (64%) than those of men's faces (57%).  Although researchers have not (yet) uncovered the specific cues that perceivers use in this task, Tabak and Zayas found that judgments were more accurate when the faces were presented right-side up, as opposed to upside-down.  Presenting faces upside-down disrupts disrupts facial recognition -- what is known as the face inversion effect (Valentine, 1988; Farah et al., 1995)The face inversion effect, in turn, is commonly attributed to configural processing -- that is, people recognize faces not just by recognizing someone's nose, or eyes, or mouth as individual features, but rather by recognizing the length of the nose relative to the distance between the eyes -- you get the drift (Maurer et al., 2002).  Anyway, the superiority of rightside-up presentation indicates that it was a configuration of cues, rather than individual features that was the relevant cue.  Research by Nicholas Rule suggests that the mouth may be an important cue.  Perhaps, Tabak and Zayas speculate, perceivers judged "effeminate" male faces and "masculine" female faces (as indicated, for example, by the ratio of width to height) as more likely to belong to homosexuals.  But they didn't actually test this.

Studies of "gaydar" were taken to a new level by a study reported by Yilun Wang and Michal Kosinski (JPSP, in press 2017) that garnered considerable media attention, drawing articles in The Economist, the New Yorker, and the New York Times.  Wang and Kosinski employed 35,000 images of the faces of white men and women who had reported their sexual orientation on online dating sites (there weren't enough minority gays to permit analysis).  When they presented these images to a group of human judges (recruited through Mechanical Turk), the humans' judgments of sexual orientation were correct approximately 61% of the time for male faces and about 54% of the time for female faces -- barely better than chance, and in line with the findings of Tabak & Zayas (2012).  However, when W&K submitted the same faces to an off-the-shelf pattern-recognition program, the machine's judgments were much better: 81% correct for male faces and 71% correct for females.  If the program was given five different faces for each target, overall accuracy increased to 91%.  Apparently, two factors contribute to the increased accuracy of the machine: (1) by virtue of being a computer processing a huge database, it was able to process much more cue information than would be possible for a human perceiver; (2) it employed available cue information more reliably in making its judgments.  At the same time, W&K make clear that the machine was, essentially, doing what the human judges were doing: assigning stereotypically "feminine" male faces and stereotypically "masculine" female faces to the "gay" category.  Emphasis on stereotyping.  The machine is not even, necessarily, a good model of human "gaydar", because it's likely that people rely on other aspects of appearance and behavior to make these judgments -- a man who has an inordinate interest in musical theater, perhaps, or a woman who's really into carpentry.  Of course, these too are stereotypes.  It's stereotypes all the way down.  And not necessarily accurate stereotypes, either

Yes, stereotypes can be accurate, in the sense that they can accurately capture what a group is like on average, even if it's not accurate with respect to all the individual group membersIn fact, Lee Jussim (Behavioral & Brain Sciences, 2017) has argued that even racial and gender stereotypes are more accurate than usually believed. I think his evidence is actually pretty weak, but he's right in principle that stereotypes are not necessarily inaccurate representations of groups.

Moreover, even accuracy of 74-91% shouldn't be overestimated, because of the low base rate of homosexuals in the population.  Consider this example taken from an article about the W&K study which discusses other controversies surrounding this study ("Why Stanford researchers Tried to Create a 'Gaydar' Machine" by Heather Murphy, New York Times, 10/10/2017).  Assume, for purposes of argument, that 5% of the population is gay.  A facial-recognition algorithm that is 91% accurate would mistakenly classify 9% of straight people as gay, and 9% of gay people as straight.  In a sample of 1000 individuals, that would mean that 4 or 5 of the 50 homosexuals (1000 x .05) x .09) would be mistakenly classified as straight, while as many as 85 of the 950 ((950 x .05) x .09) heterosexuals would be mistakenly classified as gay. The problem is not so much with the algorithm as with the base rates: with a low-baserate event, like homosexuality, there are going to be a lot of mistaken classifications.

The W&K study was parodied in the New Yorker in "Modern Science", by Paul Rudnick, the American playwright and humorist (12/04/2017).  Excerpts follow:
On several occasions, when a photo of an especially attractive subject was scanned, the hardware would disappear from the lab for many hours and then return with a sheen of perspiration and the categorization "YES....

The presence of a single arched eyebrow and a slight contraction of the lips cannot be used as evidence of male homosexuality, except when the subject is examining furniture from West Elm....

The algorithm was able to ascertain sexual preference with 98% accuracy when using only photos of the subjects' shoes....

Photos of male, female, and nonbinary subjects currently attending progressive liberal-arts colleges refused to be categorized as "gay" or "straight," and made disgusted noises.

Facial Recognition and Artificial Intelligence

Ekman's work, and research like W&K's study of "gaydar", signaled a trend toward the use of artificial intelligence and machine learning to create algorithms for facial recognition.  In an important Op-Ed article in the New York Times, Sahil Chinoy (a UCB graduate in physics and economics who worked at the Times before going on to graduate school in economics at Harvard), discusses some of the problems with the practice ("The Racist History Behind Facial Recognition", 07/14/2019). See also "Spying on Your Emotions" by John McQuaid, Scientific American 12/2021.

One of these problems, at least from the point of view of social policy, is the "perpetual lineup" problem: if a photograph (from, say, closed-circuit TV) can be matched against millions of photographs from a database of driver's licenses, then, in a sense, we're always under surveillance.  We are very quickly headed toward a surveillance society in which we're always being watched, identified, and tracked whenever we're outside the privacy of our own homes.  And even in our homes, we're already part of a surveillance economy  in which our every Google search and Facebook like will result in an advertisement appearing on our computer screens.

Another problem, from the point of view of psychological research and theory, is that the idea of identifying people's internal mental (especially emotional) states from their facial expressions, bodily postures, gestures, and the like may be simply wrongheaded.  Ekman's work, and other work like his, has been severely criticized by Lisa Feldman Barrett and other researchers who point out that the correlations between facial expressions and emotion are far from perfect.  Chinoy cites a report from the AI Now Institute argues that, based on the current state of both scientific knowledge and computer technology, widely available AI systems for identifying race, sexuality, emotions, and personality traits are "being applied in unethical and irresponsible ways".

In his article, Chinoy traces the current enthusiasm for facial recognition technology back to its roots in the 19th-century pseudosciences of phrenology and physiognomy.  In phrenology, people's traits and states are identified by virtue of bumps and depressions in their skulls which ostensibly correspond to high or low levels of benevolence or conscientiousness.  In physiognomy, traits are thought to correspond to people's physical appearance -- a person who looks like a fox, for example, was thought to be sly.  We laugh at such notions, perhaps, and recognize that they're based on the crudest form of stereotyping.  But these ideas have staying power.  Sir Francis Galton, who almost single-handedly invented psychometrics in the late 19th century, superimposed pictures of convicts one on the other, hoping that the average would reveal "the essence of the criminal face".  And Cesare Lombroso, a 19th-century proponent of physiognomy, argued that intellectual inferiority could be determined from face and body measurements.  More recently (2016, to be exact), a group of Chinese researchers employed essentially the same method in an attempt to reveal the "average face" corresponding to criminality. 


Prior Probabilities and Baserate Neglect

The ability of people to detect other people's sexual orientation, even with degraded exposure, is impressive.  Still, as with Ekman's studies of lie-detection, it should not be exaggerated, because, as with Ekman's famous studies, there is a subtle procedural feature that magnifies the subjects' accuracy levels.  Not to pick on it, because it's a perfectly good study as far as it goes, let's take the Lyons study as an example.  Like most other signal-detection studies, the "signal" (i.e., a gay target) was "on" for half the trials -- that's just how these studies are done.  And when half the targets were gay, the subjects were pretty good -- though far from perfect -- at "detecting" their sexuality.  But the problem is that, in the real world outside the laboratory, half the targets aren't gay.  A reasonable estimate of the proportion of gays in the population is closer to 5%, and that changes everything.

Bayes' Theorem

The reason it changes everything has to do with Bayes' Theorem, first proposed by Thomas Bayes, an English clergyman who also dabbled in statistics, in the 18th century (the origin myth is that he was trying to formulate a statistical proof of the existence of God). The problem in Bayes' Theorem is to determine the likelihood that some proposition (A) is true, given some observation or evidence (B).  Bayes argued that in calculating this probability, you have to take account of the base-rates: (1) first, the probability that A is true, regardless of B; (2) and second, the probability that B is true, regardless of A. 

Here's a restatement and expansion of Bayes' Theorem, so you can see how the calculations work out in what follows.  For a nice introduction to Bayes' Theorem, see the fabulous book by Reid Hastie and Robyn Dawes, Rational Choice in an Uncertain World (2001; 2nd Ed., 2010).


Applying Bayes' Theorem to Gaydar

In the current context, the accuracy of gaydar can be reformulated as follows:

In a friendly critique of the Lyons study, Ploderl (2014) applied Bayes' theorem, which takes account of base rates, to the calculation of detection accuracy.  Given a base rate of 5%, a hit rate of 70% and a false-alarm rate of 20% (both figures are reasonably close to what Lyons found) would yield "gaydar" accuracy of only 15%.  Even a more liberal base-rate estimate of 10% increases gaydar accuracy only to about 22%.  That's still not bad: as Dr. Johnson once said about a dog who could walk on its hind legs, "It is not done well; but you are surprised to find it done at all".  Still, accuracy of 15-22% is a lot lower than 70%.

As Lyons et al. (2014b) pointed out in reply, Ploderl's analysis undercuts the ecological validity of their findings to some extent, but the point of the study was simply to demonstrate that gaydar can be accurate at all.  This finding then will motivate future laboratory research intended to identify the valid (and invalid) cues to sexual orientation.  For that purpose, the problem of baserate neglect (so named by Kahneman & Tversky, 1974) isn't really a problem.  These studies did not identify the precise visual cues that the subjects employed to make their judgments, though other research has shown that judgments of sexual orientation are based largely on stereotypes: men who have "feminine" features, and women who have "masculine" features, are more likely to be classified as gay.

Applying Bayes' Theorem to Lie-Detection

A similar problem crops up in the Ekman studies. 

  • As already noted, his targets were already an unrepresentative sample of liars who were known to "leak" facial cues to deception. 
    • Only about 1/3 of his liars were "leakers", meaning that most liar don't leak. 
    • And meaning that the facial cues leaked by these leaky liars lose validity when applied to the population of liars as a whole. 
  • But again, in the Ekman study, half of the targets were liars and half were not, which -- as with the homosexuals in the Lyons study, probably overestimates the proportion of liars in the population.  Perhaps not, of course, depending on the population! 
    • But consider the problem of an airport TSA screener, who is asking each passenger, essentially, "Are you going to blow up this plane?".  The baserate is going to be low, certainly lower than 50%.  If the baserate is 10% -- on September 11, 2001, 4 of the 37 passengers on Flight 93 were hijackers -- then the true level of accuracy is going to go down from the best accuracy rate of 70% or so obtained by Ekman et al. (1999) from their sample of sheriffs and federal officers.
    • Even more so when you consider that, in 2001, Newark Airport handled more than 30 million passengers, or about 2,500,000 passengers per month, or about 83,000 per day -- which means that the ratio of lying hijackers on September 11 was about 1:21,000!

Again In the current context, the accuracy of gaydar can be reformulated as follows:

So now let's return to the question of lie detection, and apply Bayes' Theorem to that situation.  To begin with, here's an example from the Conceptual Tools website developed by Neil Cotter, a professor of electrical engineering at the University of Utah.  In his discussion of Bayes' Theorem, he considers a polygraph lie detector that detects lies with about 90% accuracy.  That is, the probability that the lie detector says "You lied" when you really did lie is .89, and the probability that the machine says "You told the truth" when you really did tell the truth is .90.



DL = Detector says "You Lied"
p(DL | L = .89
DT = Detector says "You told the Truth"
p(DT | L = .11
L = You actually Lied
p(DL | T) = .10
T = You actually told the Truth
p(DT | L) = .90