Language and Communication

Once we have engaged in some cognitive activity -- perceived an object or remembered an event, induced a concept, deduced category membership, rendered a judgment, made an inference, solved a problem -- we may use our capacity for language to communicate what we have done to someone else. But language is not just a tool for communication. It is also a tool for thought.

Communication Between Nonhuman Animals

Many animal species have the ability to communicate with other members of the same species.

The pheromones discussed in the lectures on Sensation also count as vehicles for communication between nonhuman animals -- and, perhaps, between humans as well. Pheromones appear to communicate aggression, sexual interest, and the like. Of course the odor "landscape" for humans, especially, is complicated by the existence of "exogenous" odors, such as cigarette smoke and other air pollutants, soap, deodorants, and perfumes, which can obscure the messages from relatively weak "endogenous" odors.

But nonverbal communication goes beyond pheromones, and includes a wide variety of "instinctual"behaviors discussed in the lectures on Learning.

Animals may exchange signals during the mating process -- as in the male stickleback's zig-zag dance, or the female stickleback's head-up receptive posture.
Or they may communicate threats to intruders on their territory -- as in the head-down threat posture of the male stickleback.
There are displays of submission (wolves, when fighting, will expose their throats), and alarm at the sight of a predator (distress calls in birds subject to predation).
Animals display signals of their intention: birds will raise and lower their tails before takeoff, but this activity serves no aerodynamic purpose.
There are signals of conflict: the phoebe makes a particular call just before a change in movement.
And signals of the location of food: bees perform a "waggle dance" that indicates the relative position of food relative to the sun and their hive.

The Waggle-Dance of the Honeybee

The waggle dance has been described as the most abstract and complex form of nonhuman communication. It is performed by worker honeybees (sterile females) who have been foraging for food. After returning to the hive, the foragers waggle their bodies as they move along the surface of the honeycomb; then they circle to the right, returning to their starting point, and start the waggle all over again; then they circle to the left, return to their starting point, and the process continues, so that the pattern forms a "Figure 8".

The angle of the waggle, from vertical, is equal to the angle of the vectors formed by lines connecting the hive to the sun and to the food source. If the food source is at a 45^o angle from the sun, then the waggle line will be 45^o away from vertical.
The distance of the waggle -- essentially, the number of waggles performed by the bee -- indicates the distance of the food source from the hive. The more waggles, the farther away the food.
Different subspecies have different "dialects": for German honeybees, each waggle corresponds to about 50 meters of distance; for Italian honeybees, each waggle corresponds to only about 20 meters. But unlike the dialects of human language, these dialects are innate, carried on the genes. German honeybee larvae, raised in Italian honeybee nests, still dance "German" -- causing all sorts of confusion when they communicate to their Italian hive-mates!

For his discovery of the waggle-dance of the honeybee, Karl von Frisch shared the Nobel Prize in Physiology or Medicine with Nikko Tinbergen and Konrad Lorenz.

The waggle dance is extensively discussed by James Gould & Carol Grant Gould in The Honeybee (1988) and The Animal Mind (1999).

These are all social displays: instinct-like innate expressive movements, that do not have to be learned, and that are performed in narrowly defined situations. Other species have more complicated communication systems, that seem to be acquired through learning.

The classic case here is bird song. Male birds sing, but females do not. Each species has a characteristic song, which is used to identify potential mates. Within each species, there are also territorial "dialects", variations on the basic song that seem to help preserve the gene pool in a particular area.

Studies of sparrows indicate that bird songs seem to be learned through exposure to other males, but this is not the usual form of learning. In the natural setting, all young males acquire the dialect sung by their parents. In the laboratory, where the young from one territory are exposed to the dialect from another territory, they will learn whatever dialect they are exposed to. However, sparrows cannot learn the songs of other species. There appears to be a critical period for song-learning, similar to that seen in imprinting in ducks and geese (remember?), between about 1 and 7 weeks of age. If a young sparrow is exposed to its song before it is seven days old, or after it is 50 days old, it will sing only a crude approximation of its song. Interestingly, practice is not important. After exposure, the young male will remain silent for several months. Then, it will sing its song perfectly, the very first time.

There are analogous effects in the female sparrow. As noted, females don't sing; but they do respond to the song of their parents. Exposure of young females under the same conditions as described above, yield effects on response to songs analogous to those seen in males. Moreover, females show the effects of administration of male hormones. If given a dose of testosterone, the female will sing the song to which she was exposed when young.

Apparently, the bird inherits, as part of its innate biological endowment, a crude template of its species' song. This template is then refined through experience -- through exposure to whatever dialect of the song is sung by its parents. By virtue of this exposure the bird then produces a detailed version of its own song. This is an instance of learning: a relatively permanent change in behavior that occurs as a result of experience. But notice that this learning does not occur through reinforcement: there is no practice, and no regime of rewards and punishments.

Bird-song learning is innate, but that doesn't mean that the songs themselves are invariant. A study led by Elizabeth Derryberry at the University of Tennessee, compared the songs of white-crowned sparrows before and after the "lockdown" imposed in California in 2020 as a response to the Covid-19 pandemic. One of the results of the lockdown was a significant diminution in ambient noise in the environment -- not least because there were fewer cars on the road, fewer planes flying overhead, fewer people talking and fewer dogs barking. For a little while, urban environments sounded more like rural environments (some of the data for this study was collected along the Bay Trail in Richmond, a city just north of UCB). As a result, Derryberry and her colleagues were able to show that the birds sang at lower volume levels -- apparently because they no longer had to compete with the noise around them. Derryberry et al. conclude that "These findings illustrate that behavioral traits can change rapidly in response to newly favorable conditions, indicating an inherent resilience to long-standing anthropogenic pressures such as noise pollution". True, but we shouldn't push it: remember those baby sea turtles threatened by evolutionary traps caused by man-made lighting, as discussed in the lectures on Learning.

Whale song has some of the same qualities as bird song, but we know very little about the conditions under which it is learned, and performed.

Linguistic Communication Between Humans

Bird song shows some parallels to human speech. Every normal child learns to speak his or her native language. This learning occurs naturally, without reinforcement, even if the child is raised in an extremely impoverished linguistic environment. There is also a critical period for development.

If children are linguistically isolated until puberty, they show very poor language skills thereafter.
This is also true for deaf children. If they are exposed to sign language from an early age, they will learn to use it fluently, just as normal-hearing children learn a spoken language.

But if exposure is delayed, they will have difficulty becoming fluent signers.
Deaf children of deaf parents (who can sign), show no delays in language acquisition.
But deaf children of hearing parents, who ordinarily do not use sign language, may be delayed or impaired if they are not exposed to sign language elsewhere, such as preschool.
Children who receive cochlear implants also show no problems in oral language acquisition.

There is some controversy within the Deaf community (who use an upper-case "D" to signify that they consider themselves part of a particular culture) about cochlear implants. Some object to implants because it suggests that deafness is a handicap, instead of a "difference", and because they signify accommodation to the "hearing" culture. Others believe that implants give deaf people the maximum flexibility in their interactions with other people, deaf and hearing.
One approach that is definitely sub-optimal is a reliance on lip-reading and speech alone. Both these skills are extremely difficult for deaf people to acquire, and the time spent in the learning process effectively hampers the learning of other, more optimal skills, resulting in relatively poor linguistic ability.

That doesn't mean that deaf people shouldn't be taught to lip-read and speak. Deaf people should be able to take advantage of all available communication resources. But it does mean that exclusive reliance on lip-reading and speech is not in the best interests of deaf people.
For more on this controversy, see:

When the Mind Hears: A History of the Deaf by Harlan Lane (1984).
Deaf in America: Voices from a Culture by Carol Padden and Tom Humphries (1988).
I Can Hear You Whisper: An Intimate Journey Through the Science of Sound and Language by Lydia Denworth (2014).

If children learn a second language after puberty, that language is spoken with an accent.
Recovery from aphasia is more likely in children than in adults. But in the final analysis, human language is something else.

How long does the critical period last? We don't know for sure, not least because almost every normal child is exposed to language from birth. But some evidence comes from a new internet-based study of more than 600,000 demographically diverse native and non-native speakers of English (Hartshorne et al., Cognition, 2018). Some of the non-native speakers had been "immersion" learners, in that they learned English at roughly the same time as they learned their native language, or who learned English later in a largely English-speaking environment; others were "non-immersion" learners, who learned English after acquiring their native language, and in a non-English-speaking environment. The subjects were asked to make a graded series of grammaticality judgments. Not surprisingly, immersion learners performed better than non-immersion learners, and those who began learning English early performed better than those who began later. But while previous theorists assumed that the critical period for acquiring fluency was the beginning of puberty, Hartshorne et al. found that the drop-off occurred after age 17. Individuals who began learning English before age 10 became more competent than those who began after age 14, so the amount of exposure during the critical period is also important. But beginning at age 5 didn't improve fluency over beginning at age 10. Bottom line: if you want to become fluent in a second language, start learning it before age 10. And prepare to spend about 30 years before you're really speaking like a native.

People can learn a second language, and, especially if they start early enough, they can learn it pretty well. And the same goes for a third, as many Europeans will attest. But how many languages can one individual master? If the critical period lasts as long as 17 years, there's more opportunity than if it lasts only until age 12, but still, you've got to practice to really master a language, and there are only so many hours in a day. Still, people who speak multiple languages are known as polyglots, and people who can speak more than 11 languages are known as hyperpolyglots. They don't just have a couple of words or phrases at their command; they are able to converse at length with native speakers. Their linguistic abilities are so extreme that there is speculation that there might be a biological difference -- something in the brain -- between them and mere mortals, but no such difference has been found so far. On the other hand, there's practice. Alexander Arguelles, a well-known hyperpolyglot, reckons that he spends up to 40% of his waking hours on language study and practice. For more on hyperpolyglotism, including Arguelles, see "Maltese for Beginners" by Judith Thurman, New Yorker,09/03/2018).

The Reading Wars

Learning oral language, permitting speaking and listening, occurs effortlessly in all normal humans. Learning a written language, permitting reading and writing, is more problematic. Apparently, the brain evolved for speaking but not for writing -- not surprisingly, given that written language was invented only about 5000 years ago -- not long enough for the brain to have evolved a module for written language. Spoken language is acquired simply by being exposed to it; written language has to be deliberately and effortfully taught and learned.

But how to teach reading -- ah, that's a matter of some controversy -- so controversial, in fact, that the debate has been called "The Reading Wars".

Children in my generation -- I was born in the 1950s and educated in the public schools of New York State -- were taught phonics, basically drilling students endlessly in phonology (the sounds associated with letters, syllables, and words), enabling us to "sound out" new words when we encountered them. It worked out well for most of us, but not so well for a relatively small group of children who suffered from dyslexia (and the adults who still do).
Beginning in the 1920s, but gaining popularity beginning in the 1960s, was an alternative instructional method, known as the whole word or whole language method, in which children are taught the pronunciation of whole words some children have a great deal of difficulty with phonics. Whole-language instruction was generally favored in "progressive", mostly private, schools, but its stock rose because of the success of this method with children suffering from dyslexia (dyslexics also find it fairly easy to read Chinese, where words are represented by ideographs rather than strings of letters). It was when these methods began to be imported into the public-school curriculum that The Reading Wars took off (something similar happened with The New Math, but that would be a distraction from the current topic).

Whole-language instruction became very popular, not least because it is fun for both students and teachers. It's epitomized by the "Dick and Jane" readers that were so popular in schools. The only problem is that, for most students, the whole-word method is not the best way to teach reading -- a point made vigorously by Rudolf Flesch in his 1955 book, Why Johnny Can't Read. Students taught by the whole-word method can read simple, familiar texts, but they have difficulty decoding unfamiliar, more complex material.

Let me be clear: the whole-word method isn't wholly, or even largely, responsible for the poor reading skills of the typical American elementary school pupil. Poverty and poor funding are much more important culprits. But all the scientific evidence points to some variant of phonics as generally the best method of teaching reading -- not least because it makes whole-word instruction even easier (for the record, I went through the whole "Dick and Jane" series -- but not before I was drilled in phonics!).

For coverage of this research, see Language at the Speed of Sight: How We Read, Why So Many Can't, and What Can Be Done About It (2017) by Mark Seidenberg, a distinguished psycholinguist and cognitive neuroscientist at the University of Wisconsin.

Also "Ending the Reading Wars: Reading Acquisition from Novice to Expert" by Anne Castles, Kathleen Rastle, and Kate Nation, A trio of Australian and English researchers, which appeared as a special issue of Psychological Science in the Public Interest (Vol. 19, No. 1, June 2018). The authors make a strong empirical case for phonics over whole-language instruction, although they note that other kinds of teaching, including whole-language, enhance readers' ability to comprehend what they read. Their review is accompanied by "What Research Tells Us About Reading Instruction", a commentary by Rebecca Treiman, a distinguished American psycholinguist and one of the world's foremost experts on reading. Treiman reminds us that reading is actually a very difficult cognitive skill: brain is built for oral language, but not for written language. While joining Castles et al. in endorsing phonics, Treiman also discusses ways in which phonics instruction can be improved.

Language is qualitatively different from even the most complex communication system of nonhuman species. Human language is differentiated from other communication systems by five properties.

Meaning: Language expresses ideas. This is sometimes called semanticity, referring to abstract, symbolic representations -- words -- that mean the same thing to all speakers of a language.

Reference: Language refers to the real or imagined world of objects and events. Related to this is the property of displacement, meaning that language can refer to events in the past or the future as well as in the immediate present.

Interpersonal: Language is used not just as a tool for thought, but also to communicate one's ideas to someone else -- to change their opinions, or to accomplish one's goals.

Structure: Linguistic utterances are organized according to the rules of grammar. Prescriptive grammar is a set of authoritative recipes: how language ought to be. Descriptive grammar is a set of descriptions of natural speech: how language is used. The rules of grammar are finite in number. But they permit the generation of an infinite number of different, meaningful sentences.

Creativity: People can generate and understand novel utterances, that have never been heard or spoken by them (or anyone else) ever before. This is also known as productivity. For example, it's been estimated that there are 1 nonillion (or 1 trillion quintillion) meaningful 20-word sentences in English.

That's 1,000,000,000,000,000,000,000,000,000,000, or 10³⁰ sentences.

Each of us can understand each of these effortlessly (provided we know what the words mean). But there are only 3 billion seconds in a century. Therefore, there is simply not enough time for us to have learned the meaning of each of these sentences by exposure, even at the rate of 1/sec for an entire lifetime.

These properties combine to make human language unique in nature. Social displays are stereotyped and obligatory; speech is creative and optional. Birds are known to improvise on their songs, but the basic song remains unchanged.

Some primates can be trained to perform some linguistic tasks. The chimpanzees Washoe (who was trained first by Allen and Beatrix Gardner at the University of Nevada, Reno -- located in Washoe County, Nevada -- and later by Deborah and Roger Fouts at Central Washington University, and who died in 2007 at the age of 42, having been trained and Nim Chimsky (a pun by her trainer, Herbert Terrace of Columbia University, on the name of Noam Chomsky, the godfather of modern linguistics), and the gorillas Koko and Michael (trained by Penny Patterson, a former graduate student at Stanford), have learned some aspects of American Sign Language (chimpanzees lack the proper vocal apparatus for anything resembling human speech); and the chimpanzee Sarah (trained by David Premack, at the University of Pennsylvania) has learned to string tokens together in order to form elementary "sentences". Kanzi, a bonobo studied by Sue Savage-Rumbaugh (originally at Georgia State University and later at the Iowa Primate Learning Sanctuary), learned to communicate with humans using a keyboard of 300 "lexigrams" representing various concepts in "Yerkish", an artificial symbol system named after the Yerkes Primate Research Center, where he was born (he also learned to comprehend some spoken English).

And not just primates. Alex, an African grey parrot trained by Irene Pepperberg acquired over 100 words in the course of his 30-year lifespan. Like a preschooler, he learned is colors, and basic shapes, to count a little, and to categorize objects. Alex, whose name was derived from Pepperberg's "Avian Learning Experiment", actually got an obituary in the New York Times ("Alex, a Parrot Who Had a Way with Words, Dies" by Benedict Carey, 09/10-2007, from which the picture is drawn; see also "The Communicators" by Charles Siebert, an appreciation of Alex and Washoe, who died the same year as Alex, New York Times Magazine, 12/30/2007).

However, the training of these animals is very arduous, requiring lots of time, effort and reinforcement. And there are big individual differences among chimps and gorillas in their ability to learn language. No primate, even under ideal conditions, has shown the language ability that a normal two-year-old human child acquires effortlessly under the most impoverished conditions. The general consensus is that chimpanzees can acquire some aspects of language semantics, learning the meanings of symbolic "words", but have little or no capacity for linguistic syntax, or the ability to string words together into meaningful utterances. This suggests that human language is indeed a unique ability in nature.

Nim Chimpsky was raised like a human infant in an apartment near Columbia University, and when Terrace's program of research ended, Nim retired to a primate field station at the University of Kansas. He is the subject of a biography by Stephanie Hess, Nim Chimpsky, the Chimp Who Would Be Human (2008), and a documentary film, Project Nim, directed by Peter Marsh (2010), based on Hess's book. For Terrace's own take on Nim, see his book, Why Chimpanzees Can't Learn Language and Only Humans Can (2019; see review by David P. Barash in "Aping Our Behavior", Wall Street Journal, 12/14/2019). Terrace argues that while chimps and other animals can use "words" (and other symbols) imperatively (so, for example, the word food means something like Give me food), they don't use them declaratively (i.e., to name things in conversation, as when we say to someone, I think this food is too spicy). Animals might be able to learn the correspondence between words (and other symbols) -- semantics, in other words -- they don't have syntax, so they don't have the ability to combine symbols to express new ideas. He also argues that while they might be able to acquire concrete concepts, such as food, they don't have the ability to acquire abstract concepts such as justice. Barash, an evolutionary psychologist, isn't sure that chimps and other creatures are utterly lacking in linguistic ability -- he thinks that there might well be some chimp(s), out there somewhere, who might have it. We just may not have encountered them yet.

Koko was born in the San Francisco Zoo on July 4, 1971, and died at the Gorilla Foundation, the home established for her in the Bay Area by her trainer, Penny Patterson, on June 19, 2018 (see her obituary in the New York Times, 06/21/2018). There's no doubt that Koko was able to use signs to communicate with Patterson and others, though a fierce debate raged over the precise extent of her linguistic abilities. Koko had semantics, to a degree, using arbitrary signs to convey meaning, but it's not clear that she had syntax, which is what allows language to express an infinite variety of meanings, and which distinguishes language from other forms of communication.

Sarah was also retired from research, and spent her last 13 years in the company of other chimpanzees at Chimp Haven. She died in 2019, and got a great obituary in the New York Times ("The World's Smartest Chimpanzee Has Died", by Lori Gruen, 08/10/2019). I have more to say about Sarah in my lectures on the origins of consciousness and the development of social cognition.

Kanzi is, in some respects, a sadder story. Savage-Rumbaugh raised Kanzi in an intact family unit consisting of his mother, Matata, and his sister, Panbanisha, and eventually other relatives. Exposed to the keyboard as an infant, he quickly learned to use it to communicate with the human researchers by matching Yerkish lexigrams to spoken English words directed at various objects and actions. Savage-Rumbaugh claimed that, in one test, he demonstrated understanding of 72% of 660 novel sentences such as "Go get the ball that's outside". Panbanisha also learned to communicate via the lexigrams. Interestingly, Matata, who started the project as an adult, was much less successful in learning Yerkish. Savage-Rumbaugh also suggested that the bonobos' acquired a form of speech, with distinct vocalizations specific to certain Yerkish words. Eventually, conflict developed among the staff: Savage-Rumbaugh and some other staff began to consider the lab environment as a kind of "Pan/Homo" hybrid culture ("Pan" being the label for the species containing chimpanzees and bonobos), and wanted to continue to explore issues pertaining to language and culture; others wanted to "put the bonobo back in the bonobos", as it were -- instead of forcing them to live in something like human culture. There were management and funding problems as well, and cross-complaints alleging mistreatment of the animals. But the overriding issue has to do with the ethics of treating bonobos as if they were human -- as opposed to letting them live something like their normal lives. The research center still operates, and Kanzi and his family still live there, but Savage-Rumbaugh effectively lost her position (as of July 2020 she was still trying to regain it). Instead of her work on Pan/Homo "hybrid culture", most research involves more traditional procedures for the study of primate cognition -- although there's still considerable use of the Yerkish keyboards. For more detail, see "The Divide" by Lindsay Stern, Smithsonian magazine, 07-08/2020, from which these illustrations are taken.

For an interesting perspective on animal language and human-animal communication, see "Buzz Buzz Buzz" by Michelle Nijhuis (New York Review of Books,08/20/2020), reviewing books on interspecies communication by the Dutch philosopher Eva Meijer. In books such as Animal Languages and When Animals Speak: Toward an Interspecies Democracy, Meijer argues that instead of trying to teach animals English (or ASL, or whatever), we should try to learn their languages -- ho they "speak" to us through the sounds and gestures they make. She further argues that, because animals are able to communicate their desires to us, they are "political actors" who deserve to be recognized by our political system -- as individuals with both "negative rights" (e.g., not to be confined, tortured, or killed) and "positive rights (e.g., to preserved and protected habitats).

Hierarchical Organization of Language

Language is organized hierarchically, into a number of different levels of analysis.

At the lowest level, the phoneme is the smallest unit of speech. Similarly, written language has an elementary graphemic level consisting of the letters of the alphabet.

At the next level, the morpheme is the smallest unit of speech that carries meaning. In English, there are about 50,000 of these: roots, stems, prefixes, and suffixes. There are two classes of morphemes: open-class morphemes consist of nouns, verbs, adjectives, and adverbs (so called because new members can be added to the class just by inventing new words, like "radar" and "snarf"); closed-class morphemes consist of articles, connectives, prepositions, prefixes, and suffixes. Phonemes are combined into morphemes by phonological rules that specify which combinations are "legal" in a particular language.

At the next level, the word consists of one or more morphemes -- a root or stem, plus (perhaps) a prefix or suffix. In English, there are about 200,000 of these. Morphemes are combined into words according to the same phonological rules. Knowledge of the meanings of individual words is stored in the mental lexicon.

At the next level, phrases and sentences consist of strings of one or more words that express a meaningful proposition. In English, or any other language, the number of possible phrases and sentences is essentially infinite. Again, there are two classes of these.

Language basics are simple sentences consisting of open-class morphemes, such as "Mommy go store".
Language elaborations are complex sentences including closed-class morphemes, such as "My mommy goes to the store".

At each level of the hierarchy, combinations of lower-level elements are governed by rules. Grammatical rules govern how words can be strung together. Just a finite number of phonetic elements (phonemes) can be combined into a larger but finite number of morphemes and words, which in turn can be combined by a finite set of grammatical rules into an infinite number of propositions. These grammatical rules, known as syntax, make possible the creativity of human language. But the rules of syntax aren't the only rules of language:

Phonological rules specify which speech sounds can go together, and graphemic rules specify which letters can appear together.
Orthographic rules specify how various speech sounds are represented visually.
Morphological rules specify how morphemes are combined to convey meaningful information.

Phonology

Each language has its own set of phonemes -- in English, there are about 40.

Other languages have other sets of phonemes.

For example, Hawaiian has only 12 phonemes, a, e, I, o, u, and h, k, l, m, n, p, and w. So there are fewer speech sounds to make up Hawaiian words than, for example, English words. Which is why so many Hawaiian words are long -- like humuhumunukunukuapua'a, which refers to a kind of fish. With so few letters, and correspondingly few phonemes, it takes longer strings to make unique words.
By contrast, Cherokee, a Native American language, has some 85 separate speech sounds. When Sequoyah invented a script for Cherokee, in the early 19th century, he adopted letters from the Roman, Cyrillic (Russian), and Greek alphabets, and Arabic numerals, but these letters, in Cherokee, do not correspond to their sounds in these languages.
Hungarian (or Magyar), a Finno-Ugric language, has a number of phonemes that do not appear in English or other Indo-European languages, such as ' (pronounced as in b one, though longer.

Ranjit Bhatnagar, a sound artist, has invented the Speak-and-Play, an electronic keyboard that produces digitized samples of his voice producing the 40 English phonemes (in principle, it could be programmed to use other voices, and other sets of phonemes). It was introduced by Margaret Leng Tan, an avant-grade virtuoso of the toy piano, at the UnCaged Toy Piano Festival (New York Times, 12/14/2013). Link to a video demonstrating the Speak-and-Play system.

The Alphabet

The language that humans are wired by evolution to produce and understand is spoken language. By contrast, written language is a relatively recent cultural invention, and some languages (those of preliterate cultures) don't have a written form (except perhaps one jury-rigged by anthropologists and missionaries). All normal children learn to understand their native tongue effortlessly, but some of these same individuals have difficulty learning to read. Oral language is hard-wired into us; we learn to read with difficulty. The source of this difficulty is illustrated by the "Reading Wars" which dominate American elementary education. How best to get children to convert letter strings into words, and word strings into sentences? We start with the alphabet.

Essentially, the alphabet is a way of expressing the phonemes of a language in written symbols (some languages, like Chinese and Japanese, have written forms, but not alphabets). The earliest known alphabet emerged in Egypt about 2000 BCE, as some Semitic-speaking group parlayed a small set of hieroglyphics into a set of symbols representing the speech sounds of their language. The Phoenician alphabet, source of the Roman letters familiar in a large number of written languages, emerged about 1000 BCE.

Like the rest of language, alphabets continually evolve. Written language apparently began with logograms, like the characters in Chinese or Japanese, where symbols stand for whole words. Then it shifted to logosyllabaries, in which symbols stand for syllables, not whole words. And finally to the alphabet, in which individual letters correspond to single units of sound. Writing, in whatever form, apparently emerged independently in four or five different geographical areas:

Mesopotamian cuneiform, around 3000 BCE.
Egyptian hieroglyphics, also around 3000 BCE -- because the two systems don't have anything in common, the conclusion is that they arose independently -- although the idea of writing may have been transmitted from one culture to the other.
Chinese writing, dating to about 1500 BCE. Again, Chinese characters are so different from cuneiform or hieroglyphs
Mayan glyphs, first appearing in Mesoamerica roughly 1500 years before European contact.
A written version of Rongaronga was used by the inhabitants of Easter Island (Rapa Nui) at the time of first contact with Europeans in 1722. This counts as a possible independent invention, because there was no written language in use in the Polynesian islands from which Rapa Nui's settlers migrated. Unfortunately, European settlers destroyed most of the written Rongaronga artifacts, and the only inhabitants who could read or write the language were taken into slavery in the 19th century. So there's just not much material to compare to other writing systems.

Some interesting facts about the alphabet:

In English, the letters V (derived from the Latin U) and J (derived from the Latin I), are of relatively recent origin.
Cherokee was the first Native American language to be given a written form, and Sequoyah's work is the only instance whereby an entire system of writing was invented by a single individual.
However, the pinyin ("spelling sounds") system for Mandarin Chinese comes close. Traditional Mandarin is written in the form of ideographs, picture-like characters which represent words, but which give no indication of how the words are pronounced. As a result, learning Mandarin entails memorizing thousands of individual characters -- both their meaning and their pronunciation. Beginning in the 1950s, following earlier less-successful attempts at romanization, Zhou Youguang, a Chinese linguist, and others developed pinyin, which represents pronunciation with the Roman alphabet, plus some diacritical marks to represent tones (which also carry meaning in Mandarin). The pinyin system was officially adopted by China in 1958, and by Taiwan in 2009. Zhou himself led an interesting life: like many other Chinese intellectuals, he was sent to a labor camp during the Cultural Revolution; he lived to 111 (see "Zhou Youguang, Who Made Writing Chinese as Simple as ABC" his obituary by Margalit Fox, New York Times, 01/14/2017).

A similar process of romanization is the romaji ("roman letters") system, originally developed in the 16th century by Yajiro, a Japanese convert to Christianity, and which gained popularity beginning during the Meiji era of the 19-20th centuries, and especially after World War II.
And also the hangul system in Korean, introduced in the15th century by scholars in the court of Sejong the Great.

So does the Cyrillic alphabet, used in Russian and other Slavonic languages, named for Sts. Cyril and Methodius, 9th-century Christian missionaries in Bulgaria.

For histories of the English alphabet, see:

Alphabet Abecedarium by Richard A. Firmage (1993);

Language Visible: Unraveling the Mystery of the Alphabet from A to Z by David Sacks (2004);

The Story of Writing: Alphabets, Hieroglyphs and Pictograms (2007);

The Greatest Invention: A History of the World in Nine Mysterious Scripts (2022) by Silvia Ferrara, reviewed in "Alphabet Politics" by Josephine Quinn (New York Review of Books 01/19/2023);

Inventing the Alphabet: The Origins of Letters from Antiquity to the Present (2022), also reviewed by Quinn.

and the suitably titled Writing and Script: A Very Short Introduction (2009) by Andrew Robinson.

For an overview of writing in general, see:

Writing from Invention to Decipherment (2024), ed. by Silvia Ferrara, Barbara Montecchi, and Miguel Valério.

Also, two striking hour-long documentaries (directed by David Singleton) which aired as part of the Nova science series on PBS in 2020. These are absolutely fascinating, and well worth your time to watch (links below).

A to Z: The First Alphabet discusses how writing emerged c. 3100 BCE, roughly simultaneously in the cuneiform system of Mesopotamia and the hieroglyphics of Egypt (the earliest Chinese characters have been dated to c. 1200 BCE), and how these systems were transformed into alphabets using the rebus principle. The rebus principle is, simply, that the sounds associated with pictograms can be combined to represent the sound of an unrelated, nonpictographic word. In the example on the right, an image of a pen touching a knee is the rebus for penny (get it?). The story is that various characters were transformed into symbols representing the sounds associated with them. Watch the film to see how this got us to the alphabet.

A to Z: How Writing Changed the World traces how the letters of the Latin alphabet were ideally shaped to serve as the basis for the moveable type invented b Johannes Gutenberg in the mid-15th century, thus promoting both the expansion of literacy and the scientific revolution. Today, languages like English, Russian, Greek, and Arabic all have their unique alphabets, as discussed in the lectures on Perception (and Chinese has the "pinyin" system). But in the 17th century, the German philosopher (and co-inventor of calculus) Gottfried Leibniz envisioned a universal alphabet common to all the world's languages: we're not there yet.

The infant's speech apparatus can produce all the phonemes in all known natural language -- this is what comes out when babies babble. However, during the babbling period the infant gradually narrows its repertoire of phonemes to those of its native language -- that is, the language(s) to which the infant is exposed. This results in an accent when the person learns to speak a new language as an older child or an adult.

Long before they can produce any words, even very young infants can detect speech sounds in the auditory stream. Studies employing EEG showed that monolingual infants -- that is, infants raised in a household in which only one language is spoken -- respond differentially to speech sounds, compared to other kinds of sounds, as early as 6 months of age -- regardless of whether these phonemes are from their "home" language or another language, to which they haven't been exposed. By the age of 12 months, however, these monolingual infants clearly distinguish between their "native" phonemes and the "foreign" ones -- a phenomenon known as perceptual narrowing. The story was different, however, for infants raised in "bilingual" homes, in which they were exposed to two different languages. Like their monolingual counterparts, these infants did not discriminate between the two languages at 6 months of age; but by 12 months, they were clearly discriminating between the two sets of phonemes.

Somewhat in the manner of imprinting in geese (described in the lectures on Learning) there is a critical period for the learning of phonology, beginning about 6 months of age. However, this does not occur in a vacuum. It must seem obvious that the child has to hear people speaking. But, in fact, exposure to speech, lots of speech, is critical for language development. The more the merrier. There may be an innate Language Acquisition Device, as Chomsky has proposed; but language acquisition also depends on exposure to language, and the social interaction that comes with it. Nature and nurture, as always, work together.

This perceptual learning of speech sounds occurs even in the womb. Infants prefer speech sounds and rhythms drawn from the language to which they were exposed during infant development -- and this is true of bilingual as well as monolingual infants.

The perceptual learning of speech sounds is a form of statistical learning, as discussed in the lectures on Learning. That is, as they are exposed to language, infants naturally pick up, without benefit of reinforcement, the statistical regularities in what they hear. From this experience they acquire knowledge about the regularities of speech sounds, how phonemes are assembled into morphemes, how morphemes are assembled into words, and how words are assembled into phrases and sentences.

To take a classic example, consider the findings of a study by Patricia Kuhl:

At the age of 6-8 months, American, Taiwanese, and Japanese infants do not reliably differentiate between the syllables ra and la and qi and xi.

The syllables ra and la are common in English, but not in Japanese or Cantonese.
The Japanese syllable ra sounds to English ears like a blend of ra and la).

However, at the age of 10-12 months, American and Taiwanese infants clearly distinguish between ra and la, while the Japanese do not.

And the Taiwanese infants clearly distinguish between qi and xi, while the American infants do not.

Does this mean that Skinner was right and Chomsky wrong -- that language is not special after all, and that language acquisition is simply a variant on classical conditioning - -albeit without reinforcement? Not exactly. In the first place, Skinner was completely wrong about the necessity of reinforcement. As we have known since the time of Tolman and Harlow, learning -- acquiring knowledge of the world through experience --just happens, whether we're trying to learn or not, and whether we're rewarded for learning or not. Statistical learning is a powerful learning mechanism, and it is the process by which children acquire most of their knowledge of language -- just as it is the process by which they acquire most of their knowledge about the world, through simple observation. But there's also a role for something like an innate Language Acquisition Device. It turns out that all of the natural languages of the world share a relatively small number of universal patterns. Artificial languages that do not share these patterns are difficult to learn, whereas those that share the "universals" of natural languages are learned readily by both children and adults. So, it seems as if we come prepared -- remember preparedness, from the Learning lectures? -- to learn some kinds of patterns and rules easily. That element of preparedness, wired into the human genome and brain by evolution, is the Language Acquisition Device.

The classic example of observational rule-learning in language is the past tense of verbs in English, discussed at length by the psychologist Steven Pinker (Words and Rules: The Ingredients of Language, 2000 -- a truly wonderful book that should be read by everyone interested in language; but first read Pinker's earlier and more accessible book, The Language Instinct: How the Mind Creates Language, 1994).

The rule, of course, is to add the suffix /-d/ (or /-ed/) to the stem, yielding joke-joked or play-played.
Of course there are exceptions, such as sing-sang or buy-bought, and these have to be memorized.
When young children learn English, they typically will overgeneralize the rule, yielding conversations like the following (reported by Pinker):

Child: My teacher holded the baby rabbits and we patted them.
Adult: Did you say your teacher held the baby rabbits?
Child: Yes.
Adult: Did you say she held them tightly?
Child: No, she holded them loosely.

A similar overgeneralization occurs when children learn the plurals of nouns, which are usually -- but not always -- formed by adding /-s/.

What's interesting about all this is that the child has never heard an adult say "holded". Never. Nor has the child ever been reinforced for saying "holded". Never. If anything, she's been corrected. The child gets "holded" not by imitating adults, much less by virtue of being reinforced. The child gets "holded" because she has abstracted a rule from hearing adults, and then generalized it to new words.

Sound and Meaning

Words are made of phonemes, which indicate how the word is to be pronounced. But what's the relationship between the sounds of words and their meanings? The debate goes back at least as far as Plato's Cratylus. In that dialog, Hermogenes argues that there is an arbitrary relationship between a word and its meaning; Cratylus disagrees, and Socrates came down somewhere in the middle. And 2500 years later, modern linguistics stands with Socrates. Words are symbolic representations of objects, events, and ideas, which don't typically resemble -- in appearance or pronunciation -- the things they represent. But there are exceptions:

Wolfgang Kohler, one of the founders of the Gestalt movement in psychology, showed subjects a pointed, star-like shape and a rounded, cloud-like shape, and asked them which would be called takete and which baluba, both nonsense words in the subjects' native Spanish. They overwhelmingly associated the pointed object with takete, and the rounded one with baluba.
Ramachandran and Hubbard (2001) repeated the experiment with native English speakers, using the nonsense words bouba and kiki, and got the same results.

R&H argued that this bouba/kiki effect, as they dubbed it, represents a kind of linguistic synesthesia, in which the sound of a word, or perhaps its physical appearance is associated with its meaning. A word like bouba, with its soft vowels an consonants, is associated with soft things, while a word like tiki, with its hard vowels and consonants, is associated with sharp things.

Whistle While You Speak

We ordinarily think of spoken language in terms of words strung into phrases and sentences, but that's not necessarily the case. In the sign languages used by deaf people, for example, meaning is conveyed by gestures instead of spoken words.

Another example is whistling language, such as Silbo Gomero used on La Gomera, an island of about 22,000 people in the Canary Islands archipelago (see "'Special and Beautiful': Whistled Language Echoes Around This Island" by Raphael Minder, New York Times, 02/19/2021). Accounts of the language go back as far as 15th-century Spanish explorers, and after conquest by Spain the language was adapted to Spanish. According to Milner, "Silbo Gomero... substitutes whistled sounds that vary by pitch and length for written letters [by which he really means phonemes or words]. Unfortunately there are fewer whistles than there are letters in the Spanish alphabet, so a sound can have multiple meanings, causing misunderstandings" which have to be resolved by reference to context [not unlike more familiar spoken languages, as discussed below]. In 2009, Silbo Gomero was added to UNESCO's list of "Intangible Cultural Heritage of Humanity", which noted that it is "the only whistled language in the world that is fully developed and practiced by a large community" (other islands in the archipelago, such as El Hierro, have their own whistled languages). It is now taught as an obligatory part of the school curriculum on La Gomero. Milner suggests that whistling languages evolved because of the unique geography of the Canary Islands, whose mountains, plateaus, and ravines make traveling by foot difficult. Whistles carry much farther than speech, facilitating communication across long distances.

Syntax

Phonology has to do with the sound of speech. Syntax has to do with the grammatical rules by which words and phrases are strung together to create meaningful utterances. These grammatical rules should not be confused with the ones you learned in elementary school. Normal children have all the grammar they need long before they enter into formal schooling -- they pick it up from hearing other people speak, and especially from being spoken to. Moreover, much of this grammatical knowledge is itself unconscious, in the sense that children (and adults) are not consciously aware of the rules they're following when they speak and listen, write and read. They gain explicit knowledge of some of these rules later, during formal instruction. It's in school that you learn the rules for sentence agreement: that the nouns, verbs, and adjectives in a sentence must agree in number: we say The horse is strong, but The horses are strong. When you make a mistake, your teacher corrects you until you get it right.

But there are other grammatical rules which you're not expressly taught, and which you just know intuitively. A classic example, adapted from Joseph Danks and Sam Glucksberg (1971) are the rules for ordering adjectives: every native speaker of English agrees that it is correct to say big red Swiss tables but not to say Swiss red big tables -- it just sounds wrong. But no schoolchild is ever taught this rule, and if you asked people to tell you what the rule is, they couldn't say. And, in fact, professional linguists have debated over the precise nature of the rule for years. One proposal, from the British linguist Mark Forsyth (in The Elements of Eloquence, 2013), is that adjectives should be ordered as follows: opinion, size, age, shape, color, origin, material, and purpose, as in This a lovely little old rectangular green French silver whittling knife. For present purposes, the actual rule doesn't matter: what matters is that there is such a rule, everyone follows it intuitively, but nobody (except maybe professional linguists) can tell you what the rule is.

(There are other rules, governing phonology and morphology as well). For example, the suffix -s that marks plural English words is pronounced differently like zzz or ess, depending on the previous consonant. But nobody ever taught anybody that rule, either; native speakers of English employ it reliably by age 3 -- that is, before any elementary school teacher has taught them anything about English. And if I asked you why you say fades with the -zzz sound but gaffes with the -s sound, you couldn't tell me, because it depends on whether the preceding consonant is "voiced" or "unvoiced", and you probably don't know what that means (unless you've taken a linguistics class!).

Phrase (Surface) Structure

Phrase structure rules govern the surface structure of language -- the utterance as it is expressed by the speaker, and heard by the listener. Phrase structure grammar is represented by rewrite rules of the following form:

Noun ==> man, woman, horse, dog, etc.
Verb ==> saw, heard, hit, etc.
Article ==> a, an, the
Adjective ==> happy, sad, fat, timid, etc.
Noun Phrase ==> Art + Adj + N
Verb Phrase ==> V + NP
Sentence ==> NP + VP

These last three rules can generate 4800 different legal sentences just from the 13 words specified in the first four words (try it out!). All these sentences have the following form:

The 1st noun phrase verbed the 2nd noun phrase.

Now if you recall that there are roughly 200,000 words in English, and more being invented every day, you will understand what we mean when we say that an infinite number of sentences can be generated from a finite number of words.

Phrase structure rules look like those you learned in 7th-grade English. But they also have psychological reality. Consider the famous poem by Lewis Carroll, Jabberwocky, from Through the Looking-Glass and What Alice Found There (1871), the sequel to Alice in Wonderland (1865):

'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe
All mimsy were the borogroves
And the mome raths outgrabe.

"Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!"

He took his vorpal sword in hand:
Long time the manxome foe he sought --
So rested he by the Tumtum tree
and stood awhile in thought.

And, as in uffish thought he stood
The Jabberwock, with eyes aflame
Came whiffing through the tolgey wood,
And burbled as it came!

One, two! One, two! And through, and through
The vorpal blade went snicker-snack!
He left it dead, and with its head
He went galumphing back.

"And hast though slain the Jabberwock?
Come to my arms, my beamish boy!
O frabjous day! Callooh, Callay!"
He chortled in his joy.

'Twas brillig, and the slithy troves
Did gyre and gimble in the wabe
All mimsy were the borogroves
And the mome raths outgrabe.

We don't know what the poem means, exactly, because so many of the "words" are unfamiliar. But we do know something of the meaning, simply because the structure of the poem follows the rules of English grammar. For example, we know that the toves, whatever they are, were slithy, whatever that is; and that they were gyring and gimbling, whatever those actions are, in the wabe, whatever that is; that the borogroves were mimsy; and the mome raths were outgrabe.

Actually, it's perfectly clear what Jabberwocky means, at least to Humpty-Dumpty

"You seem very clever at explaining words, Sir", said Alice. "Would you kindly tell me the meaning of the poem 'Jabberwocky'?"

"Let's hear it", said Humpty Dumpty. "I can explain all the poems that ever were invented--and a good many that haven't been invented just yet."

This sounded very hopeful, so Alice repeated the first verse:

'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe:
All mimsy were the borogoves,
And the mome raths outgrabe.

"That's enough to begin with", Humpty Dumpty interrupted: "there are plenty of hard words there. 'Brillig' means four o'clock in the afternoon--the time when you begin broiling things for dinner."

"That'll do very well", said Alice: "and 'slithy'?"

"Well, 'slithy' means 'lithe and slimy'. 'Lithe' is the same as 'active'. You see it's like a portmanteau--there are two meanings packed up into one word."

I see it now", Alice remarked thoughtfully: "and what are 'toves'?"

"Well, 'toves' are something like badgers--they're something like lizards--and they're something like corkscrews."

"They must be very curious creatures."

"They are that", said Humpty Dumpty: "also they make their nests under sun-dials--also they live on cheese."

"And what's to 'gyre' and to 'gimble'?"

"To 'gyre' is to go round and round like a gyroscope. To 'gimble' is to make holes like a gimlet."

"And 'the wabe' is the grass plot round a sun-dial, I suppose?" said Alice, surprised at her own ingenuity.

"Of course it is. It's called 'wabe', you know, because it goes a long way before it, and a long way behind it--"

"And a long way beyond it on each side", Alice added.

"Exactly so. Well then, 'mimsy' is 'flimsy and miserable' (there's another portmanteau for you). And a 'borogove' is a thin shabby-looking bird with its feathers sticking out all round--something like a live mop."

"And then 'mome raths'?" said Alice. "If I'm not giving you too much trouble."

"Well a 'rath' is a sort of green pig, but 'mome' I'm not certain about. I think it's sort for 'from home'--meaning that they'd lost their way, you know."

"And what does 'outgrabe' mean?"

"Well, 'outgribing' is something between bellowing an whistling, with a kind of sneeze in the middle: however, you'll hear it done, maybe--down in the wood yonder--and when you've once heard it, you'll be quite content. Who's been repeating all that hard stuff to you?"

"I read it in a book", said Alice.

"Jabberwocky" had been subjected to endless exegeses by linguists. See, also The Annotated Alice by Martin Gardner, which contains references to much of this scholarly material. You can read a summary in "The Frabjous Delights of Seriously Silly Poetry" by Willard Spiegelman (Wall Street Journal, 04/04/2020). Spiegelman reminds us that some of the "nonsense" words coined by Carroll, such as galumphing and chortled, have entered the standard English dictionary.

As for the Jabberwock itself, it's a creature of Carroll's imagination, and probably best left to the reader's imagination, as well. Conjure up your own image, and then check out the classic illustration by John Tenniel. Or, better yet, buy a copy of the book, with all of Tenniel's delightful illustrations, to share with a child (or adult) you know. And while you're at it, get a copy of Gardiner's Annotated Alice for your coffee table.

In fact, over subsequent years some of Carroll's neologisms actually made it into the English dictionary. Christopher Myers (2007) has produced an illustrated version of Jabberwocky intended to introduce the poem to children. But in his book, the Jabberwock is turned into a kind of monstrous basketball player. That may be fine for getting kids to read the book, but the real beauty of the poem is in its parody of classic English poetry, and in its demonstration of just how much meaning is contained in the syntax alone.

The psychological reality of phrase-structure grammar is also indicated by experiments on memory for strings of pseudowords. For example, in an experiment by Epstein subjects were asked to memorize letter strings of letters such as the following:

THE YIG WUR VUM RIX HUM IN JAG MIV.

Another group was asked to memorize versions of the strings that had been altered with prefixes and suffixes, so that they mapped onto phrase structure grammar:

THE YIGS WUR VUMLY RIXING HUM IN JAGEST MIV.

Strings of the second type were easier to memorize compared to those of the first type, even though they contained more letters. The reason is that our knowledge of phrase structure grammar provides a scheme for organizing the strings into chunks resembling the parts of speech. This organization supports better memory (remember the organization principle?).

As a further demonstration, consider an experiment by Fodor and Bever (1965) on signal detection. In this study, subjects heard a click superimposed on a spoken sentence, and they are asked to indicate where, in the sentence, the click occurred. If the click was located at the boundary between a noun phrase and a verb phrase, it was accurately located. But if the click were presented within a noun phrase or a verb phrase, it was usually displaced to the boundary between them. This indicates that noun phrases and verb phrases are perceived as units.

Deep Structure

Phrase structure grammar is important, but it is not all there is to syntax. For example, at first glance, sentences with similar surface structures seem similar in meaning:

JOHN SAW SALLY.
JOHN HEARD SALLY.

However, similar surface structures don't guarantee similarity of meaning. Consider the following sentences:

JOHN IS EASY TO PLEASE.
JOHN IS EAGER TO PLEASE.

The differences in meaning can be demonstrated by rephrasing them in passive voice:

IT IS EASY TO PLEASE JOHN.
IT IS EAGER TO PLEASE JOHN.

Moreover, sentences with different surface structures may have similar meaning:

JOHN SAW SALLY.
SALLY WAS SEEN BY JOHN.
IT WAS JOHN WHO SAW SALLY.
IT WAS SALLY WHO WAS SEEN BY JOHN, WASN'T IT?

So clearly something else is needed besides surface structure grammar.

This "something else", according to the linguist Noam Chomsky, is transformational grammar, a set of rules that can generate many equivalent surface structures, and also uncover kernel of meaning common to many different surface structures. This "kernel of meaning" is what is known as the deep structure of the sentence. We get from deep structure to surface structure by means of transformational rules. These rules can produce many equivalent surface structures from a single deep structure.

The deep structure of a sentence can be represented by its basic propositional meaning representing the basic thought underlying the utterance:

Prop ==> NP + VP

The surface structure of a sentence consists of a proposition and an attitude:

Att --> assertion, denial, question, focus on object, etc.
S --> Att + Prop

Thus:

Proposition: THE BOY HIT THE BALL.
Assertion: THE BOY HIT THE BALL.
Denial: THE BOY DID NOT HIT THE BALL.
Question: DID THE BOY HIT THE BALL?
Focus on Object: THE BALL WAS HIT BY THE BOY.
Combination: THE BALL WAS NOT HIT BY THE BOY, WAS IT?

Something like Deep Structure, and transformational grammar, is logically necessary in order to understand language. But, as empirical scientists, we also need evidence of its psychological reality. In this regard, some evidence is provided by the utterances of novices in a language, such as infants and immigrants. These utterances tend to mimic the hypothesized deep structure:

I NO GO SLEEP.
WHY MOMMY HIT BILLY?

Further evidence is provided by studies of memory for paraphrase. In such experiments, subjects study a sentence, and then are asked to recognize studied sentences from a set of targets and lures.

HE SENT A LETTER TO GALILEO.

Interest in the experiment comes from recognition errors (false alarms) made in response to alternative phrasings. Phrasings that represent a different proposition, but the same attitude, are correctly rejected:

GALILEO SENT A LETTER ABOUT IT TO HIM.

But phrasings that represent the same proposition, albeit with a different attitude, tend to be falsely recognized:

A LETTER ABOUT IT WAS SENT TO GALILEO BY HIM.

This kind of evidence indicates that memory encodes the "gist" of a sentence, its kernel of meaning, rather than the details of its surface structure.

Further evidence is provided by a study of meaning verification. Subjects study a standard sentence:

THE BOY HIT THE BALL.

Then they are presented with test sentences, and asked to indicate which have the same meaning as the standard. Sentences that involve only a single transformation are correctly verified very quickly:

HAS THE BOY HIT THE BALL?

However, sentences that involve two transformations take more time:

WAS THE BALL HIT BY THE BOY?

This indicates that, in understanding the sentence, the language processor strips away the surface structure to unpack the deeper kernel of meaning below -- and this unpacking takes time.

In an early, now-classic experiment, Savin & Perchonock (1965) presented subjects with sentences followed by a string of eight unrelated words (that is, words that were unrelated to the sentence. They were then asked to repeat the sentence verbatim, and to recall as many of the words as possible. The sentences represented either a kernel of meaning, such as

The boy hit the ball

or one or more transformations, such as

Passive

Negative

Negative + Passive

The ball was hit by the boy

The boy did not hit the ball

The ball was not hit by the boy

The finding was that subjects remembered the most words when presented with the simple sentence representing the kernel of meaning. When they had to process a single transformation, they remembered fewer words. And when they had to process two or more transformations, they remembered even fewer words. Apparently, processing the transformations took up space in what we would now call working memory, leaving less capacity left over to hold the words.

Universal Grammar

Chomsky has proposed that, the human ability to acquire and understand any natural language is based on the fact that some grammatical knowledge is part of our innate biological equipment. Not that we are born knowing English or Chinese, obviously. A child born of English-speaking parents, but raised from birth in a Chinese-speaking household, will speak fluent Chinese, without even an English accent. And a child born of Chinese-speaking parents, but raised from birth in an English-speaking household, will speak fluent English without any trace of Chinese accent. For Chomsky, language acquisition occurs as effortlessly as it does because humans are born with a special brain module or system -- he sometimes calls it a "Language Acquisition Device" or LAD -- that permits us to learn any language to which we are exposed. At the core of the LAD is a body of linguistic knowledge that Chomsky calls 'Universal Grammar" or UG -- a set of linguistic rules or principles that apply to all languages, spoken by anyone, anywhere. UG and the LAD come with us into the world as our basic mental equipment, just like we are born with lungs and hearts and eyes and ears. They are part of the biological stuff that makes us human, and which separates us from other species, such as chimpanzees, which for all intents and purposes lack the human capacity for language.

According to Chomsky, all natural languages, whether English or Chinese or Hindi-Urdu or Swahili or whatever -- have the principles of UG in common. What makes them different is due to the fact that each language has a different package of parameters. It's a little like buying a car: every car has four wheels and an internal-combustion engine, but some cars have leather seats and other cars are convertibles. You get the basic equipment whatever car you buy; it's only the options that are different. But the options are linked together: if you buy a standard-shift car, it comes with a clutch and a different kind of gearshift, as well as a gauge to indicate the engine's RPMs. So it seems to be with language: if, for example, a language has the rule that verbs must precede objects (John shoveled the snow) will also place relative clauses after the nouns that they modify (John shoveled the snow, which was three feet deep). Not all languages have either feature (there are languages that say, in essence, John the three-foot deep snow did shovel), but a language like English, which has one feature, apparently also must have the other.

Of course, the words are different, too. In English we say "man" where the Spanish say "hombre". But Chomsky is not talking about words. He's talking about syntax: the fundamental structure of the grammatical knowledge that allows us to put words together into meaningful sentences.

Chomsky's theory is widely accepted, though mostly on grounds of plausibility: there has to be something like UG, and an innate LAD, to enable people to learn their native language as easily and as well as they do. But gathering empirical evidence for the theory has proved difficult, because it requires investigators to have an intimate knowledge of several languages that are very different from each other. Still, there has been some interesting progress toward empirical tests of Chomsky's theory.

For example, Mark Baker, a linguist who studied under Chomsky at MIT, has shown that the world's languages differ from each other in one or more of a relatively small set of 14 parameters. Baker claims to have identified more than a dozen of these already, and he thinks there may be as many as 30.

For example, some languages are polysynthetic, which means that they can express in one (usually long) word a meaning that another language would take many words to convey. Mohawk, a Native American language, is polysynthetic, as is Mayali, an aboriginal language spoken in northern Australia. English, by contrast, is not; nor are most other known languages.
As another example, languages differ in terms of head direction, which dictates whether modifiers are added before or after the phrases they modify. For example, in English, we would say

"I will put the book on the table".

Welsh, spoken in the British Isles, and Khmer, spoken in Cambodia, also put modifiers in the front of phrases. By contrast, Lakota, another Native American language, Japanese, Turkish, and Greenlandic (an "Eskimo" language) put the modifiers at the end. In Lakota, you would say something like

"I table the on book the put will".

Different languages make different "choices" among these parameters. Presumably, when a child learns his or her native language, he or she starts out with UG, and then fits what he or she hears into the scheme. It's as if an infant raised in an English-speaking household says, "OK! I got it! This is not a polysynthetic language, and it's got forward head-direction. None of this goes on consciously, of course, and you don't actually learn about these parameters unless you take a college course in comparative linguistics. But there does seem to be a universal grammar underlying all languages, and we seem to know it innately. And whatever differences there are between languages, somehow we have to grasp them very quickly in order for us to become fluent speakers as early as we do. So, it makes sense that, in addition to UG, the various parametric "options" are also laid out in our brains, ready to categorize the language we're exposed to from birth.

Critique of Universal Grammar

On the other hand, work by Michael Dunn suggests that linkage between various aspects of language grammar isn't as tight as Chomsky assumes it is. His survey indicates that the world's languages contain a lot more random linkages among these features, suggesting that flipping the switch for verb-object order, for example, doesn't also control noun-relative clause order.

Moreover, Daniel Everett, a linguist who did Christian missionary work in the Amazon (his memoir of that time is entitled Don't Sleep, There Are Snakes, has claimed that one particular tribe, the Piraha, speak a language whose structure is inconsistent with Chomsky's theory of UG. Chomsky (and others) has claimed that recursion is the essential property of human language, distinguishing it from all other modes of communication (like birdsong), and the key to its creativity. That is, we can embed clauses within sentences to make infinitely long sentences -- and, more important, sentences which have never been uttered before. What nonhuman animals lack, even chimpanzees, is the ability to combine meaningful sounds (what would be our phonemes and words) into new sounds that have a different meaning.

So, beginning with

John gave a book to Lucy

we can simply keep adding clause:

John, who is a good husband, gave a book to Lucy.
John, who is a good husband, gave a book to Lucy, who loves reading.
John, who is a good husband, gave a book, which he wanted to read himself, to Lucy, who loves reading.

And so on: you get the idea.

Anyway, Everett claims that the Piraha language doesn't have the property of recursion -- nor, for that matter, does it have number terms or color words. Far from being a unique human faculty, Everett concludes that language is a product of general intelligence. It is an invention, not unlike the bow and arrow, which can arise independently in lots of different cultures, but whose particulars are shaped by the culture that produced it. So far from being universal, particular languages are shaped by particular cultural needs. Far from being dependent on a uniquely human faculty, like Chomsky's Language Acquisition Device, language is a product of general intelligence, like the bow and arrow. The Piraha, whose culture is focused completely on the concrete here-and-now, just don't need recursion, so their language doesn't have it. But if language is shaped by culture, then it isn't universal in the way Chomsky says it is. You can't imagine how controversial this claim is -- so much so that no fewer than three independent research teams have visited the Piraha to figure out how their language works. It doesn't help that Everett has been reluctant to release the data he collected in the field. However, an independent corpus of 1,000 Piraha sentences, collected by another missionary linguist, appears to show no evidence of recursion. But even it that's true, it doesn't show that the Piraha can't generate or understand recursive sentences. It would just show that they don't routinely do it.

Everett has expanded his argument in Language: The Cultural Tool (2012). Far from depending on the existence of a separate Language Acquisition Device, Everett and other Chomsky critics argue that language learning is just a special instance of learning in general. To be sure, spoken language (as opposed to sign language) depends on a distinctly human vocal apparatus, but language itself is "just" a complex function mediated by a very complex brain. In Everett's formula:

Language = Cognition + Culture + Communication.

Language isn't an innate human faculty, in this point of view. Rather, it's a tool that can be used by a culture or not. And when it's used by a culture, it's shaped by the culture in which it's used. We'll see this idea crop up again later, when we consider the Sapir-Whorf hypothesis that language determines thought. What people talk about it (semantics), and how they talk about it (syntax), are both determined by culture, including cultural values.

Everett expanded his argument with Chomsky in How Language Began: The Story of Humanity's Greatest Invention (2018). The subtitle tells it all: Language isn't something that evolved through the accidents of natural selection; language is something that humans invented for purposes of exchanging information. For that reason, there will be cultural differences among languages -- like the language of the Piraha, which doesn't involve recursion, and doesn't have color or number terms, because the Piraha don't see the need for them. But beyond this debate, there lies a basic difference about the function of language: Chomsky think that language is primarily a tool for thinking, while Everett thinks that language is primarily a tool for communication. As Everett puts it:

Language did not fully begin when the first hominid uttered the first word or sentence. It began in earnest only with the first conversation, which is both the source and the goal of language.

A variant on Everett's critique has been promoted by Michael Tomasello (2003) in his usage-based theory of language acquisition (for a a brief summary, see "Language in a New Key" by Paul Ibbotson and Michael Tomasello, Scientific American, 11/2016). Essentially, Tomasello denies any need to postulate anything like LAD or UG, and instead argues that language is a product of a general-purpose learning system, coupled with other general-purpose cognitive abilities, such as the ability to classify objects and make judgments of similarity.

Tomasello begins by outlining some empirical problems with Chomsky's theory (which, as I've noted, has changed considerably since he first announced it in the 1950s).

Some aboriginal Australian languages don't package words neatly into separate noun phrases and verb phrases, "outliers" which suggested that UG was not universal after all.
Some "ergative" languages, such as Basque, have grammatical features that are not found in "accusative" languages, such as Spanish, French, or Portuguese. For example, where an "accusative" language would say The boy kicked the ball, an "ergative" language would say something like The ball the boy kicked.

To be fair, though, it was findings such as this that led Chomsky and his colleagues to the develop the "Principles and Parameters" version of the theory described above.

And of course, there are languages such as Piraha which don't seem to make use of recursion, which Chomsky believes lies at the heart of language.

Now, in response to such findings, one might simply make adjustments to the theory, and in fact that's what Chomsky has done. But Tomasello believes that enough challenging evidence has accumulated that Chomsky's theory should be thrown out entirely, and something new substituted -- or, in Tomasello's case, something old: statistical learning, of the sort discussed earlier in the context of phonology. The general idea is that through this general learning mechanism children pick up on the regularities of whatever language they're exposed to. Although this sounds a bit like B.F. Skinner's approach to language -- well, it is, except that it depends on observational learning rather than reinforcement. In that sense, the usage-based theory is more "cognitive" and less "behavioristic". This general statistical-learning process is augmented by other general processes, such as attention, memory, categorization, and the ability to understand other people's intentions (which is discussed in the Lectures on "Psychological Development"..

The point of this is not that Chomsky is right and Everett wrong, or vice-versa (though, frankly, I'm rooting for Chomsky). Chomsky's theories about human language have dominated linguistics, psychology, and cognitive science for more than 50 years (since his earliest publications, in the 1950s), and he attracts allies and foes in equal number. For my money, Chomsky's basic view of language as a unique human component of our cognitive apparatus is probably more right than wrong, but I'm not enough of a linguist -- I'm not any linguist at all -- to engage in a detailed critique of his theoretical views. Nor is there any need for such a critique at this level of the course. All I ask is that you appreciate Chomsky's approach to language. If you get really interested in language, you'll have plenty of opportunity to delve deeper into its mysteries in more advanced courses.

For an entertaining introduction to Chomsky's thought, both psychological and political, see "Is the Man Who Is Tall Happy", a documentary film by Michel Gondry (2013). the title is taken from a sentence that Chomsky often uses to illustrate his ideas.

Also the Noam Chomsky website at www.chomsky.info.

For more details on Baker's work, see his book, The Atoms of Language: The Mind's Hidden Rules of Grammar (2001).

Everett's work, and the controversy surrounding it, is presented in his book, Language, the Cultural Tool (2012), and in "The Grammar of Happiness", a documentary film scheduled to be shown on the Smithsonian Channel in 2012.

The Everett-Chomsky debate has made it into popular culture. Tom Wolfe, the pioneer of the "New Journalism" has done for language (and the theory of evolution) what he did earlier for modern art (in the Painted Word) and architecture (in From Bauhaus to Our House). In The Kingdom of Speech (2016, excerpted in Harper's magazine), Wolfe attacks Chomsky for both his abstruse theories about language and for his celebrity as a political activist. Actually, Wolfe begins with Darwin, and characterizes his theory of evolution "as a messy guess -- baggy, boggy, soggy and leaking all over the place". That's not remotely true, of course: evolution is as much of a certainty as we have in science (Wolfe also has fun at the expense of the "Big Bang" theory of the origin of the universe, characterizing it as little more than a bedtime story). But evolutionary psychologists can be silly, as we'll see in the lectures on "Psychological Development", and the theory of evolution has a particularly bad time explaining the emergence of language. Because nonhuman animals don't have anything like human linguistic capacity, Wolfe claims that language could not have evolved through natural selection. (Wolfe is not unique in this respect: for this same reason, someone -- I forget who, but it might have been David Premack, the psychologist who worked with the chimpanzee known as Sarah -- else once remarked that "Language is an embarrassment to the theory of evolution".) Then he turns his attention to Chomsky, accusing him of ignoring evidence (like Everett's) from the field in favor of armchair speculation. He's got Darwin pretty much dead to rights about the evolution of language, but his criticism of Chomsky focuses mostly on his celebrity rather than his theory. Still like everything Wolfe has written (New Journalism like The Electric Kool-Aid Acid Test and Bonfire of the Vanities) it's provocative and great fun to read -- even for those of us who admire both Chomsky's work on language and his political activism. Wrong as he may be about Chomsky, Darwin, and even the Big Bang, Wolfe is a wonderful (and wonderfully funny) writer -- a real tribute to what you can do with language.

But things might be changing. In 2025, a group of primate-communication researchers led by Melissa Berthet reported that a group of bonobos studied at the Kokolopori Bonobo Reserve in the Democratic Republic of the Congo showed evidence of compositionality, or the ability to combine two calls into a phrase that has a new meaning ("Extensive compositionality in the vocal systems of bonobos" by Simon Townsend et al., Science 04/04/2025). It was already known that some animals (not just bonobos) can combine calls in a manner that adds their meanings together. To take an example from human language, the phrase tall cook simply refers to a person who is both tall and a cook -- the two individual meanings are simply added together. But a phrase like good cook doesn't refer to a person who is both good and a cook. He or she could be a very bad person, but still a good cook. It turns out that these bonobos did something similar. The bonobos did something similar, combining their vocabulary of hoots, whistles, peeps, etc. into phrases whose meanings are, to borrow a phrase from the Gestalt psychologists (see the lectures on Perception), something different from the sum of their parts. From a Chomskian perspective this ability to combine language units like words to new meanings, is perhaps not unique to humans after all. And that would mean that language isn't "an embarrassment to the theory of evolution" after all. embarassment

Semantics

For Chomsky, syntax is the key to human language: Human communication wouldn't be creative otherwise. But syntax isn't the only thing we need to understand utterances. Syntax provides structure, but it doesn't provide specific meaning. It only says that the first noun phrase verbed the second noun phrase. That's just the framework for a meaningful utterance. The frame has to be filled out with specific words and phrases. Semantics has to do with the specific meanings conveyed by words in a proposition, what the various words refer to. And unless we know what the specific words refer to, we can't understand the meaning of what's being said to us. -- what the various words refer to. The problem of reference has two aspects: denotative, concerning the object, or event, or attribute that the word labels; and connotative, or the emotional meaning of the word. This question of meaning, or reference, returns us to problem of semantic (as opposed to episodic) memory; and also to categorization. Both semantic memory and categorization link language to thought.

Twas brilling, and the slithy toves did gyre and gimble in the wabe. All mimsy were the borogoves, and the mome raths outgrabe."

We can get some sense of that just from the syntax, but we have no idea what brillig is, and what toves are, and in what respect they're slithy, and what a borogrove is, and what it means to be mimsy.

The problem of reference has two general aspects, denotative and connotative. Denotative reference or denotative meaning concerns the object or event or attribute that a particular word labels. What does it mean to gyre? What are borogroves, and what does it mean to be mimzy? Connotative meaning has to do with the emotional meaning of the word, whether it's a good thing or a bad thing to be mimsy or brillig.

This question of meaning or reference returns us to the problem of semantic as opposed to episodic memory, and how we can represent semantic memory in a network where nodes represent concepts, and links represent the associative or propositional relationships among various concepts. And we've actually talked about meaning a lot when we talked about categorization, and how concepts are held together by prototypes, or exemplars, or whatever. Both semantic memory and categorization link language to thinking.

A Taxonomy of Words

In grade school we learned to classify words according to their grammatical or syntactic properties -- the parts of speech. Different languages hvae different parts of speech, and linguists have fist fights over the exact number at their conventions). But in English, this is a more-or-less consenual classification:

noun (the name of a person, place, or thing; can be concrete, like dog or abstract, like justice);
verb (referring to an activity or process, such as hit or think);
adjective (which modifies a noun or noun phrase, such as pretty or handsome);
adverb (whch modifies a verb or verb phrase, such as quickly or slowly);
pronoun (substituting for a noun or noun phrase, such as he, she, and it);
preposition (which mark spatial, temporal, or other relations, such as in, before, and for);
conjunction (which connect words, phrases, or clauses, such as and and or);
interjection (which express emotion, such as Damn! or Ouch!);
article or determiner (which mark the referents of nounsor noun phrases, such as a and the).

And that's fine. But there is another classification of words, which is what Peter Mark Roget (1779-1869) produced in his Thesaurus of English Words and Phrases, Classified and Arranged so as to Facilitate the Exression of Ideas and Assist in iterary Composition, originally published in 1853, and revised in many editions (some by Roget's son, John Lewis Roget, right up to the present day. We usually think of "Roget's Thesaurus" as a kind of dictionary in which we can look up synonyms, to prevent our writing from being repetitive and boring. And it is that. But it's so much more. Inspired by Linnaeus's system of biological classification, discussed in the lectures on Thinking, Roget, who originally trained as a physician, invented the "log-log" slide rule, and comprehensively surveyed animal and vegetable physiology (more inspiration from Linnaeus), asspired to nothing less than a comprehensive classification of all the words in English, not alphabetically, as in a standard dictionary (which doesn't classify words anyway, except as parts of speech), but, in Roget's words, "according to the deas which they express".

Here are the top two levels of Roget's hierarchy, with the category of "Abstract Relations" unpacked completely:

Abstract Relations

Existence

Abstract

Existence vs. Inexistence

Concrete

Substantiality vs. Unsubstantiality

Formal

Intrinsicality vs. Extrinsicality

Modal

State
Circumstance

Relation

Absolute

Relation vs. Irrelation
Consanguinity
Correlation
Identity vs. Contrariety vs. Difference

Continuous

Uniiformity vs. nNon-uniformity
Similarity vs. Dissimilarity
Imitation vs. Non-imitation vs. Variation

Partial

Copy vs. Prototype

General

Agreement vs. Disagreement

Quantity

Simple

Quantity vs. Degree

Comparative

Equality vs. Inequality
Mean
Compensation
Greatness vs. Smallness
Superiority vs. Inferiority
Increase vs. Decrease

Conjunctive

Addition vs. Non-addition, Subduction
Adjunct vs. Remainder, Decrement
Mixture vs. Simpleness
Junction vs. Disjunction
Vinculum
Coherence vs. Incoherence
Combination vs. Decomposition

Concrete

Whole vs. Part
Completeness vs. Incompleteness
Composition vs. Exclusion
Component vs. Extraneous

Order

General

Order vs. Disorder
Arrangement vs. Derangement

Consecutive

Precedence vs. Sequence
Precursor vs. Sequel
Beginning vs. End vs. Mddle
Continuity vs. Discontinuity
Term

Collective

Assemblage vs. Non-assemblage, dispersion
Focus

Distributive

Class
Inclusion vs. Exclusion
Generality vs. Speciality

Categorical

Rule vs. Multiformity
Conformity vs. Unconformity

Number

Abstract

Number
Numeration
List

Determinate

Unity vs. Accompaniment
Duality
Duplication vs. isection
Triality
Triplication vs. Trisection
Quaternity
Quadruplication vs. Quadrisection
Five, &c vs. Quinquesection, &c [Here Roget seems to have gotten tired!]

Indeterminate

Multitude vs. Fewness
Repetition
Infinity

Time

Absolute

Time vs. Neverness
Period, contingent Duration vs. Course
Disuturnity vs. Transientness
Perpetuity vs. Instantaneity
Chronometry vs. Anachronism

Relative

Priority vs. Posteriority
Present Time vs. Different Time
Synchronism
Futurity vs. Preterition
Newness vs. Oldness
Morning vs. Evening
Youth vs. Age
Infant vs. Veteran vs. Adolescence
Earliness vs. Lateness
Occasion vs. Intempestivity
Frequency vs. Infrequency
Periodicity vs Irregularity

Change

Simple

Change vs. Permanence
Cessation vs. Continuance
Conversion vs. Reversion
Revolution
Substitution vs. Interchange

Complex

Changeableness vs. Stability
Eventuality vs. Destiny

Causation

Constancy of Sequence

Cause vs. Effect
Attribution vs. Chance

Connection Between Cause and Effect

Power vs. Impotence
Strength vs. Weakness

Power in Operation

Production vs. Destruction
Reproduction
Producer vs. Destroyer
Paternity vs. Posterity
Productiveness vs. Unproductiveness
Agency
Energy vs. Inertness
Violence vs. Moderation

Indirect Power

Influence vs. Absence of Influence
Tendency
Liability

Combination of Causes

Concurrence vs. Counteraction

Space

Generally
Dimensions
Form
Motion

Matter

Generally
Inorganic
Organic

Intellect

Formation of Ideas
Communication of Ideas

Volition

Individual
Intersocial

Affections

Generally
Personal
Sympathetic
Moral
Religious

Note that fully half of Roget's categories refer to mental or behavioral terms: intellect, volition, and affection. This triad wll be discussed further in the lecture on The Trilogy of Mind.

For a short article on Roget and his Thesaurus see "Roget Gets the Last Word" by Claudia Kalb, Smithsonian Magazine, May 2021.

For a stunning interactive display of the conceptual relationships among words in English, link to the WordNet page created by George Miller (he of "The Magical Number Seven, Plus or Minus Two" fame) of Princeton University. WordNet is a super-duper online, interactive thesaurus, linking words (like cat) not only to their synonyms (kitty) but to their antonyms (dog), hypernyms (feline, carnivore, animal), and meronyms (paw, pad, tail), and more, as well.

Pragmatics

But even syntax and semantics together not enough to decode the meaning of an utterance, because many sentences inherently ambiguous. Consider the following examples:

Someone stepped on his trunk.
Harvey saw a man eating fish.
They are visiting firemen.
Visiting relatives can be boring.
Smoking volcanoes can be dangerous.
Make me a milkshake.
Marina restroom plans stall (this one from an actual newspaper headline in the West County Times, 09/21/2013).

Utterances such as these can't be disambiguated by analysis of phonology, syntax, and semantics alone. We understand ambiguous sentences -- and lots of the sentences that we have to deal with are ambiguous -- by making use of a fourth aspect of language, beyond phonology, syntax, and semantics, known as pragmatics. That is to say, we must consider the pragmatics of language as well: the context in which utterance takes place. This context can be linguistic: the other sentences that surround the ambiguous one, and clarify its meaning. It can also be non-linguistic, composed of the speaker's body movements, and other aspects of the physical context in which the utterance is spoken and heard.

The environmental context is important. If we're at a zoo, and I say, "someone stepped on his trunk", you're likely to look around for an elephant. If we're at an aquarium and I say, "Harvey saw a man eating fish", you're going to look around for a shark.

Also important is prosody, or the pattern of emphasis given to the various words in a sentence: How you say something may be as important, or even more important, than what you say. As an exercise, repeat the simple phrase, "What am I doing here?" five times, each time placing the emphasis on a different word. In each instance the words are the same: same phonology, same syntax, same semantics. But nevertheless, each utterance conveys a different meaning. The meanings are given by the way the words are spoken, not by the way they are strung together.

A 1951 satirical record by the comedian Stan Freberg -- actually, a spoof on radio and television soap operas -- vividly illustrates the importance of prosody. In the sketch, known as "John and Marsha, the two characters, both voiced by Freberg, simply repeat each other's names over and over, each instance with a different intonation clearly changing the mood and meaning. Link to the classic recording of the John-Masha sketch (YouTube).

A Matter of Emphasis on January 6, 2021

This is not just the stuff of jokes. Sometimes, a great deal of importance hinges on emphasis. Consider this episode from the hearings held by the House Select Committee to Investigate the January 6th Attack on the United States Capitol. As the crowd assembled on the Capitol Mall to hear his speech, the soon-to-be-twice-impeached-and soon-to-be-former President Donald J. Trump instructed security officials to take down the magnetometers used to detect weapons in the crowd: "They're not here to hurt me". Here is an excerpt from "This Is What Happened When the Authorities Put Trump Under a Microscope" by Carlos Lozada, (New York Times, 02/12/2023):

The challenges of interpreting and describing what another person was thinking, doing or intending at a particular moment — even a person as overanalyzed as Donald J. Trump — come alive in one passage, or rather, one word, of the Jan. 6 report. The issue is not even the word itself, but the form in which it is rendered.

The report [of the Select Committee]cites the testimony of a White House aide, Cassidy Hutchinson, who explained how, on the morning of Jan. 6, the president was incensed that the presence of magnetometers (used to detect weapons) was inhibiting some armed supporters from entering the Ellipse, where the president was to deliver his speech.

As always, Trump wanted a bigger crowd. Hutchinson said she heard him say something like, “I don’t F’ing care that they have weapons. They’re not here to hurt me. Take the F’ing mags away. Let my people in.”

They’re not here to hurt me. Which word should one emphasize when uttering that sentence aloud? If it is the verb “hurt,” the sentiment would be somewhat benign. They are not here to hurt me, the president might have meant, but to praise or cheer or support me. If the emphasis falls on “me,” however, the meaning is more sinister. They’re not here to hurt me, the implication would be, but to hurt someone else. That someone else could be Mike Pence, Nancy Pelosi, an officer of the Capitol Police or any of the lawmakers gathering to fulfill their duty and certify Joe Biden as president.

So, which was it? The Jan. 6 report confuses matters by italicizing “me” in the document’s final chapter but leaving it unitalicized in the executive summary. The video of Hutchinson’s testimony shows her reciting the line quickly and neutrally, with perhaps a slight emphasis on “hurt” rather than “me.” (You can watch and listen for yourself.)

Of course, the less ambiguous interpretation of Trump’s words is that either inflection — whether “hurt” or “me” — still means the president was unconcerned about anyone’s safety but his own. Perhaps “I don’t F’ing care” is the most relevant phrase.

With a document surpassing 800 pages, it may seem too much to linger on the typeface of a single two-letter pronoun. But for accounts that can serve as both historical records and briefs for the prosecution, every word and every quote — every framing and every implication — is a choice that deserves scrutiny.

The studious restraint of the Mueller report came in for much criticism once the special counsel failed to deliver a dagger to the heart of the Trump presidency and once the document was so easily miscast by interested parties. Even its copious redactions, justified by the opaque phrase “Harm to Ongoing Matter” appearing over a sea of blotted-out text, seemed designed to frustrate. Yet, for all its diffidence, there is power in the document’s understated prose, in its methodical collection of evidence, in its unwillingness to overstep its bounds while investigating a president who knew few bounds himself.

Gesture, the way we move our hands when we speak, can also be important in helping a listener to clarify our meaning. A simple example: if you're talking about a girl, and there are two girls in the room, you can clarify your meaning by gesturing towards one of them. And, in fact, a very popular theory of the evolution of language says that spoken language began as gesture. The first language, way back when, may very well have been gestural rather than vocal, involving the hands and the face, perhaps with some grunts and cries to convey emotion. This proposal gains plausibility from studies revealing the communicative power of sign languages used in various deaf communities. But even among those with normal hearing, speech is accompanied by a tremendous amount of manual activity in the form of hand-waving and other gestures. This is true even when people speak over the telephone, and can't even see each other's gestures. Language is connected to the hands, not just to the mouth. If you don't believe this, the next time you're engaged in a conversation, try not to use your hands (stick them in your pockets, perhaps), and see how much more difficult it is.

Setting aside the hands, our facial expressions and other aspects of body language convey a great deal about our intended meaning. And they can convey it even unintentionally. You can tell someone you love them, but if you're not smiling, if you're shuffling your feet, if you're not looking them in the eye when you say it, they'll never believe you.

The intersection between verbal and nonverbal pragmatics is illustrated by scare quotes and air quotes. In March 2017, President Donald Trump accused his predecessor, President Barack Obama, of having his "wires tapped" him in Trump Tower during the 2016 presidential election. Congressional and press inquires to the FBI and the National Security Agency failed to produce any evidence of wiretapping. In discussing the charge, Sean Spicer, the White House press secretary, used two fingers of each hand to put "air quotes" around the word wiretapping to indicate that Trump didn't mean the phrase literally, but was referring to any form of surveillance (photo at right from "Trump Ruins Irony, Too" by Moises Velasquez-Manoff, New York Times, 03/20/2017; see also "The Scare Quote: 2016 in a Punctuation Mark" by Megan Garber, the Atlantic, 12/23/2016.). Technically, quotation marks should be used to indicate a direct quotation, but often, in written communications, they take on an ironic connotation of epistemic uncertainty -- they indicate that the writer doesn't mean what s/he says, or doesn't mean to be taken literally, or that the word enclosed in the quotation marks is a sort of metaphor. In spoken discourse, the same purpose is accomplished by the use of air quotes. The use of air quotes and scare quotes reminds us that language isn't just a matter of syntax and semantics. The words we speak and write can take on multiple meanings, meanings that often have to be deciphered by the reader or listener. And the use of air quotes and scare quotes is an indication that some deciphering is necessary.

Two "No"s Don't Make a "Yes"

One example of how pragmatics can trump semantics is a story about the late Columbia University philosopher Sidney Morgenbesser (1921-2004). Sometime in the 1950s, the Oxford philosopher J.L Austin gave a colloquium on the philosophy of language, in which he remarked that while a double negative has a positive meaning -- to say you're not uninterested means that you're actually interested -- there were no examples where a double positive has a negative meaning. At which Morgenbesser called out,

"Yeah, yeah".

Common Ground

Could you pass the salt?

Apparently this sentence first appeared in papers by Gordon and Lakoff (1971/1975), Searle (1975/1979), and Clark (1979; Clark & Lucy, 1975). According to Herbert Clark (personal communication, June 26, 1995), there also exists a satirical paper entitled "Can you pass the salt?" or some-such. See also Groefsema (1992).

Children who reply to this question with a yes get dirty looks from their parents, and are immediately branded smart-alecks by their teachers, because this is not a question about the listener's physical abilities.Rather, it is an indirect request to pass the salt.It harkens back to Bartlett's effort after meaning, as the listener tries to resolve the inherent ambiguity in the sentence.Chomskian syntax and semantics are not enough for that purpose, it turns out. We also need a set of pragmatic principles which go beyond the information given by syntax and semantics, and which govern how people communicate with each other.In the final analysis, a sentence like this reminds us that language is not just a tool for individual thought; it is also a tool for interpersonal communication -- or, as the psycholinguist Herbert Clark has put it, language doesn't have so much to do with words and what they mean as it does with people and what they mean. So, in addition to investigating the cognitive bases of language, we have to understand its social foundations as well; once again, social psychology addresses the use to which cognitive structures and processes are put (for reviews of the social psychology of language use(Brown, 1965, 1986).

So, for example, from analyzing how sentences like Can you pass the salt? are understood, we learn that in order for the speaker and listener to communicate they have to establish common ground -- which Clark(Clark, 1979) defines as the knowledge, beliefs, and suppositions that speaker and listener share in common.Each must have some sense of what the other person knows, believes, and supposes to be true, and each must use this knowledge in structuring his or her communication. If speaker and listener are not on common ground, they will not understand each other and their interactions cannot go very far.

Conversational Rules

In order to achieve this mutual understanding, people have to manage their conversations according to what the linguist Paul Grice has called the cooperative principle (Grice, 1975, 1979):

Make your conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged.

This principle, in turn, is unpacked in terms of four conversational maxims, and their sub-maxims:

The maxim of quantity: Make your contribution as informative as is required (for current purposes), and do not make your contribution more informative than is required.

The maxim of quality: Try to make your contribution one that is true; do not say what you believe to be false, and do not say that for which you lack adequate evidence.

The maxim of relevance: Be relevant.

The maxim of manner: Be brief, and orderly, avoiding obscurity and ambiguity of expression.

Grice and others interested in sociolinguistics, including some psychologists(e.g., Clark, 1996; Higgins, 1981; Schwarz, 1994, have shown how listeners assume that speakers are following these maxims, and how lots of mischief can result when this assumption proves false.

Speech Acts

The function of language is not just to represent objects and events symbolically, nor just to communicate these thoughts to others. As the American philosopher John Searle (1970) has noted, speaking is also acting. When a clergyman or judge presides over a wedding and says, "I now pronounce you man and wife", he's not just expressing the thought that the couple is now married. His speech act makes them married, in a way that they weren't before. Just as with any other form of action, the the interpretation of speech acts depends on context. So, for example, saying "Have a nice day" to a salesclerk who has just treated you rudely is quite a different act from saying exactly the same words to your spouse as she goes out the door to work in the morning.

Uh... Um... Huh?

Spontaneous speech contains a number of apparent dysfluencies which, upon further analysis, actually turn out to be words that convey meaning, not just grunts or random noises (also known as "conversational detritus").

Clark and Tree (2002) found that English speakers use the syllables uh and um to signal listeners that there will be a pause in their speech while they search for the right word to use, or decide what they want to say next.

Uh signals that there will be a short delay.
Um signals that there will be a longer delay.

Dingemanse et al. (2013) found that huh? is a universal signal that the listener has not heard clearly what the speaker has just said. When uttered by a listener, it initiates "repair" efforts on the part of the speaker, so that he more clearly conveys his intended meaning.

Despite wide differences in phonology every language they studied had a close variant of huh?, used in just this manner.

The word always started with an h, or with a glottal stop (remember the discussion of speech sounds in the lectures on Sensation and Perception?).

In each language, huh? follows the rules of that language.

In English, it is spoken with the rising tone that indicates a question.
In Icelandic, it is spoken with the falling tone that indicates a question.

The idea here is that huh? is a universal word, one of the few that appears in the same form, or very similar forms, in a wide variety of languages. By contrast, even mama and papa appear in more different forms, across languages, than huh? does.

Huh!

So the next time someone criticizes you for sprinkling too many uhs and ums in your speech, say huh?. Then tell them that uh and um are really meaningful words. But they're also annoying little words, when there are too many of them, so try to plan your speech in advance!

See also "Huh? Is That a Universal Word?" by N.J. Enfield (American Scientist, 05-06/2019).

Code-Switching

Another aspect of pragmatics, especially among bilingual and bicultural individuals, is code-switching -- in which individuals switch back and forth between languages, or between dialects of a language, depending on the context. The most familiar example is probably observed in African-Americans, including politicians and other public figures, who may switch from Standard English when addressing or conversing with whites to African American Vernacular English (AAVE, also known as Black English or Ebonics, related to dialects spoken in rural parts of the American south, and largely derived from dialects spoken in the British Isles -- not, as often assumed, from the West Indies) when addressing or conversing with fellow blacks -- e.g., He my brother instead of He's my brother, aks rather than ask. Code-switching frequently occurs in discourse about particular topics of interest to the community in which the conversation takes place, or in expressions of solidarity with the community, or as an expression of group identity -- in which case it can be viewed as a means of establishing common ground.

A provocative example of code-switching occurs in James, by Percival Everett (2024), a retelling of Mark Twain's Huckleberry Finn from the point of view of the slave Jim. Twain's novel is full of "Negro dialect" (as it was portrayed, inaccurately, in much late 19th- and early 20th-century literature -- or, in Twain's case, as a literary technique; not to mention the n-word). Everett reverses the joke: James speaks in Standard English with his Afridan-American confreres, and speaks in dialect only to white people. While praising the novel, John McWhorter, a sociolinguist at Columbia (formerly at UCB) criticizes Everett's implication that Black English was a kind of code that African-Americans used to prevent whites from knowing what they were saying ("What a Brilliant Novel Gets Right -- and Deliberately Wrong -- about Black English", New York Times, 02/13/2025). To the contrary, McWhorter argues that Black English is just another dialect of English, derived from other nonstandard English dialects spoken by plantation owners and white indentured servants from Britain and Ireland. Well, not "just another" dialect: McWhorter asserts that Black English is "America's most interesting English dialect". Still, McWhorter agrees, the notion that Black English is (was) a kind of code, rather than the dialect that it is, makes for a great novel (and, apparently, forthcoming movie).

AAVE is sometimes derided as mere street slang or at best "standard English with mistakes" (a phrase taken from the title of a 1999 essay by the linguist Geoffrey Pullam -- who doesn't agree with this position), and became controversial in 1996, when the public schools in Oakland, California (and other municipalities) decided to permit classroom instruction in AAVE as an alternative to Standard English (SE). Language purists howled, but when John McWhorter was asked to weigh in on the dispute, he produced a definitive defense of AAVE as a genuine dialect of English, with its own rules governing phonology, morphology, semantics, and syntax (The Word on the Street, 1998). The general view now is that children need to be taught SE, so that they can get along in the classroom and the workplace, but there is no need to discourage AAVE (and other ethnic dialects) at home or in the neighborhood. Children learn to code-switch, and (thanks to what is called contrastive analysis) can learn to translate from AAVE to SE for those (including many African-Americans!) who don't know the former.

The difference between a language and a dialect is a little like the difference between a species and a subspecies. Individuals of different species can't make with each other (not successfully, anyway!). Within a species, individuals of different subspecies can mate successfully. Individuals who speak different languages cannot understand each other. Individuals who speak different dialects of the same language can. This applies to oral language. Written language can muddy the waters a bit. Serbo-Croatian is the common language of Serbs and Croats, but Serbs write in Cyrillic, while Croats use a Latin alphabet. Hindi-Urdu is one language, but Urdu is written in something like Arabic, while Hindi is written in something like Sanskrit.

You don't have to be bilingual to engage in code-switching, either. Millennials, and other heavy users of text-messaging and social media, do it as well. The quality of writing on social media is, apparently, terrible (I don't know much about this personally: I don't blog and I don't tweet, I'm not on Facebook and I'm not Linked In). But the same writers who misuse grammar (who and whom) and punctuation (the apostrophe), fail to capitalize, misuse words (literally and virtually), and rely on acronyms (lol) and emojis on the internet don't do the same thing when they write papers in their college courses, or thank-you notes to their grandparents. They write one with with their peers, and another way in other contexts. The internet is even developing its own classic style with its own logical integrity, as evidenced by the Internet's own style manual, A World Without "Whom': The Essential Guide to Language in the BuzzFeed Age (2017) by Emmy J. Favilla, who is the chief copy editor at BuzzFeed, the online news service (reviewed by John Simpson, former chief editor of the Oxford English Dictionary, in "Language Rules for the Digital Age", New York Times Book Review, 12/10/2017). But again, when these same individuals write for non-peers, they shift easily to something closer to the Chicago Manual of Style.

Something similar happens when Americans emigrate to England. For example, most Americans say to-MAY-to, while most Brits say to-MAH-to. For an American to say "to-MAH-to" in America sounds pretentious, while for an American to say "to-MAY-to" in England sounds like something an ugly American would do. This problem is discussed at length by Lynne Murphy, an American linguist who teaches in a British university (Sussex, to be precise), in the Prodigal Tongue: The Love-Hate Relationship between American and British English (2018; see also her popular blog, "Separated by a Common Language", echoing a famous quote from Oscar Wilde -- yes, George Bernard Shaw said it to, but about 50 years later). Summing up the essential pragmatics of code-switching, Murphy writes:

"A complex calculation has to be made weighing up the relative advantages of being understood, fitting in, and avoiding mockery versus the definite costs of losing one's linguistic identity and saying things that sound plainly ridiculous to you."

Code-switching is a popular vehicle for comedy, especially comedy that pointedly critiques race relations. In "Sorry to Both You", a 2018 film directed by Boots Riley, a black employee of a telemarketing company has little success until a black fellow employee exhorts him to "Use your white voice". Reviewing the film and other examples of "white voice" in popular culture, Aisha Harris writes:

As long ago as the New World, enslaved and free blacks participated in dramatized communal appropriation of "white-identified gestures, vocabulary, dialects, dress, or social entitlements", as Marvin McAllistere writes in his book Whiting Up: Whiteface Minstrels and Stage Europeans in African-American Performance.... In an episode of the '90s sitcom "Martin", Martin's attempt to sound white [while dealing with a 911 operator] and the operator's reaction shrewdly emphasize how the perception of whiteness grants a measure of access often closed to people of color. The episode also slyly suggests that while cultural differences do exist, black people, by virtue of being a minority group in America, should understand white people as much as possible; their comfort and livelihood depend on it ("When Black Performers Use Their 'White Voice'", New York Times, 07/12/2018).

Code-switching makes an important point about the social functions of language. Language is obviously a tool for communication. As Chomsky points out, language is also a tool for thinking. But language, the language we use and how we speak it, is also an expression of identity.

Back to Nonverbal Communication

Nonverbal communication isn't just for birds and bees. It's also for humans, including humans who communicate with each other through language. The nonverbal aspects of language include cues in a variety of modalities: visual (kinesics, or body language; proxemics, or distance; eye contact and other aspects of oculesics); auditory (prosody, vocal quality, volume, and pitch); even written language has its nonverbal aspects, such as the style of handwriting (e.g., email messages consisting of lots of upper-case letters). It's widely believed that spoken language had its origins in hand-gestures and other nonverbal modes. But the importance of pragmatics indicates that there's more to language and communication, than words and rules.

The Evolution of Language

While other animals have remarkable communicative abilities, and individuals of some species have displayed some capacity to learn language, human language -- language in all its glory -- is a uniquely human capacity. It has no analog anywhere else in the animal kingdom. Given the fact of the uniqueness of language, at least one puzzling fact remains: Why are there so many different languages? Why doesn't everyone speak English, or Spanish, or Swahili, or Tewa? The parallels between the diversity of species, all with their origins in a single common ancestor, and the diversity of languages, have led some linguists to seek evidence for a single, ancient "protolanguage", and to trace the evolution of modern languages from this ancestral source. In this work, they examine commonalities in speech sounds, grammar, and vocabulary among known languages, living and dead, written and oral -- much the way biologists trace the evolution of species by comparing features of morphology or the molecular structure of DNA to determine the relations among animals.

It's been said (by someone, I think David Premack), that "language is an embarrassment to the theory of evolution". The reasoning is that the uniqueness of human language appears to defy the basic Darwinian principle of continuity across species. Human language must have evolved, and if it evolved it should have left traces in our evolutionary forebears -- just as our two arms and two legs reflect that we evolved from fish who had four fins. But it doesn't. No nonhuman species has anything like human language. They might have phonology (birds), they might have semantics (lots of chimpanzees), but none of them have syntax (some theorists dispute this, but I think the evidence is pretty clear).

So how did language evolve, and why didn't it leave any evolutionary traces in other species? There are, basically, two views.

Steven Pinker & Paul Bloom (1990) argue that language evolved in the usual Darwinian manner, as a result of a random genetic mutation affecting brain structure and/or function. The ability to communicate linguistically gave humans a huge adaptive advantage, because it permitted social learning by precept rather than example. Instead of each individual having to learn through trial and error, one individual could convey its knowledge to a large number of others by teaching. For Pinker and Bloom, then, the evolutionary advantage affected communication between individuals.
Chomsky himself has a slightly different view. He has always thought that language was primarily a tool for thought rather than for communication, because it gives us a powerful means of representing knowledge, and for creative thinking -- for thinking thoughts that have never been thought before. He calls this new trait "Merge", by which he means the ability for recursion -- to combine simple thoughts (e.g., The fox jumped over the dog; The fox was quick; The fox was brown; The dog was lazy) into ever-more-complicated ones (e.g., The quick brown fox jumped over the lazy dog). Chomsky proposes that this single mutation occurred in a single human. He or she had nobody to talk to, of course, because none of his or her contemporaries had the gene. But the gene was transmitted through successive generations, so that eventually enough humans had enough other humans to share their thoughts with.

Either way, as a powerful tool for representing knowledge, or as a powerful medium for transmitting it -- and, of course, it is both -- language gave the humans who had it such a powerful adaptive advantage that they quickly outpaced all their evolutionary cousins - -not just chimpanzees, but other hominids as well. That's why language has left no evolutionary trace of itself.

The gradualist perspective on the evolution of language is well represented by The Language Puzzle: Piecing Together the Six-Million-Year Story of How Words Evolved by Steven Mithen, an archeologist. True to his title, Mithen devotes a chapter to each of the pieces that, when put together like a jigsaw puzzle, explain how it was that the unique language capacity of homo sapiens gradually evolved: the emergence of bipedality that separates hominins from other primates; the large brain that separates homo from australopithecus, plus an abundance of inernal connectivity; the inclusion of fat- and protein-rich meat into the diet; the development of a vocal apparatus capable of producing lots and lots of different phonemes, and of an auditory apparatus perfectly tuned to the frequency range of human speech; the invention of stone tools, possibly by h. neanderthalis, certainly by h. sapiens; the replacement of h. neanderthalensis by h. sapiens the invention of visual symbols, represented by cave art by h. sapiens (remember: spoken words are auditory symbols; written words, once writing is invented, will be visual symbols); and more -- you can see where this is going. Ian Tattersall, a prominent anthropologist reviewing Mithen's book in the New York Review of Books (xxxxx, 12/19/2024) praised Mithen's individual chapters, but concludes that he failed to put the gradualist picture together.

[N]ot only has the reader been treated to an accessible account of an intrinsically fascinating subject, but Mithen feels that he has all of his pieces “on the table” and is ready to explain “why, when and how” language evolved. He accordingly presents us with a detailed scenario of human evolution in which the “word-like and syntax-like” vocalizations of the earliest bipeds eventually gave way to the use of iconic words and gestures by the australopiths and later on by tool-wielding members of Homo....

***

This is all so confidently delivered, with so much circumstantial detail, that the reader is tempted to forget that there is precious little evidence for any of it and that the whole construct is held together by nothing more than a blind faith that language must have emerged gradually over a prolonged period of time. In effect, Mithen has been able to complete his jigsaw puzzle only by placing his pieces on top of one another in a neat pile, rather than by carefully fitting them together side by side. Which suggests that he has been using the wrong metaphor all along. A better one might be the architectural arch, which cannot function until it is fully complete. Half an arch, or even nine tenths of one, is useless; similarly, to deprive language of any of its interlocking elements would rob it of the “discrete infinity” that makes it qualitatively unique among the many systems of vocal and/or gestural communication. Yet, just as Chomsky’s minimalist algorithm predicts, for all its many complexities language appears to be easily invented if you happen to already have a “language-ready” brain—as was the case, for example, for the deaf and language-naive Nicaraguan schoolkids [Note: See below] who spontaneously invented and elaborated a rule-bound sign language when housed together for the first time in the 1970s.

So, what actually happened? The fossil record shows that, after a long history of hominin brain enlargement and doubtless also of complexifying vocal/gestural communication, by around 230,000 years ago our anatomically distinctive species Homo sapiens had emerged in Africa. We know from direct evidence that these ancient humans boasted a modern vocal tract, and we can reasonably infer that they also possessed brains with the internal connectivity required for language—which, after all, they could never otherwise have acquired. However, we have no evidence either direct or indirect that those earliest Homo sapiens used language as we know it, and we have to wait until about 100,000 years ago for the first convincing proxy evidence of language use to show up. This comes in the form of overtly symbolic objects such as abstract engravings, items of bodily adornment, and, before long, representational paintings.

Such objects allow us to infer pretty confidently that the human brain had by this time shifted from the ancestral intuitive algorithm to the symbolic one we use today. What is more, that change must have been precipitated by a purely behavioral stimulus, because the requisite biology had to have been there already. And since it is difficult to conceive of either language or symbolic ability in the absence of the other, the cultural stimulus concerned was most plausibly the spontaneous invention of language. People (quite likely children, to start with) began to associate particular sounds with specific meanings, and thereby to give things the names that particularize them in the modern human mind and make symbolic reasoning possible. Once that mental feedback between sound and symbol had been established, the keystone of the symbolic arch was in place.

It may seem remarkable that everything needed for a sudden cognitive transition just happened to be in the right place at the right time. But significantly, in evolutionary terms the “exaptation” of the human brain that was involved—its potential to master language and symbolic thought where no such cognitive features had existed before—would have been nothing special. Ancestral birds, for example, had possessed feathers for many millions of years before they ever used them to fly. And once the new way of processing information had kicked in, the rest was history, as the newly symbolic and linguistic (though otherwise manifestly unperfected) Homo sapiens spread beyond Africa, eventually invented farming, and for better or for worse began vigorously devoting its new cognitive capacities to the domination of the planet.

Which brings us to...

The Emergence of Language

Language as a human trait is a product of evolution, but individual languages also evolve. The English of Geoffrey Chaucer, who wrote the Canterbury Tales in the 14th century, is barely readable today without extensive footnotes. Even Shakespeare can be a struggle. But although language changes, linguists hardly ever get a chance to observe when a language emerges for the very first time.

Of course, there are various constructed languages, such as Esperanto and even Klingon, of Star Trek fame, essentially invented by a single person:

Esperanto by L.L. Zamenhoff in the late 19th century; (himself a distinguished linguist) for the Lord of the Rings trilogy.
Quenya, Gnomish, and Sindarin, threes forms of Elvish devised by J.R.R. Tolkien.

Actually, Tolkien, a philologist at Oxford, constructed a whole family of Elvish languages by simulating the changes in form that occur naturally as languages evolve. For an account of Tolkein's work, seethe essay by Carl F. Hostetter in Tolkien: Maker of Middle-Earth, ed. by Catherine McIlwaine.

Klingon by Mark Okrand (a UCB-trained linguist!) in the late 20th century.

The various "languages" used in the Star Wars films, however, aren't really constructed languages. Instead, the sound designer merely co-opted sounds from various unfamiliar languages, like Quechua.

Na'vi, for the film Avatar, by Paul Frommer, a linguist at USC, in the early 21st century.
Another UCB graduate, David Peterson, devised Dothroki and Valyrian, languages featured in the TV adaptation of George R.R. Martin's fantasy series, Game of Thrones). Peterson tells the whole story in his book, The Art of Language Invention (2015).
Peterson and his spouse, Jessie Peterson, also constructed Chakobsa, the language spoken by the Fremen in Dune, Part Two (Denis Villeneuve, 2024; the Petersons also did the language construction in Part One, beginning with the rudimentary vocabulary devised by Frank erbert in the novel on which the films were based). There's a very nice article on their process, and language construction in general, in the New York Times ("The Invention of a Desert Tongue for 'Dune', by Marc Tracy, 03/23/2024).

Imaginary Languages by Marina Yaguello (2022; trans. Erik Butler) traces the history of constructed languages (like Esperanto), imaginary languages, and the search for the original language (before the Tower of Babel). The Language Creation Society is a group of amateur and professional scholars who create, and study, constructed languages (one of its founders, who goes by the name of Sai", is a UC Berkeley Cognitive Science graduate.

For more details on constructed languages, see In the Land of Invented Languages by Arika Okrent traces Okrand's invention of Klingon, and introduces the reader to the subculture of Trekkies who actually learn to speak Klingon to each other. Also the Encyclopedia of Fictional and Fantastic Languages by Ursula K. LeGuin, herself a famous science-fiction author.

In some respects Modern Hebrew is a reconstructed language. Even in Roman times, Jews in Palestine spoke mostly Aramaic (the language of Jesus) o Greek (the language of St. Paul). Josephus, the Jewish historian of the 1st century CE, wrote in Latin. From the time of the Roman destruction of the Second Temple in Jerusalem, in 70 CE, and well into the 19th century, Jews spoke the language of whatever country they happened to live in. Most Ashkenazi (European) Jews spoke Yiddish, not Biblical Hebrew, while Sephardic (North African) Jews spoke Ladino. Hebrew was the language of the synagogue, and of Jewish scholarship, much like Latin was used mostly in Roman Catholic church services. It was read, silently and out loud, but it wasn't used for social interactions. Modern Hebrew was deliberately formulated in the late 19th century as part of the Zionist movement -- largely at the instigation of a scholar named Eliezer Ben-Yehuda, who emigrated from Lithuania to what was then Palestine in 1881. In order to become a living language, however, Hebrew had to have names for concepts that weren't mentioned in the Hebrew Bible or Talmud (a recent count of words used in an Israeli newspaper, for example, found that only 5% were in the Hebrew Bible. Along with Modern Arabic, Modern Hebrew is the official language of the State of Israel. (For a fuller story, see "Flowers Have No Names" by Benjamin Harshav, Natural History, 02/2009.)

Irish, Welsh, and Breton (a Celtic language spoken in Brittany and some other parts of France) are other examples of "dead" languages that were revived as part of a nationalist project.

In the Genesis story, God punished humankind for trying to build the Tower of Babel, which would have reached to the heavens, by creating different languages, so that people would no longer understand each other. Esperanto was conceived explicitly as a vehicle for promoting world peace, by making it possible for everyone, everywhere, to understand each other once again. For an engaging history of Esperanto, see Bridge of Words: Esperanto and the Dream of a Universal Language (2016) by Esther Shor, reviewed by Joan Acocella in the New Yorker ("Return to Babel: The Rise and Fall of Esperanto", 10/31/2016).

Cultural contact can also lead to the development of new languages -- or, at least, something like a language.

Pidgin languages develop as a rough-and-ready means of communication between groups that have no language in common. And they're not really languages, with a formal set of words and established grammar. Pidgins have no native speakers -- all speakers of a pidgin language learned their native tongue first. A classic example is Chinese Pidgin English, which developed in the 17th century as a result of contacts between English traders and Chinese natives.
Creole languages, by contrast, are real languages, which develop by combining elements of two or more other languages. They have an established phonology, syntax, and semantics, and can be learned as a native language. Like pidgins, creole languages began to emerge with contact between European explorers and colonists (and, unfortunately, slave traders) and the native peoples of Africa, Asia, and the Americas. Pidgins can become creoles if they are learned by children as a native language. Familiar examples are Hawaiian creole, which developed from Hawaiian and English, and Haitian creole, a blend of French and West African languages.
A lingua franca is another form of working language, intended to enable speakers of different languages to communicate with each other. In ancient times, Latin or Greek might have functioned as a lingua franca. These days, English often functions as a lingua franca, used by, say, French and Italians to communicate. But in the Renaissance, Mediterranean Lingua Franca was a mix of Italian, Arabic, Greek, Spanish, and other languages, used by traders throughout the Mediterranean.

Pidgins and Creoles emerged relatively recently, but there were still no linguists around to observe the process. Recently, however, linguists have had two opportunities to watch full-blown native languages emerge, almost literally, from nothing.

Nicaraguan Sign Language, known as ISN (for the Spanish, Idioma de Señas de Nicaragua) developed in the 1970s among deaf people living in western Nicaragua. Before then, they were not exposed to any language at all -- as is often the case with deaf people in underdeveloped countries. They were not spoken to, of course, but neither were they taught any sign language. However, in 1977, a number of deaf children were brought together in a special school, at which point they spontaneously developed a sign language for communicating with each other. ISN has a completely different syntax from American Sign Language -- which the students weren't taught, anyway. Steven Pinker, in The Language Instinct (probably the best book on language ever written) cites ISL as evidence that language reflects an innate cognitive faculty.
Light Warlpiri is spoke exclusively by relatively young people living in Lajamanu, an aboriginal village in the remote outback of the Northern Territory of Australia. In 1948, the villagers were relocated from another village Yuendumu, whose population had grown too large to be sustainable. The residents of Yuendumu spoke an aboriginal language known as Warlpiri. The new village, which is very isolated even by the standards of the Northern Territory became firmly established only in 1970, at which point the younger residents began to speak a new language among themselves. Light Warlpiri has some similarities to classic Warlpiri, but it is different enough, with respect to syntax, that it constitutes a different language. As of 2013, it was spoken exclusively by residents who are younger than 35, who use it to communicate among themselves. They speak Warlpiri to their elders, and aboriginals from other nearby communities, and English with outsiders. Thus, Light Warlpiri offers a once-in-forever opportunity to study the emergence of a new language while its first speakers are still alive. (See "A Village Invents a Language All Its Own" by Nicholas Bakalar, New York Times, 07/16/2013).

Linguists hardly ever get a chance to observe the formation of an alphabet, either. But in the early 19th century, Sequoyah, a leader of the Cherokee tribe of Native Americans, in Tennessee, became convinced that writing was the source of white settlers' power and success in the New World. In order to grab a share of those benefits for his own people, devised an alphabet (technically, a syllabary) in which each letter represented one of the 85 phonemes in the Cherokee language. The symbols themselves are often derived from other alphabets, such as English, Greek, and Cyrillic, but that doesn't make the Cherokee alphabet any less of an invention. And although there weren't many linguists around at the time to observe the invention, we know that, within five years of its introduction, there were more Cherokees than whites who were literate in Tennessee, and the Cherokee were publishing their own newspaper (see "Carvings from Cherokee Script's Dawn" by John Noble Wilford, New York Times, 06/23/2009).

Spoken languages arise naturally, but written languages are invented to represent them visually. Language reform in Kazakhstan provides an interesting case in point ("Kazakhstan Cheers New Alphabet, Except for All Those Apostrophes" by Andrew Higgins, New York Times, 01/16/2018). The native language is Kazakh (of course), a Turkic language A Muslim-majority former "republic" of the Soviet Union, and it was originally written employing Arabic script. In the 19th century, the written form switched to Latin, but after the incorporation of Kazakhstan into the Soviet Union, the written form shifted to Cyrillic as part of Stalin's effort to forge a common identity across the entire USSR (actually, there were different versions of Cyrillic for each of about 20 Turkic languages, to prevent the Turkic people of Kazakhstan, Uzbekistan, Azerbaijan, and other Muslim-majority "republics" from developing a common, non-Soviet identity. The situation changed with the collapse of the Soviet Union in 1989. Kazakhstan gained independence in 1991, and the country's authoritarian president, Nursultan Nazarbayev, instigated a shift from Cyrillic to Latin scripts to distance the new country from Russia, and bring it closer to the West. The problem is that spoken Kazakh has phonemes that are not easily represented in the letters of the Latin script. One solution is to employ diacritical marks, such as umlauts and tildes, like Turkish (obviously, another Turkic language) does; but that risks substituting Turkish influence for Russian influence; and besides, diacritical marks are hard to use on computer keyboards. Another solution is to employ digraphs and other letter combinations, like the English ch or th. This was apparently the solution favored by a committee of linguists and philologists convened by Nazarbayev to address the problem. But Nazarbayev objected that the use of digraphs to represent Kazakh phonemes would cause confusion when Kazakhs, learning English (which is something he favors, to foster the integration of Kazakhstan with the West), had to map the same digraphs onto different phonemes. Not an autocrat for nothing, Nazarbayev decided, all on his own, that the specifically Kazakh phonemes will be represented by apostrophes. So for example, the Republic of Kazakhstan will be written as Qazaqstan Respy'bli'kasy. The linguists and philologists objected that the result was both imprecise linguistically and aesthetically ugly. At the same time, the minority Russian community, and the Russian Orthodox Church, continues to object to the abandonment of Cyrillic. All of which goes to show that language is critical to fostering a sense of national identity.

Was the First Language a Sign Language?

Where did human language come from, in phylogenetic terms? Charles Darwin speculated that human speech had its evolutionary origins in the cries of nonhuman animals -- what has come to be known as the "bow-wow" theory. More recently, Robin Dunbar (1996) suggested that language evolved as a substitute for grooming behaviors and forms of physical contact that promote the development of hierarchies and alliances in nonhuman primates. And Peter MacNeilage and Barbara Davis have proposed an "ingestive theory" that speech has its origins in the opening and closing of the mouth during chewing.

However, in discussing the origins of language, it is important to remember not only that written language is a relatively recent cultural invention, but also that speech might well not have been the first form taken by language. We already know from studies of deaf individuals that sign language has all the properties of spoken language, including syntax and semantics, as well as gestural equivalents of phonology and morphology. Because humans share common ancestors with chimpanzees and other great apes, and because chimpanzees lack the vocal apparatus necessary for spoken language (actually, this is disputed) but seem to have the (limited) ability to learn and use some aspects of sign language, it has been frequently suggested that the first language was gestural rather than vocal, involving the hands and the face, perhaps with some grunts and cries to express emotion.

The proposal gains plausibility from studies revealing the communicative power of the sign languages used in various deaf communities. Even among those with normal hearing, speech is accompanied by a tremendous of manual activity in the form of hand-waving and other gestures (just watch your instructor's lectures!). This is true even when people speak over the telephone, and therefore cannot see each others' gestures. Language is connected to the hands, not just to the mouth.

For more information, see M.C. Corballis, From Hand to Mouth: The Origins of Language (Princeton University Press, 2002). A precis of the book, along with peer commentary, appeared in Behavioral & Brain Sciences (2003). Corballis argues that right-handedness, which is a uniquely human characteristic, arose because oral language evolved out of earlier manual gestures.

For more on sign languages, see "Ten Things You Should Know About Sign Languages by Karen Emmorey, Current Directions in Psychological Science, 2023. From the Abstract:

The 10 things you should know about sign languages are the following: (1) Sign languages have phonology and poetry. (2) Sign languages vary in their linguistic structure and family history but share some typological features due to their shared biology (manual production). (3) Although there are many similarities between perceiving and producing speech and sign, the biology of language can impact aspects of processing. (4) Iconicity is pervasive in sign language lexicons and can play a role in language acquisition and processing. (5) Deaf and hard-of-hearing children are at risk for language deprivation. (6) Signers gesture when signing. (7) Sign language experience enhances some visual-spatial skills. (8) The same left-hemisphere brain regions support both spoken and sign languages, but some neural regions are specific to sign language. (9) Bimodal bilinguals can code-blend, rather code-switch, which alters the nature of language control. (10) The emergence of new sign languages reveals patterns of language creation and evolution. These discoveries reveal how language modality does and does not affect language structure, acquisition, processing, use, and representation in the brain. Sign languages provide unique insights into human language that cannot be obtained by studying spoken languages alone.

Indo-European Languages

One such evolutionary scheme has been proposed for the "Indo-European" languages -- a linguistic super-family that includes German, English, Spanish and the other romance languages, Russian, Persian, Hindi, and Greek. According to a popular theory, the evolution of the Indo-European languages began about 6000 years ago (i.e., 4000 BCE), as nomads from the Great Eurasian Steppes domesticated the horse, invented the wheel, and rode their chariots all over Europe, the Fertile Crescent, and South Asia, spreading variants of their language. Linguistically speaking, the "family tree" of Indo-European languages has more than a dozen major branches, comprising more than 100 "daughter languages":

Anatolian (e.g., Hittite and Lydian);
Armenian and Tocharian (which is now extinct).
Celtic (e.g., Irish, Scottish, and Welsh);
Romance (Latin, Romanian, Spanish, Italian, French);
Germanic (e.g., Dutch, German, English, and Swedish);
Balto-Slavic (including Old Russian, Lithuanian, Latvian);
Slavonic (Serbo-Croatian, Bulgarian, Great Russian);
Iranian (e.g., Persian);
Indic (e.g., Sanskrit, Hindi);
Italic (including Latin, French, and Spanish)and
Greek and Albanian.

The Anatolian and Greek branches are closest to this proto-language; the Romance, Germanic, and Slavonic languages are relatively recent.

A recent study by Mark Pagel found that numbers, pronouns, and nouns change most slowly over time; adjectives and verbs, relatively rapidly.

All 87 Indo-European languages studied had similar forms for the words "I", "who", "two", "three" and "five".
Fully 80 languages had similar forms for "four", "we", "what", "how", and "where".

Note that there is no argument that Greek is more "primitive" than Danish -- just as dogs are not more "primitive" than humans. Each language is perfectly well formed and suited for human use. In formal terms, all languages are equivalent in terms of complexity, etc.

Other language groups show a similar pattern. Apparently, with the invention of agriculture (including the domestication of animals), about 10,000 years ago, isolated populations began to spread. They carried their languages with their food, and the hunter-gatherers they encountered adopted the farmers' languages as well as their sedentary ways. The graphic shows in yellow the "homelands" in which agriculture emerged, independently, after the Pleistocene, between c. 8500-2500 BCE (from "Farmers and Their Languages: The First Expansions" by Jared Diamond and Peter Bellwood, Science, 2003). As noted by Diamond, a geographer, and Bellwood, an anthropologist, agriculture had three major consequences for the societies that originally developed it:

An increase in population density supported by the higher food yields.
A sedentary as opposed to mobile lifestyle, permitting the development of technology, social stratification, and the emergence of centralized states.
The transmission of diseases such as measles and smallpox from domestic animals to humans, to which the farmers acquired a degree of immunity that their hunter-gatherer forebears lacked.

There was yet another profound consequence: The farmers' original languages began to evolve, and differentiate, as they dispersed. Diamond and Bellwood have found that many of the world's "language families" (like Indo-European) are "centered", in a sense, on these agricultural homelands. D&B make a convincing case, not least because of the convergence of five additional types of evidence: archeology, plant and animal domestication, human skeletal remains, modern and ancient DNA, and the histories of extinct languages. However, the story worldwide is a little more complicated, because the various types of evidence do not always converge. With respect to the Indo-European languages, D&B's findings support an origin among farming peoples in Anatolia, a region in modern eastern Turkey, whence it spread to the Eurasian Steppes and beyond. But D&B have differing interpretations of their own findings!

Another study employed a computer simulation to "walk back" the evolution of Indo-European (Bouckaert et al., Science 2012). These investigators have dated the origin of this language group to a Anatolia in modern eastern Turkey, but even earlier, to about 9,000 years ago. Bouckaert et al. agree that Indo-European spread from Anatolia with the expansion of farming, not by chariot-driving nomads, but their data also indicates that the Indo-European languages began to diversify after the agricultural expansion had finished. Therefore, agricultural expansion can't be the only reason for the expansion of languages. It's complicated.

Fortunately, this is a course in psychology, not linguistics, and we don't have to resolve this issue. It's enough, for present purposes, to have some idea of where these different languages came from, and how they dispersed and changed.

For more about Proto-Indo-European, read Proto: How One Ancient Language Went Global (2025) by Laura Spinney.

The Proto-Proto-Language?

Nor is there any implication that "Proto-Indo-European" is the ancestor of all languages. It certainly isn't. Our best guess is that the Indo-European languages go back only about 9,000 years -- and language is certainly older than that, perhaps 50-100,000 years old. Other language groups in Asia, Africa, and the Americas have different evolutionary histories, and different ancestors. All these proto-languages, of course, have their single common ancestor in whatever "proto-proto-language" was spoken by our common hominid ancestors in eastern and southern Africa. An analysis of the phonemes in 504 languages worldwide by Quentin Atkinson, a computational biologist, found that the further a language group was from southwestern Africa, the fewer phonemes that language has (see his article in Science, April 2011). On this basis, Atkinson has suggested that traces of the proto-proto-language can be found in the "click" languages of the Khoisan family of languages spoken by the Bushmen of the Kalahari Desert, in southwestern Africa, which have as many as 100 phonemes (including lots of different kinds of clicks). For more on this aspect of evolution, see the material on "Psychological Development", below.

TONGUE TWISTERS: In Search of the World's Hardest Language

Reprinted from The Economist, 12/17/2009

A CERTAIN genre of books about English extols the language's supposed difficulty and idiosyncrasy. "Crazy English", by an American folk-linguist, Richard Lederer, asks "how is it that your nose can run and your feet can smell?". Bill Bryson's "Mother Tongue: English and How It Got That Way" says that "English is full of booby traps for the unwary foreigner...Imagine being a foreigner and having to learn that in English one tells A lie but THE truth."

Such books are usually harmless, if slightly fact-challenged. You tell "a" lie but "the" truth in many languages, partly because many lies exist but truth is rather more definite. It may be natural to think that your own tongue is complex and mysterious. But English is pretty simple: verbs hardly conjugate; nouns pluralise easily (just add "s", mostly) and there are no genders to remember.

English-speakers appreciate this when they try to learn other languages. A Spanish verb has six present-tense forms, and six each in the preterite, imperfect, future, conditional, subjunctive and two different past subjunctives, for a total of 48 forms. German has three genders, seemingly so random that Mark Twain wondered why "a young lady has no sex, but a turnip has". (Das Madchen is neuter, whereas Die Steckrube is feminine.)

English spelling may be the most idiosyncratic, although French gives it a run for the money with 13 ways to spell the sound "o": o, ot, ots, os, ocs, au, aux, aud, auds, eau, eaux, ho and o. "Ghoti," as wordsmiths have noted, could be pronounced "fish": gh as in "cough", o as in "women" and ti as in "motion". But spelling is ancillary to a language's real complexity; English is a relatively simple language, absurdly spelled.

Perhaps the "hardest" language studied by many Anglophones is Latin. In it, all nouns are marked for case, an ending that tells what function the word has in a sentence (subject, direct object, possessive and so on). There are six cases, and five different patterns for declining verbs into them. This system, and its many exceptions, made for years of classroom torture for many children. But it also gives Latin a flexibility of word order. If the subject is marked as a subject with an ending, it need not come at the beginning of a sentence. This ability made many scholars of bygone days admire Latin's majesty--and admire themselves for mastering it. Knowing Latin (and Greek, which presents similar problems) was long the sign of an educated person.

Yet are Latin and Greek truly hard? These two genetic cousins of English, in the Indo-European language family, are child's play compared with some. Languages tend to get "harder" the farther one moves from English and its relatives. Assessing how languages are tricky for English-speakers gives a guide to how the world's languages differ overall.

Even before learning a word, the foreigner is struck by how differently languages can sound. The uvular r's of French and the fricative, glottal ch's of German (and Scots) are essential to one's imagination of these languages and their speakers. But sound systems get a lot more difficult than that. Vowels, for example, go far beyond a, e, i, o and u, and sometimes y. Those represent more than five or six sounds in English (consider the a's in father, fate and fat.) And vowels of European languages vary more widely; think of the umlauted ones of German, or the nasal ones of French, Portuguese and Polish.

Yet much more exotic vowels exist, for example that carry tones: pitch that rises, falls, dips, stays low or high, and so on. Mandarin, the biggest language in the Chinese family, has four tones, so that what sounds just like "ma" in English has four distinct sounds, and meanings. That is relatively simple compared with other Chinese varieties. Cantonese has six tones, and Min Chinese dialects seven or eight. One tone can also affect neighbouring tones' pronunciation through a series of complex rules.

Consonants are more complex. Some (p, t, k, m and n are common) appear in most languages, but consonants can come in a blizzard of varieties known as egressive (air coming from the nose or mouth), ingressive (air coming back in the nose and mouth), ejective (air expelled from the mouth while the breath is blocked by the glottis), pharyngealised (the pharynx constricted), palatised (the tongue raised toward the palate) and more. And languages with hard-to-pronounce consonants cluster in families. Languages in East Asia tend to have tonal vowels, those of the north-eastern Caucasus are known for consonantal complexity: Ubykh has 78 consonant sounds. Austronesian languages, by contrast, may have the simplest sounds of any language family.

Perhaps the most exotic sounds are clicks--technically "non-pulmonic" consonants that do not use the airstream from the lungs for their articulation. The best-known click languages are in southern Africa. Xhosa, widely spoken in South Africa, is known for its clicks. The first sound of the language's name is similar to the click that English-speakers use to urge on a horse.

For sound complexity, one language stands out. !Xoo, spoken by just a few thousand, mostly in Botswana, has a blistering array of unusual sounds. Its vowels include plain, pharyngealised, strident and breathy, and they carry four tones. It has five basic clicks and 17 accompanying ones. The leading expert on the !Xoo, Tony Traill, developed a lump on his larynx from learning to make their sounds. Further research showed that adult !Xoo-speakers had the same lump (children had not developed it yet).

Beyond sound comes the problem of grammar. On this score, some European languages are far harder than are, say, Latin or Greek. Latin's six cases cower in comparison with Estonian's 14, which include inessive, elative, adessive, abessive, and the system is riddled with irregularities and exceptions. Estonian's cousins in the Finno-Ugric language group do much the same. Slavic languages force speakers, when talking about the past, to say whether an action was completed or not. Linguists call this "aspect", and English has it too, for example in the distinction between "I go" and "I am going." And to say "go" requires different Slavic verbs for going by foot, car, plane, boat or other conveyance. For Russians or Poles, the journey does matter more than the destination.

Beyond Europe things grow more complicated. Take gender. Twain's joke about German gender shows that in most languages it often has little to do with physical sex. "Gender" is related to "genre", and means merely a group of nouns lumped together for grammatical purposes. Linguists talk instead of "noun classes", which may have to do with shape or size, or whether the noun is animate, but often rules are hard to see. George Lakoff, a linguist, memorably described a noun class of Dyirbal (spoken in north-eastern Australia) as including "women, fire and dangerous things". To the extent that genders are idiosyncratic, they are hard to learn. Bora, spoken in Peru, has more than 350 of them.

Agglutinating languages--that pack many bits of meaning into single words--are a source of fascination for those who do not speak them. Linguists call a single unit of meaning, whether "tree" or "un-", a morpheme, and some languages bind them together obligatorily. The English curiosity "antidisestablishmentarianism" has seven morphemes ("anti", "dis", "establish", "-ment", "-ari""-an" and "-ism"). This is unusual in English, whereas it is common in languages such as Turkish. Turks coin fanciful phrases such as "CEKOSLOVAKYALILASTIRAMADIKLARIMIZDANMISSINIZ?", meaning "Were you one of those people whom we could not make into a Czechoslovakian?" But Ilker Ayturk, a linguist, offers a real-life example: "EVLERINDEMIScESINE RAHATTILAR". Assuming you have just had guests who made a mess, these two words mean "They were as carefree as if they were in their own house."

YES WE (BUT NOT YOU) CAN

This proliferation of cases, genders and agglutination, however, represents a multiplication of phenomena that are known in European languages. A truly boggling language is one that requires English speakers to think about things they otherwise ignore entirely. Take "we". In Kwaio, spoken in the Solomon Islands, "we" has two forms: "me and you" and "me and someone else (but not you)". And Kwaio has not just singular and plural, but dual and paucal too. While English gets by with just "we", Kwaio has "we two", "we few" and "we many". Each of these has two forms, one inclusive ("we including you") and one exclusive. It is not hard to imagine social situations that would be more awkward if you were forced to make this distinction explicit.

Berik, a language of New Guinea, also requires words to encode information that no English speaker considers. Verbs have endings, often obligatory, that tell what time of day something happened; TELBENER means "[he] drinks in the evening". Where verbs take objects, an ending will tell their size: KITOBANA means "gives three large objects to a man in the sunlight." Some verb-endings even say where the action of the verb takes place relative to the speaker: GWERANTENA means "to place a large object in a low place nearby". Chindali, a Bantu language, has a similar feature. One cannot say simply that something happened; the verb ending shows whether it happened just now, earlier today, yesterday or before yesterday. The future tense works in the same way.

A fierce debate exists in linguistics between those, such as Noam Chomsky, who think that all languages function roughly the same way in the brain and those who do not. The latter view was propounded by Benjamin Lee Whorf, an American linguist of the early 20th century, who argued that different languages condition or constrain the mind's habits of thought.

Whorfianism has been criticised for years, but it has been making a comeback. Lera Boroditsky of Stanford University, for example, points to the Kuuk Thaayorre, aboriginals of northern Australia who have no words for "left" or "right", using instead absolute directions such as "north" and "south-east" (as in "You have an ant on your south-west leg"). Ms Boroditsky says that any Kuuk Thaayorre child knows which way is south-east at any given time, whereas a roomful of Stanford professors, if asked to point south-east quickly, do little better than chance. The standard Kuuk Thayoorre greeting is "where are you going?", with an answer being something like "north-north-east, in the middle distance." Not knowing which direction is which, Ms Boroditsky notes, a Westerner could not get past "hello". Universalists retort that such neo-Whorfians are finding trivial surface features of language: the claim that language truly constricts thinking is still not proven.

With all that in mind, which is the hardest language? On balance THE ECONOMIST would go for Tuyuca, of the eastern Amazon. It has a sound system with simple consonants and a few nasal vowels, so is not as hard to speak as Ubykh or !Xoo. Like Turkish, it is heavily agglutinating, so that one word, HoABaSIRIGA means "I do not know how to write." Like Kwaio, it has two words for "we", inclusive and exclusive. The noun classes (genders) in Tuyuca's language family (including close relatives) have been estimated at between 50 and 140. Some are rare, such as "bark that does not cling closely to a tree", which can be extended to things such as baggy trousers, or wet plywood that has begun to peel apart.

Most fascinating is a feature that would make any journalist tremble. Tuyuca requires verb-endings on statements to show how the speaker knows something. DIGA APE-WI means that "the boy played soccer (I know because I saw him)", while DIGA APE-HIYI means "the boy played soccer (I assume)". English can provide such information, but for Tuyuca that is an obligatory ending on the verb. Evidential languages force speakers to think hard about how they learned what they say they know.

Linguists ask precisely how language works in the brain, and examples such as Tuyuca's evidentiality are their raw material. More may be found, as only a few hundred of the world's 6,000 languages have been extensively mapped, and new ways will appear for them to be difficult. Yet many are spoken by mere hundreds of people. Fewer than 1,000 people speak Tuyuca. Ubykh died in 1992. Half of today's languages may be gone in a century. Linguists are racing to learn what they can before the forces of modernisation and globalisation quieten the strangest tongues.

Link to a web version of this article with graphics and related items.

A Language Gene?

Because human beings have language, but chimpanzees, our closest evolutionary relatives, do not, it has been suggested (by Chomsky and others) that the capacity for language -- in Chomsky's terms, the capacity for Universal Grammar -- is somehow encoded in our genes as part of our phylogenetic heritage. Although many geneticists argue that there aren't really genes "for" particular traits, most work in behavioral genetics is predicated on the view that there are, in fact, particular genes (or sets of genes) that control various behavioral and mental characteristics, just as they control various physical characteristics -- hence, there has been considerable interest in the search for the "language gene".

Such a gene may well exist after all. In 2001, Dr Anthony P. Monaco, a molecular biologist at Oxford University, and his colleagues reported an analysis of a large English family almost half of whom are cannot produce articulate speech. Of course, speech isn't the same as language, but the affected family members also have difficulty with written language, suggesting that the problem is more with language than with speech itself. The Monaco group reported that defects in a single gene, known as FOXP2, causes this speech/language defect. This gene turns on other genes during the development of the brain, and it (or the genes it controls) plays a particularly important role in engineering the brain for rapid motor articulation, such as is necessary for speech.

FOXP2 is present in mice, chimpanzees, and other primates, as well as in universal in humans, but humans have a very different version of the gene from that found in chimpanzees (who also, it should be noted, and perhaps not coincidentally, lack the motor apparatus needed for articulate speech). In 2002, an analysis by Svante Paapo, at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany, showed that the human version of FOXP2 emerged no more than about 120,000 years ago. According to Paapo's theory, the new gene tied the cognitive capacity for language with the motor capacity for articulate speech, essentially "perfecting" the language capacity of modern humans. In turn, Dr. Richard Klein, an archeologist at Stanford University, has proposed that the resulting ability for linguistic communication led to the emergence of modern human social behaviors, such as art, ornamentation, and long-distance trade, about 50,000 years ago. It's been said that we share about 98.5% of our genes with chimpanzees. That 1.5% makes a big difference, and perhaps our particular version of FOXP2 is a big part of that difference.

Language and Thought

Language is a powerful tool for communication -- the most efficient and effective means we have for communicating our experiences and thoughts to other people. But language is also a powerful tool for thinking itself. Our ability to represent our thoughts in words, and to manipulate those thoughts through grammatical syntax, gives us an unparalleled ability to combine and recombine our thoughts -- to reflect on them and explore their implications.

Language is such a powerful tool for thought that it has been suggested that, in addition to being a "container" for our thoughts, language shapes our thoughts, constraining what we can and cannot think about. Actually, there are several different views that we can take on the relation between language and thought.

Perspectives from Development

According to the Swiss developmental psychologist Jean Piaget, language depends on thought. As we'll discuss later in the lectures on Psychological Development, Piaget argued that the child's mind develops through a series of stages, and that the child's language will reflect the stage of cognitive development that the child is in. At first, children are focused on the concrete present, the here and now, and lack any capacity for symbolic representation. And because they lack any capacity for symbolic representation, they can't use language. Children can use a word like dog only when they understand that the word dog refers to, and symbolically represents, that four-footed, furry, barking, tail-wagging creature over there. And not just to that particular creature, but to all sorts of other creatures that look and sound and behave like that. And children can talk about that dog, or any dog, or dogs in general, only when they have the cognitive capacity to hold in mind both the concept of dog and the other words that relate to the dog -- like barking, or happy, or hungry.

Piaget believed that children can think before they can speak, and that their ability to think constrains their ability to use language. One problem with this idea is that children obviously understand a lot of what is said to them, before they can ever speak. This is part of what is known as the competence-performance distinction, and it means that children have a lot more linguistic knowledge than they display in their speech.

According to the Russian (and Soviet) developmental psychologist Lev Vygotsky, thought and language emerge independently, but then merge into a symbiotic relationship. Like Piaget, Vygotsky asserted that thought does not depend on language, and that children can think before they can speak.

Young children engage in pre-linguistic thought based on percepts, images, and actions. And they also have pre-intellectual language which arises from their interactions with parents and other people, rather than their own independent thought.
Later, after they have developed some capacity for language, children "think out loud", saying whatever they are thinking -- what Vygotsky called egocentric speech.
Still later, they acquire the ability to engage in silent inner speech, and use spoken language only for social purposes.
There is also private speech, in which people talk to themselves out loud.

Piaget argued that private speech was a reflection of the young child's egocentrism (a topic discussed later in the lectures on "Psychological Development"). Egocentric children cannot take another person's perspective, and therefore are unable to communicate effectively with others -- so they talk to themselves instead. Vygotsky, for his part, thought that private speech is actually practice for talking with other people. Both private and inner speech help us to plan our behavior ("I've got to remember to get butter, eggs, and milk at the grocery store"), monitor our progress ("First I multiply by 2, then divide the product by 3"), and regulate our emotions ("I've got to stop thinking about her"). In addition, inner speech may play an important role in creative problem-solving.

Inner speech may take the form of a monologue or a dialogue. Brain-imaging studies indicate that solving arithmetic problems (obviously a form of thinking) activates different brain centers than those involved in language processing. But this may only apply to abstract arithmetic. Once you're trying to solve a word problem ("Points A and B are 40 miles apart, and joined by a single rail line; Train X travels from A to B at a rate of 20 miles per hour; Train Y travels from B to A at 25 miles per hour; assuming that they leave at the same time, at what point will they crash into each other?"), of course, the two centers have to work together. In fact, inner speech activates the same brain centers, such as Broca's and Wernicke's areas, as does public speaking out loud. Dialogic inner speech, in addition, activates brain centers associated with the "theory of mind" (also discussed later in the lectures on "Psychological Development"), as if one were trying to understand, and make oneself understood by, another person.

For more on inner speech, see "Talking to Ourselves" by Charles Ferneyhough, Scientific American, 08/2017)

The Sapir-Whorf Hypothesis

Language is a powerful medium for communicating our thoughts to each other – for conveying thoughts from one mind, the speaker's, to another, the listener's. And language is also a powerful tool for thinking. Many of our concepts are labeled by single words, and linguistic syntax allows us to combine familiar concepts in novel ways. Language is creative, in that it allows us to think, and express, thoughts that have never been thought before.

But some theorists have gone beyond this idea, to argue that the language we use constrains the thoughts we can think about objects and events. The origin of this idea lies with Edward Sapir, a linguist who was a student of Franz Boas, himself generally regarded as the "father" of American anthropology; and especially Benjamin Lee Whorf, a chemical engineer who took up an interest in language and studied with Sapir. In one of his papers, Boas had noted that the "Eskimo" language contained four quite different words for snow. Whorf and Sapir cited this example to suggest that these words encoded quite different concepts, and that if the language were lost, these concepts would be lost as well. In other words, the language we use constrains the kinds of thoughts we can have.

The Sapir-Whorf hypothesis has a distinguished intellectual pedigree, but it's been controversial right from the start. The example of "Eskimo words for snow" has been particularly problematic. Over the years, the number of Eskimo snow-words has been inflated to 50 or even 100. More important, Boas actually cited those four words to make another linguistic point entirely. And neither Sapir nor Whorf ever showed that Eskimos actually thought about snow any differently from, say, English-speaking cross-country and downhill skiers. (Cartoon by Joe Dator, New Yorker, 09/29/2014.)

Nevertheless, the Sapir-Whorf hypothesis has captured the attention of many social scientists. Its strong form is known as linguistic determinism: that basic cognitive processes differ depending on the language one uses. A weaker form says merely that there are parallels between the structure of a language and the way that speakers of that language think.

Here's a famous example of Sapir-Whorf hypothesis, studied by Lera Boroditsky and Alice Gaby. There's an aboriginal group called the Pormpuraaw, who live in the Cape York region in North-Eastern Australia and speak a language called Kuuk Thaayore. These people have an unusual way of referring to direction. In English, we would use such terms as "left" and "right" "front" and "back". But in Kuuk Thaayorre, the Pormpuraaw refer to the cardinal directions of "north", "south", "east", and "west". Where an English speaker would say "Put the fork to the left of the plate", the Pormpuraaw would say 'Put the fork to the west of the plate" – but only if they were actually facing north. This feature of language obviously affects how the Pormpuraaw locate themselves and others in space, but it also affects their conception of time. English speakers, given a series of cartoon panels depicting a story, will typically arrange them in temporal sequence going from left to right – which is how we write. Hebrew is written from right to left though, and Hebrew speakers will arrange the panels from right to left as well. But the Pormpuraaw arrange the panels from east to west: that is, from left to right, if they happen to be facing south, but from right to left if they happen to be facing north. Apparently, how they speak about space affects how they think about time.

Boroditsky and other proponents of the Sapir-Whorf hypothesis have collected many examples like this, and they conclude that Sapir and Whorf had it right. The diversity of environments in which humans live has created the diversity of language, and this diversity of language has in turn created diversity of thought. "Each [language] provides its own cognitive toolkit and encapsulates the knowledge and worldview developed over thousands of years within a culture. Each contains a way of perceiving, categorizing, and making meaning in the world…".

Still, there are lots of counterexamples. One famous set of studies focused on color terms. Cross-cultural research by Brent Berlin and Paul Kay had revealed a consistent pattern of color terms across the diversity of languages. If a language had only two color terms, they corresponded to black and white – or, perhaps, light and dark, or warm and cool. If it had three color terms, the third term was red. If it had a fourth term, that was either green or yellow; if it had a fifth term, it was the other one, yellow or green. Then blue was added, then brown, and so on. Eleanor Rosch and David Olivier worked with the Dani, a tribe in New Guinea whose language has only two color terms – mili for "dark" and "cold" colors and mola for "light" and "warm" colors. When asked to name color patches, the Dani obviously performed differently from English-speaking college students. But when asked match color patches from memory, the Dani performed the same way the English speakers did, yielding highly similar "color spaces". In other words, the Dani perceived and remembered colors the same way that English speakers did. .

Boroditsky and others continue to find examples where language seems to influence thought, but the first thing to be said is that thinking doesn't require language at all. Language makes thinking easier, perhaps, and more powerful, but there are lots of examples of thinking in animals, who don't have anything like human language.

Rats and pigeons form expectations concerning prediction and control during classical and instrumental conditioning.
Pigeons have been found to form natural concepts.
Rhesus monkeys are curious about their world.
Wolfgang Kohler, a German psychologist, found that apes could show "insight" in problem-solving. For example, in order to obtain a banana that had been suspended out of reach, they would figure out how to stack boxes on top of each other, and then use a pole to knock down the fruit.
And, for that matter, human infants engage in a great deal of learning and problem-solving before they've acquired any language at all.

The strong form of the Sapir-Whorf hypothesis, linguistic determinism, is certainly wrong. Thought isn't a mirror of language, but language is a mirror of thought. All human languages have certain basic principles in common, and these similarities outweigh any differences. And the weaker form, linguistic relativity, probably isn't quite right either. There are parallels between thought and language, but it could easily be that cultural patterns of thinking lead people to use language the way they do. What there is evidence for, is what Daniel Slobin and others have called "thinking for speaking". You can think any thought, and express that thought in any language, but the structure of the language you use forces you to express your thoughts in a particular way. So you have to think about certain things before you speak; and what you say will nudge listeners toward a particular interpretation of what you mean.

One example is grammatical gender. Many languages, though not English, classify nouns by grammatical gender – masculine or feminine, and sometimes neuter. If I want to talk to someone about my friend Pat, I can just refer to "my friend Pat", leaving it ambiguous whether Pat is male or female. But if I want to express the same thought in Spanish, I have to say what Pat's gender is: Mi amigo Pat clearly communicates that Pat is a man; in German, Meine Freundin Pat clearly communicates that she is a woman. I can't leave it ambiguous. The syntax of Spanish and German forces me to think about gender in a way that English does not.

English lacks an "epicene" or gender-neutral pronoun that applies generically to everyone, regardless of gender. In the past, the practice was to employ a generic he, him, his to refer to people in general, not just males. But, of course, referring to people in general as if they were masculine may lead people to think of the masculine case as the default option, effectively relegating females to a sort of second-class linguistic citizenship. And it doesn't really work anyway. Consider Steve, Sally, Mary, and Jane each had his hair cut today (this example from "Everybody Has Their Opinion" by Johnson, the nom de plume of the language columnist in The Economist, 04/01/2017).

The common fix, referring to "he or she", "him or her", "his or hers" is more balanced, but also makes for clumsy writing. Consider: Whenever someone goes out in the rain, he or she should carry his or her umbrella in case he or she wants to cover his or her head.

Various solutions have been proposed, perhaps the best of which is simply to reword the offending sentence in the plural: Whenever people go out in the rain, they should carry their umbrellas in case they want to cover their heads. However, everyday usage often leads to a grammatically incorrect mixing of singular and plural: Whenever someone goes out in the rain, they should carry an umbrella in case they want to cover their head. This is, as I say, strictly illegal. But in a demonstration of how language evolves, what has been descriptively true -- that, in ordinary usage, people who want to speak in gender-neutral terms often substitute they for he or she -- has now begun to become prescriptively proper. In 2017, as "Johnson" reported, the editors of both the Chicago Manual of Style and the Associated Press Stylebook changed their rules to allow the plural pronoun they to be used as a gender-neutral singular pronoun when to stick with the rules would produce awkward or clumsy writing.

French has a similar problem (see "Gender Bender", also by Johnson, in The Economist, 04/15/2017). Although "gender" in French has more to do with "genre" (kind) than sex, as it happens, many titles of powerful people -- General, Senator, Minister, Head of State" all "happen" (ahem) to be masculine. That includes le president, which would become a problem if a leading candidate in 2017, Marine Le Pen, won the presidential election. You can't just change the word a little, because historically la presidente has meant something like "First Lady"; never mind that the French Academy, the official guardian of the language for about the last 500 years, frowns on creating grammatically feminine versions of words that are grammatically masculine. Nevertheless, the French National Assembly has come out in favor of la presidente.

Here's another example: different cultures have different kinship categories. I have two siblings, and in English I simply refer to them as "my brother" or "my sister" – or, if I want to keep things ambiguous, I'll just refer to "my siblings". But if I'm speaking Hopi, the language of a Native American tribe indigenous to the southwestern United States, I have to identify Jean as my older sister and Don as my older brother; but Don and Jean could use the same word to refer to me, their younger brother.! But if I had a younger sister, they would each use different words to refer to her.

If I want to talk about my friend Pat, I have to do so differently in Spanish than I would in English. But that doesn't necessarily mean that I have to think about Pat any differently. I only have to think about how I'm going to talk about Pat in Spanish. Any thought can be expressed in any language. Language doesn't constrain thought. As a powerful tool for thinking, language makes thinking easier. And it's an equally powerful tool for communication, providing us with a more powerful and flexible vehicle for communicating our ideas and experiences than any other animal has. Across cultures, languages are more alike than they are different, and so are human patterns of thinking.

Most famously, the American linguist Edward Sapir and his student, Benjamin Lee Whorf, argued on the basis of cross-cultural studies that language affects the way we think. In a famous -- actually notorious -- example, the Eskimos have 4 different words for snow. As Franz Boas, Sapir's teacher, put it in 1911:

...just as English uses derived terms for a variety of forms of water (liquid, lake, river, brook, rain, dew, wave, foam) that might be formed by derivational morphology from a single root meaning 'water' in some other language, so Eskimo uses the apparently distinct roots aput 'snow on the ground', gana 'falling snow', piqsirpoq 'drifting snow', and qimuqsuq 'a snow drift'

For Sapir and Whorf , the implication was that Eskimos have 4 different ways of thinking about snow -- more ways of thinking about snow than you'd have with another language, with fewer such words.

As Whorf put it:

We cut nature up, organize it into concepts, and ascribe significances as we do, largely because we are parties to an agreement to organize it in this way -- an agreement that holds throughout our speech community and is codified in the patterns of our language. The agreement is, of course, an implicit and unstated one, BUT ITS TERMS ARE ABSOLUTELY OBLIGATORY; we cannot talk at all except by subscribing to the organization and classification of data which the agreement decrees (Whorf (1940/1956), pp. 213-214).
The important distinction between HABITUAL and POTENTIAL behavior enters here. The potential range of perception and thought is probably pretty much the same for all men. However, we would be immobilized if we tried to notice, report, and think of all possible discriminations in experience at each moment of our lives. Most of the time we rely on the discriminations to which our language is geared, on what Sapir termed "grooves of habitual expression" (Whorf, 1956, p. 117).

Why is this example notorious?

In the first place, because Eskimo is a rather imprecise, somewhat derogatory term that obscures the differences between three very different linguistic-cultural groups -- the Yupiks, the Inuits, and the Aleuts. There isn't any "Eskimo" language. It's a little like referring to all Muslims as "Arabs".
In the second place, since the time of Boas, Sapir, and Whorf the number of "Eskimo" words for snow has been multiplied. have seen someone claim that they had 50 or even 100. But neither Boas, nor Sapir, nor Whorf ever said that Eskimos had all that many words for snow. Boas, who was making a different linguistic point entirely, listed a mere four. The highest that Whorf ever got in his count, made famous in a 1940 paper, was just 7 (Sapir apparently stayed out of it). Anthony Woodbury (1991), going through a Yupik dictionary, got as high as 15. A dictionary of Greenlandic "Eskimo" lists just two, representing "snow in the air" and "snow on the ground".
In the third place, English also has a multitude of words for snow -- consider the most frequent snow-related words:snow, hail, sleet, ice, icicle, slush, and snowflake. That's seven right there. Ask any skier, downhill or cross-country, and you'll get a lot more (powder, crust, slush, etc.). Woodbury found as many as 22 "snow lexemes" (not exactly words, but close) in English, depending on how you count.
In a recent contribution to the debate, Kemp and Carstensen (PLoS One, 2016) examined words for snow and ice in languages spoken in cold- and warm-weather climates. Languages spoken in warm climates tend to use the same word for the two concepts (for example, the Hawaiian word hau means both "snow" and "ice". By contrast, languages spoken in cold climates tend to distinguish between them. But far from showing that language determines thought, the authors argue the reverse: thinking -- more precisely, the need to communicate clearly -- determines language. (See "New Twist in Old Trope about Eskimo Words for Snow" by Yasmin Anwar, UC Press Release, 04/13/2016.)

If Eskimos had more words for snow, this might reflect nothing more than expertise -- in which case, if you think about it, thought is determining language.

For more on Eskimo words for snow, see:

Martin, L. (1986). "Eskimo words for snow": A case study in the genesis and decay of an anthropological example.American Anthropologist, 88, 418-423.

Murray, S.O. (1987). Snowing canonical texts.American Anthropologist, 89, 443-444.

Pullum, G. K. (1989). The great Eskimo vocabulary hoax.Natural Language & Linguistic Theory, 7, 275-281.

The idea of linguistic determinism led to a heated debate among linguists. For example, Noam Chomsky, who has argued that language is a tool for thought, has nonetheless emphasized the universality of language. All languages possess the same basic properties, of meaning, reference, structure, creativity, and the like, all languages are equally complex, and all languages are equally easy to learn (as a first language, in spoken form). So it would be strange if there were some thoughts that could be expressed in one language but not another. There is actually a geographical divide among language scholars: by and large, East-coast linguists, heavily influenced by Chomsky at MIT, tend to be opposed to the Sapir-Whorf hypothesis; Bay Area linguists, heavily influenced by Grice, and George Lakoff at UC Berkeley, tend to favor it.

Still, Whorf's 1940 paper popularized the notion that our thought processes are determined, and constrained, by language. The Sapir-Whorf hypothesis takes two forms: that language determines thought or that language influences thought.The former is a much stronger view because it states that one is incapable of understanding a concept for which the language has no name (it also implies that there is no thought without language).There is no empirical evidence supporting the strong version and considerable evidence that thought can proceed without benefit of language. However, the weak version plausibly suggests that different languages can 'carve up' the world into different ways -- or, put another way, that conceptual thinking can be shaped and constrained by available linguistic categories.As Whorf put it:

'We cut nature up, organize it into concepts, ascribe significance as we do, largely because we are parties to an agreement to organize it in this way ' an agreement that holds throughout our speech community and is codified in the patterns of our language'.

There are actually two aspects to the Sapir-Whorf hypothesis:linguistic relativity and linguistic determinism.

Determinism claims that our cognitive processes are influenced by the differences that are found in languages. Put bluntly, if a concept is not represented by a word in our language, we are unable to think about that concept.
Relativity refers to the claim that speakers are required to pay attention to different aspects of the world that are grammatically marked (e.g. shape classifiers in Japanese or verb tenses to indicate time).The absence of a corresponding word in our language does not prevent us from thinking about some concept, but it does make it difficult for us to talk about it.

The Case of Color

Color perception permits a classic test of the Sapir-Whorf hypothesis, because languages differ in terms of the variety of words they have for colors. English, for example, has 11 basic color terms (not to be confused with the primary colors), while other languages have only two color terms, roughly corresponding to dark and light. Berlin and Kay (1969), two linguists at UCB, have argued that basic color terms evolve according to a definite sequence:

If a language has only two color terms, they correspond to dark-light or cool-warm.
If a language has a third color term, it is red.
If a language has a fourth color term, it is either green or yellow.
If a language has a fifth color term, it is the other one, yellow or green.
If a language has a sixth color term, it is blue.
If a language has a seventh color term, it is brown.
If a language has additional color terms, they are purple, pink, orange, or grey.

In a classic test of the Sapir-Whorf hypothesis, UCB's Eleanor Rosch (1972; publishing under the name Eleanor Rosch Heider) investigated color naming by the Dani, a tribe in New Guinea whose language has only two color terms:mili (dark) and mola (light). Rosch then asked the Dani to discriminate between various colors, such as green and blue, or red and orange, that were matched for saturation and brightness. She also asked them to remember which colors they had seen. The Dani could perform both tasks well, showing clearly that the unavailability in their language of words representing blue, green, red, or orange did not prevent their abilities to discriminate and remember colors.

On the other hand, Kay and Kempton (1984) his colleagues compared English speakers with Tarahumara speakers, a Uto-Aztecan language of Mexico that does not have separate color terms for blue and green. In the first experiment, the subjects were presented with a blue color chip, a green, color chip, and another color chip that was intermediate to blue and green. English speakers sharply distinguished the intermediate color chip into either blue or green by using a naming strategy, whereas the Tarahumara speakers chose randomly.In the second experiment, English speakers were first presented with two color chips and shown that one (intermediate) was greener than the other color chip (blue) and then shown that the same intermediate chip was bluer than the other color chip (green).By making the subjects call the intermediate color chip both green and blue, the bias that was demonstrated in the first experiment went away and the English speakers performed similarly to the Tarahumara speakers. This might count as evidence in favor of the weak hypothesis of linguistic relativity.

Odor-Naming on the Malay Peninsula

The Sapir-Whorf hypothesis is also undermined by more recent cross-cultural work on odor-identification by Asifa Majid and her colleagues at the University of Nijmegen in The Netherlands, who have studied the odor lexicons of various Aslian languages spoken by indigenous tribes on the Malay Peninsula (e.g., Majid & Kruspe, Current Biology, 2018).

English-speaking subjects are pretty good at naming colors, a property which linguists call codability. But we're very bad at naming odors. The traditional explanation for this difference is that human evolution has favored vision over olfaction. For example, primates have much smaller olfactory bulbs than other mammals. But it's also true that English has dedicated abstract terms for many colors, like red, yellow, green, and blue; by contrast, most odor terms in English are source-based, like lemony and medicinal. About the closest that English gets to a monolexemic, abstract odor-word that's doesn't refer to something else is musty -- and even that word means "smells like must". Think of Henning's smell prism, discussed earlier, which contains "basic" terms like resinous (smells like resin) and burned (smells like something that's burned). But that's not true for every language.
The Jahai and the Maniq, indigenous tribes of hunter-gatherers living on the Malay Peninsula, have no difficulty naming odors. They perform as well as English speakers when asked to name colors, and much better than English speakers when asked to name odors. Interestingly, their languages have about a dozen monolexemic, abstract words for odors.

This pretty much disproves the hypothesis that the difficulty that English speakers have in naming odors is due to the impoverished nature of the human olfactory apparatus. In fact, at first glance, this looks like a Whorfian result -- that is, the Jahai have words for odors, and this feature of language affects their ability to perceive odors. Of course, this is contrary to the findings of Heider, who found that differences in color vocabulary had no effect on color perception. It might also have to do with the fact that the Jahai and Maniq are hunter-gatherers, who might have evolved a distinctive olfactory apparatus which enables them to survive in a jungle that presents them with lots of different smells. This would be a contra-Whorfian" hypothesis, that differences in sensory acuity led to differences in language. But again, you'd be wrong.

Consider two other indigenous Malay people, the Semaq Beri and the Semelai. These tribes speak related Aslian languages, both of which have a vocabulary of basic odor terms similar to those of the Jahai and the Maniq. But they have different lifestyles. The Semaq Beri are hunter-gatherers like the Jahai and Maniq, while the Semelai are sedentary farmers. It turns out that the Semaq Beri, like the Jahai, are as good at naming odors as they are at naming colors; but the Semelai are much worse -- despite having a language with lots of basic odor terms.

So this is definitely a contra-Whorfian result. The Semelai have a language (which presumably arose at a time when they, too, were hunter-gatherers) with lots of odor-terms, but that doesn't help them name odors. Instead, the differences in odor-naming seem to be cultural in origin, having to do with the different local environments ("human-space" vs. "forest-space" in which these tribes-people live.

The point of all this is to suggest that English-speakers may not be the best subjects to use for studies of sensory qualia outside of vision and audition. For hunter-gatherers like the Jahai and the Semaq Beri, odors are codable in abstract, monolexemic terms, as Cutting suggest. But for members of developed Western societies, we've somehow lost that code, and have to make do with the source-based labels that are left to us -- and these may not match up to the actual qualities of sensory experience.

Counterfactuals in Chinese

Another famous case of Whorfianism comes from the work of Alfred Bloom (1981), who was conducting research on political attitudes among Chinese residents of Hong Kong. His interview contained the questions of the following form, posed in Chinese:

"If the government had passed a law requiring that all citizens born outside of Hong Kong make weekly reports of their activities to the police, how would you have reacted?

Bloom reported that his subjects generally declined to answer such questions, replying "It has not done this" (again, in Chinese) instead. Bloom quickly identified what he considered to be the linguistic root of the problem. Such questions entail counterfactual conditionals, of the form If...Then... -- reasoning about states of affairs that are not actually true. Chinese can express counterfactuals of this sort, but such expressions are rarely used, it rarely does Bloom hypothesized that this linguistic convention shaped the ways that Chinese think -- put bluntly, he hypothesized that native speakers of Chinese just can't think counterfactually. He then conducted an extensive formal study that seemed to demonstrate this. Given story problems that required counterfactual thinking, most English speakers (who read them in English) came to the correct conclusion, while most Chinese speakers (who read them in Chinese) did not. Bloom's 1981 book, The Linguistic Shaping of Thought: A Study in the Impact of Language on Thinking in China and the West was hailed as a singular contribution to penetrating "the 'Chinese mind'".

Terry Kit-Fong Au, then an undergraduate at Harvard, heard about this claim in one of Brown's courses, and was having none of it (Au went on to take a PhD in psychology, and is now a professor at the University of Hong Kong). She tested almost 1,000 residents of Hong Kong or Taiwan (mainland China hadn't yet opened up much to Western researchers) who were bilingual in English and Chinese, and who were randomly assigned to English- or Chinese-language conditions. Au's results were completely contrary to Bloom's: the vast majority of her subjects had no trouble understanding counterfactuals, regardless of the language in which they were posed.

It turns out that, while Bloom spoken Chinese fluently, his translations from English into Chinese were rather unidiomatic, so that their responses did not accurately reflect the Chinese subjects' logical competence. In fact, when Au had her Chinese materials rendered into unidiomatic English, it was the English speakers who performed poorly. Speakers of both Chinese and English are able to reason counterfactually, so long as the counterfactuals are posed in idiomatic language. Reasoning is the same, no matter what language we speak.

Au published her research in 1983, just two years after Bloom's book came out. Bloom replied with criticisms of her method in 1984, but Au held her own in a 1984 rejoinder, The whole wonderful story of counterfactuals in Chinese is summarized by Roger Brown in Chapter 13 of Social Psychology: The Second Edition (1986), from which this account is drawn.

Verbal Overshadowing

The influence of language on how we think about the events that happen in our world can be demonstrated in other experiments other than those designed to confirm or disconfirm the Whorf hypothesis.

Classic work by Leonard Carmichael and his colleagues demonstrated that subjects had different systematic distortions in their recall of ambiguous line drawings (e.g, O--O) depending upon which verbal label they were given (e.g. dumbbells or eyeglasses).

Experiments on eyewitness testimony by Elizabeth Loftus and others, such as the post-event misinformation effect discussed in the lectures on Memory, showed that by varying the verb (e.g. crashed or hit) one can manipulate the estimated speed of the traveling car given by the subjects.

Whorf himself became interested in language when he noticed that behavior around gasoline drums changed when the drums were called "empty" though they contained dangerous vapors.Because the word empty connotes "lack of hazard," careless behavior from the workers resulted in fires from the tossing of cigarette stubs or the smoking by the workers.

A Garden of Sapir-Whorf Delights

Beyond the influence of language on categories, the linguist George Lakoff's work on metaphors offer another approach to the Sapir-Whorf hypothesis without depending upon the idea that language carves the world into different pieces and, as he has put it, "cultures differ only in the way they have their meat cut up". Though some metaphors are universal (e.g. love is warmth), not all cultures share the same metaphors.

Here are some other examples, mostly culled from the research of Lera Boroditsky:

Russian has different words for light and dark blue (goluboy and siniy, respectively), and native Russian speakers make finer discriminations among different shades of blue that escape English speakers.
Spanish and Japanese don't specify the agent of an action. They say The ball was dropped where we would say John dropped the ball. And Spanish and Japanese speakers have more difficulty remembering the agents of accidental events.
The Piraha, a tribe of hunter-gatherers who live in the Amazon rain forest -- we first met them earlier in these lectures, in our discussion of recursion as the universal feature of language -- have no words in their language that precisely specify numbers. No "one" or "two", much less "five" or "65). Their language also lacks grammatical number -- words like both, either, or neither, which indicate that there are two of something. And, interestingly, the Piraha have difficulty identifying quantities greater than 3. Instead, they have an approximate number sense: they can say that one group of objects is larger or smaller than another group, but they cannot say precisely how much.

While all languages distinguish between singular and plural nouns, they differ in terms of grammatical number. English has the word both that indicates there are two things, but no word that indicates that there are three. But some Austronesian languages have a triad word like both that indicates that there are three things. No known language has a quadral word indicating that there are four things.
For more about the representation of number in language, see Numbers and the Making of Us: Counting and the Course of Human Cultures by Caleb Everett, son of Daniel Everett, whose research on recursion in the Piraha language was discussed earlier.

While number words might look like another Whorfian result -- the Piraha don't have number words, so they have difficulty thinking about numbers -- this could also be a "contra-Whorfian" result. As Caleb Everett points out, numbers are a human invention, like the wheel, and some cultures just don't seem to have gotten around to inventing them. Without the concept of three, you don't need a word to represent it. In this case, language doesn't determine thought; thought determines language.

For more examples of the apparent influence of language on cognition, emphasizing space, color, and gender, see:

Through the Language Glass: Why the World Looks Different in Other Languages (2010) by Guy Deutscher, which draws heavily on Boroditsky's work.

For a contrary point of view, see The Language Hoax: Why the World Looks the Same in Any Language by John McWhorter (2014).

For an excellent update on Whorffian issues, see "TalkSense" by Manvir Singh (New Yorker, 12/30/24-01/06/25), an essay review of recent books on language by Julie Sedi (Linguaphile: A Life of Language Love) and Caleb Everett (A Myriad of Tongues). Surveying recent arguments for and against Whorffianism, Singh concludes:

Bourdieu [Pierre Bourdieu, a widely influential French sociologist and public intellectual] was right that linguistic patterns affect us. Yet, going by the best ethnographic and social-science research, his fear of [Whorffian] brainwashng was overblown. If ways of speaking can alter ways of thinking, ways of thinking can alter ways of speaking as well. The dynamic interaction between the two is part of the ongoing story of how we try to make the world intelligible to us -- and to make ourselves intelligible to one another. Talk about the human conversation.

How to Think about Language and Thought

There's no question that thinking can occur in the absence of language. Nonhuman animals think when they seek to predict and control their environments in experiments on classical and instrumental conditioning. They can learn concepts. And they can learn to solve novel problems. They can even pass some of their thoughts to other animals through observational learning. But they don't have anything like human language.

And the strong form of the Sapir-Whorf hypothesis, of linguistic determinism, is just wrong. Rosch's experiments on color naming showed that pretty clearly. The absence of a color name in language didn't prevent the Dani from perceiving that color.

Roger Brown (1986, p. 493), reviewing the evidence bearing on the Whorfian hypothesis, concludes that "While the many languages of the world today are superficially very different from one another, in certain deep semantic ways, they are all alike".

All organize biological categories (of plants and animals) into vertical hierarchies containing 3 to 6 levels of abstraction.
The categories in all languages are fuzzy rather than proper sets, held together by a principle of family resemblance.
With respect to the vertical organization of concepts, there is always one level that is "basic", maximizing both the similarity of instances within that level, and the distinctiveness between categories.
No matter what language is being learned, children first learn words that represent this basic object level.
Categories below the basic object level may have distinctive utilities within a culture, as in the many kinds of snow, but categories at and above the basic object level do not.
Complex societies have a richer "specialist" vocabulary below the basic object level, compared to simple societies.

So there might be something to the weaker form of the Sapir-Whorf hypothesis, of linguistic relativity. Perhaps the most that can be said comes from UCB's Prof. Dan Slobin (1979): languages differ in terms of the features of the environment that they emphasize. To use one of his examples, French distinguishes between the familiar and polite forms of the second-person pronoun, tu and vous; German does the same thing, du and Sie; there's nothing like that in English. Therefore, French and German speakers must think about social hierarchies, and the relationship between a speaker and the person he or she is speaking to, while English speakers don't have to do this. But that doesn't mean that English speakers don't think about social hierarchies.

Of course, English used to make such a distinction: compare you and yours with thee, thou, and thine. Because of the use of the latter terms in places such as the King James Bible, we generally think of thee/thou/thy/thine as somehow elevated, more polite and respectful: think of the phrase, "holier than thou". But in fact, up to the 17th century, you was used in polite address, and thee in familiar address. In the KJB, God is addressed as thou out of a sense of intimacy, not of disrespect. But thereafter, thee and thou pretty much dropped out of ordinary speech, and was replaced by you. After the French Revolution of 1789, the use of vous by one person to defer to another was formally banned; all were equal Citizens, and so everyone addressed each other as tu, so as to express their liberty, equality, and fraternity. Except for the Quakers, members of the Religious Society of Friends, and that was mostly in the confines of their religious meetings: interestingly, in Quaker discourse the connotations of you and thee are reversed: the use of thee/thou was an expression of egalitarianism.

But the basic point remains: language develops to represent the things that people think about. Language may constrain communication, but it doesn't constrain thought. Rather, thought constrains language: we use language to talk about the things we think about.

Does this mean that the specific language one speaks doesn't matter? That things would be just the same if we all spoke English, or Mandarin -- or, for that matter, Pormpuraaw? No. Language matters, but -- as John McWhorter has argued (in "Why Save a Language?", New York Times, 12/07/2014), language matters for other reasons. In particular, language is an important component of cultural identity. When a group of people have a language in common, which they speak to each other but not to members of other groups, that fosters a sense of communal identity, strengthening the group itself. There are other bonds between community members, to be sure, but a shared language is one of them. And besides, diversity of language may be something to celebrate in and of itself, just as we celebrate diversity in plants and animals. The world would be a lot more boring if it was just us, and cats, and tulips; and it would be a lot more boring if it was just English, Mandarin, and Pormpuraaw.

Lost in Translation

The relation of language to thought is underscored by the problem of translation -- whether we can render into English, for example, something originally said or written in French. This is a problem for comparative literature -- without translation, most of the world's writers would be inaccessible to us; and it's also a problem for international relations -- as when a word means different things to the two parties signing a treaty. It's been a problem at least since St. Augustine translated the New Testament from Greek into Latin.

In Is That a Fish in Your Ear? Translation and the Meaning of Everything (2011), David Bellos, a professor of comparative literature, offers a history of translation. Some literary theorists have argued that translation is essentially impossible, because each language represents - -creates, really -- a different mental world, so that meanings expressed in one language are ineffable -- can't really be communicated -- in another. Bellos himself disputes this -- he makes his living as a translator, after all. He argues that the purpose of translation is not to to be faithful to the words on the page, but rather to resemble them in some way that is faithful to the original.

To illustrate, he offers one of his own translations. In Life: A User's Manual, the French writer Georges Perec has a character inspect satirical calling-cards in the window of a Parisian joke-shop. One of them reads, in French, "Adolph Hitler/Fourreur". Hitler was called Der Fuhrer (The Leader), but fourreur is French for "furrier", creating a kind of pun (which is why it was displayed in a joke shop). If Bellos had translated this as "Adolph Hitler/Furrier", the reader would be puzzled at the very least, because it doesn't make any sense. And, of course, the whole joke would be lost. So, instead, Bello rendered the passage as "Adolph Hitler/German Lieder" (Lieder is German for "song"). Bellos hasn't translated the words; he's translated the sense of the passage, preserving the pun.

Language, Culture, and Freedom

All in all, human linguistic capacity a real tribute to human intelligence -- i.e., the human intelligence we all share. Human intelligence is not same as individual differences in IQ. Language is an enormous achievement -- one not matched by any other species, or by the most powerful computer. But every normal child, and most retarded ones as well, become expert in his or her native tongue.

Language is an important tool for human thought. It permits the symbolic representation of the world around us, and facilitates creative problem-solving. But it also has another, absolutely crucial, adaptive function: it permits learning through precept rather than simple trial and error. This much more efficient form of learning provides the basis for cultural evolution. It permits passing knowledge on to next generation, so that we can build on past advances, and don't have to constantly reinvent the wheel.

There's also a third function of language, which is to create the social world in which we live. This process is described by the UC Berkeley philosopher John Searle in the Social Construction of Reality (1995), and its mediated entirely by language. Mountains and molecules are features of observer-independent reality: they are what Searle calls "brute facts" about the world, and they exist no matter what anyone thinks or does. But reality also consists of observer-dependent "institutional facts" which are products of what Searle calls collective intentionality. Put briefly, these features of reality exist because people say they do. Searle's favorite examples are money and marriage: that dollar bill in our wallet has a certain value only because some institution says it does; you're married (if you're married, that is), because some minister or judge says you are. Thus, language doesn't just represent reality; in many aspects, it also creates reality. That, too, is part of cultural evolution. In my lifetime, homosexual relations were illegal, homosexuality was considered as a mental illness, and homosexuals certainly couldn't marry each other. In the third edition of its Diagnostic and Statistical Manual of Mental Disorders (DSM-III, 1973), the American Psychiatric Association declared that homosexuality, per se, was no longer to be considered to be a form of mental illness. In a groundbreaking case, Lawrence v. Texas (2003), the US Supreme Court legalized sexual relations between persons of the same sex. And in Obergefell v. Hodges (2015), the Supreme Court legalized same-sex marriage. All aspects of reality created by language, through argument and decree.

Cultural evolution parallels biological evolution, but proceeds at a much faster pace. Thus, it permits more rapid adaptation to environmental demands. But even better than adaptation, cultural evolution permits mastery of environment. Biological evolution allows us to fit better into the environment; cultural evolution, through technological advance, permits us to make the world more favorable to us.

Without language, we'd be tall, hairless, chimpanzees with thumbs -- and probably extinct as well. Language is the key to successful human adaptation, and to all that makes us human.

Chomsky's approach to language developed in the 1950s, in the heyday of behaviorism, and came to the attention of psychologists through his 1959 review of Verbal Behavior, a book by by B.F. Skinner. In that book, Skinner had offered a behavioristic view of language as verbal behavior, acquired through reinforcement and other principles of operant conditioning. Chomsky showed that Skinner's view of reinforcement was circular, and thus vacuous. He also argued that the human capacity for language was innate, and that language was acquired through something like an innate "Language Acquisition Device" (LAD) in the brain, which permitted infants to learn their native language effortlessly even without deliberate training or reinforcement.

All of this led George Miller (of "the Magical Number 7" fame), himself a strong supporter of Chomsky, to remark that the choice between Skinner and Chomsky was "a choice between an impossible and a miraculous theory of language".

And, indeed, language does appear to be something of a miracle, without precedent in evolutionary history. Language confers an incredible advantage on humans. As Chomsky once remarked concerning chimpanzees, our closest evolutionary ancestor: "There's a reason why there are 4 billion of us and only 10 thousand of them" (there are, of course, even more of us now, and even fewer of them).

Language allows us to represent knowledge very economically, and to share our knowledge with others through an extremely efficient form of social learning -- social learning by precept.

Language serves as the basis for culture, facilitating the intergenerational transmission of knowledge, beliefs, and behaviors.

Language, through its creative property, is the cognitive basis of human freedom, permitting us to think, speak, and hear thoughts that have never been thought before.

Much of this supplement on language is closely derived from the corresponding chapters, by Lila R. Gleitman, appearing in various editions of Henry Gleitman's introductory textbook, Psychology. This is as it should be, because Henry and Lila taught me almost everything I know about language, when I was a graduate student at the University of Pennsylvania (Herb Clark taught me most of the rest). I've added some things, to be sure, but the view here is still pretty much Lila and Henry's (and Herb's). To the extent that I've gotten anything wrong, or there is anything seriously out of date, I beg the reader -- and Henry, Lila, and Herb -- to accept my apologies.

For more detailed information on the psychology of language, see the following books:

The Language Instinct: How the Mind Creates Language by Steven Pinker (Morrow, 1994).

And then read Pinker's Words and Rules: The Ingredients of Language (2000), cited above.

Using Language by Herbert Clark (Cambridge University Press, 1996).
How Language Works: How Babies Babble, Words Change Meaning, and Languages Live or Die by David Crustal (Overlook, 2006).
How We Talk by Nick Enfield (2017)
Don't Believe a Word: The Surprising Truth About Language (2020) by David Shariatmadari, a journalist who has no tolerance for "language snobs" who won't let language evolve, even though that's what language has always done.
How You Say It: Why You Talk the Way You Do -- and What It Says About You by Katherine D. Kinzler. More about sociolinguistics than psycholinguistics, and more about speech than language, but a really interesting take on language pragmatics that moves beyond just describing group differences in speech patterns and addressing their role in racial, ethnic, cultural, and class discrimination.
For that matter, check out Talking Back, Talking Black by John McWhorter, a fabulously interesting sociolinguist (formerly at Berkeley, now unfortunately at Columbia), who surveys the role of "blaccent" in discrimination against African-Americians.
The Language Puzzle: Piecing Together the Six-Million-Year Story of How Words Evolved by Steven Mithen, an archeologist.
Language City: The Fight to Preserve Endangered Mother Tongues in New York by Ross Perlin, and linguist at Columbia and a (2024; reviewed by Ian Frazier in "Can We Talk?", New York Review of Books, 09/19/2024). According to the Ethnologue database, there are some 7,164 languages spoken across the world, and many indigenous languages are disappearing fast. Perlin, a linguist and co-director of the Endangered Language Alliance, found some 700 of them spoken within the boundaries of New York City, and provides a kind of map of their distribution throughout the city. He also selects a half-dozen immigrants to the US, living and working in NYC, and traces their native languages back to their regions of origin.

If you really get interested in language, here are some older books on the psychology of language that are still well worth reading:

Language and Communication by George Miller (1951)
Psycholinguistics by UC Berkeley's Dan Slobin (1974)
Psychology and Language: An Introduction to Psycholinguistics by Herbert H. Clark & Eve V. Clark (1977).

This page last revised 05/27/2025.