Indo-European languages originated in Anatolia, research suggests

New research links the origins of Indo-European with the spread of farming from Anatolia 8000-9500 years ago. (Credit: Image courtesy of Radboud University Nijmegen)

The Indo-European languages belong to one of the widest spread language families of the world. For the last two millenia, many of these languages have been written, and their history is relatively clear. But controversy remains about the time and place of the origins of the family. A large international team, including MPI researcher Michael Dunn, reports the results of an innovative Bayesian phylogeographic analysis of Indo-European linguistic and spatial data.

Their paper appears this week in Science.

The majority view in historical linguistics is that the homeland of Indo-European is located in the Pontic steppes (present day Ukraine) around 6,000 years ago. The evidence for this comes from linguistic paleontology: in particular, certain words to do with the technology of wheeled vehicles are arguably present across all the branches of the Indo-European family; and archaeology tells us that wheeled vehicles arose no earlier than this date. The minority view links the origins of Indo-European with the spread of farming from Anatolia 8,000 to 9,500 years ago.

Lexicons combined with dispersal of speakers

The minority view is decisively supported by the present analysis in this week's Science. This analysis combines a model of the evolution of the lexicons of individual languages with an explicit spatial model of the dispersal of the speakers of those languages. Known events in the past (the date of attestation dead languages, as well as events which can be fixed from archaeology or the historical record) are used to calibrate the inferred family tree against time.

Importance of phylogenetic trees

The lexical data used in this analysis come from the Indo-European Lexical Cognacy Database (IELex). This database has been developed in MPI's Evolutionary Processes in Language and Culture group, and provides a large, high-quality collection of language data suitable for phylogenetic analysis. Beyond the intrinsic interest of uncovering the history of language families and their speakers, phylogenetic trees are crucially important for understanding evolution and diversity in many human sciences, from syntax and semantics to social structure.

 

Journal Reference:

  1. R. Bouckaert, P. Lemey, M. Dunn, S. J. Greenhill, A. V. Alekseyenko, A. J. Drummond, R. D. Gray, M. A. Suchard, Q. D. Atkinson. Mapping the Origins and Expansion of the Indo-European Language Family. Science, 2012; 337 (6097): 957 DOI: 10.1126/science.1219669

Language, emotion and well-being explored

 We use language every day to express our emotions, but can this language actually affect what and how we feel? Two new studies from Psychological Science, a journal of the Association for Psychological Science, explore the ways in which the interaction between language and emotion influences our well-being.

Putting Feelings into Words Can Help Us Cope with Scary Situations

Katharina Kircanski and colleagues at the University of California, Los Angeles investigated whether verbalizing a current emotional experience, even when that experience is negative, might be an effective method for treating for people with spider phobias. In an exposure therapy study, participants were split into different experimental groups and they were instructed to approach a spider over several consecutive days.

One group was told to put their feelings into words by describing their negative emotions about approaching the spider. Another group was asked to 'reappraise' the situation by describing the spider using emotionally neutral words. A third group was told to talk about an unrelated topic (things in their home) and a fourth group received no intervention. Participants who put their negative feelings into words were most effective at lowering their levels of physiological arousal. They were also slightly more willing to approach the spider. The findings suggest that talking about your feelings — even if they're negative — may help you to cope with a scary situation.

Unlocking Past Emotion: The Verbs We Use Can Affect Mood and Happiness

Our memory for events is influenced by the language we use. When we talk about a past occurrence, we can describe it as ongoing (I was running) or already completed (I ran). To investigate whether using these different wordings might affect our mood and overall happiness, Will Hart of the University of Alabama conducted four experiments in which participants either recalled or experienced a positive, negative, or neutral event. They found that people who described a positive event with words that suggested it was ongoing felt more positive. And when they described a negative event in the same way, they felt more negative.

The authors conclude that one potential way to improve mood could be to talk about negative past events as something that already happened as opposed to something that was happening.

The second article is forthcoming in Psychological Science. The lead author is Will Hart.


Journal Reference:

  1. K. Kircanski, M. D. Lieberman, M. G. Craske. Feelings Into Words: Contributions of Language to Exposure Therapy. Psychological Science, 2012; DOI: 10.1177/0956797612443830

Categories for kinship vary between languages

 Different languages refer to family relationships in different ways. For example, English speakers use two terms — grandmother and grandfather — to refer to grandparents, while Mandarin Chinese uses four terms. Many possible kinship categories, however, are never observed, which raises the question of why some kinship categories appear in the languages of the world but others do not.

A new study published in Science by Carnegie Mellon University's Charles Kemp and the University of California at Berkeley's Terry Regier shows that kinship categories across languages reflect general principles of communication. The same principles can potentially be applied to other kinds of categories, such as colors and spatial relationships. Ultimately, then, the work may lead to a general theory of how different languages carve the world up into categories.

For the study, Kemp and Regier used data previously collected by anthropologists and linguists that specify kinship categories for 566 of the world's languages. Kemp and Regier used a computational analysis to explore why some patterns are found in the data set but others are not. In particular, they tested the idea that the world's kinship systems achieve a trade-off between the two competing principles of simplicity and informativeness.

"A kinship system with one word referring to all relatives in a family tree would be very simple but not terribly useful for picking out specific individuals," said Kemp, assistant professor of psychology within CMU's Dietrich College of Humanities and Social Sciences and lead author of the study. "On the other hand, a system with a different word for each family member is much more complicated but very useful for referring to specific relatives. If you look at the kinship systems in the languages of the world, you can't make them simpler without making them less useful, and you can't make them more useful without making them more complicated. There is a tradeoff between these two explanatory principles."

Kemp and Regier found that this trade-off explains why languages use only a handful of the vast number of logically possible kinship categories.

"The kinship systems that are used by languages lie along an optimal frontier, where systems achieve a near perfect trade-off between the competing factors of simplicity and usefulness," Kemp said. "English — with two terms to refer to grandparents — is more simple than Mandarin Chinese, but arguably a little less useful."

"Interestingly, very similar principles explain cross-language variation in color categories and spatial categories, as well as kinship categories," said Regier, associate professor of linguistics and cognitive science at Berkeley, and an author on the earlier work on color and space. "It's rewarding to see similar principles operating across such different domains."


Journal Reference:

  1. C. Kemp, T. Regier. Kinship Categories Across Languages Reflect General Communicative Principles. Science, 2012; 336 (6084): 1049 DOI: 10.1126/science.1218811
 

OMG! Texting ups truthfulness, new iPhone study suggests

Text messaging is a surprisingly good way to get candid responses to sensitive questions, according to a new study to be presented this week at the annual meeting of the American Association for Public Opinion Research.

"The preliminary results of our study suggest that people are more likely to disclose sensitive information via text messages than in voice interviews," says Fred Conrad, a cognitive psychologist and Director of the Program in Survey Methodology at the University of Michigan Institute for Social Research (ISR).

"This is sort of surprising," says Conrad, "since many people thought that texting would decrease the likelihood of disclosing sensitive information because it creates a persistent, visual record of questions and answers that others might see on your phone and in the cloud."

With text, the researchers also found that people were less likely to engage in 'satisficing' — a survey industry term referring to the common practice of giving good enough, easy answers, like rounding to multiples of 10 in numerical responses, for example. "We believe people give more precise answers via texting because there's just not the time pressure in a largely asynchronous mode like text that there is in phone interviews," says Conrad. "As a result, respondents are able to take longer to arrive at more accurate answers."

Conrad conducted the study with Michael Schober, a professor psychology and dean of the graduate faculty at the New School for Social Research. Their research team included cognitive psychologists, psycholinguists, survey methodologists and computer scientists from both universities, as well as collaborators from AT&T Research. Funding for the study came from the National Science Foundation.

"We're in the early stages of analyzing our findings," says Schober. "But so far it seems that texting may reduce some respondents' tendency to shade the truth or to present themselves in the best possible light in an interview — even when they know it's a human interviewer they are communicating with via text. What we cannot yet be sure of is who is most likely to be disclosive in text. Is it different for frequent texters, or generational, for example?"

For the study, the researchers recruited approximately 600 iPhone-users on Craigslist, through Google Ads, and from Amazon's Mechanical Turk, offering them iTunes Store incentives to participate in the study. Their goals were to see whether responses to the same questions differed depending on several variables: whether the questions were asked via text or voice, whether a human or a computer asked the questions, and whether the environment, including the presence of other people and the likelihood of multitasking, affected the answers.

Among the questions that respondents answered more honestly via text than speech: In a typical week, about how often do you exercise? During the past 30 days, on how many days did you have 5 or more drinks on the same occasion?

And among the questions that respondents answered more precisely via text, providing fewer rounded numerical responses: During the last month, how many movies did you watch in any medium? How many songs do you currently have on your iPhone?

According to Schober and Conrad, changes in communication patterns and their impact on the survey industry prompted the study. About one in five U.S. households only use cell phones and no longer have landline phones. These households are typically not surveyed even though cell-only households tend to differ in important ways from households with landline phones. More people are using text messages on mobile phones, with texting now the preferred form of communication among many people in their teens and 20s in the U.S. Texting is extremely common among all age groups in many Asian and European nations.

Conrad and Schober are also finding that people are more likely to provide thoughtful and honest responses via text messages even when they're in busy, distracting environments.

"This is the case even though people are more likely to be multitasking — shopping or walking, for example — when they're answering questions by text than when they're being interviewed by voice."

 

Study finds twist to the story of the number line: Number line is learned, not innate human intuition

Tape measures. Rulers. Graphs. The gas gauge in your car, and the icon on your favorite digital device showing battery power. The number line and its cousins — notations that map numbers onto space and often represent magnitude — are everywhere. Most adults in industrialized societies are so fluent at using the concept, we hardly think about it. We don't stop to wonder: Is it "natural"? Is it cultural?

Now, challenging a mainstream scholarly position that the number-line concept is innate, a study suggests it is learned.

The study, published in PLoS ONE April 25, is based on experiments with an indigenous group in Papua New Guinea. It was led by Rafael Nunez, director of the Embodied Cognition Lab and associate professor of cognitive science in the UC San Diego Division of Social Sciences.

"Influential scholars have advanced the thesis that many of the building blocks of mathematics are 'hard-wired' in the human mind through millions of years of evolution. And a number of different sources of evidence do suggest that humans naturally associate numbers with space," said Nunez, coauthor of "Where Mathematics Comes From" and co-director of the newly established Fields Cognitive Science Network at the Fields Institute for Research in Mathematical Sciences.

"Our study shows, for the first time, that the number-line concept is not a 'universal intuition' but a particular cultural tool that requires training and education to master," Nunez said. "Also, we document that precise number concepts can exist independently of linear or other metric-driven spatial representations."

Nunez and the research team, which includes UC San Diego cognitive science doctoral alumnus Kensy Cooperrider, now at Case Western Reserve University, and Jurg Wassmann, an anthropologist at the University of Heidelberg who has studied the indigenous group for 25 years, traveled to a remote area of the Finisterre Range of Papua New Guinea to conduct the study.

The upper Yupno valley, like much of Papua New Guinea, has no roads. The research team flew in on a four-seat plane and hiked in the rest of the way, armed with solar-powered equipment, since the valley has no electricity.

The indigenous Yupno in this area number some 5,000, spread over many small villages. They are subsistence farmers. Most have little formal schooling, if any at all. While there is no native writing system, there is a native counting system, with precise number concepts and specific words for numbers greater than 20. But there doesn't seem to be any evidence of measurement of any sort, Nunez said, "not with numbers, or feet or elbows."

Neither Hard-Wired nor "Out There"

Nunez and colleagues asked Yupno adults of the village of Gua to complete a task that has been used widely by researchers interested in basic mathematical intuitions and where they come from. In the original task, people are shown a line and are asked to place numbers onto the line according to their size, with "1" going on the left endpoint and "10" (or sometimes "100" or "1000") going on the right endpoint. Since many in the study group were illiterate, Nunez and colleagues followed previous studies and adapted the task using groups of one to 10 dots, tones and the spoken words instead of written numbers.

After confirming the Yupno participants' understanding of numbers with piles of oranges, the researchers gave the number-line task to 14 adults with no schooling and six adults with some degree of formal schooling. There was also a control group of participants in California.

The researchers found that unschooled Yupno adults placed numbers on the line (or mapped numbers onto space), but they did it in a categorical manner, using systematically only the endpoints: putting small numbers on the left endpoint and the mid-size and large numbers on the right, ignoring the extension of the line — an essential component of the number-line concept. Schooled Yupno adults used the line's extension but not quite as evenly as adults in California.

"Mathematics all over the world — from Europe to Asia to the Americas — is largely taught dogmatically, as objective fact, black and white, right/wrong," Nunez said. "But our work shows that there are meaningful human ideas in math, ingenious solutions and designs that have been mediated by writing and notational devices, like the number line. Perhaps we should think about bringing the human saga to the subject — instead of continuing to treat it romantically, as the 'universal language' it's not. Mathematics is neither hardwired, nor 'out there.'"

Out-of-Body Time

The researchers ran several experiments while in Gua, Papua New Guinea, including those that examine another fundamental concept: time.

When talking about past, present and future, people all over the world show a tendency to conceive of these notions spatially, Nunez said. The most common spatial pattern is the one found in the English-speaking world, in which people talk about the future as being in front of them and the past behind, encapsulated, for example, in expressions such as the "week ahead" and "way back when." (In earlier research, Nunez found that the Aymara of the Andes seem to do the reverse, placing the past in front and the future behind.)

In their time study with the Yupno, now in press at the journal Cognition, Nunez and colleagues find that the Yupno don't use their bodies as reference points for time — but rather their valley's slope and terrain. Analysis of their gestures suggests they co-locate the present with themselves, as do all previously studied groups. (Picture for a moment how you probably point down at the ground when you talk about "now.") But, regardless of which way they are facing at the moment, the Yupno point uphill when talking about the future and downhill when talking about the past.

Interestingly and also very unusually, Nunez said, the Yupno seem to think of past and future not as being arranged on a line, such as the familiar "time line" we have in many Western cultures, but as having a three-dimensional bent shape that reflects the valley's terrain.

"These findings suggest that how we think about abstract concepts is even more flexible than previously thought and is profoundly affected by language, culture and environment," said Nunez.

"Our familiar notions on 'fundamental' concepts such as time and number are so deeply ingrained that they feel natural to us, as though they couldn't be any other way," added former graduate student Cooperrider. "When confronted with radically different ways of construing experience, we can no longer take for granted our own. Ultimately, no way is more or less 'natural' than the Yupno way."

The research was supported by a UC San Diego Academic Senate grant, an Institute for Advanced Studies in Berlin fellowship, a UCSD Friends of the International Center fellowship, and the Marsilius Kolleg Heidelberg

The advantage of ambiguity in language

Cognitive scientists develop a new take on an old problem: why human language has so many words with multiple meanings.

Why did language evolve? While the answer might seem obvious — as a way for individuals to exchange information — linguists and other students of communication have debated this question for years. Many prominent linguists, including MIT's Noam Chomsky, have argued that language is, in fact, poorly designed for communication. Such a use, they say, is merely a byproduct of a system that probably evolved for other reasons — perhaps for structuring our own private thoughts.

As evidence, these linguists point to the existence of ambiguity: In a system optimized for conveying information between a speaker and a listener, they argue, each word would have just one meaning, eliminating any chance of confusion or misunderstanding. Now, a group of MIT cognitive scientists has turned this idea on its head. In a new theory, they claim that ambiguity actually makes language more efficient, by allowing for the reuse of short, efficient sounds that listeners can easily disambiguate with the help of context.

"Various people have said that ambiguity is a problem for communication," says Ted Gibson, an MIT professor of cognitive science and senior author of a paper describing the research to appear in the journal Cognition. "But once we understand that context disambiguates, then ambiguity is not a problem — it's something you can take advantage of, because you can reuse easy [words] in different contexts over and over again."

Lead author of the paper is Steven Piantadosi PhD '11; Harry Tily, a postdoc in the Department of Brain and Cognitive Sciences, is another co-author.

What do you 'mean'?

For a somewhat ironic example of ambiguity, consider the word "mean." It can mean, of course, to indicate or signify, but it can also refer to an intention or purpose ("I meant to go to the store"); something offensive or nasty; or the mathematical average of a set of numbers. Adding an 's' introduces even more potential definitions: an instrument or method ("a means to an end"), or financial resources ("to live within one's means").

But virtually no speaker of English gets confused when he or she hears the word "mean." That's because the different senses of the word occur in such different contexts as to allow listeners to infer its meaning nearly automatically.

Given the disambiguating power of context, the researchers hypothesized that languages might harness ambiguity to reuse words — most likely, the easiest words for language processing systems. Building on observation and previous studies, they posited that words with fewer syllables, high frequency and the simplest pronunciations should have the most meanings.

To test this prediction, Piantadosi, Tily and Gibson carried out corpus studies of English, Dutch and German. (In linguistics, a corpus is a large body of samples of language as it is used naturally, which can be used to search for word frequencies or patterns.) By comparing certain properties of words to their numbers of meanings, the researchers confirmed their suspicion that shorter, more frequent words, as well as those that conform to the language's typical sound patterns, are most likely to be ambiguous — trends that were statistically significant in all three languages.

To understand why ambiguity makes a language more efficient rather than less so, think about the competing desires of the speaker and the listener. The speaker is interested in conveying as much as possible with the fewest possible words, while the listener is aiming to get a complete and specific understanding of what the speaker is trying to say. But as the researchers write, it is "cognitively cheaper" to have the listener infer certain things from the context than to have the speaker spend time on longer and more complicated utterances. The result is a system that skews toward ambiguity, reusing the "easiest" words. Once context is considered, it's clear that "ambiguity is actually something you would want in the communication system," Piantadosi says.

Implications for computer science

The researchers say the statistical nature of their paper reflects a trend in the field of linguistics, which is coming to rely more heavily on information theory and quantitative methods.

"The influence of computer science in linguistics right now is very high," Gibson says, adding that natural language processing (NLP) is a major goal of those operating at the intersection of the two fields.

Piantadosi points out that ambiguity in natural language poses immense challenges for NLP developers. "Ambiguity is only good for us [as humans] because we have these really sophisticated cognitive mechanisms for disambiguating," he says. "It's really difficult to work out the details of what those are, or even some sort of approximation that you could get a computer to use."

But, as Gibson says, computer scientists have long been aware of this problem. The new study provides a better theoretical and evolutionary explanation of why ambiguity exists, but the same message holds: "Basically, if you have any human language in your input or output, you are stuck with needing context to disambiguate," he says.


Journal Reference:

  1. Steven T. Piantadosi, Harry Tily, Edward Gibson. The communicative function of ambiguity in language. Cognition, 2012; 122 (3): 280 DOI: 10.1016/j.cognition.2011.10.004
 

Prenatal testosterone linked to increased risk of language delay for male infants, study shows

NewsPsychology (Jan. 25, 2012) — New research by Australian scientists reveals that males who are exposed to high levels of testosterone before birth are twice as likely to experience delays in language development compared to females. The research, published in Journal of Child Psychology and Psychiatry, focused on umbilical cord blood to explore the presence of testosterone when the language-related regions of a fetus’ brain are undergoing a critical period of growth.

“An estimated 12% of toddlers experience significant delays in their language development,” said lead author Professor Andrew Whitehouse from the University of Western Australia. “While language development varies between individuals, males tend to develop later and at a slower rate than females.”

The team believed this may be due to prenatal exposure to sex-steroids such as testosterone. Male fetuses are known to have 10 times the circulating levels of testosterone compared to females. The team proposed that higher levels of exposure to prenatal testosterone may increase the likelihood of language development delays.

Professor Whitehouse’s team measured levels of testosterone in the umbilical cord blood of 767 newborns before examining their language ability at 1, 2 and 3-years of age.

The results showed male infants with high levels of testosterone in cord blood were between two-and-three times more likely to experience language delay. However, the opposite effect was found in female infants, where high-levels of testosterone in cord blood were associated with a decreased risk of language delay.

Previous smaller studies have explored the link between testosterone levels in amniotic fluid and language development. However, this is the first large population-based study to explore the relationship between umbilical cord blood and language delay in the first three years of life.

“Language delay is one of the most common reasons children are taken to a Paediatrician,” concluded Professor Whitehouse. “Now these findings can help us to understand the biological mechanisms that may underpin language delay, as well as language development more generally.”

Share this story on Facebook, Twitter, and Google:

Other social bookmarking and sharing tools:


Story Source:

The above story is reprinted from materials provided by Wiley-Blackwell.

Note: Materials may be edited for content and length. For further information, please contact the source cited above.


Journal Reference:

  1. Andrew J.O. Whitehouse, Eugen Mattes, Murray T. Maybery, Michael G. Sawyer, Peter Jacoby, Jeffrey A. Keelan, Martha Hickey. Sex-specific associations between umbilical cord blood testosterone levels and language delay in early childhood. Journal of Child Psychology and Psychiatry, 2012; DOI: 10.1111/j.1469-7610.2011.02523.x

Lovelorn liars leave linguistic leads

NewsPsychology (Feb. 13, 2012) — Online daters intent on fudging their personal information have a big advantage: most people are terrible at identifying a liar. But new research is turning the tables on deceivers using their own words.

“Generally, people don’t want to admit they’ve lied,” says Catalina Toma, communication science professor at the University of Wisconsin-Madison. “But we don’t have to rely on the liars to tell us about their lies. We can read their handiwork.”

Using personal descriptions written for Internet dating profiles, Toma and Jeffrey Hancock, communication professor at Cornell University, have identified clues as to whether the author was being deceptive.

The researchers compared the actual height, weight and age of 78 online daters to their profile information and photos on four matchmaking websites. A linguistic analysis of the group’s written self-descriptions published in the February issue of the Journal of Communication revealed patterns in the liars’ writing.

The more deceptive a dater’s profile, the less likely they were to use the first-person pronoun “I.”

“Liars do this because they want to distance themselves from their deceptive statements,” Toma says.

The liars often employed negation, a flip of the language that would restate “happy” as “not sad” or “exciting” as “not boring.” And the fabricators tended to write shorter self-descriptions in their profiles — a hedge, Toma expects, against weaving a more tangled web of deception.

“They don’t want to say too much,” Toma says. “Liars experience a lot of cognitive load. They have a lot to think about. They less they write, the fewer untrue things they may have to remember and support later.”

Liars were also careful to skirt their own deception. Daters who had lied about their age, height or weight or had included a photo the researchers found to be less than representative of reality, were likely to avoid discussing their appearance in their written descriptions, choosing instead to talk about work or life achievements.

The toolkit of language clues gave the researchers a distinct advantage when they re-examined their pool of 78 online daters.

“The more deceptive the self-description, the fewer times you see ‘I,’ the more negation, the fewer words total — using those indicators, we were able to correctly identify the liars about 65 percent of the time,” Toma says.

A success rate of nearly two-thirds is a commanding lead over the untrained eye. In a second leg of their study, Toma and Hancock asked volunteers to judge the daters’ trustworthiness based solely on the written self-descriptions posted on their online profiles.

“We asked them to tell us how trustworthy the person who wrote each profile was. And, as we expected, people are just bad at this,” Toma says. “They might as well have flipped a coin … They’re looking at the wrong things.”

About 80 percent of the 78 profiles in the study, which was supported by the National Science Foundation, strayed from the truth on some level.

“Almost everybody lied about something, but the magnitude was often small,” Toma says.

Weight was the most frequent transgression, with women off by an average of 8.5 pounds and men missing by 1.5 pounds on average. Half lied about their height, and nearly 20 percent changed their age.

Studying lying through online communication such as dating profiles opens a door on a medium in which the liar has more room to maneuver.

“Online dating is different. It’s not a traditional interaction,” Toma says.

For one, it’s asynchronous. The back-and-forth of an in-person conversation is missing, giving a liar the opportunity to respond at their leisure or not at all. And it’s editable, so the first telling of the story can come out exactly like the profile-writer would like.

“You have all the time in the world to say whatever you want,” Toma says. “You’re not expected to be spontaneous. You can write and rewrite as many times as you want before you post, and then in many cases return and edit yourself.”

Toma says the findings are not out of line with what we know about liars in face-to-face situations.

“Online daters’ motivations to lie are pretty much the same as traditional daters’,” she says. “It’s not like a deceptive online profile is a new beast, and that helps us apply what we can learn to all manners of communication”

But don’t go looking just yet for the dating site that employs Toma’s linguistic analysis as a built-in lie detector.

“Someday there may be software to tell you how likely it is that the cute person whose profile you’re looking at is lying to you, or even that someone is being deceptive in an e-mail,” Toma says. “But that may take a while.”

Share this story on Facebook, Twitter, and Google:

Other social bookmarking and sharing tools:


Story Source:

The above story is reprinted from materials provided by University of Wisconsin-Madison, via Newswise. The original article was written by Chris Barncard.

Note: Materials may be edited for content and length. For further information, please contact the source cited above.


Journal Reference:

  1. Catalina L. Toma, Jeffrey T. Hancock. What Lies Beneath: The Linguistic Traces of Deception in Online Dating Profiles. Journal of Communication, 2012; 62 (1): 78 DOI: 10.1111/j.1460-2466.2011.01619.x

Email language tips off work hierarchy

Members of the modern workforce might be surprised to learn that if they use the word "weekend" in a workplace email, chances are they're sending the message up the org chart. The same is true for the words "voicemail," "driving," "okay" — and even a choice four-letter word that rhymes with "hit." However a new study by Georgia Tech's Eric Gilbert shows that certain words and phrases indeed are reliable indicators of whether workplace emails are sent to someone higher or lower in the corporate hierarchy.

Gilbert, assistant professor in the School of Interactive Computing, focused his attention on the "Enron corpus," a body of 500,000 emails among about 150 former Enron employees, making it the largest email dataset available for public study. Even after taking a "conservative, careful" approach — applying numerous filters and eliminating thousands of emails that would have muddied his conclusions — he still was able to identify lists of words that reliably predicted whether emails traveled up or down the ladder.

"Across a wide variety of messages and relationships, these phrases consistently stand out as signaling a power relationship between two people," Gilbert said. "The probability of it occurring due to chance alone is less than 1 in 1,000."

Primarily Gilbert's work could be applied in designing "smarter" email software. Future email clients, he said, might be able to differentiate between emails sent from superiors or subordinates, and then use that information to better address someone's email preferences. Post-5 p.m. messages from people under you, for example, might get held for delivery until the next day, while emails from the boss — or the boss' assistant — could go right through.

"We have organizational charts, but they don't tell the whole story," said Gilbert, adding that the research could help map "informal power and reporting structures" in an organization. "A classic example is the CEO's administrative assistant: That person may not occupy a high box on the org chart, but he or she still has a large amount of influence."

The project falls in the research field called "applied natural language processing," which has been an important subfield of artificial intelligence for the past 25 years. Tools first developed in the area, however, can be applied to other data and yield enlightening results about human interaction through electronic media or simply how people relate to each other.

One complicating factor in Gilbert's research was the dataset itself. Aside from its utility as grist for study, the Enron corpus can also be considered profoundly flawed as an example of "natural" language; after all, Enron and its officers were responsible for one of the largest, most systemic examples of corporate fraud in U.S. history, and Gilbert acknowledges that the behavior behind this ignominy likely bled into communications among Enron officers. To minimize the chance that their statistical findings captured phenomena unique to Enron, Gilbert and his team manually went through the emails and removed phrases that appeared specific to the company. They also threw out all correspondence after Enron's bad behavior had come to light.

"The trick," he said, "was to find a point just before things really started to fall apart. We chose a date (May 1, 2001) several months before a formal investigation of Enron began and before its executives started selling off their own Enron stock. For all the emails that preceded that date, it's reasonable to assume that Enron and its people behaved just like any other large private entity."

Gilbert's research will be presented February 14 at the 2012 ACM Conference on Computer Supported Cooperative Work (CSCW), being held Feb. 11-15 in Seattle. Gilbert said CSCW, now in its 15th year, is widely regarded as the first conference devoted to social computing.

To view the full 100-word lists that reliably predict hierarchical direction, read Gilbert’s paper located at http://comp.social.gatech.edu/papers/cscw12.hierarchy.gilbert.pdf. And about that certain four-letter word that rhymes with hit?

 

Out of Africa? Data fail to support language origin in Africa

In the beginning was the word — yes, but where exactly? Last year, Quentin Atkinson, a cultural anthropologist at Auckland University in New Zealand, proposed that the cradle of language could be localized in the southwest of Africa. The report, which appeared in Science, was seized upon by the media and caused something of a sensation. Now however, LMU linguist Michael Cysouw has published a commentary in Science which argues that this neat "Out-of-Africa" hypothesis for the origin of language is not adequately supported by the data presented. The search for the site of origin of language remains very much alive.

Atkinson based his claim on a comparative analysis of the numbers of phonemes found in about 500 present-day languages. Phonemes are the most basic sound units — consonants, vowels and tones — that form the basis of semantic differentiation in all languages. The number of phonemes used in natural languages varies widely. Atkinson, who is a biologist and psychologist by training, found that the highest levels of phoneme diversity occurred in languages spoken in southwestern Africa. Furthermore, according to his statistical analysis, the size of the phoneme inventory in a language tends to decrease with distance from this hotspot.

To interpret this finding Atkinson invoked a parallel from population genetics. Biologists have observed an analogous effect, insofar as human genetic diversity is found to decrease with distance from Africa, where our species originated. This is attributed to the so-called founder effect. As people migrated from the continent and small groups continued to disperse, each inevitably came to represent an ever-shrinking fraction of the total genetic diversity present in the African population as a whole.

So does such a founder effect play a similarly significant effect in the dispersal and differentiation of languages? Michael Cysouw regards Atkinson's finding as "artefactual." Cysouw, whose work is funded by one of the Starting Grants awarded by the European Research Council (ERC), heads a research group that studies quantitative comparative linguistics in LMU's Faculty of Languages and Literatures. He says he has no objection in principle to the use of methods borrowed from other disciplines to tackle questions in linguistics, but that problems arise from their inappropriate application.

For example, he finds that if Atkinson's method is employed to examine other aspects of language, such as the construction of subordinate clauses or the use of the passive mood, the results "do not point in the same direction." Indeed, in their article in Science, Cysouw and his coauthors Steven Moran (LMU) and Dan Dediu of the Max Planck Institute for Psycholinguistics in Nijmegen show that, depending on the features considered, Atkinson's method places the site of origin of language in eastern Africa or the Caucasus or somewhere else entirely. As Cysouw points out, linguists have long sought to throw light on the origin of language by analyzing patterns of language distribution. The problem is that such relationships can be reliably traced only as far back as about 10,000 years before the present.


Journal Reference:

  1. M. Cysouw, D. Dediu, S. Moran. Comment on "Phonemic Diversity Supports a Serial Founder Effect Model of Language Expansion from Africa". Science, 2012; 335 (6069): 657 DOI: 10.1126/science.1208841