Department of spurious correlations?

Here is the abstract of a forthcoming AER piece, written by M. Keith Chen:

Languages differ widely in the ways they encode time. I test the hypothesis that languages that grammatically associate the future and the present, foster future-oriented behavior. This prediction arises naturally when well-documented effects of language structure are merged with models of intertemporal choice. Empirically, I find that speakers of such languages: save more, retire with more wealth, smoke less, practice safer sex, and are less obese. This holds both across countries and within countries when comparing demographically similar native households. The evidence does not support the most obvious forms of common causation. I discuss implications for theories of intertemporal choice.

Here is from a recent article in The Chronicle of Higher Education, by Geoffrey Pullum:

Chen’s data on languages comes from the World Atlas of Language Structures (WALS), and his evidence on prudence from the World Values Survey (WVS). Both are fully Web-accessible. Sean Roberts, who studies language evolution at the Max Planck Institute for Psycholinguistics in Nijmegen, decided to investigate the other linguistic factors treated in WALS to see how they related to prudence. He compared the goodness of fit for linear regressions on each of a long list of properties of languages (the independent variables), using as the dependent variable the answers that speakers gave to the WVS question “Did you save money last year?”

The results (see this blog post for an informal account) were jaw-dropping. He found that dozens of linguistic variables were better predictors of prudence than future marking: whether the language has uvular consonants; verbal agreement of particular types; relative clauses following nouns; double-accusative constructions; preposed interrogative phrases; and so on—a motley collection of factors that no one could plausibly connect to 401(k) contributions or junk-food consumption.

There is a bit more here.

For the pointer I thank Mike T. And I would gladly run a response from Chen, if he has interest in drafting one.

Addendum: Here is an important update from the critic, after improving the specification of his alternative fits:

The results showed that there was only one other linguistic variable that improved the fit of the model more than future tense. That is, future tense was a better predictor than 99% of the linguistic variables. For comparison, Dediu & Ladd’s test of the link between linguistic tone and Microcephalin/ASPM found that the hypothesised link was stronger than 98.5% of many thousands of links between genetic and linguistic factors.