Mindless Researching

I enjoyed Brian Wansink’s book Mindless Eating–it was well written and filled with creative experiments like the ever filling soup bowl. In the ten years since that time Wansink became not just a media start but an academic star with an h-index of 75 and over 24 thousand citations. In recent years, however, he has had to retract papers in the light of inconsistencies and questions about his data and statistics.

A Buzzfeed article, based in part on emails, now reveals that Wansink was running a brazen p-hacking factory:

The correspondence shows, for example, how Wansink coached Siğirci to knead the pizza data.

First, he wrote, she should break up the diners into all kinds of groups: “males, females, lunch goers, dinner goers, people sitting alone, people eating with groups of 2, people eating in groups of 2+, people who order alcohol, people who order soft drinks, people who sit close to buffet, people who sit far away, and so on…”

Then she should dig for statistical relationships between those groups and the rest of the data: “# pieces of pizza, # trips, fill level of plate, did they get dessert, did they order a drink, and so on…”

…“Work hard, squeeze some blood out of this rock, and we’ll see you soon.”…All four of the pizza papers were eventually retracted or corrected.

In essence, Wansink all but published a study finding green jelly beans cause acne. All hail XKCD.

Comments

What's amazing that the whole thing started to unravel when Wansink himself published a blog post where he candidly described how he made his underlings dredge datasets for "interesting findings." He seemed to be genuinely statistically illiterate and didn't seem to know that sampling error exists or that there was anything really wrong with what he was doing.

Never heard of this guy but another possibility is that he, like Lance Armstrong, got sick of the whole charade and decided to confess.

No, he's been fighting back all the time.

So did Lance Armstrong - up to winning a lawsuit, before losing again.

'Disgraced cyclist Lance Armstrong has reached an agreed settlement with The Sunday Times, which last year accused him of deceit and sued him for more than £1 million in relation to a libel action he had brought against it in 2004, winning a six figure sum in damages.

The British newspaper revealed last December it was suing Armstrong in the wake of the United States Anti Doping Agency (USADA) ruling that he had doped his way to seven Tour de France victories.

Armstrong’s original libel action against the newspaper was based on a 2004 article written by the newspaper’s chief sports writer, David Walsh, who had first raised suspicions over Armstrong’s performances in an article in July 1999, the year he won his first Tour de France.

Walsh would go on to co-write, with French journalist Pierre Ballester, the book LA Confidentiel, published in 2004 and which included allegations against Armstrong from people who had been close to him including his former masseur, Emma O’Reilly.

The claims made against Armstrong in the book, which for legal reasons was never been published in the UK, formed the basis of a Sunday Times article published in June the same year. Armstrong sued Walsh and the newspaper’s then deputy sport editor, Alan English, for libel.

According to The Sunday Times [£], in his lawsuit, Armstrong “strenuously” denied the allegations, maintained the article depicted him as “a fraud, a cheat and a liar,” and said Walsh “had an agenda” against him.

At a pre-trial hearing two years later, Mr Justice Gray rejected the newspaper’s claim that the article provided “reasonable or strong grounds” to suspect that Armstrong had been doping.

Instead, the judge agreed with Armstrong’s insistence that the only inference to be drawn from it was that he had been using performance enhancing drugs throughout his career.

The newspaper says that faced with the difficulty of having to prove that, it had no option but to agree a settlement with Armstrong, and paid him £300,000.' http://road.cc/content/news/90878-lance-armstrong-settles-%C2%A31-million-sunday-times-lawsuit

I am shocked that anyone would use science and statistics to prove something that was not true but satisfied their own bias. Shocked, I tell you!

+1. This really is how it reads.

I imagine this is something close to standard operating procedure for a lot of the soft sciences.

If only it were restricted to the soft sciences.

For example, climate science - boy, have their predictions concerning things like Arctic sea ice coverage been ludicrous compared to the last couple of decade's worth of satellite data. Talk about being far too in love with models saying that Arctic sea ice coverage would not noticeably shrink before something like 2030 or 2050. A real disgrace for climate science when it comes to comparing reality as seen through empirical data to such hoplessly flawed models. Makes one wonder if they even know what 'albedo' means.

Thankfully, real scientists continue to collect empirical real/near real time data involving polar sea ice coverage, thus showing just how incompetent those climate science predictions were.

Actually, the Arctic ice was predicted to be completely gone by several years ago.

But I thank you for trying to be excited about one prediction that you think they got right, while completely ignoring the hundreds they got wrong.

This is called Proving The Other Guy's Point, and it's hilarious.

The irony is delicious.

'Actually, the Arctic ice was predicted to be completely gone by several years ago.'

Have a link? Here is the IPCC - 'In a more recent study, there is good agreement between Arctic sea-ice trends and those simulated by control and transient integrations from the Geophysical Fluid Dynamics Laboratory (GFDL) and the Hadley Centre (see Figure 16-6). Although the Hadley Centre climate model underestimates sea-ice extent and thickness, the trends of the two models are similar. Both models predict continued decreases in sea-ice thickness and extent (Vinnikov et al., 1999), so that by 2050, sea-ice extent is reduced to about 80% of area it covered at the mid-20th century.' http://www.ipcc.ch/ipccreports/tar/wg2/index.php?idp=605

Which will be hopelessly wrong, by the way, by several decades. This is just a snapshot of September minimum extent in 4 periods, from 1850 to 2013 simply as illustration - http://nsidc.org/arcticseaicenews/files/2018/01/Figure5b-2.png Of course, the best data is from 1979 to now, based on satellite observation. Notice the difference between 1995 and 2012. 2012 is unusually low, but then the top 10 September minimum extents start from 2007, with only 2009 not being in that top ten list.

But you are welcome to look at the empirical data yourself, of course. Or just read the monthly summary at http://nsidc.org/arcticseaicenews/

You would think that people that feel that climate science is a flawed endeavor would be more than happy to have their beliefs confirmed through actual data. Unless, of course, one is not actually interested in empirical data.

Qui s'excuse s'accuse.

There is no accusing or excusing going on - the various IPCC predictions are hopelessly wrong, compared to satellite data. If one is interested in using empirical data in the interest of attempting to advance the understanding of our world, what is going on in the Arctic is a clear example of how poorly our best climate models work (certainly concerning that region) function, clearly demonstrating how completely inadequate they are.

Anyone who is truly interested in pointing out the current flaws in those models compared to empirical data would be highlighting these flaws as a concrete example of how climate science needs vast improvement. The data does not care about anyone's opinion, after all.

A Google search on "wansink" and site or domain "dailymail.co.uk" gets 98 hits. Perhaps that should be used as an inverse indicator of research credibility.

Andrew Gelman's Statistical Modeling blog (MR links to it) has been on to this guy for years.

http://andrewgelman.com/

No, the first post on Wansink is from december 2016 in relation to the above-mentioned blog post: http://andrewgelman.com/2016/12/15/hark-hark-p-value-heavens-gate-sings/

Tim van der Zee and co-authors are doing the heavy lifting here: http://www.timvanderzee.com/the-wansink-dossier-an-overview/

Legal advice for economists considering p-hacking from one of the best contributions to the Examples of Junk Science series: Antitrustworthy Analysis: "Don’t p-hack. It wastes time, it will be vulnerable to cross-examination and it undermines the legitimacy of the valuable contributions that economics can make to a case."

Here's the link to the entire series, where material covered by the "Ignoring Inconsistencies" contribution would also be very relevant to the emerging story of Wansink's questionable analytical methods. These aren't things that happened 10 years ago in the world of food and nutrition research - it is happening in economics and other sciences today.

Thanks! Good link. That comic in the first link ("try to grab the 84...") is exactly true. Way too common of a methodology, but I totally understand why it happens. !

The common term is "we tortured the data until it confessed".

And there are also some cases of mindless lack of research. For instance, bank regulators, when determining their risk weighted capital requirements for banks, never researched what assets were dangerous to the bank system, they just used how risky the assets were in general.
http://perkurowski.blogspot.com/2016/04/here-are-17-reasons-for-why-i-believe.html

How mathematically literate are grad students in the social and behavioral sciences? This is pure conjecture on my part, but my suspicion is that they've had one class on statistics, probably at the undergrad level, with no very high degree of comprehension required to pass; and after that, they've plugged their numbers into the computer and accepted whatever p-value it puts out, with no notion of which programs and which parameters should be used under which circumstances.

It might be interesting to assemble a roomful of sociology researchers, take away their devices, and give them a test on basic undergrad probability. What fraction of them could come up with the correct answer to, say, a basic Bayes' Theorem problem?

ScienceNew.org had an (maybe even a couple) article a few years back suggesting that it's a lot more than just the social scientists that have problems with statistics. I think this is it but don't have the subscription to check now: https://www.sciencenews.org/article/odds-are-its-wrong

In addition the problem journals pushing immediate results that look good and not caring much about failure to replicate or actually doing the types of studies needed to actually apply the statistical analysis (meaning more than just one data set) it pointed out that many have just gotten the stat packages and don't know what the tool is actually doing. Hence misapplying concepts. Apparently too many researchers think a Confidence Interval or 95 % means their results are true with a 95% level of confidence.

I don't think mathematical literacy has a lot to do with it.

This kind of thing shouldn't pass an honest person's smell test, at least if they have had even one statistics class. And, by the way, that first statistics class ought to cover the point. Maybe there is too much emphasis on calculations and not enough on the underlying logic of what is going on. Come to think of it, there's no "maybe" to it.

"ScienceNew.org had an (maybe even a couple) article a few years back suggesting that it’s a lot more than just the social scientists that have problems with statistics."

Well yes, but we don't need a citation to notice that the same p-hacking problem is common in medical research, health and nutrition research, and to a certain extent epidemiology.

All of these fields including the social sciences have a variety of methods to try to deal with the problem, but usually the best methods are complex and themselves prone to mis-use, and the researcher has to be honest enough to use them rather than pretending that the results were obtained without resorting to p-hacking, data-fishing, etc.

Look at this article: https://www.theguardian.com/science/occams-corner/2013/sep/19/science-religion-not-be-questioned

The author is Henry Gee who is a senior editor of Nature, Britain's leading science journal. You would expect him to get p-values right, it is a core part of his job and he holds an important job, gate keeping science. But, no, he makes the basic error

> If this all sounds rather rarefied, consider science at its most practical. As discussed in Dr McLain's article and the comments subjacent, scientific experiments don't end with a holy grail so much as an estimate of probability. For example, one might be able to accord a value to one's conclusion not of "yes" or "no" but "P<0.05", which means that the result has a less than one in 20 chance of being a fluke. That doesn't mean it's "right"

That gets things the wrong way round. Next Henry Gee will be telling us that five twelfths is equal to 2.4 because division is commutative :-)

Are grad students more mathematically literate than the senior figures who gate keep their careers? Since this is an economic blog we should think about incentives. Will mathematical literacy help or hinder them getting published in Nature?

It is sad how Mr. Wansink is being persecuted by minions of a dead orthodoxy. Meanwhile, America faces an unprecedent obesity crisis.

Hahah. Is that you, Wansink?

No, I neither am Mr. Wansik nor have any relationship with his seminal research. But it is said how Big Press' scandalmongering rags are striving to destroy a straightaway researcher.

"Straightaway researcher". Hahaha. Whatever Wansink. Only the accused or someone equally inept at understanding science could/would make such obtuse statements.

He is being accused of what? Having found variables there are related a.k.a. doing his job. Meanwhile, Big Fast Food destroys shortens American lives.

Finding variables that are related, making data up. 6 one way, 1/2 dozen the other. But whatever it takes to blame "Big Fast Food" for personal choices, right?

I mean, drug dealers are just entrepreneurs...

1) Yes, drug dealers are entrepreneurs. 2) You decided that you couldn't defend the whole "making up data thing", so you changed the subject. I'm sure it was the personal choice part of my comment that you disagree with. But having grown up in the South (not know for its healthy eating habits (independent of the actions of those evil corporations)) and on fast food, I am somehow able to both avoid consuming excessive amounts of "bad" food and to maintain a simple exercise regimen to keep myself relatively fit and healthy. Much like someone who is addicted to heroin could have... you know... not taken heorin to begin with.

You did not read the word "just". Or pretended you did not read. You are chilling for Big Fast Food. Again, all the emails show is that Mr. Wansik broke some formal protocols. He commited no wrongdoing whatsoever.

I guess presenting data from 3 to 5 year olds as if we're generated from 8 to 11 year olds and presenting estimates with p values that could not have been generated by the data are now just frowned upon rather than being signs of fraud... buy whatever. Also, no one is "just" one thing. I'm sure some drug dealers are guitar players. What else were you suggesting when saying they were "just" entrepreneurs?

Just in case you wanted a citation: https://link.springer.com/epdf/10.1186/s40795-017-0167-x?author_access_token=6_HKDpWIJnaRft4Kj0ZUym_BpE1tBhCbnbw3BuzI2RO6e1hXIXeKsXK1dOJH8WHgQJBQ6VxioV_dX3PEK6QG4ctt6O8mLtK6oc-XRIZJOy1DRyuQr-uJx8SLo9UL1EBYWrTjlh7SbE_tu5yojCAyLA%3D%3D

Collecting data costs a lot of money.

p-hacking is simply increasing productivity.

That makes it better, according to economists.

How many times has one heard or read a statement by a social scientist that "I go where the data take me". Of course, the intent is to confirm the absence of bias because the researcher doesn't go looking for data that confirms a result already predicted. Ironically, "hypothesis first" would suggest bias to most people, not the best practice for conducting research. I'm an advocate so I go looking for authority that confirms a legal result (the "hypothesis") that is best for my client. The difference between unbiased research and advocacy social science may be clear to Tabarrok and Cowen, but not to me. The new book about Peter Thiel and Gawker and the ongoing dispute between Thiel and Gawker (there's a possible claim against Thiel for tortuous interference) have revealed some interesting details, including the effort to select overweight women for the jury in the lawsuit by Hulk Hogan against Gawker that Thiel funded after several mock trials indicated that overweight women were more likely to punish a web site like Gawker that discloses personal and salacious material about people - overweight women feel that they are the subject of such personal and salacious disclosures. Think about that: Thiel won his lawsuit in large part because of overweight women, women who, fortunately for Thiel, didn't follow the advice of Mr. Wansick to avoid binge pizza eating. https://www.thedailybeast.com/peter-thiel-got-his-revenge-on-gawker-he-may-yet-regret-it

Can we prove that Tyler really HAS visited all the gas station tacquerias of Fairfax County?

Well, since his best recommendation was one in Maryland, that may not be a particularly relevant question. http://marginalrevolution.com/marginalrevolution/2011/03/gas-station-tacos.html

'R&R Tacqueria, 7894 Washington Blvd. (Rt.1); 410-799-0001, Elkridge, Maryland, 13 minutes north of the 495/95 intersection, look for the Shell sign.

This tacqueria is in a gas station, with two small counters and three chairs to sit on. It is the best huarache I have eaten, ever, including in Mexico. It is the best chile relleno I've had in the United States, ever. They serve among the best Mexican soups I have had, ever, and I have been to Mexico almost twenty times. I recommend the tacos al pastor as well. At first Yana and Natasha were skeptics ("Sometimes you exaggerate about food") but now they are converts and the takeaways have vanished. They even sell Mexican Coca-Cola and by the way the place is quite clean and nice, albeit cramped.

The highly intelligent proprietor is a former cargo pilot from Mexico City and speaks excellent English. The restaurant is called R&R after the names of his two sons.'

Though it does seem as if the smarter cooks are not always drawn to strip malls, in reference to a more recent post.

You would think, if he is so smart, he could have given his sons full names, not just a single letter.

I'm OK with the single letter, but the SAME letter? How do they know which one he's calling for?

He could have appended numbers. The marginalization of numbers at the hands of letters continues apace.

Maybe he numbers them like George Foreman did with his boys, all named George.

Isn't Elkridge more than 13 minutes north of the beltway? Inaccurate geographically and the quality assessment probably isn't reliable, either.

"I have been to Mexico almost twenty times": a nation cheers.

The restaurant has moved to a restaurant sized space in the adjacent strip mall at 7840 Washington Blvd. Elkridge, MD. It's still excellent.

What's described here is literally the best practice in polling and survey-based market research, in general.

I remember in my University department of Finance in the early 1990s the academics used to snidely call p-hacking "data mining".

"Data mining" was a really good phrase, but around maybe 15 or 20 years ago computer science types (I don't think the term "data scientist" had become widespread yet) started popularizing semi-legitimate techniques that they called data mining. So now I call it "data fishing" instead.

And in the last few years, the phrase "data mining" has been eclipsed by "machine learning". There's some good work being done, but there's a limit to how far you can go with pure empiricism and no underlying theory. Having tens of millions of observations and tens of thousands of variables doesn't help if the variables lack explanatory power, the data cover only a few years of observations, and the researcher doesn't know about endogeneity. (The many observations do however enable the researcher to do cross-validation, a clear improvement over within-sample standard error estimation.)

The Google Flu Trends saga is perhaps the perfect example. Google thought it had invented a superior technique for early detection of flu epidemics. Turns out they hadn't, they just got lucky one year. Which doesn't mean that their work should be discarded; additional tools are always welcome. But machine learning isn't going to solve the conundrums and complexities of social science research.

Here's an article in the NYT about a study conducted at the Mind and Body Lab that "exercise beliefs" affect both waistlines and life span: https://www.nytimes.com/2018/02/22/well/move/how-our-beliefs-can-shape-our-waistlines.html

"How mathematically literate are grad students in the social and behavioral sciences? This is pure conjecture on my part, but my suspicion is that they’ve had one class on statistics, probably at the undergrad level, with no very high degree of comprehension required to pass; and after that, they’ve plugged their numbers into the computer and accepted whatever p-value it puts out, with no notion of which programs and which parameters should be used under which circumstances."

"Pure conjecture on my part" X "my suspicion" X "probably" - from someone who wishes to denigrate the mathematical literacy of others.

Have you ever noticed that books that have Ph.D. after the author's name on the cover are all dubious? Is it just me or does everyone know this?

I wouldn’t have a problem if all this guy did was send his minions off to gather lots of data and then look through it to see if they could find some interesting relationship. The problem is that he claimed that what they found is true.

I haven’t read his stuff but if I understand correctly, he would say something like: “we found that if x happens (all you can eat pizza), then people do Y (eat too much pizza). So if you don’t want Y, don’t allow X.”

What he should have said is “Hmm…we found X and Y are related. That’s interesting. Let’s see if we can construct a logical theory to explain why that might be true. If we get that far, then we’ll go do lots of other tests to see if we can falsify that hypothesis.”

What is it with Buzzfeed getting the scoops lately? What are they doing well, suddenly?

@Alex - here's an idea for a follow up post that I'd love to see you write: Exploring the idea of whether we lived in a world where green jelly beans *do* cause acne. How would we find out? How would we know?

(In my mind, the sin was not searching for non-preregistered relationships, the sin was using naive p-values to report confidence in those relationships. Certainly the relationships are worth exploring, and certainly we might scientific truth where we did not know to find some. This is a distinction worth clarifying, imo.)

Comments for this post are closed