1. John Cleese on PC and wokeness. I think the first comment is satire rather than serious, but one can’t be entirely sure these days. The best-known Monty Python episodes these days are entirely acceptable, but some of the now lesser-known works are pretty…out there.
2. “Meanwhile, for-profit companies charge schools thousands of dollars for the training, making the active shooter drill industry worth an estimated $2.7 billion — “all in pursuit of a practice that, to date, is not evidence-based,” according to the researchers.” Link here.
4. Why don’t coaches/manangers adjust more? A parable from the NBA, but with much broader applicability. Note that sometimes the star player is the problem too.
The FDA may be too conservative but it does subject new pharmaceuticals to real scientific tests for efficacy. In contrasts, many medical and surgical procedures have not been tested in randomized controlled trials. Moreover, dental care is far behind medical care in demanding scientific evidence of efficacy. A long-read in The Atlantic spends far too much time on a single case of egregious dental fraud but it’s larger point is correct:
Common dental procedures are not always as safe, effective, or durable as we are meant to believe. As a profession, dentistry has not yet applied the same level of self-scrutiny as medicine, or embraced as sweeping an emphasis on scientific evidence.
…Consider the maxim that everyone should visit the dentist twice a year for cleanings. We hear it so often, and from such a young age, that we’ve internalized it as truth. But this supposed commandment of oral health has no scientific grounding. Scholars have traced its origins to a few potential sources, including a toothpaste advertisement from the 1930s and an illustrated pamphlet from 1849 that follows the travails of a man with a severe toothache. Today, an increasing number of dentists acknowledge that adults with good oral hygiene need to see a dentist only once every 12 to 16 months.
The joke, of course, is that there’s no evidence for the 12 to 16 month rule either. Still give credit to Ferris Jabr for mentioning that the case for fluoridation is also weak by modern standards–questioning fluoridation has been a taboo in American society since anti-fluoridation activists were branded as far-right conspiracy theorists in the 1950s.
The Cochrane organization, a highly respected arbiter of evidence-based medicine, has conducted systematic reviews of oral-health studies since 1999….most of the Cochrane reviews reach one of two disheartening conclusions: Either the available evidence fails to confirm the purported benefits of a given dental intervention, or there is simply not enough research to say anything substantive one way or another.
Fluoridation of drinking water seems to help reduce tooth decay in children, but there is insufficient evidence that it does the same for adults. Some data suggest that regular flossing, in addition to brushing, mitigates gum disease, but there is only “weak, very unreliable” evidence that it combats plaque. As for common but invasive dental procedures, an increasing number of dentists question the tradition of prophylactic wisdom-teeth removal; often, the safer choice is to monitor unproblematic teeth for any worrying developments. Little medical evidence justifies the substitution of tooth-colored resins for typical metal amalgams to fill cavities. And what limited data we have don’t clearly indicate whether it’s better to repair a root-canaled tooth with a crown or a filling. When Cochrane researchers tried to determine whether faulty metal fillings should be repaired or replaced, they could not find a single study that met their standards.
As always, note that the descriptions are mine and reflect my priorities, as the self-descriptions of the applicants may be broader or slightly different. Here goes:
Michelle Rorich, for her work in economic development and Africa, to be furthered by a bike trip Cairo to Capetown.
Jeffrey C. Huber, to write a book on tech and economic progress from a Christian point of view.
Mayowa Osibodu, building AI programs to preserve endangered languages.
David Forscey, travel grant to look into issues and careers surrounding protection against election fraud.
Jennifer Doleac, Texas A&M, to develop an evidence-based law and economics, crime and punishment podcast.
Fergus McCullough, University of St. Andrews, travel grant to help build a career in law/history/politics/public affairs.
Justin Zheng, a high school student working on biometrics for cryptocurrency.
Kyle Eschen, comedian and magician and entertainer, to work on an initiative for the concept of “steelmanning” arguments.
Here is the first cohort of winners, and here is the second cohort. Here is the underlying philosophy behind Emergent Ventures. Note by the way, if you received an award very recently, you have not been forgotten but rather will show up in the fourth cohort.
Scientific output is not a linear function of amounts of federal grant support to individual investigators. As funding per investigator increases beyond a certain point, productivity decreases. This study reports that such diminishing marginal returns also apply for National Institutes of Health (NIH) research project grant funding to institutions. Analyses of data (2006-2015) for a representative cross-section of institutions, whose amounts of funding ranged from $3 million to $440 million per year, revealed robust inverse correlations between funding (per institution, per award, per investigator) and scientific output (publication productivity and citation impact productivity). Interestingly, prestigious institutions had on average 65% higher grant application success rates and 50% larger award sizes, whereas less-prestigious institutions produced 65% more publications and had a 35% higher citation impact per dollar of funding. These findings suggest that implicit biases and social prestige mechanisms (e.g., the Matthew effect) have a powerful impact on where NIH grant dollars go and the net return on taxpayers investments. They support evidence-based changes in funding policy geared towards a more equitable, more diverse and more productive distribution of federal support for scientific research. Success rate/productivity metrics developed for this study provide an impartial, empirically based mechanism to do so.
Policy analysts at the Centers for Disease Control and Prevention in Atlanta were told of the list of forbidden words at a meeting Thursday with senior CDC officials who oversee the budget, according to an analyst who took part in the 90-minute briefing. The forbidden words are “vulnerable,” “entitlement,” “diversity,” “transgender,” “fetus,” “evidence-based” and “science-based.”
That’s the WaPo piece everyone is abuzz about. A few observations:
1. This story may well be true, but I’d like more than “…according to an analyst who took part in the 90-minute briefing.” Here is another account of what exactly is known. Wasn’t “not publishing the article until it is better sourced” the evidence-based thing to do?
2. I don’t have a great fondness for the terms “evidence-based” or “science-based.” When they are used on MR, it is often as a form of third-person reference or with a slight mock or ironic touch. When I see the words used by others, my immediate reaction is to think someone is deploying it selectively, without complete self-awareness, or as a bullying tactic, in lieu of an actual argument, or as a way of denying how much their own argument depends on values rather than science. I wouldn’t ban the words for anyone working for me, but seeing them often prompts my editor’s red pen, so to speak. The most er…evidence-based people I know don’t use the term so much, least of all with reference to themselves.
3. In any case, the suggested replacement phrase — “CDC bases its recommendations on science in consideration with community standards and wishes” — I do not find offensive or anti-science, and I can imagine a plausible case that it is an actual improvement. Science is (ought to be) value-free, yet CDC and more broadly federal policy should embody values too. It should not think of itself as “the handmaiden of science.”
4. There is a fine line between “censorship” and “a bureaucratic organization which can be badly damaged by individual freelancing deciding to adopt uniform terminologies.” I don’t doubt both might be going on here, but I’d like to see the extant Twitter takes show a little more subtlety on the broader point. Don’t forget that the executive branch of government reports to the…executive, it is not a freestanding committee for debate, however much it might sometimes like to imagine otherwise.
5. The word “diversity” usually isn’t specific enough, or is channeling unstated preconceptions about how diversity should be interpreted. We should improve our use of this word. I have similar feelings about “vulnerable.”
6. People react to changes rather than levels.
7. “Fetus” — look, it is fine to disagree with the “pro-life view” (I’m not even sure what is the most neutral way of labeling it). But is banning the use of the word “fetus” in institutional documents censorship? What if an employee, during the Obama years, in an official CDC release had referred to a “fetus” as a “child”? Would that have been changed back to fetus? I am inclined to say yes. Is it censorship in only one direction, or are both decisions censorship? Or is this better seen as a disagreement over matters of fact? A disagreement over values? I am genuinely unsure, and I am unsure what a majority of the American public would think. But I would say this is sooner worth a ponder than a rant.
8. If nothing else, Sam Altman can show up in China, post “here is my vulnerable entitlement diversity transgender fetus, who is evidence-based and science-based” on his Weibo account, and then go order some Chairman Mao’s braised pork belly.
9. What are the forbidden words in other parts of the federal government, whether de jure or de facto? Will anyone be showing us a list? Or is that list censored too?
We must have Drug Abuse Resistance Education…I am proud of your work. It has played a key role in saving thousands of lives and futures.
Speaking at the 30th DARE Training Conference, Attorney General Jeff Sessions was enthusiastic and strongly supportive of DARE, the program started in Los Angeles in 1983 that uses police officers to give young children messages about staying drug free and resisting peer pressure.
D.A.R.E. is listed under “What doesn’t work?” on our Review of the Research Evidence.
Rosenbaum summarized the research evidence on D.A.R.E. by titling his 2007 Criminology and Public Policy article “Just say no to D.A.R.E.” As Rosenbaum describes, the program receives over $200 million in annual funding, despite little or no research evidence that D.A.R.E. has been successful in reducing adolescent drug or alcohol use. As Rosenbaum (2007: 815) concludes “In light of consistent evidence of ineffectiveness from multiple studies with high validity, public funding of the core D.A.R.E. program should be eliminated or greatly reduced. These monies should be used to fund drug prevention programs that, based on rigorous evaluations, are shown to be effective in preventing drug use.”
A systematic review by West and O’Neal (2004) examined 11 published studies of D.A.R.E. and reached similar conclusions. D.A.R.E. has little or no impact on drug use, alcohol use, or tobacco use. They concluded that ““Given the tremendous expenditures in time and money involved with D.A.R.E., it would appear that continued efforts should focus on other techniques and programs that might produce more substantial effects” (West & O’Neal, 2004: 1028).
Recent reformulations of the D.A.R.E. program have not shown successful results either. For example, the Take Charge of your Life program, delivered by D.A.R.E. officers was associated with significant increases in alcohol and cigarette use by program participants compared to a control group (Sloboda et al., 2009).
That is my latest Bloomberg column, hardly anyone has a consistent and evidence-based view on this deal. Here is one bit:
Critics who dislike Monsanto for its leading role in developing genetically modified organisms and agricultural chemicals shouldn’t also be citing monopoly concerns as a reason to oppose the merger — that combination of views doesn’t make sense. Let’s say for instance that the deal raised the price of GMOs due to monopoly power. Farmers would respond by using those seeds less, and presumably that should be welcome news to GMO opponents.
Yet on the other side:
What does Bayer hope to get for its $66 billion, $128-a-share offer? The company has argued that it will be able to eliminate some duplicated jobs and expenses, negotiate better deals with suppliers and invest more funds in research and development. Maybe, but the broader reality is less cheery. There is a well-known academic literature, dating to the early 1990s, showing that acquiring firms usually decline in value after tender offers, especially after the biggest deals. Mergers do not seem to make companies more valuable or efficient.
The whole Bayer-Monsanto case is a classic example of how a vociferous public debate can disguise or even reverse the true issues at stake. If Bayer fails to close the deal for Monsanto, Bayer shareholders may be the biggest winners. The biggest losers from a failed deal may be its opponents, who will spend the rest of their lives in a world where misguided judgments of corporate popularity have increasing sway over laws and regulations.
Do read the whole thing.
A strongly worded letter from Krueger, Goolsbee, Romer and Tyson to Sanders and his economic team chastising them for unrealistic, unscientific numbers. (No indent).
Dear Senator Sanders and Professor Gerald Friedman,
We are former Chairs of the Council of Economic Advisers for Presidents Barack Obama and Bill Clinton. For many years, we have worked to make the Democratic Party the party of evidence-based economic policy. When Republicans have proposed large tax cuts for the wealthy and asserted that those tax cuts would pay for themselves, for example, we have shown that the economic facts do not support these fantastical claims. We have applied the same rigor to proposals by Democrats, and worked to ensure that forecasts of the effects of proposed economic policies, from investment in infrastructure, to education and training, to health care reforms, are grounded in economic evidence. Largely as a result of efforts like these, the Democratic party has rightfully earned a reputation for responsibly estimating the effects of economic policies.
We are concerned to see the Sanders campaign citing extreme claims by Gerald Friedman about the effect of Senator Sanders’s economic plan—claims that cannot be supported by the economic evidence. Friedman asserts that your plan will have huge beneficial impacts on growth rates, income and employment that exceed even the most grandiose predictions by Republicans about the impact of their tax cut proposals.
As much as we wish it were so, no credible economic research supports economic impacts of these magnitudes. Making such promises runs against our party’s best traditions of evidence-based policy making and undermines our reputation as the party of responsible arithmetic. These claims undermine the credibility of the progressive economic agenda and make it that much more difficult to challenge the unrealistic claims made by Republican candidates.
Alan Krueger, Princeton University
Chair, Council of Economic Advisers, 2011-2013
Austan Goolsbee, University of Chicago Booth School
Chair, Council of Economic Advisers, 2010-2011
Christina Romer, University of California at Berkeley
Chair, Council of Economic Advisers, 2009-2010
Laura D’Andrea Tyson, University of California at Berkeley Haas School of Business
Chair, Council of Economic Advisers, 1993-1995
Research shows that evidence-based algorithms more accurately predict the future than do human forecasters. Yet, when forecasters are deciding whether to use a human forecaster or a statistical algorithm, they often choose the human forecaster. This phenomenon, which we call algorithm aversion, is costly, and it is important to understand its causes. We show that people are especially averse to algorithmic forecasters after seeing them perform, even when they see them outperform a human forecaster. This is because people more quickly lose confidence in algorithmic than human forecasters after seeing them make the same mistake. In five studies, participants either saw an algorithm make forecasts, a human make forecasts, both, or neither. They then decided whether to tie their incentives to the future predictions of the algorithm or the human. Participants who saw the algorithm perform were less confident in it, and less likely to choose it over an inferior human forecaster. This was true even among those who saw the algorithm outperform the human.
People who defer to the algorithm will outperform those who don’t, at least in the short run. In the long run, however, will reason atrophy when we defer, just as our map-reading skills have atrophied with GPS? Or will more of our limited resource of reason come to be better allocated according to comparative advantage?
1. “Ekki staðalbúnaður í smalamennsku!” With video, of course, and implying the advantages of water transport.
3. Steven Landsburg makes some good points, but Summers may be able to invoke threshold effects.
4. Harvard faculty actually seem to hate the best parts of Obamacare. Bravo to this article. And quick summaries of evidence-based medicine.
5. “I’m the poster child of evil [art] speculation…” An excellent piece, also NYT.
6. How big is the sexism problem in economics? Kimball and anon.
Treasury Secretary Jacob J. Lew has strongly urged federal agencies to finish writing the Volcker Rule by the end of the year — more than a year after they had been expected to do so — and President Obama recently stressed the importance of the deadline.
By the way, five (!) agencies are writing the rule, which is not a good sign. As for the Volcker rule more generally, here are a few points:
1. If restricting activity X makes large banks smaller, that will ease the resolution process, following a financial crack-up. That is a definite plus, although we do not know how much easier resolution will be.
2. It is not clear that banning bank proprietary trading will lower the chance of such a financial crack-up. The overall recent record of real estate lending is not a good one, and as Edward Conard pointed out, restricting banks to the long side of transactions is not obviously a good idea. I do see the moral hazard issue with allowing banks to engage in the potentially risky activity of proprietary trading. Still, so far the data are suggesting that the banks which cracked up during the crisis did so because of overconfidence and hubris, not because of moral hazard problems (i.e., they still were holding lots of the assets they otherwise might have been trying to “game”).
3. There is no strong connection between proprietary trading and our recent financial crises.
3b. Today the bugaboo is “big banks” but once it was “small banks” and for a while “insufficiently diversified banks.” Maybe it really is big banks, looking forward, or maybe we just don’t know. Small banks have their problems too.
4. There is some chance that proprietary trading will be pushed to a more dangerous, harder to regulate corner of our financial institutions.
5. There is some danger that loopholes in the regulation itself — especially as concerns permissible client activities — may undercut the original intent of the regulation. This will depend on exactly how well the regulation is written, but past regulatory history does not make me especially confident here. And the distinction between “speculation” and “hedging” cannot be clearly defined. Should we be writing rules whose central distinctions may be arbitrary? And yet CEOs will have to sign off on compliance (with 950 pp. of regulations) personally. Is that a good use of CEO attention? Here is a good FT piece about how hard (and ambiguous) it will be to enforce the rule globally.
6. I do not myself shed too many tears over the “these markets will become less liquid without banks’ participation” critique, but I would note this is a personal judgment and the scientific status of such a claim remains unclear.
7. Many people, even seasoned commentators, approach the Volcker rule with mood affiliation, starting with how much we should resent our banks or our regulators or how we should join virtually any fight against either “big banks” or regulators. I see many analyses of this issue which spend most of their time on “mood affiliation wind-up,” as I call it, and not so much time on the actual evidence, which is inconclusive to say the least.
8. We still seem unwilling to take actions which would transparently raise the price of credit to homeowners. We instead prefer actions which appear to raise no one’s price of credit and which are extremely non-transparent in their final effects. You can think of the Volcker rule as another entry in this sequence of ongoing choices. That should serve as a warning sign of sorts, and arguably that is a more important truth than the case either for or against the rule.
When I add up all of these factors, I come closer to a “don’t do the Volcker rule” stance in my mind. The case for the rule puts a good deal of stress on #1, but overall it does not fit the textbook model of good regulation. I probably have a more negative opinion of “an extreme willingness to experiment with arbitrary regulatory stabs” than do most of the rule’s supporters, and that difference of opinion is perhaps what divides us, rather than any argument about financial regulation per se.
I really do see how the Volcker rule might work out just fine or even to our advantage. I also see the temptation of arguing “I am against big banks, this is the legislation against big banks which is on the table, so I am going to support it.” But let us at least present to our public audiences just how weak is the evidence-based case for doing this.
Addendum: You will find a different point of view from Simon Johnson here. Here is a counter to his claim that prop trading losses were significant in 2008: “Loan losses didn’t just dwarf trading losses in absolute terms. Loan losses as a share of banks’ total loan portfolios also exceeded trading losses as a share of banks’ trading accounts. Yet no one’s arguing banks should stop lending in order to protect depositors (and rightly so). In short, those expecting the Volcker Rule to be a fix-all for Wall Street’s ills have probably misplaced their hope—the rule seems like a solution desperately seeking a problem.”
I noticed you linked Robin Hanson’s article on MetaMed on Marginal Revolution. I’m the VP of research at MetaMed, and I just wanted to tell you a little bit more about us, because if all you know about us is the Overcoming Bias article you might get some misleading impressions.
Medical practice is basically a mass-produced product. Professional and regulatory bodies (like the AMA) put out guidelines for treatment. At their best, these guidelines follow the standards of evidence-based medicine, which means that on average they will produce the best health outcomes in the general population. (Of course, in practice they often fall short of that standard. For example, checklists are overwhelmingly beneficial by an evidence-based medicine standard, and yet are not universally used.)
But even at their best, the guidelines that are best from a population-health standpoint need not be optimal for an individual patient. If you have the interest and the willingness to pay, investigating your condition in depth, in the context of your entire medical history, genetic data, and personal priorities, may well turn up opportunities to do better than the standardized medical guidelines which at best maximize average health outcomes.
That’s basically MetaMed’s raison d’etre. And it’s a pretty conservative hypothesis, in fact. We may harbor a few grander ambitions (for example, I come from a mathematical background and I’m working on some longer-term projects related to algorithmically automating parts of the diagnostic process, and using machine learning principles on biochemical networks in novel ways) but fundamentally the thing we claim to be able to do is give you finer-grained information than your doctor will. We’re, of course, as yet unproven in the sense that we haven’t had enough clients to provide empirical evidence of how we improve health outcomes, but we’re not making extraordinary claims.
Robin Hanson seems to be implying that MetaMed is claiming to be useful only because we’re members of the “rationalist community.” This isn’t true. We think we’re useful because we give our clients personalized attention, because we’re more statistically literate than most doctors, because we don’t have some of the misaligned incentives that the medical profession does (e.g. we don’t have an incentive to talk up the benefits of procedures/drugs that are reimbursable by insurance), because we have a variety of experts and specialists on our team, etc.
The “rationalist” sensibility is important, to some degree, because, for instance, we’re willing to tell clients that incomplete evidence is evidence in the Bayesian sense, whereas the evidence-based medicine paradigm says that anything that yet hasn’t been tested in clinical trials and found a 5% p-value is completely unknown. For instance, we’re willing to count reasoning from chemical mechanisms as (weak) evidence. There’s a difference in philosophy between “minimize risk of saying a falsehood” and “be as close to accurate as possible”; we strive to do the latter. So there’s a sense in which our epistemic culture allows us to be more flexible and pragmatic. But we certainly aren’t basing our business model on a blanket claim of being better than the establishment just because we come from the rationalist community.
Of course there are no significant details yet, but here are a few points.
1. The evidence that this can be done effectively in a scalable manner is basically zero. Aren’t massive policies (possibly universal?) supposed to be based on evidence? (How about running a large-scale RCT first, a’la the Rand health insurance experiment? And by the way, here is a quick look at the evidence we have on pre-school, and here, not nearly skeptical enough in my view. And think in terms of lasting results, not getting kids to read nine months earlier, etc. You can find evidence for persistent math gains in Tulsa, OK, but no CBA.)
2. That doesn’t mean we should do nothing.
3. Let’s say we have “the political will” to do something effective (debatable, of course). Is adding on another layer of education, and building that up more or less from scratch in many cases, better than fixing the often quite broken systems we have now? I know well all the claims about “needing to get kids early,” but is current kindergarten so late in life? Why not have much better kindergartens and first and second grade experiences in the ailing school districts? Or is the claim that by kindergarten “it is too late,” yet a well-executed government early education could fix the relevant problems if applied at ages three to four? Would such a claim mean that we are currently writing off many millions of American children, as it stands now?
4. This is what federalism is for. Let’s have an experiment emanating from the state and/or local level.
5. What should we infer from the fact that no such truly broad-based state-level experiment has happened yet? (Georgia and Oklahoma have come closest.) That the states are lacking in vision, relative to the Presidency? Or that a workable version of the idea is hard to come up with, execute, and sell to voters?
6. In Finland government education doesn’t really touch the kids until they are six years old. Don’t they have a very good system? Some call it the world’s best. Maybe the early years are very important, but perhaps pre-schooling is not the key missing piece of the puzzle. (NB: See the comments for dissenting views on Finland.)