evidence-based

Sarah Constantin replies on MetaMed

Not long ago I linked to this Robin Hanson blog post on MetaMed.  I was sent this reply, which I will put under the fold:

I noticed you linked Robin Hanson’s article on MetaMed on Marginal Revolution.  I’m the VP of research at MetaMed, and I just wanted to tell you a little bit more about us, because if all you know about us is the Overcoming Bias article you might get some misleading impressions.

Medical practice is basically a mass-produced product. Professional and regulatory bodies (like the AMA) put out guidelines for treatment.  At their best, these guidelines follow the standards of evidence-based medicine, which means that on average they will produce the best health outcomes in the general population.  (Of course, in practice they often fall short of that standard.  For example, checklists are overwhelmingly beneficial by an evidence-based medicine standard, and yet are not universally used.)

But even at their best, the guidelines that are best from a population-health standpoint need not be optimal for an individual patient.  If you have the interest and the willingness to pay, investigating your condition in depth, in the context of your entire medical history, genetic data, and personal priorities, may well turn up opportunities to do better than the standardized medical guidelines which at best maximize average health outcomes.

That’s basically MetaMed’s raison d’etre.  And it’s a pretty conservative hypothesis, in fact.  We may harbor a few grander ambitions (for example, I come from a mathematical background and I’m working on some longer-term projects related to algorithmically automating parts of the diagnostic process, and using machine learning principles on biochemical networks in novel ways) but fundamentally the thing we claim to be able to do is give you finer-grained information than your doctor will.  We’re, of course, as yet unproven in the sense that we haven’t had enough clients to provide empirical evidence of how we improve health outcomes, but we’re not making extraordinary claims.

Robin Hanson seems to be implying that MetaMed is claiming to be useful only because we’re members of the “rationalist community.”  This isn’t true.  We think we’re useful because we give our clients personalized attention, because we’re more statistically literate than most doctors, because we don’t have some of the misaligned incentives that the medical profession does (e.g. we don’t have an incentive to talk up the benefits of procedures/drugs that are reimbursable by insurance), because we have a variety of experts and specialists on our team, etc.

The “rationalist” sensibility is important, to some degree, because, for instance, we’re willing to tell clients that incomplete evidence is evidence in the Bayesian sense, whereas the evidence-based medicine paradigm says that anything that yet hasn’t been tested in clinical trials and found a 5% p-value is completely unknown. For instance, we’re willing to count reasoning from chemical mechanisms as (weak) evidence. There’s a difference in philosophy between “minimize risk of saying a falsehood” and “be as close to accurate as possible”; we strive to do the latter.  So there’s a sense in which our epistemic culture allows us to be more flexible and pragmatic.  But we certainly aren’t basing our business model on a blanket claim of being better than the establishment just because we come from the rationalist community.

What do I think of Obama’s universal pre-school proposal?

Of course there are no significant details yet, but here are a few points.

1. The evidence that this can be done effectively in a scalable manner is basically zero.  Aren’t massive policies (possibly universal?) supposed to be based on evidence?  (How about running a large-scale RCT first, a’la the Rand health insurance experiment?  And by the way, here is a quick look at the evidence we have on pre-school, and here, not nearly skeptical enough in my view.  And think in terms of lasting results, not getting kids to read nine months earlier, etc.  You can find evidence for persistent math gains in Tulsa, OK, but no CBA.)

2. That doesn’t mean we should do nothing.

3. Let’s say we have “the political will” to do something effective (debatable, of course).  Is adding on another layer of education, and building that up more or less from scratch in many cases, better than fixing the often quite broken systems we have now?  I know well all the claims about “needing to get kids early,” but is current kindergarten so late in life?  Why not have much better kindergartens and first and second grade experiences in the ailing school districts?  Or is the claim that by kindergarten “it is too late,” yet a well-executed government early education could fix the relevant problems if applied at ages three to four?  Would such a claim mean that we are currently writing off many millions of American children, as it stands now?

4. This is what federalism is for.  Let’s have an experiment emanating from the state and/or local level.

5. What should we infer from the fact that no such truly broad-based state-level experiment has happened yet?  (Georgia and Oklahoma have come closest.)  That the states are lacking in vision, relative to the Presidency?  Or that a workable version of the idea is hard to come up with, execute, and sell to voters?

6. In Finland government education doesn’t really touch the kids until they are six years old.  Don’t they have a very good system?  Some call it the world’s best.  Maybe the early years are very important, but perhaps pre-schooling is not the key missing piece of the puzzle.  (NB: See the comments for dissenting views on Finland.)

Addendum: Here are good comments from Reihan.  See also this Brookings study: “This thin empirical gruel will not satisfy policymakers who want to practice evidence-based education.”

More on Online Education

At Cato Unbound I respond to some of the critics of my article Why Online Education Works. Here is one bit:

We do need more studies of offline, online, and blended education models, but the evidence that we do have is supportive of the online model. In 2009, The Department of Education conducted a meta-analysis and review of online learning studies and found:

  • Students in online conditions performed modestly better, on average, than those learning the same material through traditional face-to-face instruction.
  • Instruction combining online and face-to-face elements had a larger advantage relative to purely face-to-face instruction than did purely online instruction.
  • Effect sizes were larger for studies in which the online instruction was collaborative or instructor-directed than in those studies where online learners worked independently.
  • The effectiveness of online learning approaches appears quite broad across different content and learner types. Online learning appeared to be an effective option for both undergraduates (mean effect of +0.30, p < .001) and for graduate students and professionals (+0.10, p < .05) in a wide range of academic and professional studies.

There Will Be Blood

Economists often reduce complex motivations to simple functions such as profit maximization. Writing in The Economist, Buttonwood ably criticizes such simplifications. Buttonwood is too quick, however, to conclude that simplification falsifies. For example, Buttonwood argues:

If there is a shortage of blood, making payments to blood donors might seem a brilliant idea. But studies show that most donors are motivated by an idea of civic duty and that a monetary reward might actually undermine their sense of altruism.

As loyal readers of this blog know, however, the empirical evidence is that incentives for blood donation actually work quite well. Mario Macis, Nicola Lacetera, and Bob Slonim, the authors of the most important work on this subject (references below), write to me with the details:

The decision to donate blood involves complex motivations including altruism, civic duty and moral responsibility. As a result, we agree with Buttonwood that in theory incentives could reduce the supply of blood. In fact, this claim is often advanced in the popular press as well as in academic publications, and as a consequence, more and more often it is taken for granted.

But what is the effect of incentives when studied in the real world with real donors and actual blood donations?

We are unaware of a single study of real blood donations that shows that offering an incentive reduces the overall quantity or quality of blood donations. From our two studies, both in the United States covering several hundred thousand people, and studies by Goette and Stutzer (Switzerland) and Lacetera and Macis (Italy), a total of 17 distinct incentive items have been studied for the effects on actual blood donations. Incentives have included both small items and gift cards as well as larger items such as jackets and a paid-day off of work.  In 16 of the 17 items examined, blood donations significantly increased (and there was no effect for the one other item), and in 16 of the 17 items studied no significant increase in deferrals or disqualifications were found.  No study has ever looked at paying cash for actual blood donations, but several of the 17 items in the above studies involve gift cards with clear monetary value.

Although many lab studies and surveys have found differing evidence focusing on other outcomes than actual blood donations (such as stated preferences), the empirical record when looking at actual blood donations is thus far unambiguous: incentives increase donations.

Given the vast and important policy debate regarding addressing shortages for blood, organ and bone marrow in developed as well as less-developed economies, where shortages are especially severe, it is important to not only consider more complex human motivations, but to also provide reliable evidence, and interpret it carefully. The recent ruling by the 9th Circuit Court of Appeals allowing the legal compensation of bone marrow donors further enhances the importance of the debate and the necessity to provide evidence-based insights.

Here is a list of references:

Goette, L., and Stutzer, A., 2011: “Blood Donation and Incentives: Evidence from a Field Experiment,” Working Paper.

Lacetera, N., and Macis, M. 2012. Time for Blood: The Effect of Paid Leave Legislation on Altruistic Behavior. Journal of Law, Economics and Organization, forthcoming.

Lacetera N, Macis M, Slonim R 2012 Will there be Blood? Incentives and Displacement Effects in Pro-Social Behavior. American Economic Journal: Economic Policy 4: 186-223.

Lacetera N, Macis M, Slonim R.: Rewarding Altruism: A natural Field Experiment, NBER working paper.

“Ethos of the Unit”

This is from a child and adolescent mental health group at University College London, but it could and should also count as “Ethos of the Blogger”:

•All research is provisional
•All research raises as many questions as it answers
•All research is difficult to interpret and to draw clear conclusions from
•Qualitative research may be vital to elaborate experience, suggest narratives for understanding phenomena and generate hypotheses but it can’t be taken to prove anything
•Quantitative research may be able to show hard findings but can rarely (never?) give clear answers to complex questions

And yet, despite all the challenges, it is still worth attempting to encourage an evidence-based approach, since the alternative is to continue to develop practice based only on assumption and belief.

For the pointer I thank Michelle Dawson.

Portuguese drug decriminalization

Caitlin Elizabeth Hughes and Alex Stevens have written a new study:

The issue of decriminalizing illicit drugs is hotly debated, but is rarely subject to evidence-based analysis. This paper examines the case of Portugal, a nation that decriminalized the use and possession of all illicit drugs on 1 July 2001. Drawing upon independent evaluations and interviews conducted with 13 key stakeholders in 2007 and 2009, it critically analyses the criminal justice and health impacts against trends from neighbouring Spain and Italy. It concludes that contrary to predictions, the Portuguese decriminalization did not lead to major increases in drug use. Indeed, evidence indicates reductions in problematic use, drug-related harms and criminal justice overcrowding. The article discusses these developments in the context of drug law debates and criminological discussions on late modern governance.

Questions that are rarely asked: why so many retired cops?

JIm Crozier, a loyal MR reader, asks:

Why do cop movies and TV shows so often begin with an older (and often jaded) officer that is just about to retire? It is quite astounding how often this unrealistic plot trick is employed, and the psychological grounding seems weak at best. 

I don't have the viewing experience to give you an evidence-based response.  I would think the answer might lie in marginal utility theory plus behavioral economics.  Perhaps all his life that officer has failed to achieve some desired end, such as catching a criminal, bringing an evil politician to justice, reforming the corrupt police force, or whatever.  If the officer is near retirement, we know we are watching a very dramatic story which will define the life and career of that officer for ever and ever.  It is harder for the viewer to have the same feeling if the officer has four years, three months remaining on the force.  Failure would not mean final failure.

On the behavioral front, our impressions of experiences, and the memories we form, very often depend on what comes last.  Judges are more impressed by the group which sings last in the Eurovision contest, even though it is randomized.  The viewer thus implicitly knows that the cop really cares about the final segment of his or her career, reinforcing the point about decisiveness and marginal utility.

Viewers, can you do better?

The Chait-Manzi debate

Paul Krugman links to some of the key pieces, or trace through Chait's blog or Manzi plus Krugman has a NYT column today on this.  I won't go through the debate as a whole (i.e., no mention of military spending or ideas as an international public good), which covers many of the basic "U.S. vs. Europe" issues, but here are a few relevant points:

1. For this debate, "levels" are more important than growth rates.  The United States has higher per capita income than most of Europe, although I don't mean to suggest that Europe is an economic disaster.  You also can try to "argue back" some of that difference by citing social indicators or leisure time, but don't focus on the growth rates.

2. If you see the United States compared with Europe, ask if the same analysis also compares the United States to the highly successful Singapore or for that matter Brazil.  If not, be wary.

3. It would be an interesting exercise to construct an "imaginary Europe," so instead of the current gdp of Italy you would sub in the output of a comparable number of Italian-Americans, and so on.  The Swedish-Americans in Minnesota get subbed in for the Swedes in Sweden, and so on.  I've never seen that done but I would like to know the answer, both with respect to per capita income and social indicators.

4. One question is whether the U.S. or Europe does a better job of elevating poor immigrants to higher income levels.  You would think egalitarians would be obsessed with this issue, but they're not.  In fact most of them hardly mention it.

5. There has never, ever been a well-functioning social democracy — in the European sense — with the size, population, and diversity of the United States or if you wish make that any two of those three.  How about any one of those three, noting that Canada isn't really such a large country?  That doesn't mean it's impossible, but keep that in mind the next time you hear talk about evidence-based reasoning.

6. Per capita growth rates or levels can be misleading, for some of the reasons mentioned above.  A country which adds a lot of low wage labor through immigration, for instance, will look worse than it ought to.  And if you cite "higher average productivity" in some parts of Europe, you are neglecting the differences between average and marginal and also the allergies to low-wage jobs in places such as France.

7. One view is to see significant pockets of poverty in Appalachia and decry there is nothing comparable in Denmark.  Another view is to see those same poor people and compare them to the poor of the European continent, which includes places such as Belarus and Albania.  Both approaches can be misleading exercises.

8. Countries have to start from where they're at.  If you're constructing policy advice, you can either build on what a country is really good at or you can try to revise the internal culture of the country.  If you're going to do the latter, come out and say so.  Most of my policy recommendations are based on the former approach, namely strengthening what (the better-functioning) countries already are good at.  I'm not suggesting that countries never change, but getting such changes right by deliberate policy interventions is very hard to do.  I wish to stress this point applies to the pro-U.S. as much as the pro-Europe side.

I'd like everyone to have a sign, which they would hold up when appropriate: "My policies seek to revise the internal culture of my country."  That's OK, but you're raising the bar for your own ideas and don't fool yourself into thinking otherwise.

Addendum: You'll find related points here.

Consumer Driven Health Care Plans

For about the last 10 years the United States has been experimenting with consumer driven health care plans.  CDH plans typically combine a high-deductible insurance policy with a health savings account or health reimbursement account.  CDH plans now cover well over 8 million individuals, up considerably from 4.5 million in 2007 and these types of plans continue to grow rapidly.  So what have been the results?

The American Academy of Actuaries has recently produced a review of high quality research on these plans.  Here are their conclusions:

The primary indications are that properly designed CDH plans can produce significant (even substantial) savings without adversely affecting member health status.  To the knowledge of the work group, no data-based study has emerged that presents a contrary view.

Cost-savings in the first year of instituting a CDH plan relative to a traditional plan ranged from 12% to 21%, remarkably large figures.  Moreover, costs appear to grow more slowly under CDH plans than under traditional plans.  

The knock on CDH plans has always been that they could cause people to avoid preventative case.  Not only does this appear to be false it’s the opposite of the truth:

Generally, all of the studies indicated that cost savings did not result from avoidance of inappropriate care and that necessary care was received in equal or greater degree relative to traditional plans.  All of the studies reported a signficant increase in preventative services for CDH participants.

Especially interesting is that some of the studies found that CDH plans resulted in better compliance with evidence-based care.

Note that these results come from CDH plans instituted within the current system.  One would expect that the general equilibrium effects of consumer driven health plans would be even larger than the partial equilibrium effects, see Singapore for evidence (but consider Tyler’s remarks). 

The American Academy of Actuaries is a credible organization but I would like to see more of the underlying data.  All of the studies the AAA reviewed used credible methodologies, controlled for selection and were based on substantial data but the major studies so far have been industry funded.

It’s remarkable that in the current debate over how to control health care costs so little attention is being given to the important results of our 10-year experiment with consumer driven health plans.

Why is competition between health insurance companies useful?

Kevin Drum (and Matt Yglesias) asks an excellent and important question:

Tyler is arguing for keeping the insurance industry
competitive. But I simply don’t see what that buys us. Even if the
health insurance industry were dramatically improved, this wouldn’t
especially make healthcare any more efficient. It would only make the
insurance industry more efficient. That would be nice, but hardly
earthshaking…

Let me be clear: the incentives today are screwy.  Let me also tell you my ideal world.  Insurance companies are judged by honest third party intermediaries.  Insurance companies compete like heck to make customers satisfied.  Insurance companies monitor doctors, read Robin Hanson, and require evidence-based medicine.  Insurance companies which fail at these pursuits either go bankrupt or they must abide by an ex ante contract to permit the exile of their CEOs to Greenland.  Every year prices would fall in real terms, quality would improve, and coverage would be expanded.  Imagine the whole health care sector working like laser eye surgery or cosmetic surgery.

This is not the world we live in, but it is the world we should aim for and I am more than willing to consider how government might get us there.  (Mandating greater price transparency is but one step.)  But if we institute a single-payer system, or highly regulated mandates, we will never have much chance of arriving in that world.  Ever.  We will have a fairly static sector with high coverage levels but rising costs long term and less innovation. 

I believe we know why insurance companies don’t work this way, namely monitoring problems; they screw you over instead of serving you and they can get away with it.  Go ahead, call me a pollyanna, but modern information technology and measurement can indeed resolve many monitoring problems.  We can now monitor central bank performance quite well or show up in Sicily with a credit card and rent a car.  Neither was the case forty years ago.

Here is one summary of how health insurance companies are improving information technology for claims processing, medicine itself, and promoting evidence-based medicine.  I don’t mean this industry-supplied link to be a good summary of the current truth; take it as one vision of what might be possible.  To put the point another way, insurance companies are not just risk assessors or dollar transfer mechanisms; they also can be monitors and buyer agents and that is why competition is potentially so useful.

The policy point is not: "you must die today so that the reign of Milton Friedman can arrive in forty years’ time."  It is more like: "whatever transfers we wish to do today, let us proceed so that such a future remains someday possible."

Medical care is just starting to cure human beings, so don’t think the future will look like the past.  I know that preaching the virtues of insurance company competition is not a popular position in the blogosphere but like Arnold Kling, I see the single-payer advocates and mandate advocates as the conservatives, not the visionaries.

Addendum: A month or two ago, one MR reader left a long and very good comment about all the innovations provided by private health insurance companies.  I can’t find it, can any of you?  Please let us know in the comments or email me.

Addendum: Kevin Drum responds.

Jason Furman has an interesting health care plan

Make people pay more:

This paper proposes a template for a progressive cost sharing plan that would require typical families to pay half of their health costs until they reached 7.5 percent of their income; low-income families would not have any cost sharing.  The analysis shows that this template could reduce total health spending by 13 to 30 percent, reducing premiums by 22 to 34 percent without hurting health outcomes.  Moreover, low- and moderate-income families would face less cost sharing than they do under typical plans today while the premium savings would be more than enough to compensate middle- and upper-income families for the modest increase in their exposure to small risks.  Every family would have an affordable limit on their out-of-pocket payments, in contrast to the situation today, where many families have insurance policies that expose them to unlimited cost sharing.  In addition, the paper suggests the potential inclusion of evidence-based exceptions for highly valuable preventive care and chronic disease treatments as well as other mechanisms to protect the chronically ill.

This plan, of course, can be applied to either markets or government provision.  It finesses the problems with Bush health savings accounts, including the distributional issues and the lack of directness in achieving first-party payment.  It also assumes that the individual desire for comprehensive insulation from risk — clearly the market tendency where markets are present — is the fundamental problem in the health care sector.  If I protect myself from risk, I don’t take into account my diminished incentive to monitor health care costs, which creates larger dilemmas for the market as a whole.

An alternative plan is simply to tax the health insurance purchases of the relatively wealthy.  But if people overestimate anxiety costs of non-comprehensive coverage, they’ll be too ready to pay the tax and we might prefer Furman’s method of allocating health insurance according to a formula.  How easily supplemental insurance can be prevented remains an open question.

Addendum: Matt Yglesias comments.

Why is Medicine so Primitive?

The practice of modern medicine is surprisingly primitive.  My doctor only recently started to provide printed prescriptions instead of the usual scrawl.  Incorrectly filled prescriptions can be serious and computer printed prescriptions are an obvious response yet even today only one in four physicians use some form of electronic health records and only one in ten really use electronic records to follow a patient’s entire history.  My credit card company knows far more about my shopping history than my physician knows about my medical history.

Medicine is primitive in another way.  The number of treatment regimes supported only by tradition and authority is very high.  Here’s a recent example:

For the past 30 years or so, doctors have routinely given pregnant
women intravenous infusions of magnesium sulfate to halt contractions
that can lead to premature labor.

…[a] team reviewed 23 clinical trials worldwide involving 2,000 women who
had received the drug to quell contractions. They found that it did not
reduce preterm labor and that more babies died when their mothers took
the drug than in a control group where the mothers had not been given
it.

…Grimes and Nanda estimate that about 120,000 American women receive mag
sulfate each year for premature contractions, and they say some
evidence suggests it may be associated with 1,900 to 4,800 fetal deaths
annually in the United States.

This would be a shocker except for the fact that stories like this are common – by some accounts a majority of medical procedures are not supported by serious scientific evidence.  Indeed, what are we to make of a profession where evidence-based medicine is only a recent and still far from accepted movement?

Why is medicine so primitive?  One reason is that medicine is the largest area of the economy still dominated by artisanal production.  I will be blunt: We need assembly line medicine, medicine that is routinized, marked and measured. As I have argued before I would much prefer to be diagnosed by a computerized expert system than by a physician. The HMOs, Kaiser in particular, have done good work on measuring the effectiveness of different procedures but much more needs to be done to bring medicine into the twentieth century let alone the twenty first.