The Liberal Radicalism Mechanism for Producing Public Goods

The mechanism for producing public goods in Buterin, Hitzig, and Weyl’s, Liberal Radicalism is quite amazing and a quantum leap in public-goods mechanism-design not seen since the Vickrey-Clarke-Groves mechanism of the 1970s. In this post, I want to illustrate the mechanism using a very simple example. Let’s imagine that there are two individuals and a public good available in quantity, g. The two individuals value the public good according to U1(g)=70 g – (g^2)/2 and U2(g)=40 g – g^2. Those utility functions mean that the public good has diminishing utility for each individual as shown by the figure at right. The public good can be produced at MC=80.

Now let’s solve for the private and socially optimal public good provision in the ordinary way. For the private optimum each  individual will want to set the MB of contributing to g equal to the marginal cost. Taking the derivative of the utility functions we get MB1=70-g and MB2= 40 – 2g (users of Syverson, Levitt & Goolsbee may recognize this public good problem). Notice that for both individuals, MB<MC, so without coordination, private provision doesn’t even get off the ground.

What’s the socially optimal level of provision? Since g is a public good we sum the two marginal benefit curves and set the sum equal to the MC, namely 110 – 3 g = 80 which solves to g=10. The situation is illustrated in the figure at left.

We were able to compute the optimum level of the public good because we knew each individual’s utility function. In the real world each individual’s utility function is private information. Thus, to reach the social optimum we must solve two problems. The information problem and the free rider problem. The information problem is that no one knows the optimal quantity of the public good. The free rider problem is that no one is willing to pay for the public good. These two problems are related but they are not the same. My Dominant Assurance Contract, for example, works on solving the free rider problem assuming we know the optimal quantity of the public good (e.g. we can usually calculate how big a bridge or dam we need.) The LR mechanism in contrast solves the information problem but it requires that a third party such as the government or a private benefactor “tops up” private contributions in a special way.

The topping up function is the key to the LR mechanism. In this two person, one public good example the topping up function is:

Where c1 is the amount that individual one chooses to contribute to the public good and c2 is the amount that individual two chooses to contribute to the public good. In other words, the public benefactor says “you decide how much to contribute and I will top up to amount g” (it can be shown that (g>c1+c2)).

Now let’s solve for the private optimum using the mechanism. To do so return to the utility functions U1(g)=70 g – (g^2)/2 and U2(g)=40 g – g^2 but substitute for g with the topping up function and then take the derivative of U1 with respect to c1 and set equal to the marginal cost of the public good and similarly for U2. Notice that we are implicitly assuming that the government can use lump sum taxation to fund any difference between g and c1+c2 or that projects are fairly small with respect to total government funding so that it makes sense for individuals to ignore any effect of their choices on the benefactor’s purse–these assumptions seem fairly innocuous–Buterin, Hitzig, and Weyl discuss at greater length.

Notice that we are solving for the optimal contributions to the public good exactly as before–each individual is maximizing their own selfish utility–only now taking into account the top-up function. Taking the derivatives and setting equal to the MC produces two equations with two unknowns which we need to solve simultaneously:

These equations are solved at c1== 45/8 and c2== 5/8. Recall that the privately optimal contributions without the top-up function were 0 and 0 so we have certainly improved over that. But wait, there’s more! How much g is produced when the contribution levels are c1== 45/8 and c2== 5/8? Substituting these values for c1 and c2 into the top-up function we find that g=10, the socially optimal amount!

In equilibrium, individual 1 contributes 45/8 to the public good, individual 2 contributes 5/8 and the remainder,15/4, is contributed by the government. But recall that the government had no idea going in what the optimal amount of the public good was. The government used the contribution levels under the top-up mechanism as a signal to decide how much of the public good to produce and almost magically the top-up function is such that citizens will voluntarily contribute exactly the amount that correctly signals how much society as a whole values the public good. Amazing!

Naturally there are a few issues. The optimal solution is a Nash equilibrium which may not be easy to find as everyone must take into account everyone else’s actions to reach equilibrium (an iterative process may help). The mechanism is also potentially vulnerable to collusion. We need to test this mechanism in the lab and in the field. Nevertheless, this is a notable contribution to the theory of public goods and to applied mechanism design.

Hat tip: Discussion with Tyler, Robin, Andrew, Ank and Garett Jones who also has notes on the mechanism.

Get your money for nothin’ get your chicks for free

The manager of Dire Straits earned a percentage of their royalties and he’s selling a big chunk of it to the public. For $3,970–a little cheaper if you buy in bulk–you can get 1/925 of an asset which has been paying around $296,992 per year over the last year for an annual return of about 8%* (corrected from earlier)–that’s pretty good and the prospectus argues that growth in streaming and a forthcoming Mark Knopfler tour will increase royalties.

I think it would be pretty cool to hear Sultans of Swing on the radio and shout “turn it up!” because you knew were earning but only accredited investors need apply. In related news Matt Levine has an excellent piece on accredited investor rules and his alternative:

  • Anyone can also invest in any other dumb investment; you just have to go to the local office of the SEC and get a Certificate of Dumb Investment. (Anyone who sells dumb non-approved investments without requiring this certificate from buyers goes to prison.)
  • To get that certificate, you sign a form. The form is one page with a lot of white space. It says in very large letters: “I want to buy a dumb investment. I understand that the person selling it will almost certainly steal all my money, and that I would almost certainly be better off just buying index funds, but I want to do this dumb thing anyway. I agree that I will never, under any circumstances, complain to anyone when this investment inevitably goes wrong. I understand that violating this agreement is a felony.”
  • Then you take the form to an SEC employee, who slaps you hard across the face and says “really???” And if you reply “yes really” then she gives you the certificate.
  • Then you bring the certificate to the seller and you can buy whatever dumb thing he is selling.

Syverson on Productivity

The FRB of Richmond has a great interview with Chad Syverson:

EF: Some have argued that the productivity slowdown since the mid-2000s is due to mismeasurement issues — that some productivity growth hasn’t been or isn’t being captured. What does your work tell us about that?

Syverson: It tells us that the mismeasurement story, while plausible on its face, falls apart when examined. If productivity growth had actually been 1.5 percent greater than it has been measured since the mid-2000s, U.S. gross domestic product (GDP) would be conservatively $4 trillion higher than it is, or about $12,000 more per capita. So if you go with the mismeasurement story, that’s the sort of number you’re talking about and there are several reasons to believe you can’t account for it.

First, the productivity slowdown has happened all over world. When you look at the 30 Organization for Economic Co-operation and Development countries we have data for, there’s no relationship between the size of the measured slowdown and how important IT-related goods — which most people think are the primary source of mismeasurement — are to a country’s economy.

Second, people have tried to measure the value of IT-related goods. The largest estimate is about $900 billion in the United States. That doesn’t get you even a quarter of the way toward that $4 trillion.

Third, the value added of the IT-related sector has grown by about $750 billion, adjusting for inflation, since the mid-2000s. The mismeasure­ment hypothesis says that there are $4 trillion missing on top of that. So the question is: Do we think we’re only getting $1 out of every $6 of activity there? That’s a lot of mismeasurement.

Finally, there’s the difference between gross domestic income (GDI) and GDP. GDI has been higher than GDP on average since the slowdown started, which would suggest that there’s income, about $1 trillion cumulatively, that is not showing up in expenditures. But the problem is that was also true before the slowdown started. GDI was higher than GDP from 1998 through 2004, a period of relatively high-productivity growth. Moreover, the growth in income is coming from capital income, not wage income. That doesn’t comport with the story some people are trying to tell, which is that companies are making stuff, they’re paying their workers to produce it, but then they’re effectively giving it away for free instead of selling it. But we know that they’re actually making profits. We might not pay directly for a lot of IT services every time we use them, but we are paying for them indirectly.

As sensible as the mismeasurement hypothesis might sound on its face, when you add up everything, it just doesn’t pass the stricter test you would want it to survive.

And he makes an excellent point about the potential productivity growth from AI

…it seems that with some fairly modest applications of AI, the produc­tivity slowdown goes away. Two applications that we look at in our paper are autonomous vehicles and call centers.

About 3.5 million people in the United States make their living as motor vehicle operators. We think maybe 2 million of those could be replaced by autonomous vehi­cles. There are 122 million people in private employment now, so just a quick calculation says that’s an additional boost of 1.7 percent in labor productivity. But that’s not going to happen overnight. If it happens over a decade, that’s 0.17 percent per year.

About 2 million people work in call centers. Plausibly, 60 percent of those jobs could be replaced by AI. So when you do the same kind of calculation, that’s an additional 1 percent increase in labor productivity; spread out over a decade, it’s 0.1 percent per year. So, from those two applica­tions alone, that’s about a quarter of a percent annual accel­eration for a decade. So you only need maybe six to eight more applications of that size and the slowdown is gone.

Read the whole thing. There’s no fluff in this interview. Syverson packs every answer with substantive content.

Revolving Door Patent Examiners

The revolving door between government and private industry creates opportunities for regulatory capture. Dick Cheney’s moves between Secretary of Defense, CEO of Halliburton and Vice-President certainly raised eyebrows. Secretaries of the Treasury, Robert Rubin, Hank Paulson and Steve Mnuchin were all former bankers at Goldman Sachs. Former members of Congress who become lobbyists are common as are bureaucrats and congressional staffers who turn to lobbying on behalf of the industries they previously regulated. At the same time, it seems desirable that government should be able to draw from top notch people in the private sector and it’s not surprising that private sector firms would want to hire people with government experience. It’s unfortunate (in my view) that government is so entwined with the private sector but that is inevitable in a mixed economy. Nevertheless, it would be useful if we had more data and less anecdote when it comes to the revolving door.

In a new and impressive paper (summary here), Haris Tabakovic and Thomas Wollmann take a detailed look at this issue using patent examiners. Using data on over 1 million patent decisions they find that examiners grant significant more patents to firms that later hire them.

It’s possible that examiners want to work for firms that have high quality patents but several considerations suggest that this is not the explanation for the correlation between grant probability and firm hiring. First, the firms doing the hiring are law firms that handle patent applications. We are not talking about USPTO examiners all wanting to work for Google.

Second, in a very clever analysis the authors show that USPTO examiners who leave for the private sector tend to go to a city near their college alma mater. Moreover, examiners who leave are more likely to approve patents to firms located their alma mater (even when these firms subsequently do not hire them). In other words, it looks as if (on the margin) patent examiners are more generous to firms that they might want to subsequently work for because they are located in places desirable to them. Patent examiners who do not leave do not show a similar bias which removes a home-city boosterism effect. All of these effects are after taking into account examiner fixed effects–so it’s not that examiners who leave are different on average it’s that examiners who leave act differently when firms are located in regions that are potentially desirable to those examiners.

Finally, the authors show that patent quality, as measured by future citations, is lower for patents granted to firms that later hire the examiner or to firms in the same city who are granted patents by the examiner (i.e. to firm-patents the examiner might have given a pass to in order to curry favor). The authors also find some evidence in the patents themselves. Namely, patents that are grant to subsequent employers tend to have claims that are shorter (i.e. stronger) because fewer words were added during the claims process.

The policy implications are less clear. Waiting periods are crude–in other contexts we call these non-compete clauses and most people don’t like them. Note also that these relationships appear to be driven by norms rather than explicit bargaining. USPTO examiners are paid substantially less than their private sector substitutes and that nearly always seems like a bad idea. Paying examiners more would reduce the incentive to rotate.

A Win for Justice

Congratulations to the Institute for Justice for an important victory against the abuse of civil asset forfeiture:

Today, the Institute for Justice dismantled one of the nation’s largest and most egregious civil forfeiture programs.  For decades, Philadelphia’s forfeiture machine terrorized its citizens:  throwing them out of their homes without notice, seizing their cars and other property, and forcing victims to navigate a rigged kangaroo court system to have any chance of getting their property back.  And the property and money forfeited was then given to the very officials who were supposed to be fairly enforcing the law.
After four long years of litigation, IJ cemented a victory for all Philadelphians this morning with two binding consent decrees in which city officials agreed to reforms that:
    1.  Sharply limit when Philadelphia law enforcement can forfeit property;
    2.  Prevent law enforcement from keeping what they seize;
    3.  Establish robust protections for the due process rights of citizens; and
    4.  Create a $3 million fund to compensate innocent people who were ensnared by the city’s abusive system.
My paper, To Serve and Collect (with Makowsky and Stratmann) suggests that this victory will not only reduce civil asset forfeiture it will also change police behavior and decision-making, altering the number, type, and racial composition of arrests.

Do Boys Have a Comparative Advantage in Math and Science?

Even with a question mark my title, Do Boys Have a Comparative Advantage in Math and Science, is likely to appear sexist. Am I suggesting the boys are better at math and science than girls? No, I am suggesting they might be worse.

Consider first the so-called gender-equality paradox, namely the finding that countries with the highest levels of gender equality tend to have the lowest ratios of women to men in STEM education. Stoet and Geary put it well:

Finland excels in gender equality (World Economic Forum, 2015), its adolescent girls outperform boys in science literacy, and it ranks second in European educational performance (OECD, 2016b). With these high levels of educational performance and overall gender equality, Finland is poised to close the STEM gender gap. Yet, paradoxically, Finland has one of the world’s largest gender gaps in college degrees in STEM fields, and Norway and Sweden, also leading in gender-equality rankings, are not far behind (fewer than 25% of STEM graduates are women). We will show that this pattern extends throughout the world…

(Recent papers have found the paradox holds in other measures of education such as MOOCs and in other measures of behavior and personality. Hat tip on these: Rolf Degen.)

Two explanations for this apparent paradox have been offered. First, countries with greater gender equality tend to be richer and have larger welfare states than countries with less gender equality. As a result, less is riding on choice of career in the richer, gender-equal countries. Even if STEM fields pay more, we would expect small differences in personality that vary with gender would become more apparent as income increases. Paraphrasing John Adams, only in a rich country are people feel free to pursue their interests more than their needs. If women are somewhat less interested in STEM fields than men, then we would expect this difference to become more apparent as income increases.

A second explanation focuses on ability. Some people argue that more men than women have extraordinary ability levels in math and science because of greater male variability in most characteristics. Let’s put that hypothesis to the side. Instead, lets think about individuals and their relative abilities in reading, science and math–this what Stoet and Geary call an intra-individual score. Now consider the figure below which is based on PISA test data from approximately half a million students across many countries. On the left are raw scores (normalized). Focus on the colors, red is for reading, blue is science and green is mathematics. Negative scores (scores to the left of the vertical line) indicate that females scores higher than males, positive scores that males score higher on average than females. Females score higher than males in reading in every country surveyed. Females also score higher than males in science and math in some countries.

Now consider the data on the right. In this case, Stoet and Geary ask for each student what subject are they relatively best at and then they average by country. The differences by sex are now even even more prominent. Not only are females better at reading but even in countries where they are better at math and science than boys on average they are relatively better at reading.

Thus, even when girls outperformed boys in science, as was the case in Finland, girls generally performed even better in reading, which means that their individual strength was, unlike boys’ strength, reading.

Now consider what happens when students are told. Do what you are good at! Loosely speaking the situation will be something like this: females will say I got As in history and English and B’s in Science and Math, therefore, I should follow my strengthens and specialize in drawing on the same skills as history and English. Boys will say I got B’s in Science and Math and C’s in history and English, therefore, I should follow my strengths and do something involving Science and Math.

Note that this is consistent with the Card and Payne study of Canadian high school students that I disscused in my post, The Gender Gap in STEM is NOT What You Think. Quoting Card and Payne:

On average, females have about the same average grades in UP (“University Preparation”, AT) math and sciences courses as males, but higher grades in English/French and other qualifying courses that count toward the top 6 scores that determine their university rankings. This comparative advantage explains a substantial share of the gender difference in the probability of pursing a STEM major, conditional on being STEM ready at the end of high school.

and myself:

Put (too) simply the only men who are good enough to get into university are men who are good at STEM. Women are good enough to get into non-STEM and STEM fields. Thus, among university students, women dominate in the non-STEM fields and men survive in the STEM fields.

Finally Stoet and Geary show that the above considerations also explain the gender-equality paradox because the intra-individual differences are largest in the most gender equal countries. In the figure below on the left are the intra individual differences in science by gender which increase with gender equality. A higher score means that boys are more likely to have science as a relative strength (i.e. women may get absolutely better at everything with gender equality but the figure suggests that they get relatively better at reading) and on the right the share of women going into STEM fields which decreases with gender equality.

The male dominance in STEM fields is usually seen as due to a male advantage and a female disadvantage (whether genetic, cultural or otherwise). Stoet and Geary show that the result could instead be due to differences in relative advantage. Indeed, the theory of comparative advantage tells us that we could push this even further than Stoet and Geary. It could be the case, for example, that males are worse on average than females in all fields but they specialize in the field in which they are the least worst, namely science and math. In other words, boys could have an absolute disadvantage in all fields but a comparative advantage in math and science. I don’t claim that theory is true but it’s worth thinking about a pure case to understand how the same pattern can be interpreted in diametrically different ways.

To Serve and Collect

My latest paper (co-authored with Michael Makowsky and Thomas Stratmann) is To Serve and Collect: The Fiscal and Racial Determinants of Law Enforcement (forthcoming in the Journal of Legal Studies):

We exploit local deficits and state-level differences in police revenue retention from civil asset forfeitures to estimate how incentives to raise revenue influence policing. In a national sample, we find that local fine and forfeiture revenue increases at a faster rate with drug arrests than arrests for violent crimes. Revenues also increase at a faster rate with black and Hispanic drug arrests than white drug arrests. Concomitant with higher rates of revenue generation, we find that black and Hispanic drug, DUI, and prostitution arrests, and associated property seizures, increase with local deficits when institutions allow officials to more easily retain revenues from forfeited property. White arrests are broadly insensitive to these institutions, save for smaller increases in prostitution arrests and property seizures. Our results show how revenue-driven law enforcement can distort police behavior.

We find that drug arrests, especially of blacks and Hispanics, generate revenues so police have the motive and opportunity to engage in revenue driven policing. What about the means? Arrests for murders or robbery are limited by the number of murders and robberies. Drug arrests, however, are more of a police choice variable, able to be ramped up or down almost at will. Thus, in addition to motive and opportunity, police also have the means for revenue driven law enforcement.

How can we test for this effect? In some states, police get to keep the revenues they collect from forfeitures but these states are not randomly assigned. Thus, we use deficits which are plausibly randomly assigned (relative to our variables of concern) and we identify off of the interaction of the two i.e. the marginal impact of additional budget deficits in states where seizure revenue is retained.

We find that black and Hispanic arrests for drugs, DUI, and prostitution arrests are all increasing with deficits in states where seizure revenues are legally retained while white arrests are broadly insensitive to deficits.

Our identification strategy is somewhat coarse so we are by no means the final word on this issue but surely it is time to forbid police departments from keeping the revenues they collect.

We conclude:

The prospects for justice are dimmed when the probability an individual is arrested varies not only by the character of their transgression but also by the potential windfall they present to the public coffer.

Mike Makowsky has an excellent tweetstorm going into further details.

Rejection is Good for Citation (maybe)

A paper in Science covering over 80 thousand articles in 923 scientific journals finds that rejected papers are ultimately cited more than first acceptances.

We compared the number of times articles were cited (as of July 2011, i.e., 3 to 6 years after publication, from ISI Web of Science) depending on their being first-intents or resubmissions. We used methods robust to the skewed distribution of citation counts to ensure that a few highly cited articles were not driving the results. We controlled for year of publication, publishing journal (and thus impact factor), and their interaction. Resubmissions were significantly more cited than first-intents published the same year in the same journal.

The author’s argue that the most likely explanation is that peer review increases the quality of manuscripts. Rejection makes you stronger. That’s possible although the data are also consistent with peers being more likely to reject better papers!

Papers in economics are often too long but papers in Science are often too short. Consider the paragraph I quoted above. What is the next piece of information that you are expecting to learn? How many more citations does a resubmission receive! It’s bizarre that the paper never gives this number (as far as I can tell). Moreover, take a look at the figure (at right) that accompanies the discussion. The authors say the difference in citations is highly significant but on this figure (which is on log scale) the difference looks tiny! This figure is taking up a lot of space. What is it saying?

So what’s the number? Well if you go to the online materials section the authors still don’t state the number but from a table one can deduce that resubmissions receive approximately 7.5% more citations. That’s not bad but we never learn how many citations first acceptances receive so it could be less than 1 extra citation.

There’s something else which is odd. The authors say that about 75% of
published articles are first-submissions. But top journals like Science and Nature reject 93% or more of submissions. Those two numbers don’t necessarily contradict. If everyone submits to Science first or if everyone never resubmits to Science then 100% of papers published in Science will be a first submission. Nevertheless, the 93% of papers that are rejected at top journals (and lower ranked journals also have high rejectance rates) are going somewhere so for the system as whole 75% seems implausibly high.

Econ papers sometimes exhaust me with robustness tests long after I have been convinced of the basic result but Science papers often leave me puzzled about basic issues of context and interpretation. This is also puzzling. Shouldn’t more important papers be longer? Or is the value of time of scientists higher than economists so it’s optimal for scientists to both write and read shorter papers? The length of law review articles would suggest that lawyers have the lowest value of time except that doesn’t seem to be reflected in their consulting fees or wages. There is a dissertation to be written on the optimal length of scientific publications.

Don’t Get Into a Knife Fight with Larry Summers

Larry Summers is not happy with Joseph Stiglitz’s piece The Myth of Secular Stagnation, which argues that the idea of secular stagnation as put forward by Summers and others was little more than a mask for poor economic policy and performance under the Obama administration.

Those responsible for managing the 2008 recovery (the same individuals bearing culpability for the under-regulation of the economy in its pre-crisis days, to whom President Barack Obama inexplicably turned to fix what they had helped break) found the idea of secular stagnation attractive, because it explained their failures to achieve a quick, robust recovery. So, as the economy languished, the idea was revived: Don’t blame us, its promoters implied, we’re doing what we can.

Larry responds:

I am not a disinterested observer, but this is not the first time that I find Stiglitz’s policy commentary as weak as his academic theoretical work is strong.

…In all of my accounts of secular stagnation, I stressed that it was an argument not for any kind of fatalism, but rather for policies to promote demand, especially through fiscal expansion. In 2012, Brad DeLong and I argued that fiscal expansion would likely pay for itself. I also highlighted the role of rising inequality in increasing saving and the role of structural changes toward the demassification of the economy in reducing demand.

…Stiglitz condemns the Obama administration’s failure to implement a larger fiscal stimulus policy and suggests that this reflects a failure of economic understanding. He was a signatory to a November 19, 2008 letter also signed by noted progressives James K. Galbraith, Dean Baker, and Larry Mishel calling for a stimulus of $300-$400 billion – less than half of what the Obama administration proposed. So matters were less clear in prospect than in retrospect.

Indeed, Stiglitz’s piece is difficult to understand as economic commentary because it’s hard to see much daylight between Stiglitz and Summers on actual diagnosis or policy. Stiglitz, for example, points to secular reasons for stagnation when he writes:

The fallout from the financial crisis was more severe, and massive redistribution of income and wealth toward the top had weakened aggregate demand. The economy was experiencing a transition from manufacturing to services, and market economies don’t manage such transitions well on their own.

Gautti Eggerstson is not as entertaining as Summers but he offers useful background.

Hysteria Was Not Treated With Vibrators

You know the story about the male Victorian physicians who unwittingly produced orgasms in their female clients by treating them for “hysteria” with newly-invented, labor-saving, mechanical vibrators? It’s little more than an urban legend albeit one transmitted through academic books and articles. Hallie Lieberman and Eric Schatzberg, the authors of a shocking new paper, A Failure of Academic Quality Control: The Technology of Orgasm, don’t quite use the word fraud but they come close.

Since its publication in 1999, The Technology of Orgasm by Rachel Maines has become one of the most widely cited works on the history of sex and technology (Maines, 1999). This slim book covers a lot of ground, but Maines’ core argument is quite simple. She argues that Victorian physicians routinely treated female hysteria patients by stimulating them to orgasm using electromechanical vibrators. The vibrator was, according to Maines, a labor-saving technology that replaced the well-established medical practice of clitoral massage for hysteria. She states that physicians did not perceive either the vibrator or manual massage as sexual, because neither method involved vaginal penetration.

This argument has been repeated in dozens of scholarly works and cited with approval in many more. A few scholars have challenged various parts of the book. Yet no scholars have contested her central argument, at least not in the peer-reviewed literature. Her argument even spread to popular culture, appearing in a Broadway play, a feature-length film, several documentaries, and many mainstream books and articles. This once controversial idea has now become an accepted fact.

But there’s only one problem with Maines’ argument: we could find no evidence that physicians ever used electromechanical vibrators to induce orgasms in female patients as a medical treatment. We examined every source that Maines cites in support of her core claim. None of these sources actually do so. We also discuss other evidence from this era that contradicts key aspects of Maines’ argument. This evidence shows that vibrators were indeed used penetratively, and that manual massage of female genitals was never a routine medical treatment for hysteria.

… the 19-year success of Technology of Orgasm points to a fundamental failure of academic quality control. This failure occurred at every stage, starting with the assessment of the work at the Johns Hopkins University Press. But most glaring is the fact that not a single scholarly publication has pointed out the empirical flaws in the book’s core claims in the 19 years since its release.

Wow. Read the whole thing.

Hat tip: Chris Martin on twitter.

Government Medical Research Spending Favors Women

It is commonly believed that medical research spending is biased against women. Here are some representative headlines: Why Medical Research Often Ignores Women (Boston University Today), Gender Equality in Medical Research Long Overdue, Study Finds (Fortune), A Male Bias Reigns in Medical Research (IFL Science). Largely on the basis of claims like this the NIH set up a committee to collect data on medical research funding and gender and they discovered there was a disparity. Government funded medical research favors women.

The Report on the Advisory Committee on Research on Women’s Health used the following criteria to allocate funding by gender:

All funding for projects that focus primarily on women, such as the Nurses’ Health Study, the Mammography Quality Standards Act, and the Women’s Health Initiative, should be attributed to women. For research, studies, services, or projects that include both men and women, recommended methods to calculate the proportion of funds spent on women’s health are as follow:

a. If target or accrual enrollment data are available, multiply the expenditure by the proportion of female subjects included in the program. For example, if 50 percent of the subjects enrolled in a trial, study, service, or treatment program are women, then 50 percent of the funds spent for that program should be counted as for women’s health. On the other hand, for diseases, disorders, or conditions without enrollment data, expenditures can be calculated based on the relative prevalence of that condition in women.

b. Where both males and females are included, as may be the case for many basic science research projects, multiply the expenditure by 50 percent.

On the basis of these criteria the report finds that in almost every category there is more female-focused NIH funding than male-focused NIH spending with the totals more than two to one in favor of females ($4.5 billion to $1.5 billion). Now personally I don’t regard this as a terrible “bias” as most spending ($25.7 billion) is for human beings and I don’t see any special reason why spending on women and men should be equal. It does show, however, that the common wisdom is incorrect. The Boston University Today piece I linked to earlier, for example, motivated its claim of bias in funding with the story of a female doctor who died of lung cancer. The NIH data, however, show a large difference in favor of women–$180 million of NIH lung cancer funding was focused on women while just $318 thousand was focused on men ($135 million wasn’t gender focused).

What about clinical trials? Well for NIH-funded clinical trials the results favor women:

Enrollment of women in all NIH-funded clinical  research in FY 15 and FY 16 was 50 percent or greater. Enrollment of women in clinical  research was highest in the intramural research program at 68 percent for both FY 15 and FY 16.

In the most clinically-relevant phase III trials:

NIH-defined Phase III Clinical Trials are a subset of NIH Clinical Research studies. The proportion of female participants enrolled in NIH-defined Phase III Clinical Trial was 67 percent in in FY 15 and 66 percent in FY 2016.

Historically, one of the reasons that men have often been more prevalent in early stage clinical trials (trials which are not always meant to treat a disease) is that after the thalidomide disaster the FDA issued a guidance document which stated that women of child-bearing age should be excluded from Phase 1 and early Phase 2 research, unless the studies were testing a drug for a life-threatening illness. That guidance is no longer in effect but the point is that interpreting these results requires some subtlety.

The NIH funds more clinical trials than any other entity but overall more clinical trials are conducted by industry. FDA data indicate that in the United States overall (the country where by far the most people are enrolled in clinical trials) the ratios are close to equal, 49% female to 51% male, although across the world there are fewer women than men in clinical trials, 43% women to 57% men for the world as whole with bigger biases outside the United States.

It would be surprising if industry research was biased against women because women are bigger consumers of health care than men. The Centers for Medicare and Medicaid Services, find, for example, that:

Per capita health spending for females was $8,315 in 2012, approximately 23 percent more than for males, $6,788

Also:

Research indicates that women visit the doctor more frequently, especially as they have children, and tend to seek out more preventive care. The National Center for Health Statistics found that women made 30% more visits to physicians’ offices than men between 1995 and 2011.

Nor is it the case that physicians ignore women. In one study of time use of family physicians and thousands of patients:

After controlling for visit and patients characteristics, visits by women had a higher percent of time spent on physical examination, structuring the intervention, patient questions, screening, and emotional counseling.

Of course, you can always find some differences by gender. The study I just cited, for example, found that “More eligible men than women received exercise, diet, and substance abuse counseling.” One often quoted 2008 study found that women in an ER waited 65 minutes to men’s average of 49 minutes to receive a pain killer. Citing that study in 2013 the New York Times decried that:

women were still 13 to 25 percent less likely than men to receive high-strength “opioid” pain medication.

Today, of course, that same study might be cited as a bias against men as twice as many men as women are dying of opioid abuse. I don’t know what the “correct” numbers are which is why I am reluctant to describe differences in the treatment of something as complex as pain to bias.

Overall, spending on medical research and medical care looks to be favorable to women especially so given that men die younger than women.

Hat tip: Discussants on twitter.

Addendum: I expect lots of pushback and motte and baileying on this post. Andrew Kadar wrote an excellent piece on The Sex-Bias Myth in Medicine in The Atlantic in 1994 but great memes resist data. Also, school summer vacation is not a remnant from when America was rural and children were needed on the farm.

Most Reported School Shootings Never Happened

The headline sounds like something from a right-wing whack job but this is coming from NPR!

This spring the U.S. Education Department reported that in the 2015-2016 school year, “nearly 240 schools … reported at least 1 incident involving a school-related shooting.” The number is far higher than most other estimates.

But NPR reached out to every one of those schools repeatedly over the course of three months and found that more than two-thirds of these reported incidents never happened. Child Trends, a nonpartisan nonprofit research organization, assisted NPR in analyzing data from the government’s Civil Rights Data Collection.

We were able to confirm just 11 reported incidents, either directly with schools or through media reports.

Gambling Can Save Science

Nearly thirty years ago my GMU colleague Robin Hanson asked, Could Gambling Save Science? We now know that the answer is yes. Robin’s idea to gauge the quality of scientific theories using prediction markets, what he called idea futures, has been validated. Camerer et al. (2018), the latest paper from the Social Science Replication Project, tried to replicate 21 social-science studies published in Nature or Science between 2010 and 2015. Before the replications were run the authors run a prediction market–as they had done on previous replication research–and once again the prediction market did a very good job predicting which studies would replicate and which would not.

Ed Yong summarizes in the Atlantic:

Consider the new results from the Social Sciences Replication Project, in which 24 researchers attempted to replicate social-science studies published between 2010 and 2015 in Nature and Science—the world’s top two scientific journals. The replicators ran much bigger versions of the original studies, recruiting around five times as many volunteers as before. They did all their work in the open, and ran their plans past the teams behind the original experiments. And ultimately, they could only reproduce the results of 13 out of 21 studies—62 percent.

As it turned out, that finding was entirely predictable. While the SSRP team was doing their experimental re-runs, they also ran a “prediction market”—a stock exchange in which volunteers could buy or sell “shares” in the 21 studies, based on how reproducible they seemed. They recruited 206 volunteers—a mix of psychologists and economists, students and professors, none of whom were involved in the SSRP itself. Each started with $100 and could earn more by correctly betting on studies that eventually panned out.

At the start of the market, shares for every study cost $0.50 each. As trading continued, those prices soared and dipped depending on the traders’ activities. And after two weeks, the final price reflected the traders’ collective view on the odds that each study would successfully replicate. So, for example, a stock price of $0.87 would mean a study had an 87 percent chance of replicating. Overall, the traders thought that studies in the market would replicate 63 percent of the time—a figure that was uncannily close to the actual 62-percent success rate.

The traders’ instincts were also unfailingly sound when it came to individual studies. Look at the graph below. The market assigned higher odds of success for the 13 studies that were successfully replicated than the eight that weren’t—compare the blue diamonds to the yellow diamonds.

The Global Middle Class

Wash Post: The world is on the brink of a historic milestone: By 2020, more than half of the world’s population will be “middle class,” according to Brookings Institution scholar Homi Kharas.

Kharas defines the middle class as people who have enough money to cover basics needs, such as food, clothing and shelter, and still have enough left over for a few luxuries, such as fancy food, a television, a motorbike, home improvements or higher education.

It’s a critical juncture: After thousands of years of most people on the planet living as serfs, as slaves or in other destitute scenarios, half the population now has the financial means to be able to do more than just try to survive.

“There was almost no middle class before the Industrial Revolution began in the 1830s,” Kharas said. “It was just royalty and peasants. Now we are about to have a majority middle-class world.”

(Kharas’s definition of middle class takes into account differences in prices across countries.)

It’s interesting that middle class values are also expanding, especially in Asia, even as they may be declining in the United States:

According to the World Values Survey (2015), people in countries with burgeoning middle classes do not feel that governments are responsible for their
success, but rather that it is thrift, hard work, determination, and perseverance that count.

The Most Momentous Place?

The old city of Jerusalem is astonishingly small for a city with so many momentous places. One can walk from Christianity’s holiest site to the holiest site of Judaism, pausing to look at one of the holiest sites of Islam, in less time than it takes to walk from my office on the campus of George Mason University to the campus Starbucks. Jerusalem is actually smaller than the GMU campus. GMU has had a few big events to its credit–two Nobel Prizes, several presidential speeches and so forth–but few people come here on pilgrimage. GMU doesn’t compete with Jerusalem.

Is there another parcel of land of similar size to the old city of Jerusalem that can lay claim to being similarly momentous? The signing of the Declaration of Independence in Philadelphia was pretty important but not much has happened there since. Cape Canaveral gets a nod but doesn’t span multiple fields of endeavor. Rome was important for a long time but its momentous events have faded compared to those that occurred in Jerusalem.

My best guess for a momentous parcel of land of similar size to old Jerusalem would be Cambridge University in the UK. Cambridge can lay claim to being the place of Newton, Darwin, Maxwell, Babbage, Turing, Oppenheimer, and Crick and Watson and many others in the fields of politics, literature and the social sciences including economists such as Keynes, Marshall and Sen. Overall, Cambridge gives Jerusalem a run for its money. Jerusalem had its momentous period between say the building of the first temple in 957 BCE and Muhammad’s night journey around 621 CE, a period of roughly 1600 years, while Cambridge has had only an 800 year run since being built in 1231 so controlling for a time a case can be made that Cambridge beats Jerusalem. Perhaps you disagree but then Cambridge is still racking up momentous events while Jerusalem hasn’t had much in the past 1400 years so Cambridge is certainly catching up. Of course, one big event could put Jerusalem back on top.

Aside from Cambridge, cases can be made for other universities such as Oxford, Harvard and even newcomer Chicago. But it’s interesting that universities come to mind as perhaps the only places in real competition with Jerusalem. Are there others?