How to Make Judges and Referees Pay

A recent viral tweet, quoted by Elon Musk, points out that bartenders can be fined or even imprisoned if they serve alcohol to patrons who later kill someone while under the influence. Judges, in contrast, enjoy absolute or qualified immunity even when they repeatedly release defendants who go on to kill.

I agree that judges should face stronger incentives to make good decisions, but the obvious problem with penalizing judges who release people who later commit crimes is that judges would then have very little incentive to release anyone—and that too is a bad decision. Steven Landsburg solved this problem in his paper A Modest Proposal to Improve Judicial Incentives, published in my book Entrepreneurial Economics.

Landsburg’s solution is elegant: we must also pay judges a bounty when they release a defendant.

Whether judges would release more or fewer defendants than they do today would depend on the size of the cash bounty, which could be adjusted to reflect the wishes of the legislature. The advantage of my proposal is not its effect on the number of defendants who are granted bail but the effect on which defendants are granted bail. Whether we favor releasing 1 percent or 99 percent, we can agree that those 1 percent or 99 percent should not be chosen randomly. We want judges to focus their full attention on the potential costs of their decisions, and personal liability has a way of concentrating the mind.

One might object that a cash bounty will cost too much, but recall that the bounty is balanced by penalties when a released defendant commits a future crime. The bounties and penalties can be calibrated so that on average the program is budget-neutral. The key is to get the incentives right on the margin.

The structure of this problem is quite general. Ben Golub, for example, writes:

There should be a retrospective reputational penalty imposed on referees who vote no on a paper because the paper is too simple technically — if that paper ends up being important. It’s an almost definitional indicator of bad judgment.

Quite right, but a penalty for rejection needs to be balanced with a bonus for acceptance. Get the marginal incentive right and quality will follow!

Economists on AI and economic growth and employment

We completed the most comprehensive study of how economists and AI experts think AI will affect the U.S. economy. They predict major AI progress—but no dramatic break from economic trends: GDP growth rates similar to today’s and a moderate decline in labor force participation. However, when asked to consider what would happen in a world with extremely rapid progress in AI capabilities by 2030, they predict significant economic impacts by 2050:

• Annualized GDP growth of 3.5% (compared to 2.4% in 2025)

• A labor force participation rate of 55% (roughly 10 million fewer jobs)

• 80% of wealth held by the top 10% (highest since 1939)

That is from this very good and very detailed Twitter thread, worth reading in its entirety.  Note this:

Only 5.2% of the variance is between scenarios—attributable to disagreement about AI capabilities themselves…

Here is the full paper, over 200 pages long, I will be reading through it.  The list of authors is impressive, with Ezra Karger in the lead, also including Kevin Bryan, Basil Halperin, and many more.  For some while this will stand as the best set of estimates we have.  Here are the related forecasts of Seb Krier.

Is financial economics still economics?

That all sounded wonderful, and that core model and its offshoots dominated financial research for decades. The problem, however, was that it wasn’t true, or at least it wasn’t nearly as true as we had thought and hoped. When financial economists refined the models with more complete specifications, it turned out Beta didn’t predict stock returns much at all. Eugene Fama and Kenneth French delivered one of the final blows to earlier approaches with a 1992 paper that showed Beta didn’t have explanatory power over expected returns at all. Since Fama himself was one of the original architects of CAPM-like reasoning, and French also was a renowned finance economist, these revisions to the model were credible. For all its original promise, marginalism, and the concomitant notion of diminishing marginal utility, no longer seemed to help explain asset returns.

Under one plausible account of intellectual history, you can date the decline of marginalism to that 1992 paper. In the most rigorous, data-oriented, and highest-paying field of economics, namely finance, marginalist constructs had every chance to succeed. In fact, they ran the board for several decades. But over time they failed. In the most prestigious field of economics, marginalism has been in full retreat for over 30 years, and it shows no signs of making a comeback.

We already know that financial practice is dominated by the (non-economist) quants. But how about financial economics research, the parts that are still done by economists? What direction is that work moving in?

I was struck by a 2024 paper published in the Journal of Financial Economics, one of the two leading journals of financial economics (Journal of Finance is the other). The authors are Scott Murray, Yusen Xia, and Houping Xiao, and the title is “Charting by Machines.” The core result is pretty simple, and best expressed in the well-written abstract:

“We test the efficient market hypothesis by using machine learning to forecast stock returns from historical performance. These forecasts strongly predict the cross-section of future stock returns. The predictive power holds in most subperiods and is strong among the largest 500 stocks. The forecasting function has important nonlinearities and interactions, is remarkably stable through time, and captures effects distinct from momentum, reversal and extant technical signals. These findings question the efficient market hypothesis and indicate that technical analysis and charting have merit. We also demonstrate that machine learning models that perform well in optimization continue to perform well out-of-sample.” Murray, Xia, and Xiao (2024, p. 1). Or consider the new paper Borri, Chetverikov, Liu, and Tsyvinski (2024). They propose a new non-linear, single-factor asset pricing model. In the abstract: “Most known finance and macro factors become insignificant controlling for our single-factor.” Yet you won’t find traditional economic variables discussed in this paper, it is all about the math, in particular a representation of the Kolmogorov-Arnold representation theorem.

In other words, the successful approach to predicting returns is giving up on traditional portfolio theory and using the “theory-less” technique of machine learning. Although this is published in the Journal of Financial Economics, in some significant sense it is not economic reasoning at all. It is calculation, combined with expertise in math and computer science. The modeling is not economic modeling in a manner that has ties to marginalism or standard intuitive microeconomic theory. And the work is predicting excess returns in a pretty robust and successful way…

There is a recent working paper which is perhaps more striking yet, by Antoine Didisheim, Shikun (Barry) Ke, Bryan T. Kelly, and Semyon Malamud. They pick up from Arbitrage Pricing Theory (APT), a well-established idea from financial economics. APT typically looks for “factors” in the data which predict excess returns, and a traditional APT model might have found five or six such factors. Are “inflation” or perhaps “the term structure of interest rates” useful factors? Well, that can be debated, but if so, those results sound pretty intuitive. But those intuitions seem to be disappearing. In a paper by these authors, they apply machine learning methods to look for more factors. As we know, machine learning is very good at finding non-obvious relationships in the data. The largest model they built has 360,000 (!) factors, and it reduces pricing errors by 54.8 percent relative to the classic six-factor model from Fama and French. Bravo to the authors, but what kinds of intuitions do you think possibly can be supported by those 360,000 factors?

That is from my new The Marginal Revolution: Rise and Decline, and the Pending Revolution in AI.

The economics of dropout risk

Bryan Caplan keeps hammering this point home, it is good to see follow-up work:

In the United States, college dropout risk is sizable. We provide new empirical evidence that beliefs about the likelihood of earning a bachelor’s degree predict college enrollment, and that the distribution of these beliefs exhibits widespread optimism. We incorporate this distribution of beliefs into an overlapping generations model with college as a risky investment that can be financed via federal loans, grants, family transfers, or earnings. We then examine the welfare impact of access to federal student loans. We find that access can reduce welfare for young adults who are low-skilled, poor, and optimistic, due to their mistaken beliefs.

That is from AEJ: Macroeconomics, by Emily G. Moschini, Gajendran Raveendranathan, and Ming Xu.  Via the excellent Kevin Lewis.

Tuesday assorted links

1. “… presenting Economics as empirical and socially relevant may broaden the profile of those who consider the field.”  But it does not get more people interested.

2. Youth happiness has been rising in many places, possibly most.

3. The NIH as an implicit regulatory body.

4. The Lebron critique of prediction markets.

5. Adam Tooze: “It is a truism of the moment that China is the last adult in the room.”

6. Quantum breakthroughs? And another account.  Will the Satoshi wallet remain safe?

7. Shall we organize scientific literatures around claims rather than papers?

A reminder (for academics)

Yes, there are skills AIs haven’t mastered. But if your skill still appears to be the exclusive province of humans, that might mean the major AI companies do not yet consider it very important to master right away. Eventually it will rise to the top of the list.

Here is more from my Free Press essay on AI.  If not for the copied passage, it seems no one was noticing this book review? (NYT, read the emendation)

New issue of Econ Journal Watch

EJW Volume 23, Issue 1, March 2026

Specification Searching in the Race between Education and Technology: Joseph Francis criticizes a canonical model of the American labor market, which has been used to advocate for more funding for education to reduce inequality. He shows how the model has routinely failed to predict the evolution of the college wage premium. Ad hoc econometric adjustments have been necessary to make the model fit the data, most notably in Claudia Goldin and Lawrence F. Katz’s well-known book. (The commented-on authors are hereby invited to reply in a future issue.)

Globalization and the China Shock: A Reassessment: David Autor, David Dorn, and Gordon Hanson estimated the effect of imports of manufactured goods from China from 1990 to 2007 on employment, wages, and social welfare payments in the USA, concluding that imports from China reduced manufacturing employment and lowered wages of workers in non-manufacturing industries. Robert Kaestner argues that the authors’ focus only on Chinese imports, which are correlated with imports from other countries and likely other omitted variables, muddles the interpretation and usefulness of their results. Kaestner argues that their estimates do not measure the effect of Chinese imports on employment and wages holding all other things equal, and do not even measure the broader equilibrium effect of Chinese imports on outcomes that includes changes in imports from other countries. Overall, the evidence suggests that omitted variable bias is likely, which renders their estimates uninformative. (The commented-on authors are hereby invited to reply in a future issue.)

Learning on machine learning on the housing supply impact of land use reforms: An Urban Studies article reports relatively modest housing-stock gains from liberalization, based on a dataset of reforms identified via machine learning applied to newspaper coverage. Researchers at the American Enterprise Institute challenge the article’s methodology and conclusions, and the Urban Studies authors respond.

An Article in Science on Covid Origins Contains a Fundamental Error: An influential article claimed that Bayesian analysis of the molecular phylogeny of early SARS-CoV-2 cases indicated that the likelihood that two successful introductions to humans had occurred was greater than the likelihood that just one had occurred. After correcting a fundamental error in Bayesian reasoning, the results presented in that paper imply larger likelihood for a single introduction, reducing the plausibility of the wet-market zoonosis account of Covid’s origins. (The commented-on authors were invited to reply and the invitation remains open.)

A Critique of Synthetic Control Method Studies on Covid-19 Policy—Evidence from Sweden: Five studies employing the Synthetic Control Method (SCM) conclude that Sweden would have experienced lower mortality had it imposed a mandatory lockdown in early 2020. Dividing Sweden into four hypothetical countries based on winter holiday timing—a proxy for pre-lockdown viral seeding—Jonas Herby shows that the estimated lockdown effect varies dramatically across regions with identical policies, suggesting SCM captures variation in viral spread rather than a causal policy effect. Sweden’s low excess mortality in the end suggests that Sweden’s state epidemiologist, Anders Tegnell, was right all along. (The commented-on authors are hereby invited to reply in a future issue.)

Central Banking Research Is Increasingly Directed to Environment, Inequality, Gender, and Race: Radu Șimandan and Cristian Valeriu Păun use the Scopus database to show how environment, inequality, gender, and race have soared as topics in research outlets supposedly focused on money and banking. They discuss the hazards of subverting price stability and other traditional central bank mandates.

Power Analysis Is Essential—A Case Study in Rounded Shapes: A Journal of Consumer Research article reported an A/B test where simply rounding the corners of square buttons increased click-through rate by 55 percent, but provided no power analysis. Ron Kohavi and coauthors show that the original study was highly underpowered. They report that three high powered A/B replications, each over two thousand times larger, had estimated effects approximately two orders of magnitude smaller than initially claimed. (The commented-on authors are hereby invited to reply in a future issue.)

“Impartial spectator” in Adam Smith’s The Theory of Moral Sentiments: In the previous issue, a critique alleged that numerous scholars flatten Smith’s “impartial spectator.” Jack Weinstein responds with “Adam Smith’s Impartial Spectator Is Neither Divine Nor an Ideal Observer,” and the critics renew their case against flattening “impartial spectator.”

The Ideological Profile of France’s Economic Bestsellers: Alexis Sémanne inspects the 100 economics bestsellers for 2024, as listed by a leading French bookseller. He develops seven categories and evaluates each book for its ideological tendency. Quite few of the books offer a freedom-oriented perspective.

Green Vanities in Europe: John Constable reviews A Green Entrepreneurial State? Exploring the Pitfalls of Green Deals, edited by Magnus Henrekson, Christian Sandström, and Mikael Stenkula, a book which reveals more than the fact that green deals in Europe have been failures.

EJW thanks its referees and others who contribute to its mission.

EJW Audio:

Sentences to ponder

This matters for the AI question, and the book leaves it unfinished. If the breakthroughs of the past required social conditions, not just cognitive capacity, then what does it mean when the next breakthroughs are produced by systems that have no social conditions at all? A neural net does not need a university chair or financial independence from the church. It does not need to reorganize its commitments. It does not, in any recognizable sense, have commitments. The machine that replaces the marginalist is not a better marginalist. It is a different kind of thing entirely.

That is from Jônadas Techio, presumably with LLMs, this review of The Marginal Revolution is interesting throughout.  And this:

Maybe the book demonstrates only that Cowen personally remains good at something the field no longer needs.

Monday assorted links

1. Was there a great Philadelphia cheese steak stagnation?

2. David French on the enemies of free speech (NYT).  And yes it is Indonesian censorship, nothing to celebrate.

3. Profile of Hussein Aboubakr.  Good piece on one of today’s best thinkers and writers.  Link to Twitter and Substack.  Unlike many writers on these topics, it is not about your opinion of Israel, rather each piece is interesting and substantive.  Try his essay on Mahfouz.

4. Lab Leak is somewhat declining in plausibility.

5. “China is cracking down on families who opt to bury their dead in empty high-rise properties — known as “bone ash apartments” — rather than pay skyrocketing costs for cemetery plots.” (FT)

6. Do developing countries still need to industrialize?

7. JFV on education and AI.

Grade Caps are Not a Good Solution to Grade Inflation

It’s well known that grade inflation has “degraded” the informational content of grades at many colleges. At Harvard, two-thirds of all undergraduate grades are now A’s—up from about a quarter two decades ago. In response, a Harvard faculty committee has proposed capping A grades at 20 percent of each class (plus a cushion for small courses). That may give professors some cover to resist further inflation, but it doesn’t solve the real problem.

The real problem is not inflation per se. It’s that students are penalized for taking harder courses with stronger peers. A grade cap leaves that distortion intact—and can even amplify it. As Harvard economist Scott Kominers argues:

A grade cap systematically penalizes ambitious students for surrounding themselves with strong classmates. Perverse course-shopping incentives ensue as a result. A student who is prepared for an advanced course but concerned about landing in the bottom 80 percent may choose to drop down preemptively—seeking out a pond where they are a relatively bigger fish. As strong students move into lower-level courses, competition for A grades increases there while harder courses continue to shrink—reducing their A allocation further and driving more students away.

The underlying issue is informational. A grade tries to capture two things—student ability and course difficulty—with a single number. Gans and Kominers show that in general this is impossible: if some students take math and earn B’s while others take political science and earn A’s, there is no way, from grades alone, to tell whether the difference reflects ability or course difficulty.

There is, however, a solution in some cases. Clearly, if every student takes some math and political science courses, informative patterns can emerge. If math students tend to get B’s in math but A’s in political science, while political science students get A’s in their own field but C’s in math, you can begin to separate course difficulty from student ability.

Students don’t all overlap the same classes. But full overlap isn’t necessary—you just need a connected network. If Alice just takes math courses, Joe takes math and political science courses, and Bob just takes political science courses, then Alice and Bob can be compared through Joe. With enough of these links, the entire system can be stitched together. The more overlap, the more precise the estimates.

Valen Johnson proposed a practical method along these lines in 1997. Gans and Kominers embed the same intuition in a much more general framework, showing exactly what can and cannot be inferred, and under what conditions.

The great thing about achievement indexes based on relative comparisons is that they are robust to grade inflation and do not penalize students for taking hard classes or subjects. A political science student who chooses to take a tough math class instead of an easy-A intro to sociology course won’t be penalized because their low math grade will, in effect, by boosted by the difficulty of the course/quality of the students. That’s good for the student and also good for disciplines that have lost students over the years because they held the line on grade inflation.

One final point. Harvard’s cap proposal appears to have been developed with little engagement with researchers who have studied problems like these for decades in the mechanism and market design literature—people like Kominers, Gans, Budish, Roth, Maskin, and Sönmez, some of them at Harvard! Moreover, this isn’t a case of ignoring high-theory for practice. The high-theory of mechanism design has produced real-world systems including kidney exchanges, school choice mechanisms, physician-resident matching, even the assignment of students to courses at Harvard, as well as many other mechanisms. Mechanism design is practical.

Grade inflation is a mechanism design problem—and we know a lot about how to solve it, if we want to solve it.

Do Parents Propagate Inequality Among Children?

The subtitle of the piece is “Evidence From Chinese and Swedish Twins.”  Abstract:

Economists have long studied how parental behavior shapes within-family inequality, yet empirical findings remain mixed. Using twins data from China and Sweden, we examine the predominant mechanisms reported in the literature. Parents in both countries invest similarly during childhood. Inter vivos transfers, however, differ: Chinese parents reinforce income inequality, whereas Swedish parents distribute wealth equally; the reinforcing pattern reflects exchange motives. Bequests are divided equally in both countries. Parental education plays a key role: less educated parents reinforce income inequality, whereas more educated parents transfer wealth equally. Cross-country differences in parental education may thus help explain the mixed findings.

By Aiday SikhovaSven OskarssonRafael Ahlskog.  Via the excellent Kevin Lewis.

Claudia Goldin and the WNBA

After Claudia Goldin became the first woman to win a solo Nobel in economics in 2023, she received hundreds of invitations and requests. She accepted just three.

One of them was advising the WNBA players union as the women prepared to negotiate a new labor deal with the league.

When Goldin replied via email to Terri Carmichael Jackson, executive director of the players union, “I remember just reading it and screaming,” Jackson said. Goldin had one requirement: She refused to be paid.

This month, the two sides reached a collective bargaining agreement that gave Women’s National Basketball Association players a nearly 400% raise. Starting this season, players’ average salary will top $580,000.

It isn’t just the biggest pay increase in U.S. league history. It is, as far as Goldin is aware, the biggest increase any union anywhere has ever negotiated.

“It’s astounding,” the 79-year-old Harvard economist said.

Mike Bass, a spokesman who represents both the National Basketball Association and the WNBA, called the deal “transformational.”

…More recently, as the pay negotiations stretched on, Goldin said she stayed focused not on the countless separate points in the typical lengthy labor deal but on one central equation: the fraction of league revenue going to players’ salary and benefits.

Goldin’s calculations had a calming effect on the players, said Jackson, the union’s executive director.

Here is more from the WSJ.  Via Anecdotal.

*The AI Doc*

The subtitle of the movie is Or How I Became an Apocaloptimist, and here is the trailer.

Overall this film was better and smarter than I was expecting.  Intelligent people were allowed to speak, and to present various sides of the issue.  It was also interesting to see how various people one knows come across on the big screen.

It is easy enough to mock the final section of the movie, which calls for a participatory “civil rights” movement on AI, negotiations with China, and a big voice for trade unions in the decisions.  What Dan Klein calls “the people’s romance.”  The Straussian read there is correct, even though it probably was not intended by the moviemakers.  In reality, for better or worse, the final decisions will continue to be made by the national security establishment.

On a weekend, there were five other people in the theater.

The Candidates’ tournament

Caruana and Sindharov have won today, obviously boosting Caruana’s chances as favorite (he beat Nakamura, the number two rated player in the tournament).  Yet what the chess world needs right now is not a winner, but rather a greater sense of legitimacy for the world title.  Ideally the same person should win a championship match two or three times in a row, and with a decisive margin.  They do not have to be as good as Carlsen, just clearly better than everyone else.  Nepo never quite made it, Ding has retreated from the chess world, and Caruana has yet to win a first title.  Is he young enough to win a few in a row?  Or are we waiting for Nordirbek Abdusattorov (or someone else) to enter the cycle?  I fear decisiveness is not soon on the way.  There are several (relatively) weak players in this tournament, so a variety of players can win just by beating up on the weakies, rather than by demonstrating mastery over their strongest peers.  Legitimacy is likely to remain uncertain, to the detriment of the chess world.  But soon we will know more.