Category: Education

Law professors prefer AI over peer answers

Large language models (LLMs) are increasingly promoted as educational tutors, yet most evaluations focus on domains with a single ground truth. Many disciplines, however, hinge on judgment: reasoning, weighing ambiguity, and reaching defensible conclusions. Law provides a sharp test. We conducted a blinded evaluation of short-answer tutoring in contracts courses with sixteen U.S. law professors. Participants created 40 representative questions, wrote answers, and judged 2,918 anonymized comparisons between human and LLM responses. Professors rated LLMs far higher than their peers (average win rate = 75.33%), with models performing similarly to the best instructor. LLM responses were also rarely flagged as harmful (3.53%, vs 12.06% for professors). Preferences for LLM answers were consistent across evaluators and reflected shared professional standards. Our evaluation can be reliably extended to additional models by employing a separate LLM as a judge, rendering expert agreements an effective, scalable method to evaluate AI tutors in judgment-rich domains.

“far”.  That is from a new paper by Alejandro Salinas, et.al.  Via Andrew Curran.  And via John Chamberlain:

Artificial intelligence (AI) and large language models (LLMs) tools are capable of mass-producing academic finance papers that are nearly indistinguishable from human-authored research, according to a new study published in the Journal of Economic Literature.

C’mon people, get ready.  I know it is difficult to admit when your human capital has been devalued, but that time is upon us.  In particular, being prolific is no longer such a comparative advantage in academia.  You might run to the “but I know what questions to ask” cope, but I implore you to solve for the equilibrium.  What is the equilibrium wage for merely asking questions?

Of course academic life and projects will continue, but the real rewards will go to people doing new, innovative, and hitherto impossible projects with AI.

The US Exports Intelligence

Most Americans work in the service sector so it’s not surprising that most export-related jobs are in the service sector (The U.S. exports about $2.2 trillion of goods and $1.2 trillion of services, but services are more labor intensive than manufacturing so they support more export jobs per dollar.)

Richard Baldwin writes:

In 2022, US service exports supported 8.9 million American jobs.

US manufacturing exports supported 2.2 million.

That’s four-to-one in favour of services. Yet in the national narrative, ‘export jobs’ almost always means things done in steel mills and factories.

…When a household in Germany pays for Netflix, that is an American export. When a Brazilian retailer buys Microsoft cloud capacity, that is an American export. When JPMorgan structures a financial deal in London, or an American consulting firm advises a company in Singapore, those are American exports too.

None of these is shipped in a container. No customs official records them as they clear the customshouse. Yet they are exports since they earn foreign income for America just as surely as the ‘Boeings, Beans and Beef’ that President Trump sold on his recent China trip.

Need I remind you that when OpenAI sells intelligence to people abroad, that is a US export? N.B. this is the future.

World trade in goods expanded roughly five-fold between 1990 and 2020. Trade in digitally enabled services expanded more than eleven-fold over the same period. These are the modern services.

The trade debate is fixated on manufacturing—where America is doing fine—while largely ignoring services, where America is crushing. Increasingly, our most valuable exports travel not on container ships but at the speed of light over fiber.

80,000 Hours: The Book

Forty hours a week, fifty weeks a year, forty years: a career is about 80,000 hours. Yet it’s striking how little serious thought goes into career decisions relative to, say, choosing a mortgage. Indeed, you are almost supposed to tell a story about how a random incident changed your life. One summer a circus came to town—and that’s the whole reason I became an economist! (True story!). Career advice, when it exists, often amounts to the platitude of “follow your passions!” Ugh. If you ask people what their passions are, music, arts and sports top the list but guess what? There aren’t enough jobs in those categories to go around.

Benjamin Todd’s newly updated book, 80,000 Hours is a unique examination of careers that runs the numbers in a serious way. The book is framed along Effective Altruism lines and it has some good public policy material. Pandemics, for example,

The world has plenty of religious cults, despots and would-be school shooters who might decide they want to take everyone else down with them…. The world [c]ould be one lab leak away from catastrophe.

Given what we know about the pace and accessibility of bioengineering tools, the chance that there will be a pandemic that kills over 100 million people during the next century seems high, plausibly similar or greater than the risk of large-scale nuclear war or climate change above six degrees. An engineered pandemic could also kill over 90% of the population,suggesting its overall scale is significantly larger.

But risks from pandemics are, even now, far more neglected than either of these. In comparison to $6bn–$10bn of philanthropic funding for climate change, and $1.6 trillion of total climate finance, pandemic prevention only receives $1bn of philanthropic funding, and total spending aimed at reducing the chance of worst-case pandemics is probably under $10bn.

See also my paper Pandemic Preparation Without Romance on what to do about it.

The opening chapters present the EA framing but most of the book has good advice even for the purely selfish–advice on building skills, networking and how to actually get a job. From what I have said so far, one might get the impression that the idea is to rationally choose your career at age 16 and then optimize your life around that plan. Not so! Todd rightly divides career paths into explore, build and deploy categories. Most people under-explore. It’s ok to jump around jobs and places, especially when you are young, so long as you are building skills and not just accumulating items for the CV. There’s evidence, for example, that scientists’ best work tends to follow periods of exploration with exploitation.

I also appreciate that Todd specifically warns about about armchair theorizing. Pro-and-con lists, for example, are ok but far less useful than getting out of the chair and actively exploring. Go talk with people, try something for a week, go somewhere. Look for cheap tests.

Start with what’s easiest. We often find people who want to, say, try out economics, who then apply for a master’s degree. That’s a huge investment of time. Instead, think about how you can learn This could mean first reading an economics textbook, or taking a single course.

You can think about creating a ‘ladder’ of tests. Start with the cheapest ways to test your options, then after each step, re-evaluate. A ladder might look like this:
a. Read our relevant career reviews, all our research on a given topic, and talk to LLMs about what the jobs are like (two to five hours).
b. Speak to someone in the area (two hours).
c. Speak to a friend to get an outside perspective on what’s best (two hours).
d. Speak to three more people who work in the area and read one or two books (twenty hours).
e. Given your findings, look for a relevant project that might take one to four weeks of work – like applying to jobs, volunteering in a related role, or doing a side project in the area – to see what it’s like and how you perform.
f. Only then consider taking on a two- to twenty-four month commitment – like a work placement, internship or graduate study. Being offered a trial position with an organization for a couple of months can be ideal because both you and the organization want to quickly assess your fit.

80000 Hours is The Random Walk Down Wall Street of career advice, the one book that really matters.

Explore, build rare and valuable skills, point them at a meaningful problem, and passion will follow rather than lead. And for those who don’t want to read a book, speak to an 80,000 Hours advisor. It’s a very cheap test.

Emergent Ventures winners, 54th cohort

Kenny Guo, Krish Chhajer, Luthira Abeykoon Mudiyansela, Toronto, quantum computing.

Jolie Gan, Calgary/SF, a publication on science and meta-science.

Hudson Mitchell-Pullman, 16, San Diego, how users interact with LLMs to learn.

Adnan Manna, 17, Amman, Jordan, finding exoplanets.

Heloise Hoffman, Stanford, biomedical research to cure her own rare disease.

M.F. Libano-Monteiro, Portugal/LSE, economics education through video.

Scott Ellis, Mississauga, Ontario, making movies with AI.

Adrian Martinez, 17, Mission Viejo, CA, math, education.

Aadil Ali, San Francisco, biographies of young achievers.

Chandler Reilly, Denver, Substack on Denver and its economics.

Jeremy Kingsley, London, AI, podcasting and Substack.

Michelle Lin, 15, Mclean, VA, curved-surface stitching on deformable materials, and algorithms.

Brunella Tipismana, general career support, writing, Peru/New Haven/NYC.

Theo Cross-Zamirski, Cambridge, UK, platform to power math education.

Beatrice Erkers, Stockholm, progress ideas in Sweden.

Repugnant Economics

I spoke on a panel at AEI with Nobelist Al Roth about his new book, Moral Economics, which covers “repugnant markets,” from prostitution to surrogacy to kidney exchange. A fun book!

My case study was acting. Acting was considered repugnant for over 2,000 years. In Rome, actors could not vote, hold office, or be trusted to give an oath in legal proceedings. So why don’t we find acting repugnant today?

One lesson: weighing costs and benefits is not enough. Roth discusses empirical research showing that legalizing prostitution cut STDs and sexual assaults—against prostitutes and others. But evidence alone won’t shift a repugnance norm. You also have to reframe the activity. Acting, for example was reframed from body rental to a skill requiring intelligence, training and ability. So I went out of my way to say that I am a fan of Aella—though not her only fan—and that I see no reason why escorting should not be considered a skill, requiring intelligence, training, and ability. I can think of few better ways of raising social welfare than making sex 10% better!

I also spoke on human challenge trials. Roth and I agree: challenge trials could have sped up COVID vaccines and saved tens of thousands of lives. We should be angry this didn’t happen. Why didn’t it? Even though most people think human challenge trials are a good idea, there was a repugnance bottleneck because the minority who did find human challenge trials repugnant were in charge. I discuss how to change this.

Al leads the discussion. My comments start at 25:15.

One way to benefit adolescents

Have school start later:

We examine the impact of California’s Senate Bill 328 (SB 328), the first statewide mandate requiring later school start times for middle and high schools, on adolescent sleep, mental health, and academic outcomes. Using difference-in-differences and eventstudy designs across five data sources, we find that SB 328 increased the share of students sleeping at least 8 hours per night by 13%, meeting the CDC-recommended minimum for this age group. Average mental health effects are imprecisely estimated, but boys show significant reductions in sadness, hopelessness, and suicidal ideation, and Hispanic students, who experienced the largest sleep-timing shifts, show parallel reductions in difficulty concentrating; together these patterns are consistent with a dose-response relationship between sleep improvement and mental well-being. Math and English scores in grade 8 improved by approximately 0.08–0.10 standard deviations, with the largest gains among Hispanic and economically disadvantaged students. A within-state analysis using teachers’ commute arrival times as a proxy for pre-policy school start times corroborates these findings, and shows academic gains accumulating over 2023–2025 alongside a suggestive decline in high school dropout rates. The absence of effects on chronic absenteeism rules out an attendance-driven mechanism, pointing instead to the direct cognitive benefits of aligning school schedules with adolescents’ biological rhythms.

That is from a new NBER working paper by Jialu (Gloria) Dou, Rania Gihleb, Osea Giuntella & Jakub Lonsky.

MIT fact of the day

Outside of Sloan and the EECS MEng program, still in the midst of admissions, compared with 2024, our departments’ new enrollments for next year are down close to 20%.

That means that, in total, outside of Sloan, we could have about 500 fewer graduate students. Which means we’ll have many fewer students advancing the work of MIT, and undergraduates will have fewer grad students as mentors in their research.

That is from the president of MIT in a recent speech.  It is time to put aside denial about the tsunami coming for higher education.

Hollis Robbins on AI and higher education

There’s a growing idea I’ve seen in some circles that college could be replaced by conversations between an A.I. tutor and a student. When I think about your model, I wonder why college even needs to exist. If I can just seek out a tutor, somebody that I like, and they just charge me a little bit, and we go through these edge-knowledge cases together, what’s the degree for? Couldn’t you, as Hollis Robbins—not only a specialist in African American sonnet traditions but also an idiosyncratic thinker on the subject of A.I. and the future of the academy—just set up your own shop?

I was in Austin, Texas, a couple of times in March with a bunch of twenty-five-year-old billionaires. This is what they’re looking at. Instead of having the credential from the institution, why not have the credential from the professor? If you have a Hollis Robbins education, what would that signal? What would that credential mean as opposed to a degree from a university? There was some conversation about what that would look like, and one guy at the end of the dinner said, “Instead of OnlyFans, it’s like OnlyProfessors.”

Here is much more from The New Yorker.

Another use of AI in research (from my email)

“Another thing we (John [Horton] and I) have thought about is having a swarm of AIs “fight” over a literature. They could take the cumulative datasets available and continuously argue until they understand the question. One line of thinking says they reach a stalemate (as scientists currently do). But we think not. More likely, they push evidentiary understanding to the limit and coalesce around what’s most probable — if not definitive!”

That is from Benjamin Manning.

Justin Wolfers update

Wolfers’s moment of clarity ultimately sent him down a road less traveled by academic economists: creating his own media company.

On Wednesday, Wolfers, 53, announced that he had founded Platypus Economics, an independent media start-up that aims to reach a mainstream audience. The name is a nod to his Australian roots, cheekily referring to the odd-looking mammal native to his birthplace. He’s funding the business himself, using the income from his textbook sales.

…To get his content channels off the ground and build an audience, Wolfers is teaming up with Initial Digital, the digital media division of the Initial Group, an entertainment company that’s backed by the private equity firm TPG.

Here is the full NYT story.

Rose Farts and the Invisible Hand

In Modern Principles, Tyler and I show the invisible hand by telling the story of how the increase in oil prices in the 1970s encouraged millions of adjustments in how goods were produced and allocated, everything from an increased use of brick for driveways to a movement of the flower market from the US, which relied on heating greenhouses, to warmer climes like Columbia and Kenya. See the I, Rose video!

The FT has an amusing update:

“When my sheep break wind, it smells of roses,” he said, recounting one of the more bizarre and far-flung consequences of the decision by US President Donald Trump and Israel’s Prime Minister Benjamin Netanyahu to bomb Iran in February.

Since Tehran hit back by firing drones and missiles at US allies in the Gulf — grounding cargo flights and closing off the Strait of Hormuz through which booming east African trade with the region used to flow — Mahihu has been forced to jettison millions of rose stems.

One farmer in Kenya is now feeding his flowers to his sheep © William Wallis/FT