Category: Education

“Tyler Cowen’s AI campus”

That is a short essay by Arnold Kling.  Excerpt:

Tyler’s Vision

As a student, you work with a mentor. At the beginning of each term, you and your mentor decide which courses you will take. If there are other students on campus taking them, great. If not, maybe you can take them with students at other schools, meeting remotely.

For each course, an AI can design the syllabus. Tyler gave an example of a syllabus generated by ChatGPT for a course on Tudor England. If you can find a qualified teacher for that course, great. If not, you could try learning it from ChatGPT, which would provide lessons, conversations, and learning assessments (tests).

Tyler thinks that 1/3 of higher ed right now should consist of teaching students how to work with AI. I do that by assigning a vibe-coding project, and by encouraging “vibe reading” and “vibe writing.”

The reason for proposing such a high proportion of effort to learning to work with AI is because we are in a transition period, where the capabilities of AI are changing rapidly. Once capabilities settle down, best practices will become established, and knowledge of how to use AI will be ingrained. For now, it is very hard to keep up.

It is possible, of course, that Tyler and I could be wrong. It could be that the best approach for higher ed is to keep students as far from AI as one can. I can respect someone who favors an anti-AI approach.

But I am disturbed by the lack of humility that often accompanies the anti-AI position in higher education. I have difficulty comprehending how faculty, at UATX and elsewhere, can express their anti-AI views with such vehemence and overconfidence. They come across to me like dinosaurs muttering that the meteor is not going to matter to them.

I believe the talk will be put online, but a few extra points here.

First, the one-third time spent learning how to use AI is not at the expense of studying other topics.  You might for instance learn how to use AI to better understand Homer’s Odyssey.  Or whatever.

Second, I remain a strong believer in spending many hours requiring the students to write (and thus think) without AI.  Given the properties of statistical sampling, the anti-cheating solution here requires that only a small percentage of writing hours be spent locked in a room without AI.

Third, for a small school, which of course includes U. Austin, so often the choice is not “AI education vs. non-AI education,” rather “AI education vs. the class not being offered at all.”

Why should not a school experiment with two to three percent of its credits being AI offerings in this or other related manners?  Then see how students respond.

Not as good as Cowen-Tabarrok

Russia is preparing a new economics textbook for university students that aims to challenge what its authors call a “myth” that democracy drives economic growth and to revive the socialist economic theories of Soviet leader Josef Stalin, the head of a Kremlin-linked advisory body said.

Moscow has ramped up efforts to enforce its view of history and global politics in schools since launching its full-scale invasion of Ukraine in 2022, introducing mandatory patriotic classes and rewriting history curricula to align with the Kremlin’s wartime narratives.

Valery Fadeyev, chairman of Russia’s presidential human rights council, told the RBC news website that he is leading work on the textbook, which could be introduced as early as the next academic year for students of sociology, political science and history.

The 350-400-page book, tentatively titled “Essays on Economics and Economic Science,” is intended to present a broader view of economic development than mainstream liberal theory, Fadeyev said.

Here is the full story, via Frank W.  The Kyiv School of Economics it ain’t…

Grade inflation sentences to ponder

Next, we consider the effects of grade inflation on future outcomes. Passing grade inflation reduces the likelihood of being held back, increases high school graduation, and increases initial enrollment in two-year colleges. Mean grade inflation reduces future test scores, reduces the likelihood of graduating from high school, reduces college enrollment, and ultimately reduces earnings.

Here is the full paper by Jeffrey T. Denning, Rachel Nesbit, Nolan Pope, and Merrill Warnick.  Via Kris Gulati.

My Austin visit

First, I gave a talk at University of Austin and also had some meetings there, including with students.  My talk was a practical guide on how to use AI to offer courses that a college or university otherwise cannot afford (especially important for smaller institutions).  I believe they will be putting it online.

My general sense was that U. Austin undergraduates are on a par with undergraduates at top five schools.  I do not think on the technical side they would compete with Stanford or MIT, but more generally…they were very impressive and asked excellent questions with real curiosity.  And seemed politically saner than typical Ivy League cohorts, though without being “mono” in any particular direction.  Here is Arnold Kling on UATX and its students.

The school does admissions by SAT scores only.

Austin is also one of my favorite places to eat in the United States.  It is especially strong in areas of import to me, including barbecue, cheeseburgers, and Tex-Mex.  Just ask your local friendly LLM

Those new service sector jobs

Basketball Expert (Fans, Journalist, Commentator, etc.)

Role Overview

We’re looking for Basketball experts — avid fans, sports journalists, commentators, and former or semi-professional players — to evaluate basketball games. You’ll watch basketball games and answer questions in real time assessing the quality, depth, and accuracy of AI insights, helping us refine our AI’s basketball reasoning, storytelling, and strategic understanding.

Key Responsibilities

  • Game Evaluation: Watch basketball games and review AI-generated play-by-play commentary and post-game analysis.

  • Performance Scoring: Rate the accuracy, insight, and entertainment value of AI sports coverage.

  • Context & Understanding: Assess the AI’s grasp of player performance, game flow, and strategic decisions.

  • Error Detection: Identify factual mistakes, poor interpretations, or stylistic inconsistencies.

  • Feedback Reporting: Provide clear written feedback highlighting strengths, weaknesses, and improvement opportunities.

  • Collaboration: Work with analysts and developers to enhance the AI’s basketball-specific reasoning and realism.

From Mercor, pays $45 to $70 an hour.  For background on Mercor, see my very recent CWT with Brendan Foody.  Via Mike Rosenwald, wonderful NYT obituary from him here.

Soumaya Keynes on the bleak labor market for economists

Third was the bleak labour market for newly minted PhD economists, which Wendy Stock of Montana State University told me could be one of the toughest ever. Hiring freezes helped to halve the number of US full-time academic postings between 2019 and 2025. In the most recent year alone, listings fell by more than during the Great Recession. And according to the most recent comparable data, since 2019 recruitment has shrivelled faster for economists than philosophers or linguists. Oof.

Here is the full FT piece, oof throughout.

The Tyranny of the Complainers

Some years ago, Dourado and Russell pointed out a stunning fact about airport noise complaints: A very large number come from a single individual or household.

In 2015, for example, 6,852 of the 8,760 complaints submitted to Ronald Reagan Washington National Airport originated from one residence in the affluent Foxhall neighborhood of northwest Washington, DC. The residents of that particular house called Reagan National to express irritation about aircraft noise an average of almost 19 times per day during 2015.

Since then, total complaint volumes have exploded—but they are still coming from a tiny number of now apparently more “productive” individuals. In 2024, for example, one individual alone submitted 20,089 complaints, accounting for 25% of all complaints! Indeed, the total number of complainants was only 188 but they complained 79,918 times (an average of 425 per individual or more than one per day.)

What I learned recently is that it’s not just airport noise complaints. We see the same pattern in data from the US Department of Education’s Office for Civil Rights which enforces federal civil rights laws related to education funding. In 2023, for example, 5059 sexual discrimination complaints came from a single individual–from a total of 8151 complaints. Thus, one individual accounted for 68.5% of all sexual discrimination complaints in that year.

In the annual reports for 2022-2024 the OCR identifies what type of complaint the single-individual with multiple complaints was making, a sex discrimination complaint, while in previous years they just give data on the number of complaints from single individuals compared to the total of all types of complaints. I’ve collated this data in this graph which presents totals compared to multiple complaints from a single individual without regard to the type of complaint. Do note, that there are also single individuals filing hundreds of other types of complaints such as age discrimination complaints so the data from more recent years may actually be an underestimate.

In any case, it’s clear that a single individual often accounts for 10-30% of all complaints! These complaints have to be investigated so this single individual may be costing taxpayers millions. It’s as if a single individual were pulling a fire alarm thousands of times a year, mobilizing emergency services on demand, and never facing repercussions.

Does this strategy work? Probably. When complaints are summarized for Congress or reported in the media, are totals presented as-is, or adjusted for spam?

Increasingly, public institutions seem to exist to manage the obsessions of a tiny number of neurotic—and possibly malicious—complainers.

My excellent Conversation with Brendan Foody

Here is the audio, video, and transcript.  Here is the episode summary:

At 22, Brendan Foody is both the youngest Conversations with Tyler guest ever and the youngest unicorn founder on record. His company Mercor hires the experts who train frontier AI models—from poets grading verse to economists building evaluation frameworks—and has become one of the fastest-growing startups in history.

Tyler and Brendan discuss why Mercor pays poets $150 an hour, why AI labs need rubrics more than raw text, whether we should enshrine the aesthetic standards of past eras rather than current ones, how quickly models are improving at economically valuable tasks, how long until AI can stump Cass Sunstein, the coming shift toward knowledge workers building RL environments instead of doing repetitive analysis, how to interview without falling for vibes, why nepotism might make a comeback as AI optimizes everyone’s cover letters, scaling the Thiel Fellowship 100,000X, what his 8th-grade donut empire taught him about driving out competition, the link between dyslexia and entrepreneurship, dining out and dating in San Francisco, Mercor’s next steps, and more.

And an excerpt:

COWEN: Now, I saw an ad online not too long ago from Mercor, and it said $150 an hour for a poet. Why would you pay a poet $150 an hour?

FOODY: That’s a phenomenal place to start. For background on what the company does — we hire all of the experts that teach the leading AI models. When one of the AI labs wants to teach their models how to be better at poetry, we’ll find some of the best poets in the world that can help to measure success via creating evals and examples of how the model should behave.

One of the reasons that we’re able to pay so well to attract the best talent is that when we have these phenomenal poets that teach the models how to do things once, they’re then able to apply those skills and that knowledge across billions of users, hence allowing us to pay $150 an hour for some of the best poets in the world.

COWEN: The poets grade the poetry of the models or they grade the writing? What is it they’re grading?

FOODY: It could be some combination depending on the project. An example might be similar to how a professor in English class would create a rubric to grade an essay or a poem that they might have for the students. We could have a poet that creates a rubric to grade how well is the model creating whatever poetry you would like, and a response that would be desirable to a given user.

COWEN: How do you know when you have a good poet, or a great poet?

FOODY: That’s so much of the challenge of it, especially with these very subjective domains in the liberal arts. So much of it is this question of taste, where you want some degree of consensus of different exceptional people believing that they’re each doing a good job, but you probably don’t want too much consensus because you also want to get all of these edge case scenarios of what are the models doing that might deviate a little bit from what the norm is.

COWEN: So, you want your poet graders to disagree with each other some amount.

FOODY: Some amount, exactly, but still a response that is conducive with what most users would want to see in their model responses.

COWEN: Are you ever tempted to ask the AI models, “How good are the poet graders?”

[laughter]

FOODY: We often are. We do a lot of this. It’s where we’ll have the humans create a rubric or some eval to measure success, and then have the models say their perspective. You actually can get a little bit of signal from that, especially if you have an expert — we have tens of thousands of people that are working on our platform at any given time. Oftentimes, there’ll be someone that is tired or not putting a lot of effort into their work, and the models are able to help us with catching that.

And:

COWEN: Let’s say it’s poetry. Let’s say you can get it for free, grab what you want from the known universe. What’s the data that’s going to make the models, working through your company, better at poetry?

FOODY: I think that it’s people that have phenomenal taste of what would users of the end products, users of these frontier models want to see. Someone that understands that when a prompt is given to the model, what is the type of response that people are going to be amazed with? How we define the characteristics of those responses is imperative.

Probably more than just poets that have spent a lot of time in school, we would want people that know how to write work that gets a lot of traction from readers, that gains broad popularity and interest, drives the impact, so to speak, in whatever dimension that we define it within poetry.

COWEN: But what’s the data you want concretely? Is it a tape of them sitting around a table, students come, bring their poems, the person says, “I like this one, here’s why, here’s why not.” Is it that tape or is it written reports? What’s the thing that would come in the mail when you get your wish?

FOODY: The best analog is a rubric. If you have some —

COWEN: A rubric for how to grade?

FOODY: A rubric for how to grade. If the poem evokes this idea that is inevitably going to come up in this prompt or is a characteristic of a really good response, we’ll reward the model a certain amount. If it says this thing, we’ll penalize the model. If it styles the response in this way, we’ll reward it. Those are the types of things, in many ways, very similar to the way that a professor might create a rubric to grade an essay or a poem.

Poetry is definitely a more difficult one because I feel like it’s very unbounded. With a lot of essays that you might grade from your students, it’s a relatively well-scoped prompt where you can probably create a rubric that’s easy to apply to all of them, versus I can only imagine in poetry classes how difficult it is to both create an accurate rubric as well as apply it. The people that are able to do that the best are certainly extremely valuable and exciting.

COWEN: To get all nerdy here, Immanuel Kant in his third critique, Critique of Judgment, said, in essence, taste is that which cannot be captured in a rubric. If the data you want is a rubric and taste is really important, maybe Kant was wrong, but how do I square that whole picture? Is it, by invoking taste, you’re being circular and wishing for a free lunch that comes from outside the model, in a sense?

FOODY: There are other kinds of data they could do if it can’t be captured in a rubric. Another kind is RLHF, where you could have the model generate two responses similar to what you might see in ChatGPT, and then have these people with a lot of taste choose which response they prefer, and do that many times until the model is able to understand their preferences. That could be one way of going about it as well.

Interesting throughout, and definitely recommended.  Note the conversation was recorded in October (we have had a long queue), so a few parts of it sound slightly out of date.  And here is Hollis Robbins on LLMs and poetry.

Stories Beyond Demographics

The representation theory of stories, where the protagonist must mirror my gender, race, or sexuality for me to find myself in the story, offers a cramped view of what fiction can do and a shallow account of how it actually works. Stories succeed not through mirroring but by revealing human patterns that cut across identity. Archetypes like Hero, Caregiver, Explorer, and Artist, and structures like Tragedy, Romance, and Quest are available to everyone. That is why a Japanese salaryman can love Star Wars despite never having been to space or met a Wookie and why an American teenager can recognize herself in a nineteenth-century Russian novel.

Tom Bogle makes this point well in a post on Facebook:

I have no issue with people wanting representation of historically marginalized people in stories. I understand that people want to “see themselves” in the story.

But it is more important to see the stories in ourselves than to see ourselves in the stories.

When we focus on the representation model, we recreate a character to be an outward representation of physical traits. Then the internal character traits of that individual become associated with the outward physical appearance of the character and we pigeonhole ourselves into thinking that we are supposed to relate only to the character that looks like us. Movies and TV shows have adopted the Homer Simpson model of the aloof, detached, and even imbecilic father, and I, as a middle-aged cis het white guy with seven kids could easily fall into the trap of thinking that is the only character to whom I can relate. It also forces us to change the stories and their underlying imagery in order to fit our own narrative preferences, which sort of undermines the purpose for retelling an old story in the first place.

The archetypal model, however, shifts our way of thinking. Instead of needing to adapt the story of Little Red-Cap (Red Riding Hood) to my own social and cultural norms so that I can see myself in the story, I am tasked with seeing the story play out in myself. How am I Riding Hood? How am I the Wolf? How does the grandmother figure appear in me from time to time? Who has been the Woodsman in my life? How have I been the Woodsman to myself or others? Even the themes of the story must be applied to my patterns of behavior or belief systems, not simply the characters. This model also enables us to retain the integrity of the versions of these stories that have withstood the test of time.

So if your goal is actually to affect real social change through stories, I would encourage you to consider how the archetypal approach may actually be more effective at accomplishing your aims than the representational approach alone (as they are not necessarily in conflict with one another).

Luis Garicano career advice

Take the messy job:

The other option is to go for a messy job, where the output is the product of many different tasks, many of which affect each other.

The head of engineering at a manufacturing plant I know well must decide who to hire, which machines to buy, how to lay them down in the plant, negotiate with the workers and the higher ups the solutions proposed, and mobilise the resources to implement them. That task is extraordinarily hard to automate. Artificial intelligence commoditizes codified knowledge: textbooks, proofs, syntax. But it does not interface in a meaningful way with local knowledge, where a much larger share of the value of messy jobs is created. Even if artificial intelligence excelled at most of the single tasks that make up her job, it could not walk the factory floor to cajole a manager to redesign a production process.

A management consultant whose job consists entirely of producing slide decks is exposed. A consultant who spends half of her time reading the room, building client relationships, and navigating organizational politics has a bundle AI cannot replicate.

Here is the full letter.

What should I ask Henry Oliver?

Yes, I will be doing a Conversation with him.  We will focus on our mutual readings of Shakespearer’s Measure for Measure, with Henry taking the lead.  But I also will ask him about the value of literature, Jane Austen, Adam Smith, Bleak House, his book on late bloomers, and more.

Here is Henry’s (free) Substack.  Here is Henry on Twitter.

So what should I ask him?

A model of girl happiness, a compensatory-use study

A statistical model was used to examine these relationships simultaneously by predicting the likelihood that a girl reports being very happy.

The model includes socioeconomic status, parent–child communication, screen-time limits, and an interaction between limits and communication.

The results reinforce the patterns in the figures. Parent–child communication dominates the model. Girls who report strong communication are about three to four times more likely to report being very happy than those who report none. Socioeconomic status shows a smaller independent association. Screen-time limits contribute little on their own and matter modestly only when strong communication is already present.

If phones were the central problem, limits would emerge as a robust solution across contexts. They do not…

What the compensatory-use model rejects is a stronger claim. It rejects the idea that smartphone exposure itself is the primary driver of youth distress and that prohibition is therefore the central remedy. If that causal story were correct, limits would show large and consistent benefits across households, including among those with the weakest communication and highest distress. They do not.

And to close:

The most reliable way to improve youth well-being is to meet individual needs through connection instead of control.

That work depends on cooperation, not compliance.

Here is the full essay by Owen Kellogg, of course this is only a single study.

Will AI Improve Undergraduate Economics Education?

From the excellent Matt Kahn:

For decades, undergraduate economics educators have followed a well-worn playbook featuring textbooks, lectures and problem sets. Students have passively listened, taken notes and studied for exams. AI disrupts this educational process. Some students are using this tool as a substitute for their own precious time. What is our best response? This paper provides a prospective analysis of how to restructure every phase of the undergraduate economics experience to improve the major and better prepare students for their uncertain future. Departments face a principal/agent issue in implementing major  curricular reforms. I discuss the incentive problems that arise both within economics departments and across departments. If we win this competition to reimagine the undergraduate experience, will the Deans reward us?

TC again: No.

Emergent Ventures India, 15th cohort

Adnan Abbasi, 25, founder of Thothica, received his grant to add an archive reader to make rare historical texts accessible using AI-powered translation. Also check out his AI generated debate between Nehru and Hayek.

Dheemanth Reddy, co-founder of Maya Research, received his grant to build Veena – cutting-edge speech models for English and Indian languages as naturally spoken by Indians.

Ritisha Sethi, 16, a high schooler from Lucknow, received her grant to develop Qubit Quest, her solution to help learn quantum computing through gamification.

Jnanendra K S received his grant to convert vintage cars to EVs in his automotive mechanic shop.

Sankalp Shrivastava, 21, self-taught developer and entrepreneur from Bhopal, received his grant for general career development.

Bharath H G received his grant to build a robotic system safely cleaning manholes remotely.

Sarthak Pandit, an engineering student, received his grant for building a wireless drone recharging system to eliminate manual battery swaps.

Namrata Rajagopal received her grant for Exception Raised – a grants program to enable India’s AI research ecosystem through funding, community, and mentorship. Check out their first cohort.

CEDA (Center for Economic Data and Analysis) at Ashoka University, received a grant to build the Economic Enterprises Tool, to integrate datasets delivering harmonized indicators across India’s enterprises.

Saransh Duharia received his grant for Garudakshak, to build a smart drone detection and neutralization system for civil use.

Aditya Gupta, 21, received his grant to develop a breath diagnostics tool screening for complex gut disorders non-invasively.

Farraz Mir received his grant for a bioinformatics automation project saving researchers time and lowering barriers to entry.

Yasmin Qureshi, 20, received her grant for travel and career development.

Jainul Abedin received his grant to scale Abyom SpaceTech, and develop India’s first reusable rocket and commercial rocket engine testing facility.

Kunjpreet Arora, 27, received his grant for Angirus, to transform plastic and industrial waste into waterproof, low-carbon bricks.

Vrinda Borkar, 30, received her grant for Wingrow Agritech, to develop agricultural markets for small farmers.

Those unfamiliar with Emergent Ventures can learn more here and here. The EV India announcement is here. More about the winners of EV India secondthirdfourthfifthsixthseventheighthninthtenth, eleventh, twelfth, thirteenth, and fourteenth cohorts. To apply for EV India, use the EV application, click the “Apply Now” button and select India from the “My Project Will Affect” drop-down menu.

And here is Nabeel’s AI engine for other EV winners. Here are the other EV cohorts.

If you are interested in supporting the India tranche of Emergent Ventures, please write to me or to Shruti at [email protected].