Naming AI models correctly

Are you confused by all the model names and terminology running around?  Here is a simple guide to what I call them:

o1 pro — The Boss

4o — Little Boss

o3 mini — The Mini Boss

GPT 4o with scheduled tasks — Boss come back later

o1 — Cheapskates’ boss

Deep Research — My research assistant

GPT-4 — Former Boss

DeepSeek — China Boss

Claude — Claude

Llama 3.3, or whichever — Meta.  I never quite got used to calling Facebook “Meta,” so I call the AI model Meta too.  Hope that’s OK!

Grok 3 — Elon

Gemini 2.0 Flash — China Boss suggests “Laggy Larry,” but I don’t actually call it that.

Perplexity.ai — Google

Got that?  Easy as pie!

Kevin Kelly’s fifty travel tips

Here is one of them, in part:

Here in brief is the method I’ve honed to optimize a two-week vacation: When you arrive in a new country, immediately proceed to the farthest, most remote, most distant place you intend to reach during the trip. If there is a small village, remote spa, a friend’s farm, or a wild place you plan on seeing on the trip, go there immediately. Do not stop near the airport. Do not rest overnight in the arrival city. Do not pause to acclimate. If at all possible proceed by plane, bus, jeep, car directly to the furthest point without interruption. Make it an overnight journey if you have to. Then once you reach your furthest point, unpack, explore, and work your way slowly back to the big city, wherever your international departure airport is.

In other words you make a laser-straight rush for the end, and then meander back. Laser out, meander back. This method is somewhat contrary to many people’s first instincts, which are to immediately get acclimated to the culture in the landing city before proceeding to the hinterlands. The thinking is: get a sense of what’s going on, stock up, size up the joint. Then slowly work up to the more challenging, more remote areas. That’s reasonable, but not optimal because most big cities around the world are more similar than different. All big cities these days feel same-same on first arrival. In Laser-Back travel what happens is that you are immediately thrown into Very Different Otherness, the maximum difference that you will get on this trip.

Here are the rest, mostly I agree.

Reforming the NIH

It seems the Trump proposal to simply cut overhead to fifteen percent will not stand up in the courts, at least not without Congressional approval?  Nonetheless a few of you have asked me what I think of the idea.

My preferred reforms for the NIH include the following:

1. Cap pre-specified overhead at 25 percent, down from a range running up to 60 percent.

2. Encourage more coverage of overhead in the proposals themselves, where the researchers are accountable for how the overhead funds are spent.  Severely limit how much the “overhead” cross-subsidizes other university functions, as is currently the case.

3. Fund a greater number of proposals, with the money coming from overhead reductions, as outlined in #1 and #2.

4. Set up a new, fully independent biomedical research arm of the federal government, based on DARPA-like principles.  In fact this was seriously proposed a few years ago, with widespread (but insufficient) support.

I would note a few additional points, which have been covered in earlier MR posts over the years:

5. The NIH could not get its act together during Covid to make fast grants with sufficient rapidity during a time of crisis.  They performed much worse than did say the NSF.

6. A while back the NIH set up a program to make riskier grants.  The program did not in fact make riskier grants.

7. The NIH killed the idea of an independent DARPA-like biomedical research agency, fearing it would limit the size and influence of the NIH itself.

8. The submission forms, their length, and the associated processes are absurd.  Whether or not the costs there are high in an absolute sense, it is a sign the current NIH is far too obsessed with process, as happens to just about every mature bureaucracy.

At this point it is obvious that the NIH cannot reform itself.  It is also obvious that a slower, technocratic approach just gives the interest groups — in this case it is “the states” most of all — time to mobilize to protect the current NIH.  There are universities in many Congressional districts and a fair amount of money at stake.

I do not per se favor a move to fifteen percent overhead, as I do understand the associated costs on scientific research.  Nonetheless I take very seriously the possibility that a radical “thoughtless” cut now stands some chance of getting us to where we ought to be in the longer run, especially since subsequent administrations will get further cracks at this problem.  They can up overhead to 25 percent, and set up the new DARPA-H.  I just don’t see why that is impossible, and it may not even be unlikely.  So what exactly is your discount rate and risk aversion here?

I feel the defenses of the NIH I am reading do not take the entire broader analysis seriously enough.  They do not take sufficiently seriously that the writers themselves have failed to adequately reform the NIH.  And over time, without serious reform, the bureaucratic stultification will only get worse.

What should I ask David Robertson?

Yes, David Robertson the conductor.  He studied with Boulez and Messiaen, and arguably is the second best Boulez conductor ever.  He also is famous for his recordings of John Adams.  I find him consistently excellent, for instance his Unsuk Chin, Milhaud, or Porgy and Bess.  Here is his Wikipedia page.  Here is his TEDx talk on conducting.  Here is his home page.  He is very smart.

So what should I ask him?

Which economic tasks are performed with AI?

I have now read through the very impressive paper on AI tasks to have come out of Anthropic, with Kunal Handa as the lead author, including Dario, Jack, and quite a few others as well.  Here is the paper and part of the abstract:

We leverage a recent privacy-preserving system [Tamkin et al., 2024] to analyze over four million Claude.ai conversations through the lens of tasks and occupations in the U.S. Department of Labor’s O*NET Database. Our analysis reveals that AI usage primarily concentrates in software development and writing tasks, which together account for nearly half of all total usage. However, usage of AI extends more broadly across the economy, with ∼ 36% of occupations using AI for at least a quarter of their associated tasks. We also analyze how AI is being used for tasks, finding 57% of usage suggests augmentation of human capabilities (e.g., learning or iterating on an output) while 43% suggests automation (e.g., fulfilling a request with minimal human involvement).

There is also a new paper on related topics by Jonathan Hartley, Filip Jolevski [former doctoral student of mine], Vitor Melo, and Brendan Moore:

We find, consistent with other surveys that Generative AI tools like large language models (LLMs) are most commonly used in the labor force by younger individuals, more highly educated individuals, higher income individuals, and those in particular industries such as customer service, marketing and information technology. Overall, we find that as of December 2024, 30.1% of survey respondents above 18 have used Generative AI at work since Generative AI tools became public.

Both recommended, the latter supported in part by Emergent Ventures as well.

Emergent Ventures winners, 40th cohort

Akhil Kumar, 19, Toronto, global health issues and general career development.

Janet Shin, Berkeley, neurotech and brain imaging.

Diana Leung, San Francisco, AI and bio and machine learning.

Kyle MacLeod, Oxford University, economics videos on YouTube.

Aarav Sharma, Singapore, high school, to work on exoskeletons and AI.

Megan Gafford, NYC, writings on aesthetics, Substack.

Alice Gribbin, Berkeley, to write a book on Correggio and beauty.

Kaivalya Hariharan, MIT,  to work on man-machine collaboration and AI, with previous EV winner Uzay Girit.

Eve Ang, Singapore, high school, biosciences and building exoskeletons.

Alex Chalmers, London area, writing on tech, progress, and policy.

Elanu Karakus, Stanford, Turkey, a smart flower to help bees find flowers.

Ishan Sharma, Washington DC, policy work on geologic hydrogen.

Parker Whitfill, economics PhD student at MIT, evaluations of differing AI systems.

Sympatheticopposition.com, @sympatheticopp, San Francisco, writing and Substack.

Yes there are further EV winners and an additional cohort coming soon!  Apologies for any delays.

Again, here is the AI engine, built by Nabeel Qureshi, for searching through the longer list.  Here are previous cohorts of EV winners.

European Union fact of the day

The IMF estimates that Europe’s internal barriers are equivalent to a tariff of 45 per cent for manufacturing and 110 per cent for services. These effectively shrink the market in which European companies operate: trade across EU countries is less than half the level of trade across US states. And as activity shifts more towards services, their overall drag on growth becomes worse…

Europe has been effectively raising tariffs within its borders and increasing regulation on a sector that makes up around 70 per cent of EU GDP.

This failure to lower internal barriers has also contributed to Europe’s unusually high trade openness. Since 1999, trade as a share of GDP has risen from 31 per cent to 55 per cent in the eurozone, whereas in China it rose from 34 per cent to 37 per cent and in the US from 23 per cent to just 25 per cent. This openness was an asset in a globalising world. But now it has become a vulnerability.

That is from Mario Draghi in the FT.  Canada too!  Even Trudeau is now saying there should be free trade across the provinces.  I do appreciate the willingness of those political units to try to counter the Trump tariff proposals.  But it would go better if they were themselves practitioners of free trade internally, never mind externally.

How should government disclosure be done?

That is my recent Bloomberg column on this all-important topic.  Here is one part:

The risk is that Trump would hoard the most sensitive information and disclose selectively, to manipulate the news cycle or to distract attention from other events. It also could give him more political weapons to use against what he calls the “deep state.” The president himself is hardly a model of transparency, whether the questions concern his tax returns, his medical exams or the possession of classified documents after leaving office.

But again, the issue is governmental disclosure, and so far, Trump’s record is 0 for 1. Before assuming office, he suggested that the US military knew more than it was letting on about the drones that had been sighted above New Jersey and other Northeastern states. Then, after Trump took office, his press secretary said only that they were “authorized” by the government “for research and various other reasons.” There has been no subsequent attempt to clarify matters. Personally, I am more confused than I was a month ago.

Perhaps there are good national security reasons for this silence. The point is that it is foolish to expect full and open disclosure from the president, no matter what his executive order says or what he has earlier promised.

One way to improve the process would be to appoint some independent auditors on a bipartisan basis, perhaps selected from Congress. Ex post, those auditors could judge whether disclosure, with transparent explanations, had actually occurred. They could grade the degree of disclosure, but they would not have the power to prevent it. Otherwise, there is a risk that — to choose an example not quite at random — evidence favoring the “two gunmen” hypothesis for JFK’s assassination is released, but conflicting evidence for the “lone gunman” hypothesis is suppressed. The auditors would issue a report saying whether disclosure was unbalanced or unfair.

And this:

Another problem with the task force is that it is authorized for only six months. Bureaucracies are by nature slow-moving, and can be even more so when they wish to be. A six-month deadline creates incentives to wait things out. Trump could threaten to extend the mandate, and perhaps he will. But then the disclosure campaign would turn out to be just a bargaining chip, rather than a genuine attempt to bring the truth to light.

Definitely recommended.

AI Personality Extraction from Faces: Labor Market Implications

Human capital—encompassing cognitive skills and personality traits—is critical for labor market success, yet the personality component remains difficult to measure at scale. Leveraging advances in artificial intelligence and comprehensive LinkedIn microdata, we extract the Big 5 personality traits from facial images of 96,000 MBA graduates, and demonstrate that this novel “Photo Big 5” predicts school rank, compensation, job seniority, industry choice, job transitions, and career advancement. Using administrative records from top-tier MBA programs, we find that the Photo Big 5 exhibits only modest correlations with cognitive measures like GPA and standardized test scores, yet offers comparable incremental predictive power for labor outcomes. Unlike traditional survey-based personality measures, the Photo Big 5 is readily accessible and potentially less susceptible to manipulation, making it suitable for wide adoption in academic research and hiring processes. However, its use in labor market screening raises ethical concerns regarding statistical discrimination and individual autonomy.

That is from a new paper by Marius Guenzel, Shimon Kogan, Marina Niessner, and Kelly Shue.  I read through the paper and was impressed.  Of course since this is machine learning, I can’t tell you what the five traits are in any simple descriptive sense.  But this is somewhat of a comeback for physiognomy, which even DeepSeek tells me is a pseudoscience.  Via tekl, a fine-looking fellow if there ever was one.

Ukrainian bond data correction

In the previous post, I cited data showing that Ukrainian bond prices had fallen over the last few months.  But it seems that data source was faulty, and the value of Ukrainian bonds has been rising, including recently.  You can find some sources here.  Apologies for the error!

I thank JoshB. for drawing this to my attention.  He writes in the comments on that previous post: “Ukrainian local debt is not widely traded but according to bloomberg data it has done nothing but rally over the last four months. The UKRGB 19.7 8/25 that the linked website claims to be quoting is trading 99.13 dollar price, up from 90 in the fall. The USD external bonds are much more widely traded and they have definitely rallied over the past few months – it has been a popular hedge fund trade. The UKRAIN 1.75 2/29 for example are now $73 up from $61 pre election. The thesis has been that the Ukraine external debt stock is small relative to reconstruction needs and the country will desire market access so it makes sense to favorably restructure the external bond holders. I’m skeptical personally, but the premise of the original post that Ukraine debt is going down in price is wrong.”

How the System Works

Charles Mann is worried that so few of us have any notion of the giant, interconnected systems that keep us alive and thriving. His new series, How the System Works at the The New Atlantis, is a primer to civilization. As you might expect from Mann, it’s beautifully written with arresting facts and images:

The great European cathedrals were built over generations by thousands of people and sustained entire communities. Similarly, the electric grid, the public-water supply, the food-distribution network, and the public-health system took the collective labor of thousands of people over many decades. They are the cathedrals of our secular era. They are high among the great accomplishments of our civilization. But they don’t inspire bestselling novels or blockbuster films. No poets celebrate the sewage treatment plants that prevent them from dying of dysentery. Like almost everyone else, they rarely note the existence of the systems around them, let alone understand how they work.

…Water, food, energy, public health — these embody a gloriously egalitarian and democratic vision of our society. Americans may fight over red and blue, but everyone benefits in the same way from the electric grid. Water troubles and food contamination are afflictions for rich and poor alike. These systems are powerful reminders of our common purpose as a society — a source of inspiration when one seems badly needed.

Every American stands at the end of a continuing, decades-long effort to build and maintain the systems that support our lives. Schools should be, but are not, teaching students why it is imperative to join this effort. Imagine a course devoted to how our country functions at its most basic level. I am a journalist who has been lucky enough to have learned something about the extraordinary mechanisms we have built since Jefferson’s day. In this series of four articles, I want to share some of the highlights of that imaginary course, which I have taken to calling “How the System Works.”

We begin with our species’ greatest need and biggest system — food.

and here’s one telling fact from the first essay:

Today more than 1 percent of the world’s industrial energy is devoted to making ammonia fertilizer. “That 1 percent,” the futurist Ramez Naam says, “roughly doubles the amount of food the world can grow.”

Addendum: Tom Meadowcroft from the comments: I teach chemical engineers, who are expert at understanding, designing and managing processes, and will be running many of these civilizational processes after they graduate. Even amongst that group of very bright thinkers, there is remarkably little knowledge as to how we achieve clean water, reliable electricity, fuel for transport and industry, dispose of sewage, and grow and distribute food. These same young adults can all tell you about colonial mindsets, how the world is going to burn, and how various groups are victimized. Our K-12 education system has very warped priorities and remarkably ignorant people at the front of the classroom.

Passive listeners on Spotify

I have been reading the new Liz Pelly book Mood Machine: The Rise of Spotify and the Costs of the Perfect Playlist.  It is a very intelligent and well done book, though it is more pessimistic than I am about the future of music.

One central lesson of the book is just how many “passive” music listeners there are.  In an earlier era they might have been content with muzak, even on the car radio (my father used to do that).  But with Spotify, and many other related internet music services, the passive listeners can be very readily identified.  They do not mind being fed AI-produced slop, or payola-driven songs in their feeds.  For instance, some song producers, often serving up musical slop, will accept lower royalty rates in return for algorithmic promotion.  The passive listeners accept this arrangement without complaint — maybe they just want background mood, or maybe they are not listening at all, and do not want the music to be too intrusive.

Obviously Spotify, or whichever service one has in mind, can track your behavior in this regard.  Passive listeners can expect a stream of very low quality in the future, meaning quality as I would define it, not as they would.

Is it bad if so many listeners are passive?  Well, it is not my ideal of the ideal philosophic republic.

Still, given that they exist I like the idea of setting them aside, segregated into their own easily-manipulated club.  After all, they don’t seem to care about Chuck Berry and Brian Eno.  Insofar as we succeed in segregating them, I would think many of the remaining algorithms become better and more in tune with what their users want.  After all, the noise from the passive listeners has been removed from the calculations.

So I think of algorithms as a way of rewarding the good guys, and avoiding some of the pooling equilibria.  What you call musical “slop,” I call the separating equilibrium.