Category: Web/Tech
Which economic tasks are performed with AI?
I have now read through the very impressive paper on AI tasks to have come out of Anthropic, with Kunal Handa as the lead author, including Dario, Jack, and quite a few others as well. Here is the paper and part of the abstract:
We leverage a recent privacy-preserving system [Tamkin et al., 2024] to analyze over four million Claude.ai conversations through the lens of tasks and occupations in the U.S. Department of Labor’s O*NET Database. Our analysis reveals that AI usage primarily concentrates in software development and writing tasks, which together account for nearly half of all total usage. However, usage of AI extends more broadly across the economy, with ∼ 36% of occupations using AI for at least a quarter of their associated tasks. We also analyze how AI is being used for tasks, finding 57% of usage suggests augmentation of human capabilities (e.g., learning or iterating on an output) while 43% suggests automation (e.g., fulfilling a request with minimal human involvement).
There is also a new paper on related topics by Jonathan Hartley, Filip Jolevski [former doctoral student of mine], Vitor Melo, and Brendan Moore:
We find, consistent with other surveys that Generative AI tools like large language models (LLMs) are most commonly used in the labor force by younger individuals, more highly educated individuals, higher income individuals, and those in particular industries such as customer service, marketing and information technology. Overall, we find that as of December 2024, 30.1% of survey respondents above 18 have used Generative AI at work since Generative AI tools became public.
Both recommended, the latter supported in part by Emergent Ventures as well.
How to teach people how to work with AI
I have a very concrete and specific proposal for teaching people how to work with AI. It is also a proposal for how to reform higher education.
Given them some topics to investigate, and have them run a variety of questions, exercises, programming, paper-writing tasks — whatever — through the second or third-best model, or some combination of slightly lesser models.
Have the students grade and correct the outputs of those models. The key is to figure out where the AIs are going wrong.
Then have the best model grade the grading of the students. The professor occasionally may be asked to contribute here as well, depending on how good the models are in absolute terms.
In essence, the students are learning how to grade and correct AI models.
Just keep on doing this.
Of course the students need to learn subject matter as well, but perhaps this process should be what…one-third of all higher education?
You might think the models are too good for human grading and correction to matter at all, whether now or in the future. That may be true at some point. That is the same scenario, however, where it does not matter what we teach the students. Might as well teach them for the world-states in which they are of some use at all.
This is all apart from the PhD you might get in gardening, but even there I think you will need to spend some time learning how to correct and judge AI models.
The three levels of AI understanding
If you are trying to contextualize someone’s opinion on current AI, I suggest asking three basic questions about their perspective. In particular, you should know which of the following three realities they are aware of. Here goes:
1. How good are the best models today?
Most people do not know this, even if you are speaking with someone at a top university.
2. How rapidly are the best current models are able to self-improve?
In my view, their output is good enough that REinforcement Learning can work, synthetic data can work, time scaling can be scaled, they can grade themselves, and they are on a steady glide toward ongoing self-improvement. You can debate the rate, but “compound interest” gets you there sooner or later.
It is fine if someone disagrees with that, but at the very least you want them to have considered this possibility. Most observers really have not.
3. How will the best current models be knit together in stacked, decentralized networks of self-improvement, broadly akin to “the republic of science” for human beings?
This one is far more speculative, as it is not possible to observe much directly in the public sphere. And most of what will happen does not exist yet, not even as plans on the drawing board. Still, you want a person to have given this question some thought. I believe, for one, that this will move the AIs into the realm of being truly innovative. Stack them, have some of them generate a few billion new ideas a week, have the others grade those ideas…etc.
I find it is difficult to have good AI conversations with people who are not conversant with at least the first two realities on this list. The very best conversations are with people who also have spent time thinking about number three as well.
What should I ask Jennifer Pahlka?
Yes, I will be doing a Conversation with her. From Wikipedia:
Jennifer Pahlka (born December 27, 1969) is an American businesswoman and political advisor. She is the founder and former executive director of Code for America. She served as U.S. Deputy Chief Technology Officer from June 2013 to June 2014 and helped found the United States Digital Service. Previously she had worked at CMP Media with various roles in the computer game industry. She was the co-chair and general manager of the Web 2.0 conferences. In June 2023, she released the book Recoding America: Why Government Is Failing in the Digital Age and How We Can Do Better.
Recently she has been working on the Niskanen Institute state capacity project. So what should I ask her?
Dwarkesh’s Question
One question I had for you while we were talking about the intelligence stuff was, as a scientist yourself, what do you make of the fact that these things have basically the entire corpus of human knowledge memorized and they haven’t been able to make a single new connection that has led to a discovery? Whereas if even a moderately intelligent person had this much stuff memorized, they would notice — Oh, this thing causes this symptom. This other thing also causes this symptom. There’s a medical cure right here.
Shouldn’t we be expecting that kind of stuff?
It’s a very good question. In 2023, I quipped, “I think they have, we just haven’t asked them.” Maybe, but less clear today. Dwarkesh reports that there have been no good answers.
It’s later than you think
Here is a short essay by Hollis Robbins on AI and education, excerpt:
Every faculty member should begin to write a detailed memo specifying the following: “What specific knowledge do I possess that AGI does not? What unique insights or capabilities can I offer that exceed AGI systems? Which students, and in which topics, would benefit enough to pay to learn from me and why?” Faculty who cannot produce this memo with concrete, defensible answers have no place in the institution. There is no middle ground.
Every dean must immediately audit their course catalog against one criterion: what advanced knowledge or skills does this course offer that AGI cannot replicate? Each course must demonstrate specific knowledge transfer or skill development that exceeds AGI capabilities. It will become obvious that the highest value courses are those aligned with specific faculty expertise. General education courses focused on basic knowledge transfer become indefensible. If the information is general enough to be called “general education,” AGI can deliver it more effectively than any human instructor. This will eliminate most of the current curriculum.
Universities will retain faculty in three categories: those advancing original research beyond AGI capabilities, those who teach the use of advanced equipment and sophisticated physical skills, and those handling previously undiscovered source materials or developing novel interpretations that outstrip AGI’s analysis. In the sciences, this means laboratory-based faculty who validate AGI-generated research proposals and offer advanced hands-on training with advanced equipment. In engineering and the arts, it’s faculty who guide students in high-level physical manipulation, augmented by AI tools. In the humanities, it’s scholars working with newly discovered primary sources, untranslated manuscripts, or archaeological evidence not yet processed by AI, as well as those creating fundamentally new interpretive frameworks that transcend AGI’s pattern-recognition capacities.
The curriculum narrows dramatically. Most lecture courses disappear. What remains are advanced research seminars where faculty share findings from new source materials or original experiments, intensive laboratory and studio sessions for hands-on skills, and research validation practicums where students learn to test AGI hypotheses. This represents a 60-70% reduction in current faculty positions, with remaining roles requiring fundamentally different capabilities than traditional academic work.
There is more of interest at the link.
“Can America Win the AI War with China?”
A long video chat, with Geoffrey Cain, who is more hawkish than I am. Bari Weiss moderates. One argument I make is that America may prefer if China does well with AI, because the non-status quo effects of AI may disrupt their system more than ours. I also argue that for all the AI rival with China (which to be sure is real), much of the future may consist of status quo powers America and China working together to put down smaller-scale AI troublemakers around the rest of the world. Interesting throughout.
Deep Research
I have had it write a number of ten-page papers for me, each of them outstanding. I think of the quality as comparable to having a good PhD-level research assistant, and sending that person away with a task for a week or two, or maybe more.
Except Deep Research does the work in five or six minutes. And it does not seem to make errors, due to the quality of the embedded o3 model.
It seems it can cover just about any topic?
I asked for a ten-page paper explaining Ricardo’s theory of rent, and how it fits into his broader theory of distribution. It is a little long, but that was my fault, here is the result. I compared it to a number of other sources on line, and thought it was better, and so I am using it for my history of economic thought class.
I do not currently see signs of originality, but the level of accuracy and clarity is stunning, and it can write and analyze at any level you request. The work also shows the model can engage in a kind of long-term planning, and that will generalize to some very different contexts and problems as well — that is some of the biggest news associated with this release.
Sometimes the model stops in the middle of its calculations and you need to kick it in the shins a bit to get it going again, but I assume that problem will be cleared up soon enough.
If you pay for o1 pro, you get I think 100 queries per month with Deep Research.
Solve for the equilibrium, people, solve for the equilibrium.
Gradual Empowerment?
The subtitle is “Systemic Existential Risks from Incremental AI Development,” and the authors are Jan Kulveit, et.al. Several of you have asked me for comments on this paper. Here is the abstract:
This paper examines the systemic risks posed by incremental advancements in artificial intelligence, developing the concept of `gradual disempowerment’, in contrast to the abrupt takeover scenarios commonly discussed in AI safety. We analyze how even incremental improvements in AI capabilities can undermine human influence over large-scale systems that society depends on, including the economy, culture, and nation-states. As AI increasingly replaces human labor and cognition in these domains, it can weaken both explicit human control mechanisms (like voting and consumer choice) and the implicit alignments with human interests that often arise from societal systems’ reliance on human participation to function. Furthermore, to the extent that these systems incentivise outcomes that do not line up with human preferences, AIs may optimize for those outcomes more aggressively. These effects may be mutually reinforcing across different domains: economic power shapes cultural narratives and political decisions, while cultural shifts alter economic and political behavior. We argue that this dynamic could lead to an effectively irreversible loss of human influence over crucial societal systems, precipitating an existential catastrophe through the permanent disempowerment of humanity. This suggests the need for both technical research and governance approaches that specifically address the risk of incremental erosion of human influence across interconnected societal systems.
This is one of the smarter arguments I have seen, but I am very far from convinced. When were humans ever in control to begin with? (Robin Hanson realized this a few years ago and is still worried about it, as I suppose he should be. There is not exactly a reliable competitive process for cultural evolution — boo hoo!)
Note the argument here is not that a few rich people will own all the AI. Rather, humans seem to lose power altogether. But aren’t people cloning DeepSeek for ridiculously small sums of money? Why won’t our AI future be fairly decentralized, with lots of checks and balances, and plenty of human ownership to boot?
Rather than focusing on “humans in general,” I say look at the marginal individual human being. That individual — forever as far as I can tell — has near-zero bargaining power against a coordinating, cartelized society aligned against him. With or without AI. Yet that hardly ever happens, extreme criminals being one exception. There simply isn’t enough collusion to extract much from the (non-criminal) potentially vulnerable lone individuals.
I do not in this paper see a real argument that a critical mass of the AIs are going to collude against humans. It seems already that “AIs in China” and “AIs in America” are unlikely to collude much with each other. Similarly, “the evil rich people” do not collude with each other all that much either, much less across borders.
I feel if the paper made a serious attempt to model the likelihood of worldwide AI collusion, the results would come out in the opposite direction. So, to my eye, “checks and balances forever” is by far the more likely equilibrium.
o1 pro
Often I don’t write particular posts because I feel it is obvious to everybody. Yet it rarely is.
So here is my post on o1 pro, soon to be followed by o3 pro, and Deep Research is being distributed, which uses elements of o3. (So far it is amazing, btw.)
o1 pro is the smartest publicly issued knowledge entity the human race has created (aside from Deep Research!). Adam Brown, who does physics at a world class level, put it well in his recent podcast with Dwarkesh. Adam said that if he had a question about something, the best answer he would get is from calling up one of a handful of world experts on the topic. The second best answer he would get is from asking the best AI models.
Except, at least for the moment, you don’t need to make that plural. There is a single best model, at least when it comes to tough questions (it is more disputable which model is the best and most creative writer or poet).
I find it very difficult to ask o1 pro an economics question it cannot answer. I can do it, but typically I have to get very artificial. It can answer, and answer well, any question I might normally pose in the course of typical inquiry and pondering. As Adam indicated, I think only a relatively small number of humans in the world can give better answers to what I want to know.
In an economics test, or any other kind of naturally occurring knowledge test I can think of, it would beat all of you (and me).
Its rate of hallucination is far below what you are used to from other LLMs.
Yes, it does cost $200 a month. It is worth that sum to converse with the smartest entity yet devised. I use it every day, many times. I don’t mind that it takes some time to answer my questions, because I have plenty to do in the meantime.
I also would add that if you are not familiar with o1 pro, your observations about the shortcomings of AI models should be discounted rather severely. And o3 pro is due soon, presumably it will be better yet.
The reality of all this will disrupt many plans, most of them not directly in the sphere of AI proper. And thus the world wishes to remain in denial. It amazes me that this is not the front page story every day, and it amazes me how many people see no need to shell out $200 and try it for a month, or more.
How did China’s internet become so cool amongst America’s youth?
That is the topic of my latest Bloomberg column. Here is part of the argument:
TikTok was briefly shut down earlier this month, and the site faces an uncertain legal future. America’s internet youth started to look elsewhere — and where did they choose? They flocked to a Chinese video site called RedNote, also known as Xiaohongshu, the name of the parent company. RedNote has more than 300 million users in China, but until recently barely received attention in the US.
And when young Americans visited RedNote, they were undoubtedly struck by an obvious fact: It is not the kind of site their parents would frequent. The opening page is full of Chinese characters, as well as shots of provocatively dressed women, weird animal and baby photos, and many images that, at least to this American viewer, make no sense whatsoever. Yet Chinese and American youth interact frequently there, for example trading tips for making steamed eggs properly.
I don’t plan on spending much of my time there, but that’s part of the point — and helps explain its appeal to American youth.
And this:
As for the AI large-language models, DeepSeek is a marvel. Quite aside from its technical achievements and low cost, the model has real flair. Its written answers can be moody, whimsical, arbitrary and playful. Of all the major LLMs, I find it the most fun to chat with. It wrote this version of John Milton’s Paradise Lost — as a creation myth for the AIs. Or here is DeepSeek commenting on ChatGPT, which it views as too square. It is hardly surprising that this week DeepSeek was the top download on Apple’s app store.
The model also has a scrappy and unusual history, having been birthed as a side project from a Chinese hedge fund. Whether or not that counts as “cool,” it does sound like something a scriptwriter would have come up with. And at least on American topics, DeepSeek seems more candid than the major US models. That qualifier is important: Don’t ask DeepSeek about Taiwan, the Uighurs or Tiananmen Square.
The most fundamental reason China is seen as cool is that…China is cool, at least in some subset of products.
The forward march of computer use, AI edition
I must admit, though, that the thing that scared me most about HudZah was that he seemed to be living in a different technological universe than I was. If the previous generation were digital natives, HudZah was an AI native.
HudZah enjoys reading the old-fashioned way, but he now finds that he gets more out of the experience by reading alongside an AI. He puts PDFs of books into Claude or ChatGPT and then queries the books as he moves through the text. He uses Granola to listen in on meetings so that he can query an AI after the chats as well. His friend built Globe Explorer, which can instantly break down, say, the history of rockets, as if you had a professional researcher at your disposal. And, of course, HudZah has all manner of AI tools for coding and interacting with his computer via voice.
It’s not that I don’t use these things. I do. It’s more that I was watching HudZah navigate his laptop with an AI fluency that felt alarming to me. He was using his computer in a much, much different way than I’d seen someone use their computer before, and it made me feel old and alarmed by the number of new tools at our disposal and how HudZah intuitively knew how to tame them.
It also excited me. Just spending a couple of hours with HudZah left me convinced that we’re on the verge of someone, somewhere creating a new type of computer with AI built into its core. I believe that laptops and PCs will give way to a more novel device rather soon.
That is from Ashlee Vance, the entire story is very interesting.
Chris Barber asks me to give AI-related advice
…for people in various stages and situations of life. You will find the discussion here.
Will transformative AI raise interest rates?
We want to know if AGI is coming. Chow, Halperin, and Mazlish have a paper called “Transformative AI, Existential Risk, and Real Interest Rates” arguing that, if we believe the markets, it is not coming for some time. The reasoning is simple. If we expect to consume much more in the future, and people engage in smoothing their incomes over time, then people will want to borrow more now. The real interest rate would rise. The reasoning also works if AI is unaligned, and has a chance of destroying all of us. People would want to spend what they have now. They would be disinclined to save, and real interest rates would have to rise in order to induce people to lend.
The trouble is that “economic growth” is not really one thing. It consists both of expanding our quantity of units consumed for a given amount of resources, but also in expanding what we are capable of consuming at all. Take the television – it has simultaneously become cheaper and greatly improved in quality. One can easily imagine a world in which the goods stay the same price, but greatly improve in quality. Thus, the marginal utility gained from one dollar increases in the future, and we would want to save more, not less. The coming of AGI could be heralded by falling interest rates and high levels of saving.
Questions about LLMs (from my email)
From Naveen:
So much talk of “AI safety” and too little in the way of practical questions like these that are going to be important in the near future.
Should law enforcement be able to subpoena AI assistants about your information? For example, I use the free GPT-3.5/4 version and it already has a lot of my personal information on it.
The other day, when I asked an insurance claims related question in a new chat window without reminding it of the fact that my car was recently totaled, it includes in the answer that “but that wouldn’t apply to you, since your car was declared non-repairable and you were declared as not at-fault.” So it remembers personal information I mentioned weeks ago even though I never told it to commit to its memory.
ChatGPT is such a rudimentary free AI system compared to the personal AI assistants we will get in the near future which will have all my travel data, health data, financial data, mental health data, personal data and what I’ve been up to.
Should law enforcement be allowed to subpoena such AI assistants? Should there be legislation mandating data retention so law enforcement can access it much like telephone records or the opposite — mandating data encryption so it can’t be accessed?