Category: Web/Tech

Summary of a new DeepMind paper

Super intriguing idea in this new @GoogleDeepMind  paper – shows how to handle the rise of AI agents acting as independent players in the economy.

It says that if left unchecked, these agents will create their own economy that connects directly to the human one, which could bring both benefits and risks.

The authors suggest building a “sandbox economy,” which is a controlled space where agents can trade and coordinate without causing harm to the broader human economy.

A big focus is on permeability, which means how open or closed this sandbox is to the outside world. A fully open system risks crashes and instability spilling into the human economy, while a fully closed system may be safer but less useful.

They propose using auctions where agents fairly bid for resources like data, compute, or tools. Giving all agents equal starting budgets could help balance power and prevent unfair advantages.

For larger goals, they suggest mission economies, where many agents coordinate toward one shared outcome, such as solving a scientific or social problem.

The risks they flag include very fast agent negotiations that humans cannot keep up with, scams or prompt attacks against agents, and powerful groups dominating resources.

To reduce these risks, they call for identity and reputation systems using tools like digital credentials, proof of personhood, zero-knowledge proofs, and real-time audit trails.

The core message is that we should design the rules for these agent markets now, so they grow in a safe and fair way instead of by accident.

That is from Rohan Paul, though the paper is by Nenad Tomasev, et.al.  It would be a shame if economists neglected what is perhaps the most important (and interesting) mechanism design problem facing us.

Do Markets Believe in Transformative AI?

No:

Economic theory predicts that transformative technologies may influence interest rates by changing growth expectations, increasing uncertainty about growth, or raising concerns about existential risk. Examining US bond yields around major AI model releases in 2023-4, we find economically large and statistically significant movements concentrated at longer maturities. The median and mean yield responses across releases in our sample are negative: long-term Treasury, TIPS, and corporate yields fall and remain lower for weeks. Viewed through the lens of a simple, representative agent consumption-based asset pricing model, these declines correspond to downward revisions in expected consumption growth and/or a reduction in the perceived probability of extreme outcomes such as existential risk or arrival of a post-scarcity economy. By contrast, changes in consumption growth uncertainty do not appear to drive our results.

That is from a new NBER working paper by Isaiah Andrews and Maryam Farboodi.

What should I ask Cass Sunstein?

Yes, I will be doing a Conversation with him soon.  Most of all (but not exclusively) about his three recent books Liberalism: In Defense of Freedom, Manipulation: What It Is, Why It Is Bad, What To Do About It, and Imperfect Oracle: What AI Can and Cannot Do.

So what should I ask him?  Here is my previous CWT with Cass.

AI Agents for Economic Research

The objective of this paper is to demystify AI agents – autonomous LLM-based systems that plan, use tools, and execute multi-step research tasks – and to provide hands-on instructions for economists to build their own, even if they do not have programming expertise. As AI has evolved from simple chatbots to reasoning models and now to autonomous agents, the main focus of this paper is to make these powerful tools accessible to all researchers. Through working examples and step-by-step code, it shows how economists can create agents that autonomously conduct literature reviews across myriads of sources, write and debug econometric code, fetch and analyze economic data, and coordinate complex research workflows. The paper demonstrates that by “vibe coding” (programming through natural language) and building on modern agentic frameworks like LangGraph, any economist can build sophisticated research assistants and other autonomous tools in minutes. By providing complete, working implementations alongside conceptual frameworks, this guide demonstrates how to employ AI agents in every stage of the research process, from initial investigation to final analysis.

By Anton Korinek.

Are we building an “animal internet”?

Should we?

Human owners of parrots in the study reported that the birds seemed happier when they could interact online with other parrots and not just with people…

Scientists are using digital technology to revolutionise animal communication and move towards an “animal internet”, using new products such as phones for dogs and touchscreens for parrots.

Experiments by Glasgow university have enabled several species, from parrots and monkeys to cats and dogs, to enjoy long-distance video and audio calls. They have also developed technology for monkeys and lemurs in zoos to trigger soothing sounds, smells or video images on demand.

Ilyena Hirskyj-Douglas, who heads the university’s Animal-Computer Interaction Group, started by developing a DogPhone that enables animals to contact their owners when they are left alone.

Her pet labrador Zack calls her by picking up and shaking an electronic ball containing an accelerometer. When this senses movement, it sets up a video call on a laptop, allowing Zack to interact with her whenever he chooses. She can also use the system to call him. Either party is free to pick up or ignore the call.

…When a parrot wanted to connect with a distant friend, a touchscreen showed a selection of other birds available online. The parrots learned to activate the screen, designed specially for them, by touching it gently with their tongues rather than pecking aggressively with their beaks.

“We had 26 birds involved,” said Hirskyj-Douglas. “They would use the system up to three hours a day, with each call lasting up to five minutes.” The interactions ranged from preening and playing with toys to loud vocal exchanges.

Here is more from Clive Cookson from the FT.  Via Malinga.

The polity that is Albania

Albania has become the first country in the world to have an AI minister — not a minister for AI, but a virtual minister made of pixels and code and powered by artificial intelligence.

Her name is Diella, meaning sunshine in Albanian, and she will be responsible for all public procurement, Prime Minister Edi Rama said Thursday.

During the summer, Rama mused that one day the country could have a digital minister and even an AI prime minister, but few thought that day would come around so quickly.

Here is the full story, we will see how this develops.  Diella also is tokenized.  Via MN.

A new RCT on banning smartphones in the classroom

Widespread smartphone bans are being implemented in classrooms worldwide, yet their causal effects on student outcomes remain unclear. In a randomized controlled trial involving nearly 17,000 students, we find that mandatory in-class phone collection led to higher grades — particularly among lower-performing, first-year, and non-STEM students — with an average increase of 0.086 standard deviations. Importantly, students exposed to the ban were substantially more supportive of phone-use restrictions, perceiving greater benefits from these policies and displaying reduced preferences for unrestricted access. This enhanced student receptivity to restrictive digital policies may create a self-reinforcing cycle, where positive firsthand experiences strengthen support for continued implementation. Despite a mild rise in reported fear of missing out, there were no significant changes in overall student well-being, academic motivation, digital usage, or experiences of online harassment. Random classroom spot checks revealed fewer instances of student chatter and disruptive behaviors, along with reduced phone usage and increased engagement among teachers in phone-ban classrooms, suggesting a classroom environment more conducive to learning. Spot checks also revealed that students appear more distracted, possibly due to withdrawal from habitual phone checking, yet, students did not report being more distracted. These results suggest that in-class phone bans represent a low-cost, effective policy to modestly improve academic outcomes, especially for vulnerable student groups, while enhancing student receptivity to digital policy interventions.

That is from a recent paper by Alp Sungu, Pradeep Kumar Choudhury, and Andreas Bjerre-Nielsen.  Note with grades there is “an average increase of 0.086 standard deviations.”  I have no problem with these policies, but it mystifies me why anyone would put them in their top five hundred priorities, or is that five thousand?  Here is my earlier post on Norwegian smart phone bans, with comparable results.

My Hope Axis podcast with Anna Gát

Here is the YouTube, here is transcript access, here is their episode summary:

The brilliant @tylercowen joins @TheAnnaGat for a lively, wide-ranging conversation exploring hope from the perspective of insiders and outsiders, the obsessed and the competitive, immigrants and hard workers. They talk about talent and luck, what makes America unique, whether the dream of Internet Utopia has ended, and how Gen-Z might rebel. Along the way: Jack Nicholson, John Stuart Mill, road trips through Eastern Europe, the Enlightenment of AI, and why courage shapes the future.

Excerpt:

Tyler Cowen: But the top players I’ve met, like Anand or Magnus Carlsen or Kasparov, they truly hate losing with every bone in their body. They do not approach it philosophically. They can become very miserable as a result. And that’s very far from my attitudes. It shaped my life in a significant way.

Anna Gát: I was so surprised. I was like, what? But actually, what? In Maggie Smith-high RP—what? This never occurred to me that losing can be approached philosophically.

Tyler Cowen: And I think always keeping my equanimity has been good for me, getting these compound returns over long periods of time. But if you’re doing a thing like chess or math or sports that really favors the young, you don’t have all those decades of compound returns. You’ve got to motivate yourself to the maximum extent right now. And then hating losing is super useful. But that’s just—those are not the things I’ve done. The people who hate losing should do things that are youth-weighted, and the people who have equanimity should do things that are maturity and age-weighted with compounding returns.

Excellent discussion, lots of fresh material.  Here is the Hope Axis podcast more generally.  Here is Anna’s Interintellect project, worthy of media attention.  Most of all it is intellectual discourse, but it also seems to be the most successful “dating service” I am aware of.

How to think about AI progress

The Zvi has a good survey post on what is going on with the actual evidence.  I have a more general point to make, which I am drawing from my background in Austrian capital theory.

There are easy projects, and there are hard projects.  You might also say short-term vs. long-term investments.

The easier, shorter-term projects get done first.  For instance, the best LLMs now have near-perfect answers for a wide range of queries.  Those answers will not be getting much better, though they may be integrated into different services in higher productivity ways.

Those improvements will yield an ongoing stream of benefits, but you will not see much incremental progress in the underlying models themselves.  Ten years from now, the word “strawberry” still will have three r’s, and the LLMs still will tell us that.  There are other questions, such as “what is the meaning of life?” where the AI answers also will not get much better.  I do not mean that statement as AI pessimism, rather the answers can only get so good because the question is not ideally specified in the first place.

Then there are the very difficult concrete problems, such as in the biosciences or with math olympiad problems, and so on.  Progress in these areas seems quite steady and I would call it impressive.  But it will take quite a few years before that progress is turned into improvements in daily life.  Again, that does not have to be AI pessimism.  Just look at how we run our clinical trials, or how long the FDA approval process takes for new drugs, or how many people are reluctant to accept beneficial vaccines.  I predict that AI will not speed up those processes nearly as much as it ideally might.

So the AI world before us is rather rapidly being bifurcated into two sectors:

a) progress already is extreme, and is hard to improve upon, and

b) progress is ongoing, but will take a long time to be visible to actual users and consumers

And so people will complain that AI progress is failing us, but mostly they will be wrong.  They will be the victim of cognitive error and biases.  The reality is that progress is continuing apace, but it swallows up and renders ordinary some of its more visible successes.  What is left behind for future progress can be pretty slow.

AI-led job interviews

We study the impact of replacing human recruiters with AI voice agents to conduct job interviews. Partnering with a recruitment firm, we conducted a natural field experiment in which 70,000 applicants were randomly assigned to be interviewed by human recruiters, AI voice agents, or given a choice between the two. In all three conditions, human recruiters evaluated interviews and made hiring decisions based on applicants’ performance in the interview and a standardized test. Contrary to the forecasts of professional recruiters, we find that AI-led interviews increase job offers by 12%, job starts by 18%, and 30-day retention by 17% among all applicants. Applicants accept job offers with a similar likelihood and rate interview, as well as recruiter quality, similarly in a customer experience survey. When offered the choice, 78% of applicants choose the AI recruiter, and we find evidence that applicants with lower test scores are more likely to choose AI. Analyzing interview transcripts reveals that AI-led interviews elicit more hiring-relevant information from applicants compared to human-led interviews. Recruiters score the interview performance of AI-interviewed applicants higher, but place greater weight on standardized tests in their hiring decisions. Overall, we provide evidence that AI can match human recruiters in conducting job interviews while preserving applicants’ satisfaction and firm operations.

That is from a new paper by Brian Jabarian and Luca Henkel.

The “Marvel Universe” of faith

In a recent video posted to the AI Bible’s Youtube channel, buildings crumble and terrified-looking people claw their way through the rubble. Horns blare, and an angel appears floating above the chaos. Then come monsters, including a seven-headed dragon that looks like something out of a Dungeons and Dragons rulebook.

The visuals in this eight-minute video, which depicts a section of the Book of Revelation, are entirely generated by artificial intelligence tools. At times it feels like a high-budget Hollywood movie, at times more like a scene from a video game, and at times like fantasy art. Despite the somewhat muddled visual styles, viewers seem to like what they see – it has racked up over 750,000 views in the two months since it was posted.

The viewers are mostly under 30, and skew male.

Here is the full story, via Ari Armstrong.

Pathbreaking paper on AI simulations of human behavior

By Benjamin Manning and John Horton:

Useful social science theories predict behavior across settings. However, applying a theory to make predictions in new settings is challenging: rarely can it be done without ad hoc modifications to account for setting-specific factors. We argue that AI agents put in simulations of those novel settings offer an alternative for applying theory, requiring minimal or no modifications. We present an approach for building such “general” agents that use theory-grounded natural language instructions, existing empirical data, and knowledge acquired by the underlying AI during training. To demonstrate the approach in settings where no data from that data-generating process exists—as is often the case in applied prediction problems—we design a highly heterogeneous population of 883,320 novel games. AI agents are constructed using human data from a small set of conceptually related, but structurally distinct “seed” games. In preregistered experiments, on average, agents predict human play better than (i) game-theoretic equilibria and (ii) out-of-the-box agents in a random sample of 1,500 games from the population. For a small set of separate novel games, these simulations predict responses from a new sample of human subjects better even than the most plausibly relevant published human data.

Here is a good Twitter thread.  A broader AI lesson here is that you often have to put in a lot of work to get the best from your LLMs.  And these results ought to have implications for the methods of psychology and some of the other social sciences as well.

Where are the trillion dollar biotech companies?

In today’s market, even companies with multiple approved drugs can trade below their cash balances. Given this, it is truly perplexing to see AI-biotechs raise mega-rounds at the preclinical stage – Xaira with a billion-dollar seed, Isomorphic with $600M, EvolutionaryScale with $142M, and InceptiveBio with $100M, to name a few. The scale and stage of these rounds reflect some investors’ belief that AI-biology pairing can bend the drug discovery economics I described before.

To me, the question of whether AI will be helpful in drug discovery is not as interesting as the question of whether AI can turn a 2-billion-dollar drug development into a 200-million-dollar drug development, or whether 10 years to approve a drug can become 5 years to approve a drug. AI will be used to assist drug discovery in the same way software has been used for decades, and, given enough time, we know it will change everything [4]. But is “enough time” 3 years or half a century?

One number that is worth appreciating is that 80% of all costs associated with bringing a drug to market come from clinical-stage work. That is, if we ever get to molecules designed and preclinically validated in under 1 year, we’ll be impacting only a small fraction of what makes drug discovery hard. This productivity gain cap is especially striking given that the majority of the data we can use to train models today is still preclinical, and, in most cases, even pre-animal. A perfect model predictive of in vitro tox saves you time on running in vitro tox (which is less than a few weeks anyway!), doesn’t bridge the in vitro to animal translation gap, and especially does not affect the dreaded animal-to-human jump. As such, perfecting predictive validity for preclinical work is the current best-case scenario for the industry. Though we don’t have a sufficient amount and types of data to solve even that.

Here is the full and very interesting essay, from the excellent Lada Nuzhna.

I podcast with Jacob Watson-Howland

He is a young and very smart British photographer.  He sent me this:

Links to the episode:

Youtube: https://youtu.be/4nMg0Qg7KRI

Spotify: https://open.spotify.com/episode/2wDyCaXhGN5ruN1SNkcaBt?si=3c054eb0396f4787

Apple: https://podcasts.apple.com/gb/podcast/watson-howland/id1813625992?i=1000724051310

Jacob’s episode summary is at the first link.