Category: Web/Tech

Claude Sonnet 3.5, economist

TC:

“Is the Stiglitz-Shapiro 1984 efficiency wage model actually a model of sticky wages, or not? Is either the real or nominal wage sticky in that model?”

Claude:

The Stiglitz-Shapiro efficiency wage model from 1984 is not primarily a model of sticky wages, although it can have implications for wage stickiness. Let’s break this down:

  1. Core concept of the model: The Stiglitz-Shapiro model is fundamentally about explaining why firms might pay wages above the market-clearing level. The key insight is that higher wages can increase productivity by: a) Reducing shirking (as workers have more to lose if caught and fired) b) Lowering turnover costs c) Improving the quality of job applicants d) Boosting worker morale and effort
  2. Wage stickiness in the model:
  • The model does not directly predict sticky wages in the traditional sense of wages that are slow to adjust to economic shocks.
  • However, it does provide a rationale for why wages might not fall even in the face of unemployment. Firms have an incentive to maintain higher wages to preserve the productivity benefits.
  1. Real vs. Nominal wages:
  • The model is primarily concerned with real wages, not nominal wages.
  • The efficiency wage is set in real terms, as it’s the purchasing power of the wage that affects worker behavior and productivity.
  1. Implications for wage rigidity:
  • While not a model of sticky wages per se, the efficiency wage concept can contribute to wage rigidity.
  • Firms may be reluctant to cut real wages even during economic downturns because doing so could reduce productivity more than it saves on labor costs.
  1. Difference from traditional sticky wage models:
  • Traditional sticky wage models often focus on nominal rigidities due to factors like long-term contracts or menu costs.
  • The Stiglitz-Shapiro model provides a real, microeconomic foundation for why wages might not adjust downward, even if they’re not formally “sticky.”

In summary, while the Stiglitz-Shapiro efficiency wage model is not explicitly a model of sticky wages, it does provide insights into why wages might be rigid, particularly in the downward direction. The focus is on real wages rather than nominal wages, and the rigidity emerges from the productivity-enhancing effects of higher wages rather than from institutional or contractual factors.

TC again; What percentage of professional economists could give an answer of comparable quality and nuance?

Thomas Schelling meets LLMs?

Drawing on political science and international relations literature about escalation dynamics, we design a novel wargame simulation and scoring framework to assess the escalation risks of actions taken by these agents in different scenarios. Contrary to prior studies, our research provides both qualitative and quantitative insights and focuses on large language models (LLMs). We find that all five studied off-the-shelf LLMs show forms of escalation and difficult-to-predict escalation patterns. We observe that models tend to develop arms-race dynamics, leading to greater conflict, and in rare cases, even to the deployment of nuclear weapons. Qualitatively, we also collect the models’ reported reasonings for chosen actions and observe worrying justifications based on deterrence and first-strike tactics.

That is from a new paper by Juan-Pablo Rivera, et.al., via the excellent Ethan Mollick.  Do note that these recommended tactics are for the U.S., so perhaps the LLMs simply are telling us that America should be more hawkish.

Claude 3 for SCOTUS?

I would vote to confirm:

I decided to do a little more empirical testing of AI’s legal ability. Specifically, I downloaded the briefs in every Supreme Court merits case that has been decided so far this Term, inputted them into Claude 3 Opus (the best version of Claude), and then asked a few follow-up questions. (Although I used Claude for this exercise, one would likely get similar results with GPT-4.)

The results were otherworldly. Claude is fully capable of acting as a Supreme Court Justice right now. When used as a law clerk, Claude is easily as insightful and accurate as human clerks, while towering over humans in efficiency.

Let’s start with the easiest thing I asked Claude to do: adjudicate Supreme Court cases. Claude consistently decides cases correctly. When it gets the case “wrong”—meaning, decides it differently from how the Supreme Court decided it—its disposition is invariably reasonable…

Of the 37 merits cases decided so far this Term,1 Claude decided 27 in the same way the Supreme Court did.2 In the other 10 (such as Campos-Chaves), I frequently was more persuaded by Claude’s analysis than the Supreme Court’s. A few of the cases Claude got “wrong” were not Claude’s fault, such as DeVillier v. Texas, in which the Court issued a narrow remand without deciding the question presented.

Although I’ve heard concerns that AI would be “woke,” Claude is studiously moderate.

Here is much more from Adam Unikovsky.  A lot of people are still in denial, or not far enough along to even count as “denying.”

Nat Friedman discusses Sora, and image wisdom

NF: Yeah, I interviewed the Sora creators yesterday, the day before on stage at an event and it was super interesting to hear their point of view. I think we see Sora as this media production tool, that’s not their view, that’s a side effect. Their view is that it is a world simulator and that in fact it can sort of simulate any kind of behavior in the world, including going as far as saying, “Let’s create a video with Ben and Daniel and Nat and have them discuss this,” and then see where the conversation goes. And their view is also that Sora today is a GPT-1 scale, not a lot of data, not a lot of compute, and so we should expect absolutely dramatic improvement in the future as they simply scale it up and thirdly that there’s just a lot more video data than there is text data on the Internet…

And then Andrej Karpathy, I was talking to him the other day too, and he said, “There’s something strange going on-”

[Ben Thompson] And a picture is worth a thousand words by the way, so the number of tokens there is just astronomically larger.

NF: He was exploring this idea that the world model and image and video models actually might better than in text models. You ask it for a car engine, someone fixing a carburetor and just the level of detail that can be in there is extraordinary, and maybe we made a mistake by training on the text extracted from Common Crawl and what we should do instead. I asked him for his most unhinged research idea. He said what we should do instead is train on pictures of web pages and when you ask the model a question, it outputs a picture of a web page with the answer and maybe we’d get way more intelligence and better results from that.

That is from his dialogue with Ben Thompson and Daniel Gross, gated but worth paying for.

Intel and its superloads

Intel’s new campus coming to New Albany, OH, is in heavy construction, and around 20 super loads are being ferried across Ohio’s roads by the Ohio Department of Transportation after arriving at a port of the Ohio River via barge. Four of these loads, including the one hitting the road now, weigh around 900,000 pounds — that’s 400 metric tons, or 76 elephants. The super loads were first planned for February but were delayed due to the immense planning workload. Large crowds are estimated to accumulate on the route, potentially slowing it even further.

Intel’s 916,000-pound shipment is a “cold box,” a self-standing air-processor structure that facilitates the cryogenic technology needed to fabricate semiconductors. The box is 23 feet tall, 20 feet wide, and 280 feet long, nearly the length of a football field. The immense scale of the cold box necessitates a transit process that moves at a “parade pace” of 5-10 miles per hour. Intel is taking over southern Ohio’s roads for the next several weeks and months as it builds its new Ohio One Campus, a $28 billion project to create a 1,000-acre campus with two chip factories and room for more. Calling it the new “Silicon Heartland,” the project will be the first leading-edge semiconductor fab in the American Midwest, and once operational, will get to work on the “Angstrom era” of Intel processes, 20A and beyond.

Here is the source, via Mathan Glezer.

A virtual rodent predicts the structure of neural activity across behaviors

Animals have exquisite control of their bodies, allowing them to perform a diverse range of behaviors. How such control is implemented by the brain, however, remains unclear. Advancing our understanding requires models that can relate principles of control to the structure of neural activity in behaving animals. To facilitate this, we built a ‘virtual rodent’, in which an artificial neural network actuates a biomechanically realistic model of the rat in a physics simulator. We used deep reinforcement learning to train the virtual agent to imitate the behavior of freely-moving rats, thus allowing us to compare neural activity recorded in real rats to the network activity of a virtual rodent mimicking their behavior. We found that neural activity in the sensorimotor striatum and motor cortex was better predicted by the virtual rodent’s network activity than by any features of the real rat’s movements, consistent with both regions implementing inverse dynamics. Furthermore, the network’s latent variability predicted the structure of neural variability across behaviors and afforded robustness in a way consistent with the minimal intervention principle of optimal feedback control. These results demonstrate how physical simulation of biomechanically realistic virtual animals can help interpret the structure of neural activity across behavior and relate it to theoretical principles of motor control.

Here is the new Nature article by Diego Aldarnado, et.al.  Via @sebkrier.

Is Apple paying OpenAI, or vice versa?

I do not know.  Am I wrong to think that in optimal contracts theory there should be payments going in both directions, as there is a dual agency relationship?  Of course optimal contracts theory often fails.

Ben Thompson has a hypothesis (gated):

This sounds like a play to acquire users and mindshare, with the potential of upselling those users to a subscription, i.e. the exact same model that OpenAI has on their website and apps. Moreover, if this partnership entails Apple not paying, it also explains why OpenAI is the only option to start: Google, for example, probably wanted to be paid for Gemini, or Anthropic for Claude, and I can imagine (1) Apple holding the line on not paying, particularly if (2) OpenAI is making an aggressive move to build out its consumer business and be a durable brand and winner in the consumer space. In short, my updated current thinking is that both Apple and OpenAI are making the bet that very large language models are becoming increasingly commoditized, which means that Apple doesn’t have to pay to get access to one, and OpenAI sees scale and consumer mindshare as the best route to a sustainable business.

If I had to guess, I would think the main payment goes from Apple to OpenAI?  We know Apple is very, very good at extracting revenue from customers.  OpenAI, no matter how rapid its growth path over the next few years, simply cannot have a comparable record.  There is thus one model where Apple pays for AI access rights, and later on then charges more for iPhones.  You still will be able to buy cheaper phones without these services, if enough people demand that.  OpenAI of course will be charging you as well, but only if you go over quota, as they face the “extra subscription fatigue issue” in a way that Apple does not.

Just a hypothesis.  The fact that we do not know should tell you something about the limits of economic reasoning.

Terence Tao on AI and mathematics

With formalization projects, what we’ve noticed is that you can collaborate with people who don’t understand the entire mathematics of the entire project, but they understand one tiny little piece. It’s like any modern device. No single person can build a computer on their own, mine all the metals and refine them, and then create the hardware and the software. We have all these specialists, and we have a big logistics supply chain, and eventually we can create a smartphone or whatever. Right now, in a mathematical collaboration, everyone has to know pretty much all the mathematics, and that is a stumbling block, as [Scholze] mentioned. But with these formalizations, it is possible to compartmentalize and contribute to a project only knowing a piece of it. I think also we should start formalizing textbooks. If a textbook is formalized, you can create these very interactive textbooks, where you could describe the proof of a result in a very high-level sense, assuming lots of knowledge. But if there are steps that you don’t understand, you can expand them and go into details—all the way down the axioms if you want to. No one does this right now for textbooks because it’s too much work. But if you’re already formalizing it, the computer can create these interactive textbooks for you. It will make it easier for a mathematician in one field to start contributing to another because you can precisely specify subtasks of a big task that don’t require understanding everything.

The entire interview is worth reading.  As Adam Smith once said…

AI and Truth Evasion

A good insight from Eliot Higgins, the head of the intelligence service Bellingcat.

When a lot of people think about AI, they think, “Oh, it’s going to fool people into believing stuff that’s not true.” But what it’s really doing is giving people permission to not believe stuff that is true. Because they can say, “Oh, that’s an AI-generated image. AI can generate anything now: video, audio, the entire war zone re-created.” They will use it as an excuse.

From an extensive interview in Wired.

Addendum: A case in point.

Leopold Aschenbrenner on AGI, and security matters

Here is the 165 pp. pdf piece, here is html, here is a Twitter short TOC preview.

And now Leopold on Dwarkesh.  I haven’t heard it, but a friend sends along this excerpt:

Why didn’t I ultimately pursue econ academia? There were several reasons, one of them being Tyler Cowen. He took me aside and said, “I think you’re one of the top young economists I’ve ever met, but you should probably not go to grad school.”
Dwarkesh Patel 02:19:50
Oh, interesting. Really? I didn’t realize that.
Leopold Aschenbrenner 02:19:53
Yeah, it was good because he kind of introduced me to the Twitter weirdos. I think the takeaway from that was that I have to move out west one more time.
Dwarkesh Patel 02:20:03
Wait Tyler introduced you to the Twitter weirdos?
Leopold Aschenbrenner 02:20:05
A little bit. Or just kind of the broader culture?
Dwarkesh Patel 02:20:08
A 60-year-old economist introduced you to Twitter?

TaskRabbit for AI Hires

Many people are interested in knowing, which AI is the closest to achieving AGI? That’s an important question for philosophers and computer scientists but more and more I am seeing firms arise to answer a different question, Which AI should I hire?

EquiStamp, for example, rates dozens of AIs based on multiple evaluations, including custom evaluations tailored to specific business tasks. For instance, one firm may want to hire an AI to handle customer queries, another to sort packages, another to summarize internal technical documents. The best AI for each task might differ from the AI that scores highest on general reasoning power.  In addition, businesses care not just about performance but also about speed and cost. No reason to hire AI-Einstein to sort the mail. AIs are also continually being re-trained so their performance can fluctuate. Businesses, therefore, may want to continually test their AIs and quickly hire and fire AIs as needed.

In short, a spot-market for hiring AIs is developing.