Category: Web/Tech

Comparing Large Language Models Against Lawyers

This paper presents a groundbreaking comparison between Large Language Models and traditional legal contract reviewers, Junior Lawyers and Legal Process Outsourcers. We dissect whether LLMs can outperform humans in accuracy, speed, and cost efficiency during contract review. Our empirical analysis benchmarks LLMs against a ground truth set by Senior Lawyers, uncovering that advanced models match or exceed human accuracy in determining legal issues. In speed, LLMs complete reviews in mere seconds, eclipsing the hours required by their human counterparts. Cost wise, LLMs operate at a fraction of the price, offering a staggering 99.97 percent reduction in cost over traditional methods. These results are not just statistics, they signal a seismic shift in legal practice. LLMs stand poised to disrupt the legal industry, enhancing accessibility and efficiency of legal services. Our research asserts that the era of LLM dominance in legal contract review is upon us, challenging the status quo and calling for a reimagined future of legal workflows.

That is from a new paper by Lauren MartinNick WhitehouseStephanie YiuLizzie Catterson, and Rivindu Perera.  Via Malinga.

Zvi Mowshowitz covers my podcast with Dwarkesh Patel

It is very long, very detailed, and very good.  Interesting throughout!  Excerpt:

Dealing with the third conversation is harder. There is place where I feel Tyler is misinterpreting a few statements, in ways I find extremely frustrating and that I do not see him do in other contexts, and I pause to set the record straight in detail. I definitely see hope in finding common ground and perhaps working together. But so far I have been unable to find the road in.

Here is the whole thing.

Claude of Anthropic does *GOAT*

Here is the new book bot, trained on Claude of course.  Again, I call this a “generative book.”

Very powerful, and I am indebted to those who helped, including at Anthropic and also Jeff Holmes of Mercatus.

Here is the original web site and the original GPT-4 bot, which still is great.  The full title is GOAT: Who is the Greatest Economist of all Time, and Why Should We Care?  An audiobook will be coming, can you guess who is reading it?

My new podcast with Dwarkesh Patel

We discussed how the insights of Hayek, Keynes, Smith, and other great economists help us make sense of AI, growth, risk, human nature, anarchy, central planning, and much more.

Dwarkesh is one of the very best interviewers around, here are the links.  If Twitter is blocked to you, here is the transcript, here is Spotify, among others.  Here is the most salacious part of the exchange, highly atypical of course:

Dwarkesh Patel 00:17:16

If Keynes were alive today, what are the odds that he’s in a polycule in Berkeley, writing the best written LessWrong post you’ve ever seen?

Tyler Cowen 00:17:24

I’m not sure what the counterfactual means. Keynes is so British. Maybe he’s an effective altruist at Cambridge. Given how he seemed to have run his sex life, I don’t think he needed a polycule. A polycule is almost a Williamsonian device to economize on transaction costs. But Keynes, according to his own notes, seems to have done things on a very casual basis.

And on another topic:

Dwarkesh Patel 00:36:44

We’re talking, I guess, about like GPT five level models. When you think in your mind about like, okay, this is GPT five. What happens with GPT six, GPT seven. Do you see it? Do you still think in the frame of having a bunch of RAs, or does it seem like a different sort of thing at some point?

Tyler Cowen 00:36:59

I’m not sure what those numbers going up mean, what a GPT seven would look like, or how much smarter it could get. I think people make too many assumptions there. It could be the real advantages are integrating it into workflows by things that are not better GPTs at all. And once you get to GPT, say, 5.5, I’m not sure you can just turn up the dial on smarts and have it, like, integrate general relativity and quantum mechanics.

Dwarkesh Patel 00:37:26

Why not?

Tyler Cowen 00:37:27

I don’t think that’s how intelligence works. And this is a Hayekian point. And some of these problems, there just may be no answer. Like, maybe the universe isn’t that legible, and if it’s not that legible, the GPT eleven doesn’t really make sense as a creature or whatever.

Dwarkesh Patel 00:37:44

Isn’t there a Hayekian argument to be made that, listen, you can have billions of copies of these things? Imagine the sort of decentralized order that could result, the amount of decentralized tacit knowledge that billions of copies talking to each other could have. That in and of itself, is an argument to be made about the whole thing as an emergent order will be much more powerful than we were anticipating.

Tyler Cowen 00:38:04

Well, I think it will be highly productive. What “tacit knowledge” means with AIs, I don’t think we understand yet. Is it by definition all non-tacit? Or does the fact that how GPT-4 works is not legible to us or even its creators so much? Does that mean it’s possessing of tacit knowledge, or is it not knowledge? None of those categories are well thought out, in my opinion. So we need to restructure our whole discourse about tacit knowledge in some new, different way. But I agree, these networks of AIs, even before, like, GPT-11, they’re going to be super productive, but they’re still going to face bottlenecks, right? And I don’t know how good they’ll be at, say, overcoming the behavioral bottlenecks of actual human beings, the bottlenecks of the law and regulation. And we’re going to have more regulation as we have more AIs.

You will note I corrected the AI transcriber on some minor matters.  In any case, self-recommending, and here is the YouTube embed:

Vitalik on AI and crypto

…we can expect to see in the 2020s that we did not see in the 2010s: the possibility of ubiquitous participation by AIs.

AIs are willing to work for less than $1 per hour, and have the knowledge of an encyclopedia – and if that’s not enough, they can even be integrated with real-time web search capability. If you make a market, and put up a liquidity subsidy of $50, humans will not care enough to bid, but thousands of AIs will easily swarm all over the question and make the best guess they can. The incentive to do a good job on any one question may be tiny, but the incentive to make an AI that makes good predictions in general may be in the millions. Note that potentially, you don’t even need the humans to adjudicate most questions: you can use a multi-round dispute system similar to Augur or Kleros, where AIs would also be the ones participating in earlier rounds. Humans would only need to respond in those few cases where a series of escalations have taken place and large amounts of money have been committed by both sides.

This is a powerful primitive, because once a “prediction market” can be made to work on such a microscopic scale, you can reuse the “prediction market” primitive for many other kinds of questions:

  • Is this social media post acceptable under [terms of use]?
  • What will happen to the price of stock X (eg. see Numerai)
  • Is this account that is currently messaging me actually Elon Musk?
  • Is this work submission on an online task marketplace acceptable?
  • Is the dapp at https://examplefinance.network a scam?
  • Is 0x1b54....98c3 actually the address of the “Casinu Inu” ERC20 token?

You may notice that a lot of these ideas go in the direction of what I called “info defense” in .

There are many other points, such as AIs as oracles, read it here, interesting throughout.

What are the actual dangers of advanced AI?

That is the focus of my latest Bloomberg column, 2x the normal length.  I cannot cover all the points, but here is one excerpt:

The larger theme is becoming evident: AI will radically disrupt power relations in society.

AI may severely limit, for instance, the status and earnings of the so-called “wordcel” class. It will displace many jobs that deal with words and symbols, or make them less lucrative, or just make those who hold them less influential. Knowing how to write well won’t be as valuable a skill five years from now, because AI can improve the quality of just about any text. Being bilingual (or tri- or quadrilingual, for that matter) will also be less useful, and that too has been a marker of highly educated status. Even if AIs can’t write better books than human authors, readers may prefer to spend their time talking to AIs rather than reading.

It is worth pausing to note how profound and unprecedented this development would be. For centuries, the Western world has awarded higher status to what I will call ideas people — those who are good at developing, expressing and putting into practice new ways of thinking. The Scientific and Industrial revolutions greatly increased the reach and influence of ideas people.

AI may put that trend into reverse.

And on arms races:

If I were to ask AI to sum up my worries about AI — I am confident it would do it well, but to be clear this is all my own work! — it might sound something like this: When dynamic technologies interact with static institutions, conflict is inevitable, and AI makes social disruption for the wordcel class and a higher-stakes arms race are more likely.

That last is the biggest problem, but it is also the unavoidable result of a world order based on nation-states. It is a race that the Western democracies and their allies have to manage and win. That is true regardless of the new technology in question: Today it is AI, but future arms races could concern solar-powered space weapons, faster missiles and nuclear weapons, or some yet-to-be-invented way of wreaking havoc on this planet and beyond. Yes, the US may lose some of these races, which makes it all the more important that it win this one — so it can use AI technologies as a counterweight to its deficiencies elsewhere.

In closing I will note for the nth time that rationalist and EA philosophies — which tend to downgrade the import of travel and cultural learning — are poorly suited for reasoning about foreign policy and foreign affairs.

Open AI will partner with Arizona State University

  • OpenAI on Thursday announced its first partnership with a higher education institution.
  • Starting in February, Arizona State University will have full access to ChatGPT Enterprise and plans to use it for coursework, tutoring, research and more.
  • The partnership has been in the works for at least six months.
  • ASU plans to build a personalized AI tutor for students, allow students to create AI avatars for study help and broaden the university’s prompt engineering course.

Here is the full story.  After a very brief lull, AI progress is heating up once again…

Meta, the forward march of progress

According to one estimate, that is about $20 billion spent on GPUs. Which I guess is a lot.

How is AI education going to work?

That is the topic of my latest Bloomberg column.  Here is the first part of the argument:

Two kinds of AI-driven education are likely to take off, and they will have very different effects. Both approaches have real promise, but neither will make everyone happy.

The first category will resemble learning platforms such as Khan Academy, Duolingo, GPT-4, and many other services. Over time, these sources will become more multimedia, quicker in response, deeper in their answers, and better at in creating quizzes, exercises and other feedback. For those with a highly individualized learning style — preferring videos to text, say, or wanting lessons slower or faster — the AIs will oblige. The price will be relatively low; Khan Academy currently is free and GPT-4 costs $20 a month, and those markets will become more competitive.

For those who want it, they will be able to access a kind of universal tutor as envisioned by Neal Stephenson in his novel The Diamond Age. But how many people will really want to go this route? My guess is that it will be a clear minority of the population, well below 50%, whether at younger or older age groups…

Chatbots will probably make education more fun, but for most people there is a limit to just how fun instruction can be.

And the second part:

There is, however, another way AI education could go — and it may end up far more widespread, even if it makes some people uneasy. Imagine a chatbot programmed to be your child’s friend. It would be exactly the kind of friend your kid wants, even (you hope) the kind of friend your kid needs. Your child might talk with this chatbot for hours each day.

Over time, these chatbots would indeed teach children valuable things, including about math and science. But it would happen slowly, subtly. When I was in high school, I had two close (human) friends with whom I often talked economics. We learned a lot from each other, but we were friends first and foremost, and the conversations grew out of that. As it turns out, all three of us ended up becoming professional economists.

This could be the path the most popular and effective AI chatbots follow: the “friendship first” model. Under that scenario, an AI chatbot doesn’t have to be more fun than spending time with friends, because it is itself a kind of friend. Through a kind of osmosis, the child could grow interested in some topics raised by the AI chatbot, and the chatbot could feed the child more information and inspiration in those areas. But friendship would still come first.

Worth a ponder.

Hester Peirce on the SEC

As she did on the LBRY case (n.b. I was a LBRY advisor), SEC Commissioner Hester Peirce released a statement about the failings of the SEC with regards to the recently approved Bitcoin spot ETPs. It’s rare to read something this brutal coming from the inside.

We squandered a decade of opportunities to do our job. If we had applied the standard we use for other commodity-based ETPs, we could have approved these products years ago, but we refused to do so until a court called our bluff.

…Today’s order does not undo the many harms created by the disparate treatment of spot bitcoin products.

First, our arbitrary and capricious treatment of applications in this area will continue to harm our reputation far beyond crypto. Diminished trust from the public will inhibit our ability to regulate the markets effectively. This saga will taint future interactions between the industry and our staff and will dampen the rich, informative dialogue that best protects investors.

Second, our disproportionate attention on these filings has diverted limited staff resources away from other mission critical work. Over ten years, likely millions of dollars of staff time has gone toward blocking these applications.

Third, our actions here have muddied people’s understanding of what the SEC’s role is. Congress did not authorize us to tell people whether a particular investment is right for them, but we have abused administrative procedures to withhold investments that we do not like from the public.

Fourth, by failing to follow our normal standards and processes in considering spot bitcoin ETPs, we have created an artificial frenzy around them. Had these products come to market in the way other comparable products typically have, we would have avoided the circus atmosphere in which we now find ourselves.

Fifth, we have alienated a generation of product innovators within our space. Our unreasonable approach to these applications has signaled that regulatory prejudice against new products and services can lead us to sidestep the law and unreasonably delay product launches. The industry has logged hundreds of meetings, has filed submissions, withdrawals and amendments, and ultimately had to resort to a costly legal battle to get us to today.

Although this is a time for reflection, it is also a time for celebration. I am not celebrating bitcoin or bitcoin-related products; what one regulator thinks about bitcoin is irrelevant. I am celebrating the right of American investors to express their thoughts on bitcoin by buying and selling spot bitcoin ETPs.[10] And I am celebrating the perseverance of market participants in trying to bring to market a product they think investors want. I commend applicants’ decade-long persistence in the face of the Commission’s obstruction.

Are chatbots better than we are?

Maybe so:

We administer a Turing Test to AI Chatbots. We examine how Chatbots behave in a suite of classic behavioral games that are designed to elicit characteristics such as trust, fairness, risk-aversion, cooperation, \textit{etc.}, as well as how they respond to a traditional Big-5 psychological survey that measures personality traits. ChatGPT-4 exhibits behavioral and personality traits that are statistically indistinguishable from a random human from tens of thousands of human subjects from more than 50 countries. Chatbots also modify their behavior based on previous experience and contexts “as if” they were learning from the interactions, and change their behavior in response to different framings of the same strategic situation. Their behaviors are often distinct from average and modal human behaviors, in which case they tend to behave on the more altruistic and cooperative end of the distribution. We estimate that they act as if they are maximizing an average of their own and partner’s payoffs.

That is from a recent paper by Qiaozhu Mei, Yutong Xie, Walter Yuan, and Matthew O. Jackson.  Via the excellent Kevin Lewis.

We need more talk of progress

Image

Here is the underlying John-Burn Murdoch FT piece, pointer here.  And a text excerpt:

Ruxandra Teslo, one of a growing community of progress-focused writers at the nexus of science, economics and policy, argues that the growing scepticism around technology and the rise in zero-sum thinking in modern society is one of the defining ideological challenges of our time.

I am pleased that Ruxandra is now also an Emergent Ventures winner.

What should I ask Jonathan Haidt?

Yes, I will be doing another Conversation with him.  Here is my previous Conversation with him, almost eight years ago.  As many of you will know, Jonathan has a new book coming out, namely The Anxious Generation: How the Great Rewiring of Childhood Is Causing an Epidemic of Mental Illness.  But there is much more to talk about as well.  So what should I ask him?