Category: Web/Tech

Scenarios for the Transition to AGI

By Anton Korinek and Donghyun Suh, in a new NBER working paper:

We analyze how output and wages behave under different scenarios for technological progress that may culminate in Artificial General Intelligence (AGI), defined as the ability of AI systems to perform all tasks that humans can perform. We assume that human work can be decomposed into atomistic tasks that differ in their complexity. Advances in technology make ever more complex tasks amenable to automation. The effects on wages depend on a race between automation and capital accumulation. If automation proceeds sufficiently slowly, then there is always enough work for humans, and wages may rise forever. By contrast, if the complexity of tasks that humans can perform is bounded and full automation is reached, then wages collapse. But declines may occur even before if large-scale automation outpaces capital accumulation and makes labor too abundant. Automating productivity growth may lead to broad-based gains in the returns to all factors. By contrast, bottlenecks to growth from irreproducible scarce factors may exacerbate the decline in wages.

The best paper on these topics so far?  And here is a recent Noah Smith piece on employment as AI proceeds.  And a recent Belle Lin WSJ piece, via Frank Gullo, “Tech Job Seekers Without AI Skills Face a New Reality: Lower Salaries and Fewer Roles.”  And here is a proposal for free journalism school for everybody (NYT, okie-dokie!).

Gatekeeping is Apple’s Brand Promise

Steve Sinofsky, former president of Microsoft’s Windows division and now a VC, has an excellent deep dive on the EU’s Digital Markets Act (DMA). The Act is very squarely aimed at Apple, despite the fact that Apple is not a monopoly and has a significantly smaller share of the phone market than Android. Apple’s history is well known, in contrast with Microsoft it went for a closed system in which Apple controlled entry to a much greater extent. The same was true with iPhone versus Android.

iPhone was successful but it was not as successful as Android that came shortly after because of the constraints Steve put in place to be the best, not the highest share or the greatest number of units. Android was to smartphones just as Microsoft was to personal computers. Android sought out the highest share, greatest variety of hardware at the lowest prices, and most open platform for both phone makers and developers. By making Android open source, Google even out-Microsofted Microsoft by providing what hardware makers had always wanted—complete control. A lot more manufacturers, people, and companies appreciated that approach more than Apple’s. That’s why something like 7 out of 10 smartphones in the world run Android.

Android has the kind of success Microsoft would envy, but not Apple, primarily because with that success came most all the same issues that Microsoft sees (still) with the Windows PC. The security, privacy, abuse, fragility, and other problems of the PC show up on Android at a rate like the PC compared to Macintosh and iPhone. Only this time it is not the lack of motivation bad actors have to exploit iPhone, rather it is the foresight of the Steve Jobs vision for computing. He pushed to have a new kind of computer that further encapsulated and abstracted the computer to make it safer, more reliable, more private, and secure, great battery life, more accessible, more consistent, always easier to use, and so on. These attributes did not happen by accident. They were the process of design and architecture from the very start. These attributes are the brand promise of iPhone as much as the brand promise of Android is openness, ubiquity, low price, choice.

The lesson of the first two decades of the PC and the first almost two decades of smartphones are that these ends of a spectrum are not accidental. These choices are not mutually compatible. You don’t get both. I know this is horrible to say and everyone believes that there is somehow malicious intent to lock people into a closed environment or an unintentional incompetence that permits bad software to invade an ecosystem. Neither of those would be the case. Quite simply, there’s a choice between engineering and architecting for one or the other and once you start you can’t go back. More importantly, the market values and demands both.

That is unless you’re a regulator in Brussels. Then you sit in an amazing government building and decide that it is entirely possible to just by fiat declare that the iPhone should have all the attributes of openness.

Apple’s promise to iPhone users is that it will be a gatekeeper. Gatekeeping is what allows Apple to promise greater security, privacy, usability and reliability. Gatekeeping is Apple’s brand promise. Gatekeeping is what the consumer’s are buying. The EU’s DMA is an attempt to make Apple more “open” but it can only do so at the expense of turning Apple into Android, devaluating the brand promise and ironically reducing competition.

Read the whole thing for more details and history including useful comparisons with the US antitrust trial against Microsoft.

Austin Vernon on drones and defense (from my email)

I think they still favor the defensive. On the front line they make movement, hence offense, very difficult.

In the strategic sense we’ve already seen Ukraine adjust to the propeller drone/cruise missile attacks. The first few months were terrible for them but then they organized a defense system with the mobile anti drone teams. The interception percentage for drones traveling a fair distance over Ukraine is extremely high, 98% type numbers. Most of the Russian focus in now on more “front line” targets like Odessa because the Ukrainians don’t have as much time and space to make the interception. They are downing maybe 60%-70% of those drones.

The Russians are slow to adapt, but they eventually do. There is no reason to believe they won’t get better at intercepting these slow drones. Expensive cruise missiles with high success rates can end up being a better deal when strategic drones have 98% loss rates. The slow drones are better suited for near front line attacks. It also wouldn’t surprise me if they adapted to be more expensive to add features like quiet engines, thermal signature obfuscation, and lower radar cross sections.

I also think it’s worth pointing out that the Houthis have tried unmanned surface vehicles and they’ve all been quickly destroyed. Same with their slower drones. The hardest weapons to defend against have been conventional anti ship missiles and the newer ballistic anti ship missiles. You can argue about the intercepting missiles being too expensive, but the US is moving towards using more APKWS guided rockets against these strategic drone targets. These only cost $30,000 each and we already procure tens of thousands of them each year. The adaptation game is ongoing but the short range FPV drones seem quite durable while the strategic slow speed drone impact looks less sustainable.

Here is my original post.

Marc Andreessen and I talk AI at an a16z American Dynamism event

a16z has issued the talks from that event, and we are issuing it too, as a bonus episode of CWT.  But note it is shorter than usual, and not the typical CWT format — this was done for an audience of actual DC human beings!

Excerpt:

COWEN: Why is open-source AI in particular important for national security?

ANDREESSEN: For a whole bunch of reasons. One is, it is really hard to do security without open source. There are actually two schools of thought on information security, computer security broadly, that have played out over the last 50 years. There was one school of security that says you want to basically hide the source code, and you want to hide the source code precisely. This seems intuitive because, presumably, you want to hide the source code so that bad guys can’t find the flaws in it, right? Presumably, that would be the safe way to do things.

Then over the course of the last 30 or 40 years, basically, what’s evolved is the realization in the field (and I think very broadly) that actually, that’s a mistake. In the software field, we call that “security through obscurity,” right? We hide the code. People can’t exploit it. The problem, of course, is: okay, but that means the flaws are still in there, right?

If anybody actually gets to the code, they just basically have a complete index of all the problems. There’s a whole bunch of ways for people to get the code. They hack in. It’s actually very easy to steal software code from a company. You hire the janitorial staff to stick a USB stick into a machine at 3:00 in the morning. Software companies are very easily penetrated. It turned out, security through obscurity was a very bad way to do it. The much more secure way to do it is actually open source.

Basically, put the code in public and then basically build the code in such a way that when it runs, it doesn’t matter whether somebody has access to the code. It’s still fully secure, and then you just have a lot more eyes on the code to discover the problems. In general, open source has turned out to be much more secure. I would start there. If we want secure systems, I think this is what we have to do.

Marc is always in top form.

Claims about compute

Hard for me to judge this one, but do not underestimate elasticity of supply.  And by the way, initial reports on Devin are very positive.

TikTok divestiture

I’ve blogged this in the past, and don’t have much to add to my previous views.  I will say this, however: if TikTok truly is breaking laws on a major scale, let us start a legal case with fact-finding and an adversarial process.  Surely such a path would uncover the wrongdoing under consideration, or at least strongly hint at it.  Alternately, how about some research, such as say RCTs, showing the extreme reach and harmful influence of TikTok?  Is that asking for too much?

Now maybe all that has been done and I am just not aware of it.  Alternatively, perhaps this is another of those bipartisan rushes to judgment that we are likely to regret in the longer run.  In which case this would be filed under “too important to be left to legal fact-finding and science,” a class of issues which is sadly already too large.

Claude 3 Opus and AGI

As many MR readers will know, I don’t think the concept of AGI is especially well-defined.  Can the thing dribble a basketball “with smarts”?  Probably not.  Then its intelligence isn’t general.  You might think that kind of intelligence “doesn’t matter,” and maybe I agree, but that is begging the question.  It is easier and cleaner to just push the notion of “general” out of the center of the argument.  God aside, if such a being exists, intelligence never is general.

In the structure of current debates, the concept of “AGI” plays a counterproductive role.  You might think the world truly changes once we reach such a thing.  That means the doomsters will be reluctant to admit AGI has arrived, because imminent doom is not evident.  The Gary Marcus-like skeptics also will be reluctant to admit AGI has arrived, because they have been crapping on the capabilities for years.  In both cases, the stances on AGI tell you more about the temperaments of the commentators than about any capabilities of the beast itself.

I would say this: there is yet another definition of AGI, a historical one.  Five years ago, if people had seen Claude 3 Opus, would they have thought we had AGI?  Just as a descriptive matter, I think the answer to that question is yes, and better yet someone on Twitter suggested more or less the same.  In that sense we have AGI right now.

Carry on, people!  Enjoy your 2019-dated AGI.  Canada has to wait.

Concluding remarks: Forget that historical relativism!  True AGI never will be built, and that holds for OpenAI as well.  Humans in 2019 were unimaginative, super-non-critical morons, impressed by any piddling AI that can explain a joke or understand a non-literal sentence.  Licensing can continue and Elon is wrong.

The Continuing Influence of Fast Grants

Fast Grants, the rapid COVID funding mechanism created by Tyler, Patrick Collison and Patrick Hsu continues to inspire change around the world. Jano Costard, the Head of Challenges at SPRIND, the German Federal Agency for Disruptive Innovation writes:

Lots to learn from Fast Grants! Can we implement it in a public institutions that face a different set of rules (and legacy)? We tried with the Challenge program at the German Federal Agency for Disruptive Innovation, SPRIND, and succeeded, mostly.

While Fast Grants gave out grants in the first round in 48h, we haven’t been that speedy. Our last Challenge had 2 weeks and 2 days from deadline until final decision in a two stage evaluation procedure. Those last two days were spent doing pitches and the teams were informed of the decision the following night. So, it rather compares to the 2 weeks decision time Fast Grants has for later rounds.

During Covid, speed was of the utmost importance. But speed remains crucial now. Teams we fund have applications with other public funders undecided after more than 2 years. These delays accumulate and matter even for pressing but slowly advancing threats like climate change. No cleantech solution that is still in the lab today will have a meaningful impact on achieving our climate goals for 2030! It’s not only the R&D that takes time, getting to meaningful scale quickly will be much harder. That’s why there is no time to waste at the start of the process.

Fast grants has two important advantages when it comes to implementation: private funds and limited legacy. Public institutions often face additional rules and procedures that slow down processes. But this is not inevitable.

For SPRIND Challenges, we implemented a funding mechanism that left room for unbureaucratic processes and provided solutions for challenges that public funders or procurers typically face. This mechanism, called pre-commercial procurement, has been established by the European Commission in 2007 but was used in Germany only 1 time until we started to use it in 2021. This is also due to legacy in processes. Institutions execute their work in part based on an implicit understanding of how things need to be, about what is allowed and what is not. This might lead them to ignore new and beneficial instruments just because “this can’t be true”. Even worse, if new mechanisms are adopted by an institution with strong inherent understand of what can and cannot work, they run the risk of overburdening new and beneficial mechanisms with previous processes and requirements. In the end, a funding mechanism is just a tool. It needs to be used right.

SPRIND had the benefit of being a newly established public institution with important liberties in doing things differently and it’s lead by a Director @rafbuff who, at the time, had no experience in the public sector. So, did we find the ultimate way to research and innovation funding with SPRIND Challenges? Certainly not! Improvements are necessary but sometimes hard to achieve (looking at you, state-aid-law!).

Impressive! And check out SPRIND, they are funding have some interesting projects!

Approaching Human-Level Forecasting with Language Models

Forecasting future events is important for policy and decision making. In this work, we study whether language models (LMs) can forecast at the level of competitive human forecasters. Towards this goal, we develop a retrieval-augmented LM system designed to automatically search for relevant information, generate forecasts, and aggregate predictions. To facilitate our study, we collect a large dataset of questions from competitive forecasting platforms. Under a test set published after the knowledge cut-offs of our LMs, we evaluate the end-to-end performance of our system against the aggregates of human forecasts. On average, the system nears the crowd aggregate of competitive forecasters, and in some settings surpasses it. Our work suggests that using LMs to forecast the future could provide accurate predictions at scale and help to inform institutional decision making.

That is from a new paper by Danny Halawi, Fred Zhang, Chen Yueh-Han, and Jacob Steinhardt.  I hope you are all investing in that chrisma…

GPT as ethical advisor

This study investigates the efficacy of an AI-based ethical advisor using the GPT-4 model. Drawing from a pool of ethical dilemmas published in the New York Times column “The Ethicist”, we compared the ethical advice given by the human expert and author of the column, Dr. Kwame Anthony Appiah, with AI-generated advice. The comparison is done by evaluating the perceived usefulness of the ethical advice across three distinct groups: random subjects recruited from an online platform, Wharton MBA students, and a panel of ethical decision-making experts comprising academics and clergy. Our findings revealed no significant difference in the perceived value of the advice between human generated ethical advice and AI-generated ethical advice. When forced to choose between the two sources of advice, the random subjects recruited online displayed a slight but significant preference for the AI-generated advice, selecting it 60% of the time, while MBA students and the expert panel showed no significant preference.

That is a 2023 piece by Christian Terwiesch and Lennart Meincke, via the excellent Kevin Lewis.  And here is my earlier 2019 CWT with Dr. Kwame Anthony Appiah.

Daniel Gross on the printing press and GPT

In a way, everyone’s been wondering, trying to analogize ChatGPT with the printing press, but in reality it’s almost the opposite.

The entire thing is happening in the inverse of that, where the printing press was a technology to disseminate information through a book basically and convince people to do things, and the kind of anti-book is the LLM agent, which summarizes things very succinctly. If anything, it awakens people to the fact that they have been complicit in a religion for a very long time, because it very neatly summarizes these things for you and puts everything in latent space and suddenly you realize, “Wait a minute, this veganism concept is very connected to this other concept.” It’s a kind of Reformation in reverse, in a way, where everyone has suddenly woken up to the fact that there’s a lot of things that are wrong…

So yeah, it takes away all the subtlety from any kind of ideology and just puts it right on your face and yeah, people are having a reaction to it.

That is from the Ben Thompson (gated) interview with Daniel and Nat Friedman, self-recommending.

Your friendly AI assistant (it’s happening)

Klarnas AI assistant, powered by @OpenAI, has in its first 4 weeks handled 2.3 m customer service chats and the data and insights are staggering: – Handles 2/3 rd of our customer service enquires – On par with humans on customer satisfaction – Higher accuracy leading to a 25% reduction in repeat inquiries – Customer resolves their errands in 2 min vs 11 min – Live 24/7 in over 23 markets, communicating in over 35 languages It performs the equivalent job of 700 full time agents…

Link here.