The case against the import of GPT-3

From an email from Agustin Lebron, noting that I will impose no further indentation:

“One thing that’s worth noting:

The degree of excitement about GPT-3 as a replacement for human workers, or as a path to AGI, is strongly inversely correlated with:

(a) How close the person is to the actual work. If you look at the tweets from Altman, Sutskever and Brockman, they’re pumping the brakes pretty hard on expectations.
(b) How much the person has actually built ML systems.

It’s a towering achievement to be able to train a system this big. But to me it’s clearly a dead-end on the way to AGI:

– The architecture itself is 3 years old: https://arxiv.org/abs/1706.03762. It is not an exaggeration to say that GPT-3’s architecture can be described as “take that 2017 paper and make 3 numbers (width, # layers, # heads) much bigger”. The fact that there hasn’t been any improvement in architecture in 3 years is quite telling.

– In the paper itself, the authors clearly say they’re quite near fundamental limits in being able to train an architecture like this. GPT-3 isn’t a starting point, it’s an end-point.

– If you look at more sober assessments (http://lacker.io/ai/2020/07/06/giving-gpt-3-a-turing-test.htmlhttps://minimaxir.com/2020/07/gpt3-expectations/), without the tweet selection bias, it starts to look less impressive.

– Within my fairly heterogeneous circle of ML-expert friends, there’s little disagreement about dead-end-ness.

The most interesting thing about GPT-3 is the attention and press that it’s gotten. I’m still not sure what to make of that, but it’s very notable.

Again, it’s incredibly impressive and also piles of fun, but I’m willing to longbet some decent money that we’re not replacing front-end devs with attention-layer-stacks anytime soon.”

I remain bullish, but it is always worth considering other opinions.

Comments

Well, yeah.

Thankfully Tyler is faster than most of us to recognize when he's made a mistake or misjudgment. Or at least the possibility that he's done so (it's not clear if Tyler agrees with that email or not).

Tyler still thinks blockchain, Etherium, and Vitalik Buterin are going to save mankind with crypto currency when it's all really just a solution waiting for a problem at best--or a pump and dump scheme at worst. If Tyler had been raising the status of cutting edge biomedical researchers over the last 5 years instead of shitcoins maybe we would have had a better response to covid.

Respond

Add Comment

Respond

Add Comment

Can GPT-3 tell us if critical race theory is a religion, and if so can it tell us how to rescue people from it's talons?

Now THAT would be useful.

Making up cute little stories not so much.

gwern graciously generated for me using GPT-3 two versions of Robot Steve Sailer's review of Robin DiAngelo's "White Fragility" (because I was unable to force myself through the book):

https://twitter.com/gwern/status/1286105615249612807

The second try is better than the first. I'd say GPT-3 is better at imitating Ms. DiAngelo's prose style than mine, but I suspect most writers would say the same thing. I can imagine that if Gore Vidal and William F. Buckley were still around they'd simultaneously bring out essays about how GPT-3, while it's bad at imitating them, really gets to the essence of the vapidity of their arch-rival's huffed up prose style.

It has mostly to do with how predictable you are. If every article is about stifling political correctness, coastal elites, and the managerial-professional class with an occasional splash of "long march through the institutions," GPT-3 can probably imitate that style pretty easily. More creative writers who have eclectic interests will be AI-proof for now.

Likewise, I bet GPT-3 could imitate Bernie Sanders' (in fairness, probably most politicians') interview and townhall answers pretty well. He is very disciplined in his word choice and the sorts of topics he is willing to speak on so that is perfect material for a chatbot to work with.

Quantity counts too. A chatbot would likely be able to do a good job imitating Trump's style at this point.

John McWhorter remarks about how limited President Trump's vocabulary is and how few phrases he uses.

Respond

Add Comment

Respond

Add Comment

Being more proof against AI by not really having either a recignidable style and voice or consistent ideas and themes is probably a bit of a Pyrrhic victory in the world of writing.

I think you can have a recognizable style and voice as well as consistent ideas without being boring and predictable. Good writers do things like connect ideas that no one thought to connect before, vary their word choice, bring in cultural or literary references and even tell jokes. ML algorithms are particularly bad with humor and people have commented that GPT-3 hasn't made significant advances in this area.

Foxes and hedgehogs again... Although I suspect ML will probably not have too hard a time with those sort of "showing off" essays that connect random and obscure ideas without too much of a wider argument. There are some predictable patterns in those.

Wait, so you drop an Isiah Berlin reference while denouncing "showing off" in writing? I'm not sure if we are disagreeing.

Respond

Add Comment

Respond

Add Comment

Respond

Add Comment

Respond

Add Comment

Respond

Add Comment

One of the interesting things it seems like you could do here is separate style from content - write up an essay with Sailer themed content but a DiAngelo prose style theme. You could actually then add a bit of testability to claims that X is a particularly bad or good stylist, without loadings of a particular writer's content choices biasing readers.

(For bonus lols sample unpopular politician speeches and drop em through an popular politician style filter, see how much difference it makes to subjective evaluations...)

Does a Christopher Hitchens lecture hold up when delivered in the voice and style of Gilbert Gottfried? I suspect not.

Hey, that's my wife's car! B*tch!!!

Respond

Add Comment

Respond

Add Comment

Respond

Add Comment

Respond

Add Comment

Respond

Add Comment

GPT-3 is getting a lot of attention because the API lets a lot of software engineers experiment with it and see what it's like. There has arguably been previous research of comparable performance, but the only people who get to use it are working in AI labs at Google, Facebook, etc. None of those research labs have made their product available to external software engineers via API.

I don't know if GPT-3 itself will become a promising product. But the situation reminds me of Xerox PARC. Some exciting research is well-known at the top corporate research lab. But it hasn't broken out into a popular consumer product. Maybe the research isn't there yet... or maybe the corporate research lab isn't the right place for the research to turn into new products, and to really grow, the technology needs to get into the hands of some startup that is still hungry to succeed.

Agree. The lesson of 20 years of developing consumer software products: build platforms that users can innovate on. GPT-3 adopted this approach for ML. Users are innovating by experimenting with inputs and the media are selecting for the most compelling. Darwinian evolution in everything.

Respond

Add Comment

Respond

Add Comment

Just like praising the taxpayer funded NHS system for its world leading response in handling the pandemic, a hallmark of true integrity is realizing when one was simply uninformed about a subject, and try to correct a previous false impression based on that lack of information. Though it is amusing to think how AI could replace humans in making grants within the framework of effective altruism, GPT3 is clearly not up to the task..

It would not take a very good AI to award a prize to the NHS Recovery Trial for providing the first proven fatality reducing treatment for COVID19. Being first is pretty easy to recognize. GPT-3 may have an unfair advantage even, with its pharmacology tutor ability.

Respond

Add Comment

", a hallmark of true integrity is realizing when one was simply uninformed about a subject"

prior, are you talking about your own posting history here?

Respond

Add Comment

Respond

Add Comment

Chalk it up to one extra piece in the puzzle of what things we can get computers to do. We've made huge progress on computer vision and now we have very significant progress on NLP. If anything, these might cause people to re-understand which tasks are truly mindless.

Respond

Add Comment

Yawn.

I’ve been hearing “but the architecture hasn’t changed in X years.” And “this idea is simply applying 50 year old math with more horsepower, that will never work” and “We need a whole new approach to make progress” since the start of the ML revolution. Every prediction of these people has been wrong.

Applying more horsepower has been shockingly effective. I expect it will continue to be.

Do you understand that this is essentially a blank slate argument? That is, you are arguing that simple nodes, and enough of them, will spontaneously form intelligence?

I think not. This is why some people drop the zinger that such an idea is based on a 1948 view of the brain.

We now know that the brain has a lot of specialized structures and subsystems, that handle things like vision and physics and very specialized ways.

It takes a certain kind of audacity to think that a grown neural network can suddenly understand color in a shadow or that the robot it sees in the street is actually a Halloween costume.

To think it's just "horsepower."

Say these words: “My model, and my predictions have been wrong for the last 10 years. Completely and shocking wrong. But this time is different, I’ll be right for sure.”

But that's not true. It is not surprising to me at all that ML algorithms have scaled specifically at what they do.

But it's not like a car with more horsepower becomes a space ship. In both cases there are other concerns for the "harder problem" that are not even addressed.

I would enjoy firing you.

have you ever heard of "fu money?"

Respond

Add Comment

Respond

Add Comment

Respond

Add Comment

I'm going to drop this here, but I think it has broad applications for the whole page, and how we process the "opinion" of Agustin Lebron above:

Are we processing his text syntactically or semantically?

That is, do we think about whether the (syntactic) text is smooth, and he sounds like an expert? Should it be set in opposition to how smooth or expert a dissenting opinion seems?

Or, can we build a (semantic) mental for the ML system and what it's doing? Can we look at the antagonistic models structurally, architecturally, as well?

(When a person says a thing, can we drop the person and understand the thing?)

I think the way human brains really work, the way AGI must really work, is that we can build fully semantic adversarial arguments and compare them .. according to the rules of the universe.

Respond

Add Comment

To get really geeky for a moment and step back a level or two, antagonistic arguments on the Group Mind of the internet are *only* useful when they are processed semantically.

By content, and certainly not with reductions like "whose status is raised."

Respond

Add Comment

Respond

Add Comment

Respond

Add Comment

Respond

Add Comment

Given Elon has left OpenAI due to declining "transarency" so it can get venture capital, my guess is it will become less relevant, needing to limit progress to create scarcity to be able to charge prices higher than "free".

Elon had by then fully understood that AI can't build factories, and without factories you have nothing of real value.

The direction of OpenAI seems to be down the same deadened that led to the US being so dependent on Asia, and tto a lesser degree Europe, for "high tech" goods, and all the claims of IP theft driving Trump et al to disrupt supply chains and in my view strengthen China and weaken the US.

AI is a useful part of a product you manufacture, but without the factories you build to build factories, AI is almost worthless. You will always be hostage to those building the factories you need to run any AI.

Just remember, if AI and robots replace workers, either everything will be free, or the robots will buy all production and Azi will build products to suit the needs and desires of Azi, not humans.

Respond

Add Comment

People often bring up selection bias in what samples get publicized, but I don't think it would be that big of a leap to incorporate that selection bias in the model itself. The human brain does something like this already - unconscious processing does all sorts of work, but it's only surfaced to the conscious level when it's considered "good enough".

Sure, that's supervised learning. The question is how domain-specific the algorithm used for learning is.

Respond

Add Comment

Respond

Add Comment

I work in this area, although I haven't got a demo yet.
The point of pursuing this architecture with many more parameters is that many people presumed that the benefit of adding parameters would level off, instead it has risen perfectly linearly with parameters; please read Gwern's post on this. It is true that GPT-3 doesn't innovate on architecture, but believe me the world does not lack DNN architecture innovation. There are hundreds of papers every year introducing new structures for memory, attention, binding, inference and reasoning into connectionist structures. Here's a recent one I like, but the important thing is to look at the sheer volume and pace of the related work.

https://www.aaai.org/Papers/AAAI/2020GB/AAAI-MinerviniP.4499.pdf

The very richest places (like openAI) are correctly using their resources to test the advantages of pure scaling, but every other little lab (and there are hundreds of them) is working on efficiencies and architecture.

How much of a step it is towards "AGI" is a tricky question because people have vastly different conceptions of what the "G" means. But replacing front-end devs? Or copy-writers? I see no reason why low-end online freelancers aren't already in competition with this.

Some of the pushback from the community comes from the fact that this has been on the way for at least five years, without getting any attention, and now suddenly everyone turns to it and asks "is this AGI!?". Of course not. But it isn't a single machine that one company methodically built up by stacking TPUs, it's the bow wave of a huge and varied global research movement.

A "front-end dev" at a minimum can take a sketch of a page layout and connect it to a back end database. For instance below. We are back to that syntactic to semantic jump. How does the zip code entry field connect to the Air Quality graphic? Is it something you can "train?"

https://www.airnow.gov/?city=Huntington%20Beach&state=CA&country=USA

(I do not live in Huntington Beach, location used for illustration.)

Respond

Add Comment

Respond

Add Comment

I agree with most of this, but in the broadest possible sense, the other "selection bias" that happens here is simply the weight towards optimistic prognostications about AGI.

First, that researchers are almost by definition over optimistic about their ideas. I'm finishing up my PhD, and my advisor is *wildly* overoptimistic about each new idea he tries. If you took his statements as forecasts, they would be terrible, but that's not the point. The fact is, the overwhelming majority of things you try are going to fail, that's why cutting edge quantitative research is hard. Maybe there are some perfectly rational calculators out there who can grapple with that, but starting each new project with the technically correct assessment that it is overwhelmingly unlikely to succeed is crippling for most talented researchers.

Then, there's the loudness of voices in the public space. Most of us here aren't really qualified to comment directly on the progress of AGI. I'm finishing my PhD in statistics so I'm closer than the overwhelming majority, and yet as I don't specifically work on these systems I don't have much insight beyond the experts.

And the incentives are much stronger for optimistic voices. Like, there's not much of a career to be had in bland predictions "Eh, I think we're going to face a bunch of unexpected challenges, and AGI is much further than people think". Now, anecdotally, I find that is a pretty common sentiment among ML researchers that I talk to! But, there's little reason that voice would be amplified, but there's lots to discuss for the minority with strong opinions that it is going to succeed.

Now, there are some fair counterpoints. Like, that broad sample of ML researchers with bland but slightly pessimistic views on AGI are working on closely related systems, but they by and large aren't working directly on the AGI stuff itself. So maybe they just aren't qualified to comment. Of course, the head honcho researchers of these projects tend to be a bit more circumspect, but maybe that's just politics (what's the advantage to sticking your neck out there for a tricky position).

But broadly, the optimism you hear in the this popular discourse about AGI vastly outpaces the day-to-day discussions I hear among actual ML researchers. But there is zero reason for anyone to amplify the voice of some researcher working steadily away at his academic job in some adjacent ML space who is a bit pessimistic about the track towards AGI.

Speaking as "just a programmer" I think anyone with experience looking at "what the code is really doing" can read Rodney Brooks and understand the differences between approaches, as well as why he says, not only haven't we built an AGI, we still don't know how to begin.

Somewhat random entry point to Brooks' blog:

https://rodneybrooks.com/agi-has-been-delayed/

Respond

Add Comment

Respond

Add Comment

Again, it’s incredibly impressive and also piles of fun, but I’m willing to longbet some decent money that we’re not replacing front-end devs with attention-layer-stacks anytime soon.

Yes.

My (current) take: GPT-3 is both Rubicon and Waterloo.

Respond

Add Comment

Again, it’s incredibly impressive and also piles of fun, but I’m willing to longbet some decent money that we’re not replacing front-end devs with attention-layer-stacks anytime soon.

Yes.

My (current) take: GPT-3 is both Rubicon and Waterloo.

Respond

Add Comment

Lacker's blog contains this entry:

Q: Which colorless green ideas sleep furiously?
A: Ideas that are colorless, green, and sleep furiously are the ideas of a sleep
furiously.

As we know, "colorless green ideas sleep furiously" is a sentence the Noam Chomsky created back in the 1950s (57? the year of Sputnik?) to make the point that syntax and semantics are orthogonal. The sentence is syntactically fine, but semantic nonsense.

Some years later poet John Hollander wrote a poem entitled "Coiled Alizarin":

Curiously deep, the slumber of crimson thoughts:
While breathless, in stodgy viridian,
Colorless green ideas sleep furiously.

Respond

Add Comment

Humans evolved from much simpler animals over a few hundred million years. It took many random evolutionary changes over millions of lifetimes.

AI will evolve much faster. The generations are shorter, the changes are better than random, and an enormous variety of possibilities are being tested.

GPT-3 is another step along the path to machine AGI that will surpass human AGI but it might take 100 years to do so.

The brain is not magic. Engineers and scientists can build something similar. Machines already play Chess and Go better than humans. They will surpass us at one task after another.

The analogy with evolution breaks down a bit because, with machine learning, you need smart people tweaking algorithms and setting parameters according to the specific problem that the algorithm is designed to solve. The human engineers, unlike the computer, understand the context and can choose an appropriate algorithm and parameterization based on that knowledge and understanding.

In the case of supervised learning, you also need intelligent humans to assemble and curate the training data that is fed into the algorithm.

In other words, this is clearly intelligent design and not evolution. It is just anthropomorphism that tempts us into thinking that doing certain tasks such as arranging words in a sentence based on some clever algorithms and reams of training data is close to intelligence. It is no more intelligent than an algorithm that determines the prime factors of a number. Whatever intelligence is, it is more than the ability to complete a few tasks.

Respond

Add Comment

An analogy from biology that may be worth exploring is that of mimicry. In mimicry, one animal takes on superficial characteristics of another in order to attract prey or scare of predators. However, it doesn't actually become the other animal and lacks many important strengths and characteristics of the animal it is mimicking.

So far, AI applications are intelligently designed programs that mimic certain characteristics of humans. The mimics may even fool some humans just as animals in the natural world get tripped up by mimicry that is the product of natural selection. Hopefully, by the time we have much more sophisticated algorithms and applications, we also have better definitions of "intelligence" so we know when or if we have crossed the threshold into real intelligence.

Respond

Add Comment

Respond

Add Comment

Could someone in this area of work explain to me why none of the computer game companies (to my knowledge) have tried to use any of these technologies? Games like Civilization for example would seem to be desperately in need of better AI...but have not moved forward in 20 years. It seems like low hanging fruit...but obviously(?) not. Anyone know why?

Millions of concurrent users require extremely high throughput. And so game AI relies on "simple" and "efficient" algorithms.

https://youtu.be/6VBCXvfNlCM

Well, I intentionally picked Civilization as an example of a game where There aren't millions of concurrent users and there is no need for speedy response. People play with hours in between moves or alone at home. But I take your point. It still seems like game developers would be pursuing this in order to protect their own futures...I and think that they might find a market for people willing to play against an intelligent AI.

Respond

Add Comment

Respond

Add Comment

Respond

Add Comment

I just finished my PhD in ML. I agree with the post, however the interesting and amazing points of GTP-3 are 1. Continued scaling in performance with scaling of parameters, 2. Engineering (and financial) challenges of training such large models. I think the mainstream media is over-hyped about it thanks to a combination of good performance, great results, the inherent OpenAI mission, and lack of context from models. We are in fact reaching a computational bottleneck in deep learning, let's see what the ML community propose next :)

Respond

Add Comment

Sounds like professional jealousy to me. Of course the ideas are simple but the fact we can get such impressive performance with such a simple model is actually the point. It means progress towards AI is probably easier than we thought. There has always been this claim that we don’t see real progress in AI because we never had enough computing power to throw at it. It seems this claim was broadly confirmed by GPT3.

Respond

Add Comment

This is an area I know quite a bit about, since I work directly in research and development in this area (with high probability, you, the reader, use my products). I agree with the post in general. I don't care about the architecture comments. In sum, while tasks we can do effectively will expand with these advances in time. Systems like your Google Assistant will have lower error rates, for example. But don't hail our new robot overlords just yet.

Respond

Add Comment

Lebron's comments are wrong.

Sutskever et al are most certainly not changing their mind about the prospects for AI scaling. What they are gingerly criticizing is the slight bit of hype that GPT-3, as-is, will immediately start dumping out reams of perfect code putting developers out of work. Obviously, it will not. Equally obviously, future systems will be far better, and those might.

His first point is so wrong I would describe it as a brazen lie if Lebron was any kind of DL expert. GPT is in fact obsolete, and this is a point in its favor, not against, as he somehow twists it. There have been constant advances in Transformer tech. (How does Lebron get this so wrong when the very section of the paper he cites as proof that scaling has ended is about outlining the ways forward like, oh say, bidirectional training?)

Lebron cites approvingly Lacker's post, but omits any argument for why Lacker's poor use of GPT-3 shows anything. He also cites Max Woolf's post, which if you read through, makes such cutting criticisms as API calls taking a bit of time. Wow. Devastating stuff. I'm sure decades from now, we will still be held back by GPT-3 API calls taking a few too many milliseconds. That's how all of this works, right?

Lebron cites his 'circle of ML friends'. Would this be the same circle which did not predict GPT-3, thinks that Sutskever et al think scaling is a dead end, and which is unaware of all the Transformer improvements like bidirectionality and work on linear attention? If so, so much the worse for Lebron and his friends. Perhaps he should get better friends.

Verdict: Lebron should spend less time retailing obsolete Twitter and Reddit submissions, and more time reading Arxiv papers.

Okay, so I'm just a programmer, but I think I read Brooks and understand him. At the bottom of this page he puts a table, showing what he sees as the strengths and weaknesses of the Symbolic, Neural, Robotic and Behavioral approaches:

https://rodneybrooks.com/forai-steps-toward-super-intelligence-i-how-we-got-here/

I read Lebron as concurring, not just that Neural doesn't have it all, no approach seems to right now. I also get that Neural approaches will get better. Perhaps you think a lot better.

But are you saying that "better" or "more horsepower" in their direction of growth will give them the kind emergent behaviors (in abstract symbolic manipulation, in planning) we need for AGI?

Or are you saying that they will get better, perhaps lots better, in their domain?

Respond

Add Comment

Maybe Damon Olsbot is just bored and gone, but fwiw I think that an AI expert should be able to explain to the level of a programmer, why or why not neural nets have the ability to "jump" not just to better learning, but to wholly different kinds of behaviors.

Consider: You have a 4 gallon bucket and a 1 gallon bucket, how do you measure out 7 gallons?

I don't think any Neural solution can do this on its own. AI "services" like Alexa might, but only if the human operators have mapped the question into a solution architecture explicitly. Otherwise "I don't know how to do that."

Respond

Add Comment

Respond

Add Comment

“...take that 2017 paper and make 3 numbers (width, # layers, # heads) much bigger”.

This makes GPT-3 *more* important, not less. Myself (and others in software) were very surprised that merely bumping these numbers up resulted in such an improvement. My assumption was that we'd hit diminishing returns much sooner than we actually did with this technique.

An improvement, by what measure? Is that a relevant measure to predict real life usefulness? Commercial viability?

Respond

Add Comment

Respond

Add Comment

There has been no progress towards artificial general intelligence (AGI) in the last 30 years. Machine learning is an high dimensional non-linear interpolation method, which interpolates well only in spaces where it has learned from high density data. Given how it interpolates, it necessarily extrapolates poorly, which is dangerous in many applications, but more profoundly makes it incapable of learning by analogy from one context to another, which surely must be an important part of any general intelligence. Machine learning not only isn't approaching AGI; in its failures it continues to demonstrate that it will never be a path to AGI. This, of course, is why it was abandoned in the 1990s by AI researchers. Machine learning is a wonderful thing, capable of many feats of pattern recognition in any limited and well defined context. It is not and will never be artificial intelligence. My only problem with the field of research that should be labelled Machine Learning is that it adopted the AI misnomer.

Respond

Add Comment

An amusing comment on Twitter:

"GPT-3 has intensified my fear of being governed by machines that are always slightly wrong about the most important things, but in so subtle ways that human intellects are usually not good enough to spot the differences."

https://twitter.com/Plinz/status/1286489101005283330?s=19

Maybe I'd worry about the "slightly."

Respond

Add Comment

I’m worried about the enthusiastic people thinking that the current examples of what GPT-3 produces are anywhere near acceptable replacements for e.g. documentation, search engines or user support.

GPT-3 basically remixes *everything* ever written. Including everything false or unacceptable. Its products are thus necessarily unreliable in terms of truthfulness or usefulness. It doesn’t have, and cannot, from its knowledge, determine, a measure to judge such things with.

As long as it isn’t combined with separate algorithms and machinery to assign truth and utility values to assertions (utility with respect to what goals? Hah!), I doubt it will become useful.

On the other hand, God help us all if someone manages to embody GPT-3 in such a way, without regard for AI safety concerns. Then we’re probably doomed. GPT-3 directly interacting with the real world, being able to run experiments against it and having *some* goal that moves it to do so may well develop into an AGI.

Yep, that's one of the greatest dangers of over-hype as with this GPT-3 stuff: some people might actually believe the over-hype and assign important tasks to GPT-3 or some other fancy ML-based product that they were snowed into buying.

Respond

Add Comment

Respond

Add Comment

Respond

Add Comment