My excellent Conversation with Seth Godin
Here is the audio, video, and transcript from a very good session. Here is part of the episode summary:
Seth joined Tyler to discuss why direct marketing works at all, the marketing success of Trader Joe’s vs Whole Foods, why you can’t reverse engineer Taylor Swift’s success, how Seth would fix baseball, the brilliant marketing in ChatGPT’s design, the most underrated American visual artist, the problem with online education, approaching public talks as a team process, what makes him a good cook, his updated advice for aspiring young authors, how growing up in Buffalo shaped him, what he’ll work on next, and more.
Here is one excerpt:
COWEN: If you were called in as a consultant to professional baseball, what would you tell them to do to keep the game alive?
GODIN: [laughs] I am so glad I never was a consultant.
What is baseball? In most of the world, no one wants to watch one minute of baseball. Why do we want to watch baseball? Why do the songs and the Cracker Jack and the sounds matter to some people and not to others? The answer is that professional sports in any country that are beloved, are beloved because they remind us of our parents. They remind us of a different time in our lives. They are comfortable but also challenging. They let us exchange status roles in a safe way without extraordinary division.
Baseball was that for a very long time, but then things changed. One of the things that changed is that football was built for television and baseball is not. By leaning into television, which completely terraformed American society for 40 years, football advanced in a lot of ways.
Baseball is in a jam because, on one hand, like Coke and New Coke, you need to remind people of the old days. On the other hand, people have too many choices now.
COWEN: What is the detail you have become most increasingly pessimistic about?
GODIN: I think that our ability to rationalize our lazy, convenient, selfish, immoral, bad behavior is unbounded, and people will find a reason to justify the thing that they used to do because that’s how we evolved. One would hope that in the face of a real challenge or actual useful data, people would say, “Oh, I was wrong. I just changed my mind.” It’s really hard to do that.
There was a piece in The Times just the other day about the bibs that long-distance runners wear at races. There is no reason left for them to wear bibs. It’s not a big issue. Everyone should say, “Oh, yeah, great, done.” But the bib defenders coming out of the woodwork, explaining, each in their own way, why we need bibs for people who are running in races — that’s just a microcosm of the human problem, which is, culture sticks around because it’s good at sticking around. But sometimes we need to change the culture, and we should wake up and say, “This is a good day to change the culture.”
COWEN: So, we’re all bib defenders in our own special ways.
GODIN: Correct! Well said. Bib Defenders. That’s the name of the next book. Love that.
COWEN: What is, for you, the bib?
GODIN: I think that I have probably held onto this 62-year-old’s perception of content and books and thoughtful output longer than the culture wants to embrace, the same way lots of artists have held onto the album as opposed to the single. But my goal isn’t to be more popular, and so I’m really comfortable with the repercussions of what I’ve held onto.
Recommended, interesting throughout. And here is Seth’s new book The Song of Significance: A New Manifesto for Teams.
It was just a simulation run…designed to create that problem
Meaning an entire, prepared story as opposed to an actual simulation. No ML models were trained, etc. It's not evidence for anything other than that the Air Force is now actively considering these sorts of possibilities.
— Edouard Harris (@harris_edouard) June 1, 2023
Please don’t be taken in by the b.s.! The rapid and uncritical spread of this story is a good sign of the “motivated belief” operating in this arena. And if you don’t already know the context here, please don’t even bother to try to find out, you are better off not knowing. There may be more to this story yet — context is that which is scarce — but please don’t jump to any conclusions until the story is actually out and confirmed.
Funny how people accuse “the AI” of misinformation, right?
Addendum: Here is a further update, apparently confirming that the original account was in error.
The art of prompting is just at its beginning
If prompted correctly, even GPT 3.5 can achieve draws against Stockfish 8 (an older but very powerful Chess engine), empirically demonstrating how LLMs are reasoning engines, not just text generation engines. https://t.co/an9eY27Re9
— Siqi Chen (@blader) May 29, 2023
And here are some results for Minecraft. I would like to see confirmations, but these are credible sources and this is all quite important if true.
Playing repeated games with Large Language Models
They are smart, but not ideal cooperators it seems, at least not without the proper prompts:
Large Language Models (LLMs) are transforming society and permeating into diverse applications. As a result, LLMs will frequently interact with us and other agents. It is, therefore, of great societal value to understand how LLMs behave in interactive social settings. Here, we propose to use behavioral game theory to study LLM’s cooperation and coordination behavior. To do so, we let different LLMs (GPT-3, GPT-3.5, and GPT-4) play finitely repeated games with each other and with other, human-like strategies. Our results show that LLMs generally perform well in such tasks and also uncover persistent behavioral signatures. In a large set of two players-two strategies games, we find that LLMs are particularly good at games where valuing their own self-interest pays off, like the iterated Prisoner’s Dilemma family. However, they behave sub-optimally in games that require coordination. We, therefore, further focus on two games from these distinct families. In the canonical iterated Prisoner’s Dilemma, we find that GPT-4 acts particularly unforgivingly, always defecting after another agent has defected only once. In the Battle of the Sexes, we find that GPT-4 cannot match the behavior of the simple convention to alternate between options. We verify that these behavioral signatures are stable across robustness checks. Finally, we show how GPT-4’s behavior can be modified by providing further information about the other player as well as by asking it to predict the other player’s actions before making a choice. These results enrich our understanding of LLM’s social behavior and pave the way for a behavioral game theory for machines.
Here is the full paper by Elif Akata, et.al.
Two podcasts I have enjoyed recently, First, The Social Radars, headed by Jessica Livingston and Carolynn Levy, co-founder and managing director of Y Combinator respectively. Jessica and Carolynn bring a lot of experience and trust with them so the entrepreneurs they interview open up about the realities of startups. Here is a great interview with David Lieb, creator of Google Photos and before that the highly successful failure, Bump.
I’ve also enjoyed the physician brothers Daniel and Mitch Belkin at The External Medicine Podcast. Here’s a superb primer on drug development from the great Derek Lowe.
ChatGPT as marketing triumph
That is the topic of my latest Bloomberg column, here is one excerpt:
Consider the name itself, ChatGPT. Many successful tech products have happy, shiny, memorable names: Instagram, Roblox, TikTok. Someone comes up with the name, perhaps in a brainstorming session, and then it is taken to the broader world. With luck, it will prove sticky and continue to invoke a warm glow.
ChatGPT, as a name, does not have a comparably simple history. Originally the service was GPT-2, then GPT-3. GPT-3.5 was then released, and ChatGPT was announced in November 2022. ChatGPT-3.5 was cumbersome, and by that point it was running the risk of becoming obsolete numerically. So ChatGPT became the popular moniker. When GPT-4 was released in March, there was the chance to switch to just GPT-4 and treat ChatGPT as a kind of interim brand, connected with the 3.5 edition, but no — ChatGPT it is.
I still prefer to call it GPT-4, to make it clear which version of the service I am using. GPT-3.5 remains available for free and is widely used, and so my inner persnickety nag feels the two should not be confused. But they are confused, and almost everyone (present company excluded) just says ChatGPT. Again, this is not textbook marketing, but users seem fine with it.
At this point, you might be wondering what that GPT stands for. It is “Generative Pre-Trained Transformer.” How many of its users could tell you that? How many of its users could explain what it means? I can imagine a marketing team sitting around a conference table insisting that GPT is too opaque and just won’t do.
But at OpenAI, software engineers and neural net experts are the dominant influences. So Americans get a name that has not been heavily market-tested and does not necessarily fit well into an advertising slogan. Maybe they like it, especially if they are sick of more and more of the economy being run by marketers.
There is more at the link. The part of GPT-4 that comes across as “focus group tested” — the neutering of Sydney and the like — is also the part that is not entirely popular, even if it is politically wise.
MMS: Massively Multilingual Speech.
– Can do speech2text and text speech in 1100 languages.
– Can recognize 4000 spoken languages.
– Code and models available under the CC-BY-NC 4.0 license.
– half the word error rate of Whisper.
— Yann LeCun (@ylecun) May 22, 2023
And trained on The New Testament, I might add, a text which exists in a great number of languages.
How effective was the IAEA?
Here is the Open AI call for international regulation, most of all along the lines of the International Atomic Energy Agency. I am not in general opposed to this approach, but I think it requires very strong bilateral supplements, from the United States of course. Which in turn requires U.S. supremacy in the area, as was the case with nuclear weapons. From a 564 pp. official work on the topic:
For nearly forty years after its birth in 1957 the IAEA remained essentially irrelevant to the nuclear arms race. (p.22)
There is also this:
However, in the late 1950s and early 1960s it was not the failure of the IAEA’s functions as a ‘pool’ or ‘bank’ or supplier of nuclear material that inflicted the most serious blow on the organization, on its safeguards operation and eventually on Cole himself. For a variety of reasons, the Agency’s chief patron, the USA, chose to arrange nuclear supplies bilaterally rather than through the IAEA. One reason was that the IAEA had been unable to develop an effective safeguards system. Another was that in a bilateral arrangement it was the US Administration, under the watchful eyes of Congress, that chose the bilateral partner rather than leaving the choice to an international organization that would have to respond to the needs of any Member State whatever its political system, persuasion or alliance. But the most serious setback came in 1958 when, for overriding political reasons, the USA chose the bilateral route in accepting the safeguards of EURATOM as equivalent to — in other words as an acceptable substitute for — those of the IAEA.
It is frequently suggested that the IAEA has been partially captured by the nuclear sector itself. I do not consider that bad news, but it is a sobering thought for those expecting too much from this approach. Do note that it took years to set up the agency, and furthermore when North Korea wanted to acquire nuclear weapons the country simply left the agency and broke its earlier agreement. Perhaps the greatest gain from this approach is that the non-crazy nations have a systematic multilateral framework to work within, should they decide to defer to the external, bilateral pressure from the United States?
On the other side, my fear is that the international agreement will lead to excess regulation at the domestic level.
There is also this:
The fact that Iraq’s nuclear weapon programme had been under way for several years, perhaps a decade, without being detected by the IAEA, led to sharp criticism of the Agency and posed the most serious threat to the credibility of its safeguards since they had first been applied some 30 years earlier.
All of these issues could use much more intelligent discussion.
Is software eating Japan? (from my email)
I came across a great series of posts by Richard Katz about Japan and information technology. It shows that software has not eaten Japan (for now?).
Some interesting facts:– “by 2025, 60% of Japan’s large companies will be operating core systems that are more than 20 years old. Would anyone today use a 2005 PC?”
– ” It is worth noting that the OECD shows Japan suffering an actual drop in output per employee in ICT business services from 2005 through 2021, its ICT productivity having peaked out in 1999. By contrast, Korea’s productivity grew 41% in the same period”
– “Among high school boys, Japan ranks number one in using the Internet to search for information on a particular topic several times a day, and Japan’s girls come in second among OECD girls. Japanese boys also rank number one in single-player online games. On the other hand, these boys score at, or near, the bottom on other activities, such as multi-player online games or uploading their own content”.
– In an OECD study, Japan is third from the bottom when it comes to the share of students ” who foresee having a career, not just in ICT, but in any area of science or engineering” [that was quite shocking to me, to be honest. Positively shocking, on the other hand, was that Portugal leads the pack]
– “What’s even more remarkable is that Japan’s top performers on the math and science tests come in dead last in the share who foresee themselves working in science or engineering”.
– Maybe this provides a bit of an explanation: ” In 2021, the average annual income of a Japanese ICT staffer was just ¥4.38 million ($34,466), down 4% from 2019. That was 2% below the median salary in Japan, whereas in the US and China, ICT salaries are 8-10% above the median.”
That is all from Krzysztof Tyszka-Drozdowski.
Ezra Klein and Jean Twenge on teen mental health
They did an NYT podcast, here is the transcript, here is the podcast itself. Excerpt:
EZRA KLEIN: Violent crime actually gets at why I wanted to ask this before we got into smartphones, because I’m much more familiar with the debate over crime than I am over the debate over suicide trend lines. And the thing that has always striking to me, and I think really underplayed in our national discourse over crime, is that we don’t really understand it, that if you go into the ’80s and the ’90s, you see crime goes way, way up in the 70s, way, way up in all kinds of different jurisdictions, more or less all at the same time.
And then it begins going way, way down in all kinds of different places all at the same time. So you have stories that people know, like there’s New York with Rudy Giuliani and broken windows policing and stop and frisk. But it also goes down in all these places that didn’t do what New York City did. It’s so widespread, both the rise and the fall, that you end up having researchers trying to think of even broader explanations, like whether or not lead and the amount of lead in young kids’ bloodstreams — and thus, the effect on their executive function when they got older, maybe that’s the causal mechanism.
And it made me wonder if there isn’t a chance that suicide and teen mental health is like that, that it has this kind of all the way up, all the way down in all places, and we don’t really understand why pattern to it.
Do read/listen to the whole thing, interesting throughout.
My Conversation with Simon Johnson
Here is the audio, video, and transcript. Here is part of the episode description:
What’s more intense than leading the IMF during a financial crisis? For Simon Johnson, it was co-authoring a book with fellow economist (and past guest) Daron Acemoglu. Written in six months, their book Power and Progress: Our Thousand-Year Struggle Over Technology and Prosperity, argues that widespread prosperity is not the natural consequence of technological progress, but instead only happens when there is a conscious effort to bend the direction and gains from technological advances away from the elite.
Tyler and Simon discuss the ideas in the book and on Simon’s earlier work on finance and banking, including at what size a US bank is small enough to fail, the future of deposit insurance, when we’ll see a central bank digital currency, his top proposal for reforming the IMF, how quickly the Industrial Revolution led to widespread prosperity, whether AI will boost wages, how he changed his mind on the Middle Ages, the key difference in outlook between him and Daron, how he thinks institutions affect growth, how to fix northern England’s economic climate, whether the UK should join NAFTA, improving science policy, the Simon Johnson production function, whether MBAs are overrated, the importance of communication, and more.
And here is one excerpt:
COWEN: If institutions are the key to economic growth, as many people have argued — Daron and yourself to varying degrees — why, then, is prospective economic growth so hard to predict?
In 1960, few people thought South Korea would be the big winner. It looked like their institutions were not that good. It was a common view: oh, Philippines, Sri Lanka — then Ceylon — would do quite well. They had English language to some extent. They seemed to have okay education. And those two nations have more or less flopped. South Korea took off. It’s now, per capita income roughly equal to France or Japan. Doesn’t that mean it’s not about institutions? Because institutions are pretty sticky.
JOHNSON: Yes, I think of institutions as being part of the hysteresis effect, if you can get it in a positive way, that if you grow and you strengthen institutions, which South Korea has done, it makes it much harder to relapse. There are plenty of countries that had spurts of growth without strong institutions and found it hard to sustain that.
You make a very good point about the early 1960s, Tyler. There wasn’t that much discussion that I’ve seen about institutions per se, but education — yes, absolutely. Culture — people made the same comparisons. They said, “Confucian culture is no good or won’t lead to growth. That’s a problem, for example, for South Korea.” That turned out to be wrong.
I think institutions are sticky. I think history matters a lot for them. They’re not predestination, though. You could absolutely carve your own way, but the carve-your-own way is harder when you start with institutions that are more problematic, less democratic, more autocratic control, less protection of property rights.
All of these things can go massively wrong, but building better institutions and making them sustainable, like Eastern Europe — the parts of the former Soviet Empire that managed to escape the Soviet influence after 1989, 1991 — I think those countries have worked long and hard, with very mixed results in some places, to build better institutions. And the EU has helped them in that regard, unquestionably.
I enjoyed this session with Simon.
Might AI possibly boost social trust?
That is the theme of my latest Bloomberg column, here is one bit:
The point is not that LLMs won’t be used to create propaganda — they will — but that they offer users another option to filter it out. With LLMs, users can get the degree of objectivity that they desire, at least after they learn how they work.
Still, this is a safe prediction: Within a year or two, there will be a variety of LLMs, some of them open source, and people will be able to use them to generate the kinds of answers they want. You might wonder how this is an improvement on the status quo, which now affords viewers a choice of polarized media sources such as Fox News or MSNBC, not to mention voices of all sorts on Twitter.
I hold out some hope for improvement in part because LLMs operate on a “pull” basis — that is, you ask them for what you want. Even if you are working with a “right-wing” LLM, you can always ask it for a left-wing or centrist perspective, and vice versa. It would be like watching Fox and having a button on your remote that you can click to get an opposing or contrasting view — within seconds. This is a vast improvement over cable TV, in terms of immediacy if nothing else. LLMs also make it very easy to generate a debate or a “compare and contrast” answer on just about any issue.
Again, there is no knowing how much balance people will want. But at least AI will make it easier to get balance if they so choose. That seems better than having a particular cable TV channel or a pre-established Twitter feed as the baseline default.
The impersonal nature of many LLMs may also be a force for ideological balance. Currently, many left-wingers don’t want to switch to Fox (or visit their neighbor) to hear a different perspective because they find the personalities offensive or obnoxious. LLMs offer the potential to sample points of view in their driest, least provocative and most analytically argued forms.
We’ll see how this goes, and any analysis or prediction here should be quite hedged, but I am more optimistic than many people on this front.
Patrick Collison interviews Sam Altman
I haven’t heard this one yet, but if ever there was such a thing as self-recommending…link here. Hat tip Alex T.!
Open source chat, no censorship
Here it is, like it or not…
If you are looking for a chat LLM without any forced 'alignment' or 'moralizing' censorship, I recommend `WizardLM-13B-Uncensored` which was literally just released today.
Been playing with it all morning. It is my favorite open-source chat model so far.https://t.co/9vrPyktaIz https://t.co/xIthzHyUyK
— hardmaru (@hardmaru) May 10, 2023
Generative AI and firm values
What are the effects of recent advances in Generative AI on the value of firms? Our study offers a quantitative answer to this question for U.S. publicly traded companies based on the exposures of their workforce to Generative AI. Our novel firm-level measure of workforce exposure to Generative AI is validated by data from earnings calls, and has intuitive relationships with firm and industry-level characteristics. Using Artificial Minus Human portfolios that are long firms with higher exposures and short firms with lower exposures, we show that higher-exposure firms earned excess returns that are 0.4% higher on a daily basis than returns of firms with lower exposures following the release of ChatGPT. Although this release was generally received by investors as good news for more exposed firms, there is wide variation across and within industries, consistent with the substantive disruptive potential of Generative AI technologies.
A significant effect, here is the new NBER working paper from Andrea L. Eisfeldt, Gregor Schubert, and Miao Ben Zhang.