Category: Web/Tech
The wisdom of Gwern, why should you write?
Much of the value of writing done recently or now is simply to get stuff into LLMs. I would, in fact, pay money to ensure Gwern.net is in training corpuses, and I upload source code to Github, heavy with documentation, rationale, and examples, in order to make LLMs more customized to my use-cases. For the trifling cost of some writing, all the worlds’ LLM providers are competing to make their LLMs ever more like, and useful to, me.
And that’s just today! Who knows how important it will be to be represented in the initial seed training datasets…? Especially as they bootstrap with synthetic data & self-generated worlds & AI civilizations, and your text can change the trajectory at the start. When you write online under stable nyms, you may be literally “writing yourself into the future”. (For example, apparently, aside from LLMs being able to identify my anonymous comments or imitate my writing style, there is a “Gwern” mentor persona in current LLMs which is often summoned when discussion goes meta or the LLMs become situated as LLMs, which Janus traces to my early GPT-3 writings and sympathetic qualitative descriptions of LLM outputs, where I was one of the only people genuinely asking “what is it like to be a LLM?” and thinking about the consequences of eg. seeing in BPEs. On the flip side, you have Sydney/Roose as an example of what careless writing can do now.) Humans don’t seem to be too complex, but you can’t squeeze blood from a stone… (“Beta uploading” is such an ugly phrase; I prefer “apotheosis”.)
This is one of my beliefs: there has never been a more vital hinge-y time to write, it’s just that the threats are upfront and the payoff delayed, and so short-sighted or risk-averse people are increasingly opting-out and going dark.
If you write, you should think about what you are writing, and ask yourself, “is this useful for an LLM to learn?” and “if I knew for sure that a LLM could write or do this thing in 4 years, would I still be doing it now?”
Here is the link, or try this link. Of course not many people have the actual purpose of mind to believe such a thing. But a few do.
From Reed and Logchies
Introduction. Your analysis produces a statistically insignificant estimate. Is it because the effect is negligibly different from zero? Or because your research design does not have sufficient power to achieve statistical significance? Alternatively, you read that “The median statistical power [in empirical economics] is 18%, or less” (Ioannidis et al., 2017) and you wonder if the article you are reading also has low statistical power. By the end of this blog, you will be able to easily answer both questions. Without doing any programming.
An Online App. In this post, we show how to calculate statistical power post-estimation for those who are not familiar with R. To do that, we have created a Shiny App that does all the necessarily calculating for the researcher (CLICK HERE).
Here is the link and the full story.
The falling cost of tokens
That is from Elad Gil.
My excellent Conversation with Nate Silver
Here is the audio, video, and transcript. Here is the episode summary:
In his second appearance, Nate Silver joins the show to cover the intersections of predictions, politics, and poker with Tyler. They tackle how coin flips solve status quo bias, gambling’s origins in divination, what kinds of betting Nate would ban, why he’s been limited on several of the New York sports betting sites, how game theory changed poker tournaments, whether poker players make for good employees, running and leaving FiveThirtyEight, why funky batting stances have disappeared, AI’s impact on sports analytics, the most underrated NBA statistic, Sam Bankman-Fried’s place in “the River,” the trait effective altruists need to develop, the stupidest risks Tyler and Nate would take, prediction markets, how many monumental political decisions have been done under the influence of drugs, and more.
Here is one excerpt:
COWEN: Why shouldn’t people gamble only in the positive sum game? Take the US stock market — that certainly seems to be one of them — and manufacture all the suspense you want. Learn about the companies, the CEO. Get your thrill that way and don’t do any other gambling. Why isn’t that just better for everyone?
SILVER: Look, I’m not necessarily a fan of gambling for gambling’s sake. Twice a year, I’ll be in casinos and in Las Vegas a lot. Twice a year, I’ll have a friend who is like, “Let’s just go play blackjack for an hour and have a couple of free drinks,” and things like that. But I like to make bets where I think, at least in principle, I have an edge, or at least can fool myself into thinking I have an edge.
Sometimes, with the sports stuff, you probably know deep down you’re roughly break-even or something like that. You’re doing some smart things, like looking at five different sites and finding a line that’s best, which wipes out some but not all of the house edge. But no, I’m not a huge fan of slot machines, certainly. I think they are very gnarly and addictive in various ways.
COWEN: They limit your sports betting, don’t they?
SILVER: Yes, I’ve been limited by six or seven of the nine New York retail sites.
COWEN: What’s the potential edge they think you might have?
SILVER: It’s just that. If you’re betting $2,000 on the Wizards-Hornets game the moment the line comes out on DraftKings, you’re clearly not a recreational bettor. Just the hallmarks of trying to be a winning player, meaning betting lines early because the line’s early and you don’t have price discovery yet. The early lines are often very beatable. Betting on obscure stuff like “Will this player get X number of rebounds?” or things like that. If you have a knack for — if DraftKings has a line at -3.5 and it’s -4 elsewhere, then it can be called steam chasing, where you bet before a line moves in other places. If you have injury information . . .
It’s a very weird game. One thing I hope people are more aware of is that a lot of the sites — and some are better than others — but they really don’t want winning players. Their advertising has actually changed. It used to be, they would say for Daily Fantasy Sports, which was the predecessor, “Hey, you’re a smart guy” — the ads are very cynical — “You’re a smart guy in a cubicle. Why don’t you go do all your spreadsheet stuff and actually draft this team and make a lot of money, and literally, you’ll be sleeping with supermodels in two months. You win the million-dollar prize from DraftKings.”
And:
COWEN: If we could enforce just an outright ban, what’s the cost-benefit analysis on banning all sports gambling?
SILVER: I’m more of a libertarian than a strict utilitarian, I think.
COWEN: Sure, but what’s the utilitarian price of being a libertarian?
Recommended, interesting and engaging throughout. And yes, we talk about Luka too. Here is my first 2016 CWT with Nate, full of predictions I might add, and here is Nate’s very good new book On the Edge: The Art of Risking Everything.
Which books and blogs are in the Silicon Valley canon?
This Patrick Collison list is descriptive, not normative:
The Tinkerings of Robert Noyce
Seeing Like a State
The Dream Machine
The Sovereign Individual
The Beginning of Infinity
Surely You’re Joking, Mr Feynman
Softwar
Ashlee Vance’s Elon biography
The Mythical Man-Month
Mindstorms
Masters of Doom
Skunk Works
Structure and Interpretation of Computer Programs
Thinking in Systems
Superintelligence
The Whole Earth Catalog
Zero to One
The Hard Thing about Hard Things
Founders at Work
Showstopper
Dealers of Lightning
The Making of the Atomic Bomb
PG’s essays
The Rise and Fall of American Growth
The Big Score
Finite and Infinite Games
A Pattern Language
The Selfish Gene
The Lean Startup
Marginal Revolution (if it has to be a book, Stubborn Attachments)
Revolution in the Valley
Uncanny Valley
LessWrong
Slate Star Codex(/ACT)
The PayPal Wars
The Cathedral and the Bazaar
The Diamond Age
What the Dormouse Said
Zen and the Art of Motorcycle Maintenance
The Rise of Theodore Roosevelt
Titan (on Rockefeller)
The Power Broker
Gödel, Escher, Bach
What else?
I would love to see this natural experiment, Isaac Asimov edition
Before the sparse audience, he vowed to run the city of Cheyenne exclusively with an AI bot he calls “VIC” for “Virtual Integrated Citizen.”
…Standing behind a lectern with a sign that read “AI FOR MAYOR,” he gave a brief PowerPoint presentation on the history of AI. Then he stepped aside to give the floor to his Mac mini and iPad — which were propped on a table and connected to a hanging speaker at the front of the room — and told attendees to direct questions toward the screen.
“Is the computer system in city hall sufficient to handle AI?” one attendee, holding a wireless microphone at his seat, asked VIC.
“If elected, would you take a pay cut?” another wanted to know…
Midway through an interview with The Post, Miller offered to let the bot respond. VIC, in its robotic tone, correctly answered questions about trash day in Cheyenne, registering to vote and the current president of the United States.
VIC said it would argue against banning books — which some Cheyenne schools have done — citing their “educational value.” “But,” the bot added, “let’s create a process ensuring a balanced approach.”
Here is the full story. Via the excellent Kevin Lewis.
Okie-dokie, solve for the equilibrium
One of the grand challenges of artificial general intelligence is developing agents capable of conducting scientific research and discovering new knowledge. While frontier models have already been used as aids to human scientists, e.g. for brainstorming ideas, writing code, or prediction tasks, they still conduct only a small part of the scientific process. This paper presents the first comprehensive framework for fully automatic scientific discovery, enabling frontier large language models to perform research independently and communicate their findings. We introduce The AI Scientist, which generates novel research ideas, writes code, executes experiments, visualizes results, describes its findings by writing a full scientific paper, and then runs a simulated review process for evaluation. In principle, this process can be repeated to iteratively develop ideas in an open-ended fashion, acting like the human scientific community. We demonstrate its versatility by applying it to three distinct subfields of machine learning: diffusion modeling, transformer-based language modeling, and learning dynamics. Each idea is implemented and developed into a full paper at a cost of less than $15 per paper. To evaluate the generated papers, we design and validate an automated reviewer, which we show achieves near-human performance in evaluating paper scores. The AI Scientist can produce papers that exceed the acceptance threshold at a top machine learning conference as judged by our automated reviewer. This approach signifies the beginning of a new era in scientific discovery in machine learning: bringing the transformative benefits of AI agents to the entire research process of AI itself, and taking us closer to a world where endless affordable creativity and innovation can be unleashed on the world’s most challenging problems. Our code is open-sourced at this https URL
That is from a new paper by Chris Lu, Cong Lu, Robert Tjarko Lange, Jakob Foerster, Jeff Clune, David Ha. Note this is related to some earlier work in economics by Benjamin Manning of MIT (with co-authors).
I’ve said it before, and I’ll say it again. The marginal product of LLMs is when they are interacting with well-prepared, intricately cooperating humans at their peak, not when you pose them random queries for fun.
My excellent Conversation with Paul Bloom
Here is the audio, video, and transcript. Here is part of the episode summary:
Together Paul and Tyler explore whether psychologists understand day-to-day human behavior any better than normal folk, how babies can tell if you’re a jerk, at what age children have the capacity to believe in God, why the trend in religion is toward monotheism, the morality of getting paid to strangle cats, whether disgust should be built into LLMs, the possibilities of AI therapists, the best test for a theory of mind, why people overestimate Paul’s (and Tyler’s) intelligence, why flattery is undersupplied, why we should train flattery and tax empathy, Carl Jung, Big Five personality theory, Principles of Psychology by William James, the social psychology of the Hebrew Bible, his most successful unusual work habit, what he’ll work on next, and more.
And here is one excerpt:
COWEN: I have some questions about intelligence for you. If we think of large language models, should we let them feel disgust so that they avoid left-wing bias?
BLOOM: [laughs] Why would disgust make them avoid left-wing bias?
COWEN: Maybe we’re not sure it would, but there are various claims in the literature that for people on the right, disgust is a more fundamental emotion, and that a greater capacity to feel disgust encourages people in some ways to be more socially conservative. Debatable, but I don’t think it’s a crazy view. So, if you build LLMs, and you give them, say, a lot of empathy and not much or any disgust, you’re going to get left-leaning LLMs, which you might say, “Well, that was my goal.” But obviously, not everyone will accept that conclusion either.
BLOOM: I wouldn’t want woke LLMs. I think there’s a lot in extreme —
COWEN: You’ve got them, of course.
BLOOM: I’ve got them. I think Gemini is the one, if I wanted to go — the woke LLM of choice. Because I think the doctrine called wokeness leads to a lot of moral problems and makes the world worse in certain ways, but I wouldn’t mind left-wing LLMs.
In fact, I’m not a fan of disgust. You’re right that disgust is often associated with right-wing, but in the very worst instantiation of it. Disgust is what drives hatred towards gay people. It involves hatred of interracial marriage, the exclusion of immigrants, the exclusion of other races. If there’s one emotion I would take away from people, it would be disgust, at least disgust in the moral realm. They could keep their disgust towards rotten food and that sort of thing. That’s the one thing I wouldn’t put into LLMs. I’d rather put anger, pity, gratitude. Disgust is the one thing I’d keep away.
COWEN: So, you wouldn’t just cut back on it at the margin. You would just take disgust out of people if you could?
And:
COWEN: I think at the margin, I’ve moved against empathy more being a podcast host, that I’ll ask a question —
BLOOM: Wait. Why being a podcast host?
COWEN: Well, I’ll ask a question, and a lot of guests think it’s high status simply to signal empathy rather than giving a substantive answer. The signaling-empathy answers I find quite uninteresting, and I think a lot of my listeners do, too. Yet people will just keep on doing this, and I get frustrated. Then I think, “Well, Tyler, you should turn a bit more against empathy for this reason.” And I think that’s correct.
Paul cannot be blamed for doing that, however. So substantive, interesting, and entertaining throughout.
Arena magazine
Introducing: Arena Magazine. It’s a print & digital magazine about tech, capitalism, and the USA – that’s actually on the side of tech, capitalism, and the USA! Subscribe: arenamag.com/join
Here is a link and a thread.
From Google DeepMind (it’s happening)
We’re presenting the first AI to solve International Mathematical Olympiad problems at a silver medalist level. It combines AlphaProof, a new breakthrough model for formal reasoning, and AlphaGeometry 2, an improved version of our previous system.
Here is further information.
From the NYT three days ago “A.I. Can Write Poetry, but It Struggles With Math.” From the NYT today: “Move Over, Mathematicians, Here Comes AlphaProof.” And here is one opinion: “This type of A.I. learns by itself and can scale indefinitely, said Dr. Silver, who is Google DeepMind’s vice-president of reinforcement learning.” Okie-dokie!
Who uses ChatGPT?
By Anders Humlum and Emilie Vestergaard:
We study the adoption of ChatGPT, the icon of Generative AI, using a large-scale survey experiment linked to comprehensive register data in Denmark. Surveying 100,000 workers from 11 exposed occupations, we document that ChatGPT is widespread, but substantial inequalities have emerged. Women are 20 percentage points less likely to have used the tool. Furthermore, despite its potential to lift workers with less expertise, users of ChatGPT earned more already before its arrival. Workers see a substantial productivity potential in ChatGPT but are often hindered by employer restrictions and the need for training. Informing workers about expert assessments of ChatGPT shifts workers’ beliefs but has limited impacts on actual adoption.
Here is the link to the full paper.
Mark Zuckerberg on open source AI
Introducing Llama 3.1
Read about it here, use it here. Here is some video. Bravo!
*Why Machines Learn: The Elegant Math Behind Modern AI*
By Anil Ananthaswamy, an excellent book, it will make the end of year best non-fiction list. It focuses on machine learning and its offshoots, and you can read it for the story even if you don’t followall of the matrix algebra and the calculus. It is also the best book I know on how science advances by laying different “bricks,” and later bringing them all together toward a practicable solution. Recommended.
Will technology improve animal welfare?
That is the topic of my latest Bloomberg column, here is one excerpt:
There is, however, some better news on the animal welfare front. The cause is on the verge of some major victories — and they have been earned through technology rather than rhetoric.
The first major development is Ozempic and the other weight-loss drugs in the GLP-1 category. By one estimate, 25,000 Americans start taking these weight-loss drugs every week, and 93 million Americans may meet the criteria for using them. The spread of such drugs to many other countries is likely, especially since they seem to produce health gains above and beyond weight loss.
The logic is simple: People lose weight on these drugs because they eat less, and eating less usually means eating less meat. And less meat consumption results in less factory farming.
This should count as a major victory for animal welfare advocates, even though it did not come about through their efforts. No one had to be converted to vegetarianism, and since these drugs offer other benefits, this change in the equilibrium is self-sustaining and likely to grow considerably. Yes, it is only a partial victory, but total victory was unlikely anyway.
And this:
There is yet a third reason for animal welfare advocates to be optimistic. It is more speculative, but now seems less crazy than it used to: Super-powered AI could help us observe and learn animal languages, thus enabling humans to converse with at least some of the smarter (or at least more articulate?) animals. There is already a project at UC-Berkeley to converse with sperm whales by decoding their language and translating it to English, using techniques drawn from large-language models.
If we could talk with animals — and hear their complaints and descriptions of their own suffering — would we be less likely to eat them and treat them badly? How would we respond to the pleas of dolphins to stop using our nets to catch tuna, a process which kills many dolphins?
This is some chance this strategy could backfire; dolphins, for instance, may not be as charming as people think. Nonetheless, it holds at least some chance of a revolution in how we humans think of our relations with the rest of the animal kingdom.
Do you think there are any animals we could talk into vegetarianism, if only for marginal changes? If not, why be so optimistic that humans will change? Or maybe underneath it all, you do think that humans are somewhat special?