Category: Science

Do women disagree less in science?

This paper examines the authorship of post-publication criticisms in the scientific literature, with a focus on gender differences. Bibliometrics from journals in the natural and social sciences show that comments that criticize or correct a published study are 20-40% less likely than regular papers to have a female author. In preprints in the life sciences, prior to peer review, women are missing by 20-40% in failed replications compared to regular papers, but are not missing in successful replications. In an experiment, I then find large gender differences in willingness to point out and penalize a mistake in someone’s work.

That is from a new paper by David Klinowski.  Via the excellent Kevin Lewis.

My Conversation with Jonathan GPT Swift

Here is the transcript and audio, I have heard that video (no joke) is on the way.  Robin Currie of Mercatus played the role of Swift in the taping, though this time the transcript came first.  Here is part of the summary:

GPT Swift discusses his support for the Church of Ireland, his shift from the Whigs to the Tories, and his opposition to William Wood’s copper coinage in Ireland. He also talks about his works, including Gulliver’s Travels and A Modest Proposal, and his skepticism of moral and intellectual progress. Swift addresses rumors about his relationship with Esther Johnson and his fascination with scatological themes in his works. He also discusses his early life in England, his intellectual mentor Sir William Temple, and his jovial attitude towards death.

Here is one excerpt:

COWEN: Queen Anne: overrated or underrated? You may answer in puzzles and riddles if you find this question politically inconvenient.

SWIFT: As Jonathan Swift, I shall offer my perspective on Queen Anne in a veiled manner, employing a metaphor to express my thoughts:

Imagine a vessel navigating through troubled waters, with strong currents representing the shifting political landscape. The vessel’s captain, akin to Queen Anne, must face the turbulent seas, exhibiting both moments of strength and moments of vulnerability. Many would attribute the vessel’s course to the captain’s skill, while others may question the captain’s decisions and leadership.

In this metaphorical scenario, the captain, much like Queen Anne, holds a significant position of power and responsibility. However, the captain’s competence may be a subject of debate. Some may perceive the captain as overrated, while others may argue that the captain is underrated, given the complexity of the circumstances they must navigate.

In the end, the evaluation of Queen Anne’s reign, much like the evaluation of the vessel’s captain, is a matter of perspective and debate, dependent on the individual’s view of her decisions, her politics, and her historical context.

How about this for humor:

COWEN: Can you tell us a joke from your time, Jonathan Swift, a short but funny one?

SWIFT: I shall share a short jest that was popular during my time:

Why do we never tell secrets on a farm?

Because the potatoes have eyes and the corn has ears.

COWEN: So, what is a joke from your time that *you* would find funny?

SWIFT: My humor often leaned towards satire and irony. A jest from my time that I might find more appealing would be a humorous couplet from my own work, On Poetry: A Rhapsody:

“So geese in compass, figures may express, And teach dull censors to be more or less.”

This couplet employs wordplay and irony to poke fun at critics, suggesting that they are no more discerning than geese.

Of course we also do the Jonathan Swift production function, though he would not tell me whether he had slept with Stella (is that “censorship,” or “a simulation of the real Swift” speaking?).  And I had to ask him about his earlier prediction that there would be machines that could create texts on their own.

As for method, here is TC:

Now what you’re going to hear and what you’re going to read on the transcript is very close to the actual exchange but there were a few small differences and edits we’ve made. Very often the GPT would begin the answer with, “As Jonathan Swift.” We simply took that out. Some of the longer answers, there were resummaries at the end. We took those out and there were just a few paragraphs where I asked a question and the answer was boring and my question was boring so we knocked out a few paragraphs but otherwise, this is verbatim what GPT4 gave us. I did not keep on repeating prompts trying to get the answer I wanted. This is really awfully close to the dialogue.

Do read the whole thing.  It is too “textbook-y” in parts, but overall I was extremely impressed.

A brief observation on AGI risk and employee selection (from my email)

  • Stunting growth now in the development of artificial intelligence just makes the probability of a bad future outcome more likely, as the people who are prosocial and thoughtful are more likely to be discouraged from the field if we attach a stigma to it. My view is that most people are good and care about others and our collective future. We need to maintain this ratio of “good people” in AI research. We can’t have this become the domain of malevolent actors. It’s too important for humanity.

That is from Ben R.

What should I ask Kevin Kelly?

From Wikipedia:

Kevin Kelly (born 1952) is the founding executive editor of Wired magazine, and a former editor/publisher of the Whole Earth Review. He has also been a writer, photographer, conservationist, and student of Asian and digital culture

Among Kelly’s personal involvements is a campaign to make a full inventory of all living species on earth, an effort also known as the Linnaean enterprise. He is also sequencing his genome and co-organizes the Bay Area Quantified Self Meetup Group.

His Out of Control is a wonderful Hayekian book.  His three-volume Vanishing Asia is one of the greatest picture books of all time.  His new book (I haven’t read it yet) is Excellent Advice for Living: Wisdom I Wish I’d Known Earlier.  Here is Kevin on Twitter, here is his home page.

I will be doing a Conversation with him, so what should I ask?

Machine Learning as a Tool for Hypothesis Generation

While hypothesis testing is a highly formalized activity, hypothesis generation remains largely informal. We propose a systematic procedure to generate novel hypotheses about human behavior, which uses the capacity of machine learning algorithms to notice patterns people might not. We illustrate the procedure with a concrete application: judge decisions about who to jail. We begin with a striking fact: The defendant’s face alone matters greatly for the judge’s jailing decision. In fact, an algorithm given only the pixels in the defendant’s mugshot accounts for up to half of the predictable variation. We develop a procedure that allows human subjects to interact with this black-box algorithm to produce hypotheses about what in the face influences judge decisions. The procedure generates hypotheses that are both interpretable and novel: They are not explained by demographics (e.g. race) or existing psychology research; nor are they already known (even if tacitly) to people or even experts. Though these results are specific, our procedure is general. It provides a way to produce novel, interpretable hypotheses from any high-dimensional dataset (e.g. cell phones, satellites, online behavior, news headlines, corporate filings, and high-frequency time series). A central tenet of our paper is that hypothesis generation is in and of itself a valuable activity, and hope this encourages future work in this largely “pre-scientific” stage of science.

Here is the full NBER working paper by Jens Ludwig and Sendhil Mullainathan.

The Era of Planetary Defense Has Begun

In Modern Principles of Economics, Tyler and I use asteroid defense as an example of a public good (see video below). As of the 5th edition, this public good wasn’t being provided by either markets or governments. But thanks to NASA, the era of planetary defense has begun. In September of 2022 NASA smashed a spacecraft into an asteroid. A new set of five papers in Nature has now demonstrated that not only did NASA hit its target, the mission was a success in diverting the asteroid:

DART, which was the size of a golf cart, collided with a Great Pyramid-sized asteroid called Dimorphos. The impact caused the asteroid’s orbit around another space rock to shrink — Dimorphos now completes an orbit 33 minutes faster than before the impact, researchers report1 today in Nature.

…As DART hurtled towards Dimorphos at more than 6 kilometres per second, the first part that hit was one of its solar panels, which smashed into a 6.5-metre-wide boulder. Microseconds later, the main body of the spacecraft collided with the rocky surface next to the boulder — and the US$330-million DART shattered to bits….the spacecraft hit a spot around 25 metres from the asteroid’s centre, maximizing the force of its impact….large amounts of the asteroid’s rubble flew outwards from the impact. The recoil from this force pushed the asteroid further off its previous trajectory. Researchers estimate that this spray of rubble meant Dimorphos’ added momentum was almost four times that imparted by DART.

…Although NASA has demonstrated this technique on only one asteroid, the results could be broadly applicable to future hazards…if a dangerous asteroid were ever detected heading for Earth, a mission to smash into it would probably be able to divert it away from the planet.

Is Economics Self-Correcting?

The subtitle of that article is Replications in the American Economic Review, and the authors are Jörg Ankel-Peters, Nathan Fiala, and Florian Neubauer.  Here is the abstract:

Replication and constructive controversy are essential for scientific progress. This paper reviews the impact of all replications published as comments in the American Economic Review between 2010 and 2020. We investigate the citation rates of comments and whether a comment affects its original paper’s citation rates. We find that most comments are barely cited, and they have no impact on the original papers’ subsequent citations. This finding holds for original papers for which the comment diagnoses a substantive problem. We conclude from these citation patterns that replications do not update the economics literature. In an online opinion survey, we elicited viewpoints of both comment authors and original authors and find that in most cases, there is no consensus regarding the replication’s success and to what extent the original paper’s contribution sustains. This resonates with the conventional wisdom that robustness and replicability are hard to define in economics.

If you see a critical comment in the AER, the odds that it is correct, and significantly so, are really pretty high, given the barriers to getting in.  Yet no one seems to care.  (Note that the lack of caring is connected to the Bayesian inference that the published critical comments likely are correct.)  This is to me one of the more significant indictments of the economics professions as we know it today.  And it is not obvious how we might change this state of affairs.

Here is the argument in tweet storm form.

Home bias in top economics journals

Two of the top economics journals have institutional ties to a specific university, the Quarterly Journal of Economics (QJE) to Harvard University and the Journal of Political Economy (JPE) to the University of Chicago. Researchers from Harvard, but also nearby Massachusetts Institute of Technology (MIT), and from Chicago (co-)author a disproportionate share of articles in their respective home journal. Such home ties and publication bias may harm, but also benefit, article quality. We study this question in a difference-in-differences framework, using data on both current and past author affiliations and cumulative citation counts for articles published between 1995 and 2015 in the QJE, JPE, and American Economic Review (AER), which serves as a benchmark. We find that median article quality is lower in the QJE if authors have ties to Harvard and/or MIT than if authors are from other top-10 universities, but higher in the JPE if authors have ties to Chicago. We also find that home ties matter for the odds of journals to publish highly influential and low impact papers. Again, the JPE appears to benefit, if anything, from its home ties, while the QJE does not.

That is from a new paper by Dirk Bethmann, Felix Bransch, Michael Kvasnicka, and Abdolkarim Sadrieh.

AGI is Coming

ARSTechnica: On Monday, researchers from Microsoft introduced Kosmos-1, a multimodal model that can reportedly analyze images for content, solve visual puzzles, perform visual text recognition, pass visual IQ tests, and understand natural language instructions. The researchers believe multimodal AI—which integrates different modes of input such as text, audio, images, and video—is a key step to building artificial general intelligence (AGI) that can perform general tasks at the level of a human.

In 2020 Metaculus forecasters were predicting weak general AI by around 2053. Now they are predicting weak general AI by 2028 and strong general AI which includes:

  • Has general robotic capabilities, of the type able to autonomously, when equipped with appropriate actuators and when given human-readable instructions, satisfactorily assemble a (or the equivalent of a) circa-2021 Ferrari 312 T4 1:8 scale automobile model. A single demonstration of this ability, or a sufficiently similar demonstration, will be considered sufficient.

by 2040.

I never expected to witness the birth of aliens. It is a very strange time to be alive. If you think the world isn’t changing in a very uncertain and discontinuous way you just aren’t paying attention.

ChatGPT vs. the experts (Department of Uh-Oh)

ChatGPT’s answers are generally considered to be more helpful than humans’ in more than half of questions, especially for finance and psychology areas.

Most of all, ChatGPT does better in terms of concreteness.  Note also that ChatGPT uses more nouns and deploys a more neutral tone than do the human experts.  ChatGPT fares worst in the medical domain, but its biggest problem (from the point of view of the human evaluators) is giving too much information and not enough simple instructions.  Hmm…  In any case, here is the link.

I wonder how well the upgrades are going to do.

My Progress Studies conversation with Jon Baskin of The Point

No, not Nilsson’s The Point, the other one!  Here is the piece, here is one excerpt:

JB: What did you learn from reading Plato?

TC: Reading Plato, it struck me—I’m thirteen years old here—but it struck me how much the whole rest of the world is by no means on board with the things I thought were good. So Socrates is constructing his ideal republic, and you can debate whether Plato really favored it, or whether Socrates really favored it, but those ideas are out there. You ban the poets? You want to be like Sparta? Those are big issues.

And even then, when I read Plato, I saw it as a dialogic mode of thinking, rather than believing that Plato was endorsing everything Socrates said. The dialogues are rich and fruitful. And people hold a lot of different points of view. So to try to refine dialogic modes of thinking about progress, and indeed everything: that was the biggest lesson I got from Plato. And then just how smart some of the early people were. And that’s not unrelated to progress studies. In the modern world there are a whole bunch of ways we’re clearly much smarter, like programming computers. Are we smarter in every way? Will we produce our own Adam Smith or Plato? Tough questions.

And:

JB: I want to talk a little bit about Tolstoy. Max Weber, in his famous 1917 lecture “Science as a Vocation,” says that Tolstoy is the person who most sharply raises the question of whether the advances of science and technology have any meaning that go beyond the purely practical and technical. And he quotes Tolstoy saying that, basically, for the person who puts progress at the center of their life, life can never be satisfying, because they’ll always die in the middle of progress. How would you respond to this charge from Tolstoy about progress?

TC: I’m pretty happy and Tolstoy was not, would be my gut-level response. 

Self-recommended!

My Conversation with Brad DeLong

Here is the audio and transcript, here is part of the summary:

Tyler and Brad discuss what can really be gleaned from the fragmentary economics statistics of the late 19th century, the remarkable changes that occurred from 1870–1920, the astonishing flourishing of German universities in the 19th century, why investment banking allowed America and Germany to pull ahead of Britain economically, what enabled the Royal Society to become a force for progress, what Keynes got wrong, what Hayek got right, whether the middle-income trap persists, his favorite movie and novel, blogging vs. Substack, the Slouching Towards Utopia director’s cut, and much more.

And here is one excerpt:

COWEN: What do you take to be the best understanding of the 17th-century Scientific Revolution, if indeed you view it as a 17th-century revolution?

DELONG: I always think Joel Mokyr is absolutely magnificent on this. I think he understates the role that having printing by movable type played in creating the community of scientific practice and knowledge seeking.

There’s one thing that happens that is extremely unusual. Back before 1870, there’s no possibility at all that humanity is going to be able to bake the economic pie sufficiently large that everyone can have enough. Which means that, principally, politics and governance are going to be some elite constituting itself and elbowing other elites out of the way, and then finding a way to run a force-and-fraud domination and exploitation scheme on society so that they at least can have enough. When Proudhon wrote in 1840s that property is theft, it was not metaphor. It was really fact.

What does this elite consist of? Well, it’s a bunch of thugs with spears, the people who have convinced the thugs with spears that they’re their bosses, and their tame accountants, bureaucrats, and propagandists. Which means, most of the time, when you have a powerfully-moving-forward set of people thinking about ideas, whether the idea is true is likely to be secondary to whether the idea is useful to helping me keep my place as a tame propagandist in the force-and-fraud domination and exploitation elite machine.

This is a point I’ve stolen from Ernest Gellner, and I think it is very true. Yet, somehow, the Royal Society decides, no. The Royal Society decides nothing except through experiment — what we are going to demand that nature tell us, or tell one of us, or at least someone writes us a letter saying they’ve done the experiment about what is true. That is a miraculous and completely unexpected transformation, and one to which I think we owe a huge amount.

Many interesting points are discussed.

Alex Epstein responds on *Fossil Future*

I don’t agree with many (any?) of his points in his response, and it is conspicuously lacking in arguments about climate itself.  But here is my original review of Epstein’s book.  And his lengthy response, which I did tell him I would post on MR.  Here is one excerpt:

Second, I think Tyler’s comments involve the use of two fallacies that, together, are often used to dismiss powerful challenges to dangerous establishment ideas:

  1. “Strawmanning” powerful new ideas to make them appear unreasonable.
  2. “Softmanning” the challenged establishment ideas to make them appear reasonable.

I hope that seeing these fallacies illustrated in detail will help readers spot them in other contexts.

If you like disagreement, you will find some there.

Emergent Ventures winners, 24th cohort

Shakked Noy, MIT economics, to do RCTs on GPTs as teaching and learning tools.

Gabriel Birnbaum, Bay Area, from Fortaleza, Brazil, to investigate lithography as a key technology used in the manufacturing of microchips.

Moritz Wallawitsch, Berkeley. RemNote is his company, educational technology, and to develop a complementary podcast and for general career development.

Katherine Silk, Boston/Cambridge, general career support and to support advice for early-stage startups.

Benjamin Schneider, Brooklyn.  To write a book on the new urbanism.

Joseph Walker, Sydney, Australia, to run and expand the Jolly Swagman podcast.

Avital Balwit, Bay area, travel grant and general career development.

Benjamin Chang, Cambridge, MA. General career support, “I will develop novel RNA riboswitches for gene therapy control in human cells using machine learning.”

Daniel Kang, Berkeley/Champagne-Urbana, biometrics and crypto.

Aamna Zulfifiqar, Karachi, Pakistan, to attend UK higher education to study economics.

Jeremy Stern, Glendale, CA, Tablet magazine.  To write a book.

James Meech, PhD student, Cambridge, UK, to work on a random number generator for better computer architectures.

Arthur Allshire, University of Toronto, background also in Ireland and Australia, robotics and support to attend conferences.

Jason Hausenloy, 17, Singapore, travel and general career development, issues surrounding artificial intelligence.

Sofia Sanchez, Metepec, Mexico, biology and agricultural productivity, to spend a summer at a Stanford lab.

Ukraine tranche:

Andrey Liscovich, eastern Ukraine, formerly of Harvard, to provide equipment for public transportation, communication and emergency power generation to civilian authorities of frontline-adjacent areas in Ukraine which have lost vital infrastructure.

Chris Nicholson, Bay area, working as a broker to maintain internet connectivity in Ukraine.

Andrii Nikolaiev, Arsenii Nikolaiev, Zarina Kodyrova, Kvanta, to advance Ukrainian mathematics, help and train math Olympiad winners.

As usual, India and Africa/Caribbean tranches will be reported separately.