Category: Data Source
AI Scientists in the Lab
Today, we introduce Periodic Labs. Our goal is to create an AI scientist.
Science works by conjecturing how the world might be, running experiments, and learning from the results.
Intelligence is necessary, but not sufficient. New knowledge is created when ideas are found to be consistent with reality. And so, at Periodic, we are building AI scientists and the autonomous laboratories for them to operate.
…Autonomous labs are central to our strategy. They provide huge amounts of high-quality data (each experiment can produce GBs of data!) that exists nowhere else. They generate valuable negative results which are seldom published. But most importantly, they give our AI scientists the tools to act.
…One of our goals is to discover superconductors that work at higher temperatures than today’s materials. Significant advances could help us create next-generation transportation and build power grids with minimal losses. But this is just one example — if we can automate materials design, we have the potential to accelerate Moore’s Law, space travel, and nuclear fusion.
Our founding team co-created ChatGPT, DeepMind’s GNoME, OpenAI’s Operator (now Agent), the neural attention mechanism, MatterGen; have scaled autonomous physics labs; and have contributed to important materials discoveries of the last decade. We’ve come together to scale up and reimagine how science is done.
The AI’s can work 24 hours a day, 365 days a year and with labs under their control the feedback will be quick. In nine hours, AlphaZero taught itself chess and then trounced the then world champion Stockfish 8, (ELO around 3378 compared to Magnus Carlsen’s high of 2882). That was in 2017. In general, experiments are more open-ended than chess but not necessarily in every domain. Moreover context windows and capabilities have grown tremendously since 2017.
In other AI news, AI can be used to generate dangerous proteins like ricin and current safeguards are not very effective:
Microsoft bioengineer Bruce Wittmann normally uses artificial intelligence (AI) to design proteins that could help fight disease or grow food. But last year, he used AI tools like a would-be bioterrorist: creating digital blueprints for proteins that could mimic deadly poisons and toxins such as ricin, botulinum, and Shiga.
Wittmann and his Microsoft colleagues wanted to know what would happen if they ordered the DNA sequences that code for these proteins from companies that synthesize nucleic acids. Borrowing a military term, the researchers called it a “red team” exercise, looking for weaknesses in biosecurity practices in the protein engineering pipeline.
The effort grew into a collaboration with many biosecurity experts, and according to their new paper, published today in Science, one key guardrail failed. DNA vendors typically use screening software to flag sequences that might be used to cause harm. But the researchers report that this software failed to catch many of their AI-designed genes—one tool missed more than 75% of the potential toxins.
Solve for the equilibrium?
New data on social media
It has gone largely unnoticed that time spent on social media peaked in 2022 and has since gone into steady decline, according to an analysis of the online habits of 250,000 adults in more than 50 countries carried out for the FT by the digital audience insights company GWI. And this is not just the unwinding of a bump in screen time during pandemic lockdowns — usage has traced a smooth curve up and down over the past decade-plus.
Across the developed world, adults aged 16 and older spent an average of two hours and 20 minutes per day on social platforms at the end of 2024, down by almost 10 per cent since 2022. Notably, the decline is most pronounced among the erstwhile heaviest users — teens and 20-somethings…
Additional data from GWI trace the shift. The shares of people who report using social platforms to stay in touch with their friends, express themselves or meet new people have fallen by more than a quarter since 2014. Meanwhile, reflexively opening the apps to fill up spare time has risen, reflecting a broader pernicious shift from mindful to mindless browsing.
Here is more from John Burn-Murdoch in the FT. I was just doing as Aspen podcast two nights ago, where I spoke of social media as a problem that, in time, largely would solve itself. You also may recall my recent post about declining rates of depression for young adults. That said, you might wonder what exactly is the correct definition of social media (MR comments section?), and whether this study is tracking the proper conception of it.
For the pointer I thank Adrian Kelly.
Mark Skousen on recession warning signs
The White House and Wall Street were exuberant last week when the Commerce Department’s Bureau of Economic Analysis revised upward its second-quarter estimate of gross domestic product to show 3.8% growth in real terms, compared with a negative number in the first quarter. “US economy notches fastest growth pace in nearly two years in second quarter,” reported Yahoo Finance, “suggesting robust growth despite uncertainty set off by President Donald Trump’s tariff policy.”
Dig into the numbers, however, and you find the trade war is in fact wreaking economic havoc. Buried in last week’s BEA report is a much more reliable measure of the economy—gross output, or GO. It measures spending at all stages of production, totaling an estimated $63 trillion this year—more than twice GDP of $30 trillion.
GO revealed that economic growth is slowing to a crawl, ahead only 1.2% in real terms. If you include all transactions in wholesale and retail trade, the adjusted GO is up only 0.3%. More important, overall business spending fell sharply, by an annualized 5.6% in real terms. These results are much more consistent with the weak labor-market data.
Here is more from the WSJ.
Innovation and the Great Divergence
Abstract: Recent developments in historical national accounting suggest that the timing of the Great Divergence hinges on the different trends in northwest Europe and the Yangzi Delta region of China. The positive trend of GDP per capita in northwest Europe after 1700 was a continuation of a process that began in the fourteenth century, while the negative trend in the Yangzi Delta continued a pattern of alternating periods of growing and shrinking, but reaching a new lower level. These GDP per capita trends were driven by different paths of innovation. TFP growth was strongly positive in Britain after the Black Death, in the Netherlands during the sixteenth century and again in Britain from the mid-seventeenth century. Although TFP growth was positive in China during the Northern Song dynasty, it was predominantly negative during the Ming and Qing dynasties, in the Yangzi Delta as well as in China as a whole.
By Stephen Broadberry and Runzhuo Zhai, via the excellent Samir Varma.
Reading Orwell in Moscow
In this paper, I measure the effect of conflict on the demand for frames of reference, or heuristics that help individuals explain their social and political environment by means of analogy. To do so, I examine how Russia’s full-scale invasion of Ukraine in February 2022 reshaped readership of history and social science books in Russia. Combining roughly 4,000 book abstracts retrieved from the online catalogue of Russia’s largest bookstore chain with data on monthly reading patterns of more than 100,000 users of the most popular Russian-language social reading platform, I find that the invasion prompted an abrupt and substantial increase in readership of books that engage with the experience of life under dictatorship and acquiescence to dictatorial crimes, with a predominant focus on Nazi Germany. I interpret my results as evidence that history books, by offering regime-critical frames of reference, may serve as an outlet for expressing dissent in a repressive authoritarian regime.
That is from a job market paper by Natalia Vasilenok, political science at Stanford. Via.
Academic Human Capital in European Countries and Regions, 1200-1793
We present new annual time-series data on academic human capital across Europe from 1200 to 1793, constructed by aggregating individual-level measures at three geographic scales: cities, present-day countries (as of 2025), and historically informed macro-regions. Individual human capital is derived from a composite index of publication outcomes, based on data from the Repertorium Eruditorum Totius Europae (RETE) database. The macro-regional classifications are designed to re ect historically coherent entities, offering a more relevant perspective than modern national boundaries. This framework allows us to document key patterns, including the Little Divergence in academic human capital between Northern and Southern Europe, the effect of the Black Death and the Thirty Years’ War on academic human capital, the respective contributions of academies and universities, regional inequality within the Holy Roman Empire, and the distinctiveness of the Scottish Enlightenment.
Here is the full paper by Matthew Curtisa, David de la Croix, Filippo Manfredinib, and Mara Vitale. Via the excellent Samir Varma.
Does automation reduce stigma?
By removing human cashiers, self-checkout registers may alter feelings of embarrassment experienced by customers. Using high-frequency scanner data from supermarkets in the Washington, D.C. area with staggered adoption of self-checkout, we conduct event study analyses on consumer purchasing behavior. On the extensive margin, we find positive but noisy effects of self-checkout adoption on sales of some stigmatized items. On the intensive margin, we show that stigmatized items are much more likely to be purchased at self-checkout than at cashier registers, especially condoms and pregnancy tests. We estimate that customers are willing to pay 8.5 cents in additional time cost for the privacy of purchasing stigmatized items at self-checkout.
Here is the full paper by Rebecca Cardinali., et.al. Via the excellent Kevin Lewis.
I even draw distinctions across automated models. For instance, if I have “a stupid question,” I am more likely to ask Grok, since I would rather GPT maintain a higher opinion of what I do and do not know.
Is it the phones?
Or perhaps we should just credit Sydney Sweeney? That is from Chris Said.
Human growth sentences to ponder
The most striking finding is that males born in the 1960s appear to have had a later or smaller adolescent growth spurt than those born a decade earlier. Combining the NHANES surveys and their precursors, I show that males born in the 1960s were the same height in childhood as those born a decade earlier, but then fell behind and were around half an inch shorter in adolescence. By adulthood, the heights of the two cohorts were nearly identical. These patterns are consistent with the 1960s cohort experiencing a slower growth tempo in adolescence through either a later or smaller adolescent growth spurt, followed by catch-up growth by growing longer into early adulthood (later ”age at final height”). Similar patterns are not evident in the height of females; however, females born in the 1960s experienced menarche (first menstrual period) later than those born a decade earlier.
That is from a new paper by Nicholas Reynolds. Via the excellent Samir Varma.
A new RCT on banning smartphones in the classroom
Widespread smartphone bans are being implemented in classrooms worldwide, yet their causal effects on student outcomes remain unclear. In a randomized controlled trial involving nearly 17,000 students, we find that mandatory in-class phone collection led to higher grades — particularly among lower-performing, first-year, and non-STEM students — with an average increase of 0.086 standard deviations. Importantly, students exposed to the ban were substantially more supportive of phone-use restrictions, perceiving greater benefits from these policies and displaying reduced preferences for unrestricted access. This enhanced student receptivity to restrictive digital policies may create a self-reinforcing cycle, where positive firsthand experiences strengthen support for continued implementation. Despite a mild rise in reported fear of missing out, there were no significant changes in overall student well-being, academic motivation, digital usage, or experiences of online harassment. Random classroom spot checks revealed fewer instances of student chatter and disruptive behaviors, along with reduced phone usage and increased engagement among teachers in phone-ban classrooms, suggesting a classroom environment more conducive to learning. Spot checks also revealed that students appear more distracted, possibly due to withdrawal from habitual phone checking, yet, students did not report being more distracted. These results suggest that in-class phone bans represent a low-cost, effective policy to modestly improve academic outcomes, especially for vulnerable student groups, while enhancing student receptivity to digital policy interventions.
That is from a recent paper by Alp Sungu, Pradeep Kumar Choudhury, and Andreas Bjerre-Nielsen. Note with grades there is “an average increase of 0.086 standard deviations.” I have no problem with these policies, but it mystifies me why anyone would put them in their top five hundred priorities, or is that five thousand? Here is my earlier post on Norwegian smart phone bans, with comparable results.
The politics of depression in young adults
From a recent paper by Catherine Gimbrone, et.al.:
From 2005 to 2018, 19.8% of students identified as liberal and 18.1% identified as conservative, with little change over time. Depressive affect (DA) scores increased for all adolescents after 2010, but increases were most pronounced for female liberal adolescents (b for interaction = 0.17, 95% CI: 0.01, 0.32), and scores were highest overall for female liberal adolescents with low parental education (Mean DA 2010: 2.02, SD 0.81/2018: 2.75, SD 0.92). Findings were consistent across multiple internalizing symptoms outcomes. Trends in adolescent internalizing symptoms diverged by political beliefs, sex, and parental education over time, with female liberal adolescents experiencing the largest increases in depressive symptoms, especially in the context of demographic risk factors including parental education.
Here is the link. This is further evidence for what is by now a well-known proposition.
AI-led job interviews
We study the impact of replacing human recruiters with AI voice agents to conduct job interviews. Partnering with a recruitment firm, we conducted a natural field experiment in which 70,000 applicants were randomly assigned to be interviewed by human recruiters, AI voice agents, or given a choice between the two. In all three conditions, human recruiters evaluated interviews and made hiring decisions based on applicants’ performance in the interview and a standardized test. Contrary to the forecasts of professional recruiters, we find that AI-led interviews increase job offers by 12%, job starts by 18%, and 30-day retention by 17% among all applicants. Applicants accept job offers with a similar likelihood and rate interview, as well as recruiter quality, similarly in a customer experience survey. When offered the choice, 78% of applicants choose the AI recruiter, and we find evidence that applicants with lower test scores are more likely to choose AI. Analyzing interview transcripts reveals that AI-led interviews elicit more hiring-relevant information from applicants compared to human-led interviews. Recruiters score the interview performance of AI-interviewed applicants higher, but place greater weight on standardized tests in their hiring decisions. Overall, we provide evidence that AI can match human recruiters in conducting job interviews while preserving applicants’ satisfaction and firm operations.
That is from a new paper by Brian Jabarian and Luca Henkel.
The evolution of the economics job market
In the halcyon days of 2015-19, openings on the economics job market hovered at around 1900 per year. In 2020, Covid was a major shock, but the market bounced back quickly in 2021 and 2022. Since then, though, the market has clearly been in a funk. 2023, my job market year, saw a sudden dip in postings. 2024 was even worse, with openings falling 16% lower than the 2015-19 average.
At the time, the sudden fall in 2023 seemed mysterious—it was an otherwise healthy year for the broader labor market. In hindsight, it seems like the 2021-22 recovery masked some underlying weakness. The 2020 job market had 500 fewer openings than the 2014-19 average; 2021 and 2022 together produced only around 100 more jobs than the 2014-19 average. In other words, the recovery never made up for the pandemic; by this crude logic, around 400 economist jobs were “destroyed”.
…And of course, all of this decline occurred before the litany of disasters that have recently hit the Econ job market. In May, Jerome Powell announced that the Federal Reserve—perhaps the largest employer of economists in America—would cut its workforce by 10%. The federal government has frozen hiring, as has the World Bank. Hit by the dual threat of fines and looming cuts to federal funding, Harvard, MIT, the University of Washington, Notre Dame, Northwestern University, among others, have announced hiring freezes and budget cuts.
Here is more from Oliver Kim, who also offers a much broader discussion of the meaning of all this.
One look at negative emotional contagion
This paper studies how peers’ genetic predisposition to depression affects own mental health during adolescence and early adulthood using data from the National Longitudinal Study of Adolescent to Adult Health (Add Health). I exploit variation within schools and across grades in same-gender grademates’ average polygenic score—a linear index of genetic variants—for major depressive disorder (the MDD score). An increase in peers’ genetic risk for depression has immediate negative impacts on own mental health. A one standard deviation increase in same-gender grademates’ average MDD score significantly increases the probability of being depressed by 1.9 and 3.8 percentage points for adolescent girls (a 7.2% increase) and boys (a 25% increase), respectively. The effects persist into adulthood for females, but not males. I explore several potential mechanisms underlying the effects and find that an increase in peers’ genetic risk for depression in adolescence worsens friendship, increases substance use, and leads to lower socioeconomic status. These effects are stronger for females than males. Overall, the results suggest that there are important social-genetic effects in the context of mental health.
That is from a recent paper by Yeongmi Jeong, via the excellent Kevin Lewis.
It would take more than one paper to establish these claims
Nonetheless these are interesting results, worthy of further examination:
The measurement of intelligence should identify and measure an individual’s subjective confidence that a response to a test question is correct. Existing measures do not do that, nor do they use extrinsic financial incentive for truthful responses. We rectify both issues, and show that each matters for the measurement of intelligence, particularly for women. Our results on gender and confidence in the face of risk have wider applications in terms of the measurement of “competitiveness” and financial literacy. Contrary to received literature, women are more intelligent than men, compete when they should in risky settings, and are more literate.
That is from the September JPE, by Glenn W. Harrison, Don Ross, and J. Todd Swarthout. Here are ungated versions of the paper. Here is Bryan Caplan on the limitations of any single paper.