Category: Data Source

Profile of Youyang Gu, data scientist

In mid-April, while he was living with his parents in Santa Clara, Calif., Gu spent a week building his own Covid death predictor and a website to display the morbid information. Before long, his model started producing more accurate results than those cooked up by institutions with hundreds of millions of dollars in funding and decades of experience.

“His model was the only one that seemed sane,” says Jeremy Howard, a renowned data expert and research scientist at the University of San Francisco. “The other models were shown to be nonsense time and again, and yet there was no introspection from the people publishing the forecasts or the journalists reporting on them. Peoples’ lives were depending on these things, and Youyang was the one person actually looking at the data and doing it properly.”

The forecasting model that Gu built was, in some ways, simple. He had first considered examining the relationship among Covid tests, hospitalizations, and other factors but found that such data was being reported inconsistently by states and the federal government. The most reliable figures appeared to be the daily death counts. “Other models used more data sources, but I decided to rely on past deaths to predict future deaths,” Gu says. “Having that as the only input helped filter the signal from the noise.”

The novel, sophisticated twist of Gu’s model came from his use of machine learning algorithms to hone his figures.

Here is the full Bloomberg piece by Ashlee Vance, I am especially pleased because Youyang was an Emergent Ventures winner.  Here is Youyang Gu on Twitter.

Diversity in policing

In the wake of high-profile police shootings of Black Americans, it is important to know whether the race and gender of officers and civilians affect their interactions. Ba et al. overcame previous data constraints and found that Hispanic and Black officers make far fewer stops and arrests and use force less than white officers, especially against Black civilians. These differences are largest in majority-Black neighborhoods in the city of Chicago (see the Perspective by Goff). Female officers also use less force than male officers. These effects are supportive of the efficacy of increasing diversity in police forces.

That is a new paper in Science by Bocar A. Ba, Dean Knox, Jonathan Mummolo, and Roman Rivera.  Via Anecdotal.

Half Doses of Moderna Produce Neutralizing Antibodies

A new phase II study from Moderna shows that half-doses (50 μg) appear to be as good as full doses (100 ug) at generating correlates of protection such as neutralizing antibodies.

In this randomized, controlled phase 2 trial, the SARS-CoV-2 vaccine candidate mRNA-1273, administered as a two-dose vaccination regimen at 50 and 100 μg, exhibited robust immune responses and an acceptable safety profile in healthy adults aged 18 years and older. Local and systemic adverse reactions were mostly mild-to-moderate in severity, were ≤4 days of median duration and were less commonly reported in older compared with younger adults. Anti-SARS-CoV-2 spike binding and neutralizing antibodies were induced by both doses of mRNA-1273 within 28 days after the first vaccination, and rose substantially to peak titers by 14 days after the second vaccination, exceeding levels of convalescent sera from COVID-19 patients. The antibodies remained elevated through the last time point assessed at 57 days. Neutralizing responses met criteria for seroconversion within 28 days after the first vaccination in the majority of participants, with rates of 100% observed at 14 and 28 days after the second vaccination. While no formal statistical testing was done, binding and neutralizing antibody responses were generally comparable in participants who received the 100 μg mRNA-1273 and the 50 μg dose at all time points and across both age groups. Overall, the results of this randomized, placebo-controlled trial extend previous immunogenicity and safety results for mRNA-1273 in the phase 1 study in an expanded cohort including participants older than 55 years of age [16, 19].

[These data] confirm that a robust immune response is generated at both 50 and 100 ug dose levels.

As I wrote earlier, halving the dose is equivalent to instantly doubling the output of every Moderna factory.

See my piece in the Washington Post on getting to V-day sooner for an overview of dose stretching strategies.

Addendum: France says one dose is sufficient for previously COVID infected.

What are the most important statistical ideas of the past 50 years?

We argue that the most important statistical ideas of the past half century are: counterfactual causal inference, bootstrapping and simulation-based inference, overparameterized models and regularization, multilevel models, generic computation algorithms, adaptive decision analysis, robust inference, and exploratory data analysis. We discuss common features of these ideas, how they relate to modern computing and big data, and how they might be developed and extended in future decades. The goal of this article is to provoke thought and discussion regarding the larger themes of research in statistics and data science.

By Andrew Gelman and Aki Vehtari, via Lampros Tzontsos.

Monopsony it ain’t, rather output loss

We use highly consistent national-coverage price and wage data to provide evidence on wage increases, labor-saving technology introduction, and price pass-through by a large low-wage employer facing minimum wage hikes. Based on 2016-2020 hourly wage rates of McDonald’s Basic Crew and prices of the Big Mac sandwich collected simultaneously from almost all US McDonald’s restaurants, we find that in about 25% of instances of minimum wage increases, restaurants display a tendency to keep constant their wage ‘premium’ above the increasing minimum wage. Higher minimum wages are not associated with faster adoption of touch-screen ordering, and there is near-full price pass-through of minimum wages, with little heterogeneity related to how binding minimum wage increases are for restaurants. Minimum wage hikes lead to increases in real wages (expressed in Big Macs an hour of Basic Crew work can buy) that are one fifth lower than the corresponding increases in nominal wages.

That is a new paper from Orley Ashenfelter and Štěpán Jurajda.  I will ask again my standard question: don’t we have ways of helping poorer individuals that boost output rather than harming it?  Why don’t we focus on those?

Fortunately roadblocks are arising.

Via Ilya Novak.

The effects of fluoride in drinking water

Water fluoridation is a common but debated public policy. In this paper, we use Swedish registry data to study the causal effects of fluoride in drinking water. We exploit exogenous variation in natural fluoride stemming from variation in geological characteristics at water sources to identify its effects. First, we reconfirm the long-established positive effect of fluoride on dental health. Second, we estimate a zero effect on cognitive ability in contrast to several recent debated epidemiological studies. Third, fluoride is furthermore found to increase labor income. This effect is foremost driven by individuals from a lower socioeconomic background.

That is from a forthcoming JPE paper by Linuz Aggeborn and Mattias Öhman.

Gender and the dynamics of economics seminars

This paper reports the results of the first systematic attempt at quantitatively measuring the seminar culture within economics and testing whether it is gender neutral. We collected data on every interaction between presenters and their audience in hundreds of research seminars and job market talks across most leading economics departments, as well as during summer conferences. We find that women presenters are treated differently than their male counterparts. Women are asked more questions during a seminar and the questions asked of women presenters are more likely to be patronizing or hostile. These effects are not due to women presenting in different fields, different seminar series, or different topics, as our analysis controls for the institution, seminar series, and JEL codes associated with each presentation. Moreover, it appears that there are important differences by field and that these differences are not uniformly mitigated by more rigid seminar formats. Our findings add to an emerging literature documenting ways in which women economists are treated differently than men, and suggest yet another potential explanation for their under-representation at senior levels within the economics profession.

That is from a new paper by Pascaline Dupas, Alicia Sasser Modestino, Muriel Niederle, Justin Wolfers,
and the Seminar Dynamics Collective.  Via Kris Gulati.

Why Does Teacher Quality Matter?

From Mike Insler, Alexander F. McQuoid, Ahmed Rahman, and Katherine A. Smith, here is an apparently major result:

This work disentangles aspects of teacher quality that impact student learning and performance. We exploit detailed data from post-secondary education that links students from randomly assigned instructors in introductory-level courses to the students’ performances in follow-on courses for a wide variety of subjects. For a range of first-semester courses, we have both an objective score (based on common exams graded by committee) and a subjective grade provided by the instructor. We find that instructors who help boost the common final exam scores of their students also boost their performance in the follow-on course. Instructors who tend to give out easier subjective grades however dramatically hurt subsequent student performance. Exploring a variety of mechanisms, we suggest that instructors harm students not by “teaching to the test,” but rather by producing misleading signals regarding the difficulty of the subject and the “soft skills” needed for college success. This effect is stronger in non-STEM fields, among female students, and among extroverted students. Faculty that are well-liked by students—and thus likely prized by university administrators—and considered to be easy have particularly pernicious effects on subsequent student performance.

Via the excellent Kevin Lewis.

Short selling and the price discovery process

We show that stock prices are more accurate when short sellers are more active. First, in a large panel of NYSE-listed stocks, intraday informational efficiency of prices improves with greater shorting flow. Second, at monthly and annual horizons, more shorting flow accelerates the incorporation of public information into prices. Third, greater shorting flow reduces post-earnings-announcement drift for negative earnings surprises. Fourth, short sellers change their trading around extreme return events in a way that aids price discovery and reduces divergence from fundamental values. These results are robust to various econometric specifications, and their magnitude is economically meaningful.

That is from 2013 research by Boehmer and Wu.  Next, beware of any paper doing cross-sectional comparisons across 111 countries, but here is how the resulting correlations go (link):

We find that when short-selling is possible, aggregate stock returns are less volatile and there is greater liquidity. When countries start to permit short-selling, aggregate stock price increases, implying lower a cost of capital. There is no evidence that short-sale restrictions affect either the level of skewness of returns or the probability of a market crash. Collectively, our empirical evidence suggests that allowing short-selling enhances market quality.

While I would not draw very firm conclusions from that, it is not going to help the case against short-selling.  Here is a general literature survey from 2020.  Lots is murky, but again the evidence is not supporting the often rather polemic critics.  A very general point is that short selling is less different from “plain selling” than you might think, all the more so if you consider dynamic portfolio strategies (which can replicate just about any underlying desired net position).

Furthermore, these days there is more choice than ever before.  If the asset you have in mind is somehow too fragile to withstand short-selling pressure, but is valuable nonetheless, staying private never has been easier and with some degree of liquidity to boot.

Why U.S. Immigration Barriers Matter for the Global Advancement of Science

From Ruchir Agarwal, Ina Ganguli, Patrick Gaule, and Geoff Smith:

This paper studies the impact of U.S. immigration barriers on global knowledge production. We present four key findings. First, among Nobel Prize winners and Fields Medalists, migrants to the U.S. play a central role in the global knowledge network— representing 20-33% of the frontier knowledge producers. Second, using novel survey data and hand-curated life-histories of International Math Olympiad (IMO) medalists, we show that migrants to the U.S. are up to six times more productive than migrants to other countries—even after accounting for talent during one’s teenage years. Third, financing costs are a key factor preventing foreign talent from migrating abroad to pursue their dream careers, particularly talent from developing countries. Fourth, certain ‘push’ incentives that reduce immigration barriers – by addressing financing constraints for top foreign talent – could increase the global scientific output of future cohorts by 42% percent. We conclude by discussing policy options for the U.S. and the global scientific community.

Via the excellent Kevin Lewis.

Money is good, at more margins than you might have thought

Past research has found that experienced well-being does not increase above incomes of $75,000/y. This finding has been the focus of substantial attention from researchers and the general public, yet is based on a dataset with a measure of experienced well-being that may or may not be indicative of actual emotional experience (retrospective, dichotomous reports). Here, over one million real-time reports of experienced well-being from a large US sample show evidence that experienced well-being rises linearly with log income, with an equally steep slope above $80,000 as below it. This suggests that higher incomes may still have potential to improve people’s day-to-day well-being, rather than having already reached a plateau for many people in wealthy countries.

That is from a new paper by Matthew A. Killingsworth.  Via the excellent Kevin Lewis.

U.S.A. fact of the day

The U.S. also experienced its most violent year in decades with an unprecedented rise in homicides. The Gun Violence Archive reported that more than 19,000 people died in shootings or firearm-related incidents in 2020, the highest figure in over two decades.

New Orleans-based crime analyst Jeff Asher took a closer look at the number of murders in 57 major American cities and he found that the number of offenses grew in 51 of them. He only focused on agencies where data was available and most of them had figures through November or December of 2020. Growth in violent crime varied by city with Seattle seeing a 74 percent spike in homicides between 2019 and 2020 while Chicago and Boston saw their offenses grow 55.5 percent and 54 percent, respectively. Elsewhere, Washington D.C. and Las Vegas saw growth in their murder offences, albeit at a slower pace of less than 20 percent.

New York’s homicide count went up by nearly 40 percent with Mayor Bill de Blasio stating that the figures should worry all New Yorkers and it has to stop.

Here is the full link.  Via Noah.

Our aggregate demand shortfall?

From Ben Casselman: “…unlike with most measures of the economy, retail sales are actually ABOVE their prepandemic level. Up 2.6% from February, and 2.9% over the past year. So not a clean story like with jobs.”

And from Larry Summers: ” Total household income is 8% above what anybody thought it would be before Covid.”

There are very real macroeconomic problems right now, but please keep the following in mind while drawing up a “demand-based” stimulus plan.  Focus on public health!

Ireland fact of the day, coming soon to a state near you

Some experts estimate this could mean, if we do not accelerate the pace of vaccination, one million deaths for the United States.

Who Runs the AEA?

That is a new paper by Kevin D. Hoover and Andrej Svorenčík:

The leadership structure of the American Economics Association is documented using a biographical database covering every officer and losing candidate for AEA offices from 1950 to 2019. The analysis focuses on institutional affiliations by education and employment. The structure is strongly hierarchical. A few institutions dominate the leadership, and their dominance has become markedly stronger over time. Broadly two types of explanations are explored: that institutional dominance is based on academic merit or that it based on self-perpetuating privilege. Network effects that might explain the dynamic of increasing concentration are also investigated.

I wonder how the AEA budget will hold up now that interviews can be done by Zoom and meeting attendance is not required.

Via the excellent Kevin Lewis.