In mid-April, while he was living with his parents in Santa Clara, Calif., Gu spent a week building his own Covid death predictor and a website to display the morbid information. Before long, his model started producing more accurate results than those cooked up by institutions with hundreds of millions of dollars in funding and decades of experience.
“His model was the only one that seemed sane,” says Jeremy Howard, a renowned data expert and research scientist at the University of San Francisco. “The other models were shown to be nonsense time and again, and yet there was no introspection from the people publishing the forecasts or the journalists reporting on them. Peoples’ lives were depending on these things, and Youyang was the one person actually looking at the data and doing it properly.”
The forecasting model that Gu built was, in some ways, simple. He had first considered examining the relationship among Covid tests, hospitalizations, and other factors but found that such data was being reported inconsistently by states and the federal government. The most reliable figures appeared to be the daily death counts. “Other models used more data sources, but I decided to rely on past deaths to predict future deaths,” Gu says. “Having that as the only input helped filter the signal from the noise.”
The novel, sophisticated twist of Gu’s model came from his use of machine learning algorithms to hone his figures.