Model this newsroom estimator

The New York Times’s performance review system has for years given significantly lower ratings to employees of color, an analysis by Times journalists in the NewsGuild shows.

The analysis, which relied on data provided by the company on performance ratings for all Guild-represented employees, found that in 2021, being Hispanic reduced the odds of receiving a high score by about 60 percent, and being Black cut the chances of high scores by nearly 50 percent. Asians were also less likely than white employees to get high scores.

In 2020, zero Black employees received the highest rating, while white employees accounted for more than 90 percent of the roughly 50 people who received the top score.

The disparities have been statistically significant in every year for which the company provided data, according to the journalists’ study, which was reviewed by several leading academic economists and statisticians, as well as performance evaluation experts.

…Management has denied the discrepancies in the performance ratings for nearly two years…

And from the economists:

Multiple outside experts consulted by the reporters consistently said the methodology used in the Guild’s most recent analysis was reasonable and appropriate and that the approach used by the company appeared either flawed or incomplete. Some went further, suggesting the company’s approach seemed tailor-made to avoid detecting any evidence of bias.

Rachael Meager, an economist at the London School of Economics, was blunt: “LMAO, that’s so dumb,” she wrote when Guild journalists described the company’s methodology to her. “That’s what you would do if you want to obliterate signal,” she added, using a word that in economics refers to meaningful information.

“This is so stupid as to border on negligence,” added Dr. Meager, who has published papers on evaluating statistical evidence in leading economics journals.

Peter Hull, a Brown University economist who has studied statistical techniques for detecting racial bias, also questioned the company’s approach and recommended a way to test it: running simulations in which bias was intentionally added. The company’s method repeatedly failed to detect racial disparities in those tests.

Here is the full article, prepared by the NYT Guild Equity Committee, including Ben Casselman.  Of course we now live in a world where very few people will be surprised by this.  Where exactly does the moral authority lie here for making editorial judgments about content concerning race?

Is The Army racially egalitarian? (model this)

This paper links the universe of Army applicants between 1990 and 2011 to their federal tax records and other administrative data and uses two eligibility thresholds in the Armed Forces Qualification Test (AFQT) in a regression discontinuity design to estimate the effects of Army enlistment on earnings and related outcomes. In the 19 years following application, Army service increases average annual earnings by over $$4,000 at both cutoffs. However, whether service increases long-run earnings varies significantly by race. Black servicemembers experience annual gains of $$5,500 to $$15,000 11–19 years after applying while White servicemembers do not experience significant changes. By providing Black servicemembers a stable and well-paying Army job and by opening doors to higher-paid postservice employment, the Army significantly closes the Black-White earnings gap in our sample.

Here is the full paper by Kyle Greenberg,, via the excellent Kevin Lewis.

Sentences to ponder model this

Consistent with beauty-blind admissions, alumni’s beauty is uncorrelated with the rank of the school they attended in China. In the US, White men who attended high-ranked schools are better looking, especially attendees of private schools. A one percentage point increase in beauty rank corresponds to a half-point increase in the school rank.

Here is more, via the excellent Kevin Lewis.

Model this Afghanistan policy

But in both the sanctions and the seizures, you can see an almost Kafka-esque madness in the American position. They are expending all this effort to ameliorate the consequences of a sanctions regime they are implementing. They are desperately brokering deals to preserve foreign reserves that they are freezing. When I ask why they continue to impose these policies at all, the administration says that the Taliban has American prisoners, that it is a brutal regime that murders opponents and represses women, that it has links to terrorists, and that our sanctions grant us much-needed leverage.

Here is more from Ezra Klein (NYT) on the debacle of starvation unfolding in Afghanistan.

Model this and who are the real liberals anyway?

– Fifty-nine percent (59%) of Democratic voters would favor a government policy requiring that citizens remain confined to their homes at all times, except for emergencies, if they refuse to get a COVID-19 vaccine. Such a proposal is opposed by 61% of all likely voters, including 79% of Republicans and 71% of unaffiliated voters.

– Nearly half (48%) of Democratic voters think federal and state governments should be able to fine or imprison individuals who publicly question the efficacy of the existing COVID-19 vaccines on social media, television, radio, or in online or digital publications. Only 27% of all voters – including just 14% of Republicans and 18% of unaffiliated voters – favor criminal punishment of vaccine critics.

– Forty-five percent (45%) of Democrats would favor governments requiring citizens to temporarily live in designated facilities or locations if they refuse to get a COVID-19 vaccine. Such a policy would be opposed by a strong majority (71%) of all voters, with 78% of Republicans and 64% of unaffiliated voters saying they would Strongly Oppose putting the unvaccinated in “designated facilities.”

That is from a Rasmussen poll.  You might consider Rasmussen a right-leaning institution, but these kinds of results should not be possible even in somewhat slanted polls (methodology here).  Furthermore, this poll came out January 13, and it hasn’t exactly received a ton of attention from mainstream media, can you model that too?  Wouldn’t it be awful even if this poll were off by 2x?

One lesson is that it is not always good for your party if it is on the winning side of the culture wars.

Model this what is wrong with physicians?

Compared to differences among their male patient counterparts, female patients randomly assigned a female doctor rather than a male doctor are 5.0% more likely to be evaluated as disabled and receive 8.5% more subsequent cash benefits on average. There is no analogous gender-match effect for male patients.

And is it the male or female physicians who are at fault here?  Or is this diagnostic differential somehow optimal?

Here is the full NBER paper by Marika Cabral and Marcus Dillender.

Model this Apple pricing decision

Apple has one new product that’s already so back-ordered it won’t arrive in time for Christmas. It’s a polishing cloth. Priced at $19.

Unveiled in October after Apple showed off its new line of gadgets, the soft, light gray square is made of “nonabrasive material” and embossed with Apple’s logo. During tests, the rag worked like other microfiber cloths that list for less than half that price. So…why $19?

As it happens, Apple’s pricing strategy rarely allows accessories to fall below that threshold. The 6.3-inch swatch of fabric sits beside 17 other Apple-branded items on the company’s website—a mélange of charging cables, dongles and adapters—each priced at $19. Some, such as the wired earbuds and charging adapter, were once included with new iPhones.

Those $19 Apple items—together with the Apple Watch, AirPods and other small gadgets—are part of the company’s growing Wearables, Home and Accessories category, which had more than $8 billion in revenue in the quarter that ended in October.

Almost every Apple price ends in the number “9.”  Would it matter if we all carried around $30 bills?  There is further discussion in this Galvin Brown WSJ piece.

Via the excellent Samir Varma.

Model this art market illiquidity what would Hayek say?

She [an artist] tries to do a show a year, one every three years at each of the three galleries.  The idea, she explained, is for your prices not to have a sudden rise, precisely because they can crash, but rather for your dealers to increase them slowly as your work receives exposure through venues like group shows, exhibitions, and biennials.  Auctions can be dangerous for just that reason.  At the time we spoke, Wilcox’s works on paper (19″ x 24″) were selling for around $6,000; her largest paintings (12′ x 6′), for $45,000.  Dealers take 50 percent.  Prices are based on size, not judgments of quality, because you don’t want to influence buyers’ opinions.  Smaller works are cheaper, but more expensive per square inch (kind of like real estate).  Large paintings are easier to sell in Los Angeles than London or New York, because the houses are bigger.

That is an excerpt from William Deresiewicz, The Death of the Artist: How Creators are Struggling to Survive in the Age of Billionaires and Big Tech, an excellent book (ignore the subtitle).

Model this NBA coaches and Adam Smith

Of all the head coaches in the NBA in either 2019-20 or 2020-21, there was a combined one All-Star appearance as a player, by Doc Rivers. At the league’s high point of former players as coaches, in 2001-02, there were 13 different former All-Stars walking the sidelines who had combined for 60 appearances.

Here is more from Kevin Pelton at ESPN.  Is it that data analytics matter so much more?  A general increase in the division of labor?  Or are today’s stars so prominent, perhaps because of social media, that a team does not need recognizable coaches to bring in the fans?

The Pelton posts considers further issues in mechanism design, such as whether a single free throw should be used to determine both points late in NBA games.  I would think that leads to an overinvestment in fouling from teams that are behind?

Medical ethics? (model this)

Steven Joffe, MD, MPH, a medical ethicist at the University of Pennsylvania, said he doesn’t believe clinicians “should be lowering our standards of evidence because we’re in a pandemic.”

Link here.  That sentence is a good litmus test for whether you think clearly about trade-offs, statistical and speed trade-offs included, procedures vs. final ends of value (e.g., human lives), and how obsessed you are with mood affiliation (can you see through his question-begging invocation of “lowering our standards”?).  It is stunning to me that a top researcher at an Ivy League school literally cannot think properly about his subject area at all, and furthermore has no compunction admitting this publicly.  As Alex wrote just earlier today: “Waiting for more data isn’t “science,” it’s sometimes an excuse for an unscientific status-quo bias.”

To be clear, we should run more and better RCT trials of Ivermectin, the topic at hand for Joffe (and in fact Fast Grants is helping to fund exactly that).  But of course the “let’s go ahead and actually do this” decision should be different in a pandemic, just as the “just how much of a hurry are we in here anyway?” calculus should differ as well.  I do not know enough to judge whether Ivermectin should be in hospital treatment protocols, as it is in many countries, but I do not condemn this simply on the grounds of it representing a “lower standard.”  It might instead reflect a “higher standard” of concern for human lives, and you will note the drug is not considered harmful as it is being administered.

If you apply the standards of Joffe’s earlier work, we should not be proceeding with these RCTs, including presumably vaccine RCTs, until we have assured that all of the participants truly understand the difference between “research” and “treatment” as part of the informed consent protocols.  No “therapeutic misconception” should be allowed.  Really?

If the pandemic has changed my mind about anything, it is the nature of expertise.

Model this

Nancy Pelosi warned that a Covid-19 vaccine should not be authorised for use in the US based on data from British trials, amid fears that the Trump administration is planning to rush out an inoculation before election day.

The Democratic speaker of the House of Representatives on Friday cast doubt on the British system for testing and approving medicines, further politicising the race to develop a vaccine for Covid-19.

“We need to be very careful about what happens in the UK. We have very stringent rules in terms of the Food and Drug Administration here, about the number of clinical trials, the timing, the number of people and all the rest,” Ms Pelosi told reporters in Washington.

Here is the full FT story, and here is a nice NYT piece, by Zeke Emanuel and others, on the superiority of the British clinical trials system, especially with respect to Covid-19.

Model this New York City police force

Last night, and some previous nights, many storefronts in Manhattan were trashed, there was looting in Soho, or how about this description from Rachel Olding at The Daily Beast?:

Hard to describe how rampant the looting was tonight in Midtown Manhattan and how lawless it was. Complete anarchy. Literally hundreds of stores up and down Broadway, Fifth Ave, Sixth Ave. Kids ruling the streets like it was a party.

Now, those are among the most visible and “high value” spots in the whole city and the NYPD has over 38,000 police to draw upon.  So what is the best model of why all that trouble happened and indeed was allowed to happen?  I see a few candidates:

1. Those police are not sufficiently well trained.

2. Those police are trained but they are afraid of confronting protestors and so they don’t do it.

3. The mayor de facto doesn’t want the police to be too involved, as that might be unpopular with swing voters in the primaries or even the general election.

4. The police union insists, de facto, that not many police be sent directly into such confrontations.

5. There is a general lack of accountability, and so there is failure at multiple levels, and so many good things simply do not happen, but for reasons which are not always entirely concrete.

6. The police do not have the right technology to handle these kinds of problems.

Which is it, and which other hypotheses am I neglecting?

As a more general observation, if this problem cannot be solved, complaining about Trump holding the Bible and the tear gas on the way to the church ultimately will fall upon deaf ears.  Ultimately the American public are not going to side against “the thin blue line” (i.e., the police), so to win all those important civil liberties victories you also need the police doing the proper job effectively. Maybe I picked the wrong Google terms but “why didn’t New York police stop rioters” does not in fact yield anything substantive on the question I am asking.  How can that be?  While you’re at it, model that too!

Addendum: One reader hypothesis is to send a signal to the mayor for criticizing them. Another is here: “Similar to Baltimore, the police in Minneapolis will make it clear that looting and widespread private property destruction will be tolerated for the remainder of the protests as a way to conflate protesters and looters and “teach a lesson to” their liberal civilian bosses