The economics of Twitter moderation

Social media platforms ban users and remove posts to moderate their content. This “speech policing” remains controversial because little is known about its consequences and the costs and benefits for different individuals. I conduct two field experiments on Twitter to examine the effect of moderating hate speech on user behavior and welfare. Randomly reporting posts for violating the rules against hateful conduct increases the likelihood that Twitter removes them. Reporting does not affect the activity on the platform of the posts’ authors or their likelihood of reposting hate, but it does increase the activity of those attacked by the posts. These results are consistent with a model in which content moderation is a quality decision for platforms that increases user engagement and hence advertising revenue. The second experiment shows that changing users’ perceived content removal does not change their willingness to pause using social media, a measure of consumer surplus. My results imply that content moderation does not necessarily moderate users, but it marginally increases advertising revenue. It can be consistent with both profit- and welfare-maximization if out-of-platform externalities are small.

That is from a new paper by Rafael Jiménez Durán, via the excellent Kevin Lewis.