Category: Data Source

How to extract information from on-line reviews, or why Star Wars is still a thing

Online reviews promise to provide people with immediate access to the wisdom of the crowds. Yet, half of all reviews on Amazon and Yelp provide the most positive rating possible, despite human behaviour being substantially more varied in nature. We term the challenge of discerning success within this sea of positive ratings the ‘positivity problem’. Positivity, however, is only one facet of individuals’ opinions. We propose that one solution to the positivity problem lies with the emotionality of people’s opinions. Using computational linguistics, we predict the box office revenue of nearly 2,400 movies, sales of 1.6 million books, new brand followers across two years of Super Bowl commercials, and real-world reservations at over 1,000 restaurants. Whereas star ratings are an unreliable predictor of success, emotionality from the very same reviews offers a consistent diagnostic signal. More emotional language was associated with more subsequent success.

Here is more from Matthew D. Rocklage, Derek D. Rucker, and Loran F. Nordgren, via the excellent Kevin Lewis.

Facts about Nigerian-Americans

Second-generation black Americans have been inadequately studied in prior quantitative research. The authors seek to ameliorate this research gap by using the Current Population Survey to investigate education and wages among second-generation black Americans with a focus on Nigerian Americans. The latter group has been identified in some qualitative studies as having particularly notable socioeconomic attainments. The results indicate that the educational attainment of second-generation Nigerian Americans exceeds other second-generation black Americans, third- and higher generation African Americans, third- and higher generation whites, second-generation whites, and second-generation Asian Americans. Controlling for age, education, and disability, the wages of second-generation Nigerian Americans have reached parity with those of third- and higher generation whites. The educational attainment of other second-generation black Americans exceeds that of third- and higher generation African Americans but has reached parity with that of third- and higher generation whites only among women. These results indicate significant socioeconomic variation within the African American/black category by gender, ethnicity, and generational status that merits further research.

Here is the full paper by Sakomoto, et.al.

Further estimates on the cost of climate change and global warming

Sea level rise will cause spatial shifts in economic activity over the next 200 years. Using a spatially disaggregated, dynamic model of the world economy, this paper estimates the consequences of probabilistic projections of local sea level changes. Under an intermediate scenario of greenhouse gas emissions, permanent flooding is projected to reduce global real GDP by 0.19 percent in present value terms. By the year 2200, a projected 1.46 percent of the population will be displaced. Losses in coastal localities are much larger. When ignoring the dynamic response of investment and migration, the loss in real GDP in 2200 increases from 0.11 percent to 4.5 percent.

That newly published paper is from Klaus Desmet, Robert E. Kopp, Scott A. Kulp, Dávid Krisztián Nagy, Michael Oppenheimer, Esteban Rossi-Hansberg and Benjamin H. Strauss in American Economic Journal: Macroeconomics.  Am I wrong to feel a little…underwhelmed by those estimates?  Here is an earlier recent paper on other cost estimates.

The mortality of scholars

After recovering from a severe mortality crisis in the seventeenth century, life expectancy among scholars started to increase as early as in the eighteenth century, well before the Industrial Revolution. Our finding that members of scientific academies—an elite group among scholars—were the first to experience mortality improvements suggests that 300 years ago, individuals with higher social status already enjoyed lower mortality. We also show, however, that the onset of mortality improvements among scholars in medicine was delayed, possibly because these scholars were exposed to pathogens and did not have germ theory knowledge that might have protected them. The disadvantage among medical professionals decreased toward the end of the nineteenth century.

Here is more from Robert Stelter, David de la Croix, and Mikko Myrskylä.  Via the excellent Kevin Lewis.

The influence of hidden researcher decisions in applied microeconomics

Another one from the Department of Uh-Oh:

Researchers make hundreds of decisions about data collection, preparation, and analysis in their research. We use a many‐analysts approach to measure the extent and impact of these decisions. Two published causal empirical results are replicated by seven replicators each. We find large differences in data preparation and analysis decisions, many of which would not likely be reported in a publication. No two replicators reported the same sample size. Statistical significance varied across replications, and for one of the studies the effect’s sign varied as well. The standard deviation of estimates across replications was 3–4 times the mean reported standard error.

Here is the paper by numerous authors, via Scott Cunningham.

Are Americans getting worse?

Maybe so:

Morbidity and mortality have been increasing among middle-aged and young-old Americans since the turn of the century. We investigate whether these unfavorable trends extend to younger cohorts and their underlying physiological, psychological, and behavioral mechanisms. Applying generalized linear mixed effects models to 62,833 adults from the National Health and Nutrition Examination Surveys (1988-2016) and 625,221 adults from the National Health Interview Surveys (1997-2018), we find that for all gender and racial groups, physiological dysregulation has increased continuously from Baby Boomers through late-Gen X and Gen Y. The magnitude of the increase is higher for White men than other groups, while Black men have a steepest increase in low urinary albumin (a marker of chronic inflammation). In addition, Whites undergo distinctive increases in anxiety, depression, and heavy drinking, and have a higher level than Blacks and Hispanics of smoking and drug use in recent cohorts. Smoking is not responsible for the increasing physiological dysregulation across cohorts. The obesity epidemic contributes to the increase in metabolic syndrome, but not in low urinary albumin. The worsening physiological and mental health profiles among younger generations imply a challenging morbidity and mortality prospect for the United States, one that may be particularly inauspicious for Whites.

Here is the full article, via an excellent loyal MR reader.

Testing Todd

Emmanuel Todd, that is.  Here is a recent paper from Jerg Gutmann and Stefan Voigt:

Many years ago, Emmanuel Todd came up with a classification of family types and argued that the historically prevalent family types in a society have important consequences for its economic, political, and social development. Here, we evaluate Todd’s most important predictions empirically. Relying on a parsimonious model with exogenous covariates, we find mixed results. On the one hand, authoritarian family types are, in stark contrast to Todd’s predictions, associated with increased levels of the rule of law and innovation. On the other hand, and in line with Todd’s expectations, communitarian family types are linked to racism, low levels of the rule of law, and late industrialization. Countries in which endogamy is frequently practiced also display an expectedly high level of state fragility and weak civil society organizations.

Via the excellent Kevin Lewis.

Socioeconomic roots of academic faculty

Using a survey of 7218 professors in PhD-granting departments in the United States across eight disciplines in STEM, social sciences, and the humanities, we find that the estimated median childhood household income among faculty is 23.7% higher than the general public, and faculty are 25 times more likely to have a parent with a PhD. Moreover, the proportion of faculty with PhD parents nearly doubles at more prestigious universities and is stable across the past 50 years.

Here is the full paper, via all over Twitter.

On the GDP-Temperature relationship and its relevance for climate damages

I have worried about related issues for some while, and now that someone has done the hard work I find the results disturbing and possibly significant:

Econometric models of temperature impacts on GDP are increasingly used to inform global warming damage assessments. But theory does not prescribe estimable forms of this relationship. By estimating 800 plausible specifications of the temperature-GDP relationship, we demonstrate that a wide variety of models are statistically indistinguishable in their out-of-sample performance, including models that exclude any temperature effect. This full set of models, however, implies a wide range of climate change impacts by 2100, yielding considerable model uncertainty. The uncertainty is greatest for models that specify effects of temperature on GDP growth that accumulate over time; the 95% confidence interval that accounts for both sampling and model uncertainty across the best-performing models ranges from 84% GDP losses to 359% gains. Models of GDP levels effects yield a much narrower distribution of GDP impacts centered around 1–3% losses, consistent with damage functions of major integrated assessment models. Further, models that incorporate lagged temperature effects are indicative of impacts on GDP levels rather than GDP growth. We identify statistically significant marginal effects of temperature on poor country GDP and agricultural production, but not rich country GDP, non-agricultural production, or GDP growth.

That is from Richard G Newell, Brian C. Prest, and Steven E. Sexton.  Via the excellent Kevin Lewis.

When Did Growth Begin?

The subtitle of the paper is “New Estimates of Productivity Growth in England from 1250 to 1870” and it is by Paul Bouscasse, Emi Nakamura, and Jon Steinsson:

We provide new estimates of the evolution of productivity in England from 1250 to 1870. Real wages over this period were heavily influenced by plague-induced swings in the population. We develop and implement a new methodology for estimating productivity that accounts for these Malthusian dynamics. In the early part of our sample, we find that productivity growth was zero. Productivity growth began in 1600—almost a century before the Glorious Revolution. Post-1600 productivity growth had two phases: an initial phase of modest growth of 4% per decade between 1600 and 1810, followed by a rapid acceleration at the time of the Industrial Revolution to 18% per decade. Our evidence helps distinguish between theories of why growth began. In particular, our findings support the idea that broad-based economic change preceded the bourgeois institutional reforms of 17th century England and may have contributed to causing them. We also estimate the strength of Malthusian population forces on real wages. We find that these forces were sufficiently weak to be easily overwhelmed by post-1800 productivity growth.

Via Anton Howes.  Here is a related tweet storm from Steinsson.

Google Trends as a measure of economic influence

That is a new research paper by Tom Coupé, here is one excerpt:

I find that search intensity rankings based on Google Trends data are only modestly correlated with more traditional measures of scholarly impact…

The definition of who counts as an economist is somewhat loose, so:

Plato, Aristotle and Karl Marx constitute the top three. They are followed by B. R. Ambedkar, John Locke and Thomas Aquinas, with Adam Smith taking the seventh place. Smith is followed by Max Weber, John Maynard Keynes and the top-ranking Nobel Prize winner, John Forbes Nash Jr.

…John Forbes Nash Jr., Arthur Lewis, Milton Friedman, Paul Krugman and Friedrich Hayek are the most searched for Nobel Prize winners for economics, while Tjalling Koopmans, Reinhard Selten, Lawrence Klein, James Meade and Dale T. Mortensen have the lowest search intensity.

Here are the Nobelist rankings.  Here are the complete rankings, if you are wondering I come in at #104, just ahead of William Stanley Jevons, one of the other Marginal Revolution guys, and considerably ahead of Walras and Menger, early co-bloggers (now retired) on this site.  Gary Becker is what…#172?  Ken Arrow is #184.  The internet is a funny place.

I guess I found this on Twitter, but I have forgotten whom to thank – sorry!

Does expertise make consumers emotionally numb?

I consider this a speculative idea, but of interest, here is the paper abstract:

Expertise provides numerous benefits. Experts process information more efficiently, remember information better, and often make better decisions. Consumers pursue expertise in domains they love and chase experiences that make them feel something. Yet, might becoming an expert carry a cost for these very feelings? Across more than 700,000 consumers and 6 million observations, developing expertise in a hedonic domain predicts consumers becoming more emotionally numb – i.e., having less intense emotion in response to their experiences. This numbness occurs across a range of domains – movies, photography, wine, and beer – and across diverse measures of emotion and expertise. It occurs in cross-sectional real-world data with certified experts, and in longitudinal real-world data that follows consumers over time and traces their emotional trajectories as they accrue expertise. Further, this numbness can be explained by the cognitive structure experts develop and apply within a domain. Experimentally inducing cognitive structure led novice consumers to experience greater numbness. However, shifting experts away from using their cognitive structure restored their experience of emotion. Thus, although consumers actively pursue expertise in domains that bring them pleasure, the present work is the first to show that this pursuit can come with a hedonic cost.

That is by Matthew D. Rocklange, Derek D. Rucker, and Loran F. Nordgren.  For the pointer I thank the excellent Kevin Lewis.

The importance of superspreading events

Empirical observation throughout the SARS-CoV-2 pandemic has shown the outsized role of superspreading events in the propagation of SARS-CoV-2, wherein the average infected person does not transmit the virus. Our results suggest the same dynamics likely influenced the initial establishment of SARS-CoV-2 in humans, as only 29.7% of simulated epidemics from the primary analysis went on to establish self-sustaining epidemics. The remaining 70.3% of epidemics went extinct…Furthermore, the large and highly connected contact networks characterizing urban areas seem critical to the establishment of SARS-CoV-2. When we simulated epidemics where the number of connections was reduced by 50% or 75% (without rescaling per-contact transmissibility), to reflect emergence in a rural community, the epidemics went extinct 94.5% or 99.6% of the time, respectively…The high extinction rates we inferred suggest that spillover of SARS-CoV-2-like viruses may be frequent, even if pandemics are rare.

Here is the paper, via the excellent Kevin Lewis.

Have the Danes rendered families less important?

Many American policy analysts point to Denmark as a model welfare state with low levels of income inequality and high levels of income mobility across generations. It has in place many social policies now advocated for adoption in the U.S. Despite generous Danish social policies, family influence on important child outcomes in Denmark is about as strong as it is in the United States. More advantaged families are better able to access, utilize, and influence universally available programs. Purposive sorting by levels of family advantage create neighborhood effects. Powerful forces not easily mitigated by Danish-style welfare state programs operate in both countries.

Here is the full paper by James J. Heckman and Rasmus Landersø.

CEO Stress, Aging, and Death

We estimate the long-term effects of experiencing high levels of job demands on the mortality and aging of CEOs. The estimation exploits variation in takeover protection and industry crises. First, using hand-collected data on the dates of birth and death for 1,605 CEOs of large, publicly-listed U.S. firms, we estimate the resulting changes in mortality. The hazard estimates indicate that CEOs’ lifespan increases by two years when insulated from market discipline via anti-takeover laws, and decreases by 1.5 years in response to an industry-wide downturn. Second, we apply neural-network based machine-learning techniques to assess visible signs of aging in pictures of CEOs. We estimate that exposure to a distress shock during the Great Recession increases CEOs’ apparent age by one year over the next decade. Our findings imply significant health costs of managerial stress, also relative to known health risks.

That is from a new NBER working paper by Mark Borgschulte, Marius Guenzel, Canyao Liu, and Ulrike Malmendier.