The language of restaurant reviews

Jennifer Schuessler at The New York Times reports on the work and new book of Dan Jurafsky:

In a study of more than a million Yelp restaurant reviews, Mr. Jurafsky and the Carnegie Mellon team found that four-star reviews tended to use a narrower range of vague positive words, while one-star reviews had a more varied vocabulary. One-star reviews also had higher incidence of past tense, pronouns (especially plural pronouns) and other subtle markers that linguists have previously found in chat room discussions about the death of Princess Diana and blog posts written in the months after the Sept. 11 attacks.

In short, Mr. Jurafsky said, authors of one-star reviews unconsciously use language much as people do in the wake of collective trauma. “They use the word ‘we’ much more than ‘I,’ as if taking solace in the fact that this bad thing happened, but it happened to us together,” he said.

Another finding: Reviews of expensive restaurants are more likely to use sexual metaphors, while the food at cheaper restaurants tends to be compared to drugs.

Previous MR posts on Jurafsky are here.

Comments

Good restaurants are all alike; every crappy restaurant is crappy in its own way.

For me to give a one star review requires something pretty traumatic, I'm not Michelin. Also when something is so awful and you give a place such a ding it is still like an attack, you need to spend more effort justifying yourself so others don't think you're a crank.

It is surprising to see how many 1 star reviews are from people who could not get in to the restaurant - seems like bad form; a review out of spite.

This is like one star reviews on Amazon where it's not about the product but about the dodgy third party seller that the buyer chose to buy from who turned out to provide bad service.

+1 I have used a one star far more sparingly than 4 stars.

Someone is saying somewhere tonight: Thank goodness they don't apply the algorithms to website postings.

I remember a fun to trip to Virginia. Good restaurant review.

The waiter actually made a special dessert.

Why on earth would anyone analyze a million Yelp reviews? The basic idea seemed good, Zagat for the internet age, but Yelp has failed, largely because its users are the world's least discerning people. The whole point of review is to separate good from bad but basically everything on Yelp is clustered into the upper middle and indisputably terrible places often outscore genuinely good ones.

Actually, the problems are a lot deeper than that. Aggregating numerical votes into averages provides readers the wisdom of that crowd at a glance. Aggregating lengthy texts is useless.

The site isn't even very technically sophisticated. Yelp could overcome many of its users' shortcomings simply by analyzing how each person votes and putting them into clusters of generally like minded people, the way Netflix does with movies. But it doesn't. It just presents me with each person's opinion as if they're all equally relevant to my likely opinion.

Since when are ratings on Yelp taken seriously? Yelp is all about extorting advertising dollars from small businesses, especially restaurants.

http://www.eastbayexpress.com/oakland/yelp-and-the-business-of-extortion-20/Content?oid=1176635

I generally agree with you Scoop, except I do find Yelp useful in two cases. First, when a restaurant is at 4.5 (as opposed to just 4) stars, I think it's usually correctly identified as a great place. Second, when a restaurant is 3 or fewer stars, I think it's usually correctly identified as pretty bad. Of course, the problem is the majority are, like you said, clustered into the upper middle and rated 3.5 or 4 stars.

Yes, there is some measure of information from the variation in ratings, but I read the reviews. The things that many people repeat are the things worth noting. It is easy to look past superlatives and easy to identify well thought out reviews.

I've found Yelp to be useful comparing restaurants within a given area. Or as a "sanity check" if I see a place that looks interesting. I haven't often been steered wrong.

Mr. Jurafsky and the Carnegie Mellon team could probably invest their time in something a little more meaningful, perhaps a study of the number of times the word "elite" is attached to "Republican Guard" in media accounts concerning Iran.

The Iraqi Republican Guard was their best force by far, but they also died horrible screaming deaths. The Iranian RG isn't any different.

These forces aren't formed, trained, and equipped to repel invasions. They are built to suppress domestic insurrection.

I get your point, though, about the media. Why single out this example when their sole purpose is to capture your attention. Nearly every news article has a headline that is contradicted by information within the article, usually hidden at the bottom.

Or why CNN thinks the last UN Secretary-General is named "Kofi Annan of Ghana".

My sister worked at Yelp for two years. Everything bad you can say about them is true.

The reviews are mostly fake. The reviewers do not have consistent preferences (e.g., rainy days cause low reviews for restaurants). The businesses are being extorted for ad dollars and they respond to this extortion by hiring companies to write fake reviews.

The quality of Yelp has declined rapidly as a result.

In my experience Yelp has been pretty good about accurately showing phone #s and hours of operation at a glance. That's value added.

I call bullshit on "mostly fake".
No one (nor system) could write that many fake reviews.

If it works the way he says, it's not one person or one company. It could be thousands of small reputation-enhancement companies serving millions of small businesses. That's very plausible. In fact, it would be in the interests of such companies to submit lots of negative reviews to stir the pot and scare business owners into enlisting their services. The incentives favor generating lots of strongly positive and negative reviews, all of which are fake.

I read Yelp reviews for a restaurant we went to tonight. One reviewer had reviews from 357 places. Her reviews of various places were mixed. Her reviews were both detailed and specific about the menu items.

Is it unreasonable for me to conclude this is a genuine review from an experienced diner or have I been Turinged?

I should have said, "mostly useless".

People will give a restaurant a five star review because the waiter was cute or a one star review because the waitress was late getting them a third round of free refill iced teas.

The reviews are heavily polarized with 1 star and 5 star ratings being far more common than 2s, 3s, and 4s. The real world isn't that polarized. Most people have a moderately good time at a local eatery, they don't love it or hate it.They also rarely bother to write a review.

Imagine if pollsters, instead of randomly sampling the population, just reviewed the comments on a CNN article about a candidate. Would we call that a useful poll?

What you write sounds plausible, but consider that people choose restaurants based on their tastes and they likely know something about them. So reviews have an inherent upward bias. But when someone's expectations are disappointed, they have an emotional plummet to the bottom.

I consider myself an honest rater, and I have very few 3 ratings. One restaurant has disappointed me with its service and cleanliness multiple times (1), but their food is unique and amazing (5). Must I average these selection criteria with equal weights?

My choice to repeatedly go back there reveals a preference ranking higher than a 3, regardless of any math I might perform.

Maybe people choose to rate places only at the extremes: 1- awful, 4- good, but not perfect, 5- great.

None of what you say means that the overall estimates wouldn't be, on average, correct.

The thing that gives pause is the "reputation enhancement" services, but again, if you compare locally (as in, within a few city blocks rather than across an entire city), the restaurants you'll be comparing will have roughly the same amount of money to spend on such services, so you'd expect results to be more accurate.

You can also just look at the incidence of negative reviews. Places with a ton of 0's are probably not worth it even if there are lots of 5's to compensate the overall average.

Actually, 1 or 5 may be the most informative ratings -- "I would return" or "I would not return." 2, 3, and 4 are too muddled

It's hard to write in the present tense, as all good writers should. I like to think of myself as a good writer (I did well in English composition), but even I find it hard. Writing in the dental preterite is easier (of course I Googled that, http://grammar.about.com/od/pq/g/preteriterm.htm)

This is entirely predictable from work the psychologist Robert Zajonc did in the 1960s - http://en.wikipedia.org/wiki/Mere-exposure_effect There is a smaller vocabulary of positive words (and they occur more often). There is a larger vocabulary of negative words (and they occur less often).

For that matter, this quote from Leo Tolstoy:
“All happy families are alike; each unhappy family is unhappy in its own way.”

Positive restaurant reviews are a dime a dozen.

However, restaurants that consistently receive low stars in sites like Yelp typically are trash.

So, while reviews skew disproportionately towards the extremes, you can still use these sites to weed out the most hopeless of eating establishments.

What they're not good for is finding the truly exceptional restaurants and diamonds in the rough.

What a large data set! How do you conduct a serious study on over a million Yelp reviews? How many reviews were actually read by a real person?

It's called natural language processing.

http://en.wikipedia.org/wiki/Natural_language_processing

It's actually pretty good at identifying interesting things when you're looking at larger documents (Think identifying latent groupings for academic journal articles, which is exactly what JSTOR uses this for.) but AFAIK, it's properties are less proven for smaller language chunks, like tweets or internet comments. Yelp reviews probably fall somewhere in the middle.

That said, there's still probably a lot of interesting things you can get just from word counts on stuff like this.

I don't bother with any restaurant review that fails to mention Pakistani taxi drivers.

Comments for this post are closed