Why is it hard to measure the value of soccer players?

Max Mendez Beck emails me:

Given the advent of statistics in sports that occurred in the last five years, I am struck by how well soccer works as a metaphor for current epistemological debates regarding the use (and primacy) of quantitative versus qualitative data in social science research. While the three major American sports (football, basketball, and baseball) have been overtaken by a quantitative obsession (count how many tables and numbers you see on an average ESPN show), soccer is emblematic of a sport that is quite difficult to measure quantitatively.

Consider how easy it is to determine who did well in an average NBA game without needing to even watch it. You can just look at points, assists, rebounds, steals, turnovers, etc. In soccer, individual statistics are almost nonexistent. Even as major sports channels have attempted to incorporate quantitative measures into their soccer broadcasts–for example, by showing the number of kilometers a player has covered when he gets subbed out (a pretty uninformative statistic on its own)–these numbers have not caught on with the regular fan.

While in basketball everyone debates about who “the best ever” is by referring to their career averages in points, field goal percentage, PER, etc. In soccer the only statistic that is ever used is goals scores, and goals scored is only one small dimension of a player, even smaller if he is not a striker. It would be silly to judge Andrés Iniesta or Zinedine Zidane on how many goals they scored in a season.

So what is it about soccer that makes it so hard to quantify? Or what makes American sports so easy to measure? One obvious answer is the length of the units that can be easily separated and analyzed. In basketball its a maximum of 24 seconds, in baseball its essentially a pitch (or an at bat), and in football its each snap. For soccer, the only apparent unit to separate out is the 45 minute halftime mark. Changes in possession could be another measure, but even then a team’s single possession could be several minutes long.

However, the real challenge comes in measuring individual accomplishments. Just recently I was watching a Barcelona game and Iniesta clearly was having an amazing game (as was mentioned several times by the announcer), and yet the things that made him have a great game were only describable in words and not numbers. There was a beautiful and sudden “regate” or dribble around a defender before he passed it on to a teammate for a quick counter attack. There was the beautiful pass between defenders that led to an assist for the first goal. There was the sudden change in direction and over the top pass to the other side of the field that put the defenders on their heels. Many of these moves are incredibly situational; they have to do with the rhythm of the game and the need to speed it up or slow it down. Nothing in the boxscore could truly capture these attributes.

So the question is: Is soccer something that can’t be measured in numbers?

Here are various readings on the topic.


Right before the World Cup a couple of years ago, Benjamin Morris has a long article on advanced soccer stats with 15 graphs summarizing the 2010-2013 seasons.


These sophisticated new stats proved that the best two soccer player over those 4 seasons had been ... Messi and Ronaldo. Just like everybody with a TV had already known.

As Yogi Berra said, you can observe a lot just by watching.

I believe the best use of we thought out advanced stats in any sport usually isn't to determine who THE BEST is, but to identify players and aspects of the game that have a larger influence on winning that we realize. The baseball player who draws a lot of walks, the possession retaining block (rather than swat) in basketball, and the hockey forward who maintains possession into the attacking zone have all gained in status because of quantitative push we have seen in the past decade and a half.

I mention ice hockey because of it's conspicuous absence in the source and because efforts to quantify it suffers from very similar problems to soccer.

Complicating the comparison, though, is the frequency of substitutions and the consequent varied personnel deployments. A lot of so-called "advanced" hockey statistics, such as Corsi/Fenwick, are most value when viewed in a relative context (i.e., having this player on the ice improves the Corsi of everyone else relative to when they skate with their regular linemates). Since the personnel combinations are much more limited in soccer, it's not clear whether one would be able to disaggregate the teammate-quality effects to the same degree hockey statisticians manage to.

But those two are strikers. A but different from each other but their value is mostly in scoring.

Xavi and Iniesta look badon that chart, but Spain's dominance in international soccer had little to do with Villa and Torres, the guys who scored the goals for them. You can still find stats that make them good, like percentage of passes completed, but that's not all there is to midfielders either

There are a lot of soccer statistics being tracked. Just because mainstream sports media doesn't report them, doesn't mean fans don't discuss them, but things like pass %, key passes, shot %, interceptions and tackles can give you a much better picture.

I don't think so. Soccer fans talk a lot about qualitative stuff, HOW their teams play, how they WISHED they played, what TYPE of players suit their club, plus the actual facts of results, player transfers etc. Nobody talks about pass %.

You're partly right. However, it's worth mentioning that fans discussions are mostly qualitative, whether it's soccer or basketball. Like Travis said, there are a lot of soccer stats being tracked and countless websites to check them, unfortunately broadcasters make little use of them and I think it has more to do with the European philosophy (which is highly influential in soccer) than the game itself. If soccer was an "American sport" broadcasting would certainly be very influenced by American broadcast but because it's Eurocentric, broadcasts are designed around Europe's style. Americans love sports stats and not long ago many Europeans made fun of this fact, broadcast of sport stats in European dominated sports has always been neglect, even in stats obsessed disciplines like Formula 1.

Sports fans are extremely biased and like to use stats when they support their argument and like to dismiss it (favoring qualitative stuff like flair, flow, style, etc.) when they do not support their point of view.

I don't think it's cultural. Rather, soccer doesn't have stoppages like baseball, football, or basketball.

First, individual plays are easier to measure. It's why baseball has the most stats and the longest history of incorporating them. Football, likewise, has lots of stats that are thrown around. Basketball has plenty of stoppages though not plays per se. Second, individual plays mean there are pauses where the announcer can stop and talk about statistics. In soccer, such breaks are quite limited (substitutions, free kicks).

Hockey also has limited stats because its continuous like soccer.

I don't think soccer and hockey are _quite_ as continuous as we'd think at first glance. There are plenty of events in those matches that resemble an American football punt (and therefore a possession change). In soccer you could certainly count the number of times a soccer teams starts a new possession with pass back to the keeper, or a goal kick, or similar. And hockey has times where a defenseman chills behind his own net while everyone changes lines. I would assume player tracking (if/where it exists) analysis could find these smaller chunks pretty easily.

Australian rules football is also continuos, but it has masses of stats. The stats however are generally presented at quarter and half time breaks, and in the newspapers the following day. It's normal for the most popular newspapers to have 8-10 pages of football stats on a Monday.

Soccer is a strange sport.

Well it might be that basketball has more stats than soccer but not by much. I don't know where you guys get your information but there are heat maps about where the most shots on goal originated, there stats on passes, passes completed, on where each player was located, on tackling losses etc. Just because British or American soccer does not report that doesn't mean that there are no statistics.
But since most games are pretty singular many stats are probably not as useful as in basketball.

I think the answer is fairly obviously no. Of course these things can be measured. It suprises me that so little of this stuff is recorded, if that is indeed the case. The simple question I ask is: how do you know someone had a good game? In order to know you must have data. How many passes? How many defenders beaten? How long was each pass? How long was each run? How many assists? How many shots on goal? Etc. All these things can easily be measured. It is odd that they are not. Perhaps it is cultural. Is the USA more quantitative and scientific in its approach to sport than Europe? Probably. Is europe more artistic? If soccer were more popular in the USA then things would probably be different.

In a way, you are missing the point - the number of passes is pretty much meaningless when the final score is 1:0. And when the team that won made half the passes of the losing team, and the person that made the pass to the person that scored the goal made less passes in that game than anyone else. Just as it is quite possible that the person that scored the only goal may have also been the person with the least shots on goal of all players taking shots that entire game.

Soccer really is not about statistics, in part because statistics are effective when using aggregates, not single results.

And it should be further noted that in professional soccer, the number of substitutions is severely restricted, unlike American favored sports. In other words, there is no effective way to substitute players for different situations, and thus much less need for measuring individual player performance to determine their best use in a particular setting - a good team must be balanced for 90 minutes in a way unfamiliar to how most sports played in America work.

"the number of passes is pretty much meaningless when the final score is 1:0"

For the result of that game, yes. But for predicting future performance (the point of sports stats), then probably not.

"there is no effective way to substitute players for different situations, and thus much less need for measuring individual player performance to determine their best use in a particular setting"

This doesn't make a lot of sense. Of course you want to know what players perform best and how they are best used. You want to know which players to start and which positions to start them in (and what players you should try to acquire to build your team). The number of substitutions you can make is pretty much irrelevant to the value of measuring player performance.

Not even for "predicting future performance". It's just not true that a win is normally made up of a larger number of passes: How this is addressed in soccer, is that fans observe and discuss how much "possession" their team has - some teams like to play with possession, others like to NOT HAVE POSSESSION and score on the counter attack. And this can work - I don't have the stats, natch, but there is probably little correlation between possession and results. And, as I say, the way that soccer fans address this is to discuss their opinion of the merits (results + beauty) of holding possession or not. No argument is helped by specifying precisely how much possession their team has / should have.

There's massive correlation between possession and results. To take the best two teams in the biggest 3 leagues: Madrid and Barca are top in La Liga possession, Bayern and Dortmund top two in Bundesliga possession, City and Arsenal 2nd and 3rd in Premier League possession. Tactical exceptions exist (Atletico Madrid, Leicester, Manchester United going the other way), but, fundamentally: 1) in order to score, you need to have the ball, and your opponents need to have the ball to score on you 2) possession is an effective proxy for player skill for the above reason.

I didn't mean to imply a simple relationship between possession (or passes) and success. However, I think that low-level events of the game almost certainly correlate with skill and predict success in a measurable way.

Multiple subs allow for extensive stat accumulation regarding team performance when the player is in vs on the bench. Think plus minus in hockey for a crude example.

Do you watch soccer? How do you think they analyze a player and compare players? Numbers of passes is a good indication and pass accuracy Is an even better indicator.

If you looking for a forward you can ask for a player with a good head game because that would fit your system of ask for a player who can score from outside and stats can help you devise the player you're looking for.

You can even compare goalies by the number of free-kick goals they concede or left-backs by the number and accuracy of long passes they make per game...

'How do you think they analyze a player and compare players?'

By watching them play? Or is that just a little bit to difficult to grasp? Two goalies may have similar statistics, but this says little about how good a goalie is without seeing them within an entire context - such as whether they are playing for a good or bad team.

The best example of this just might be Oliver Kahn - he pretty much brought the German national team to the World Cup championship game alone in 2002, and is certainly one of the greatest goalies to play soccer. His statistics? Quite good when playing for one of the better national teams, and excellent when playing for one of Germany's better teams, but extraordinary when losing at the World Cup final in 2002 on a fairly poor team. Leading to the question - how much of the statistics were due to the teams he played on, and how much due to his undoubted heart, and not simply his skills? (As a note - Oliver Kahn was the sort of goalkeeper that faced down strikers by making it obvious who was the person in charge of the situation. In 2002, this worked up to the point that the German national team faced Brasilia in the finals, where he let in two of the three goals that Germany conceded in the entire tournament.)

The people who make decisions concerning which players to acquire also look at things that are hard to quantify statistically. Unless one makes a statistical category for most threatening goalie that a striker could go up against, there is no easy way to value Kahn's greatest skill in a manner that could be compared. To paraphrase Bobby Fischer, Kahn liked breaking the egos of those trying to score against him, and is probably the only goalie who can be counted as playing an offensive role, not simply a defensive one.

Uniqueness is not something handled well by statistics, and a lot of the best soccer stars tend to fit into the unique category. With a player like Kahn, you can reach the World Cup finals. And with a player like Maradona, you can win there - with a goal considered the greatest in World Cup history (along with one in the same game that was probably the worst called one). Then there is a player like Pele - winning World Cup championships becomes a familiar feeling.


You trully believe soccer coaches don't look at stats? Jeez!

The thing with soccer, is that just as important as shots, passes and recovered balls is what the player does when not in the possession of the ball, did he un-mark himself, did he draw defenders, did he press etc.

Also happens in basketball, can't remember his name now but I read an article of a player that although his stats we unremarkable, his teams seem to win more often when he played.

Football also has several positions that are difficult to measure, Offensive linesmen, you can judge them by the overall offensive results, but you can't measure individual contributions. Good Cornerbacks get the ball thrown their way less, something that also happens to the players on the bench.

1:0, 0:0, 1:1, 0:1, 1:0, 2:1, 0:0 1:1 ......

As noted, there simply isn't a lot to measure statistically in the beautiful game, apart from shots on goal and goals allowed - and as seen above, there really aren't a lot of goals.

And beauty concerns appreciation, not statistics. As noted in the title of the autobiography by a player whose greatness can be measured in statistics, if one only desires a pale reflection of what made him such a great player.

This sort of anti-statistical pushback from traditionalists is typical as more advanced analysis starts becoming influential in a sport. Players, coaches, and fans don't really understand what is being done and dismiss those doing the stats as nerds who don't know what the game is really about. This happened (and is still happening) in baseball, but ultimately the stat nerds prevailed, as teams that employed advanced metrics performed better than those that didn't. Now everybody uses the advanced techniques. The same thing will likely happen in soccer.

Not everything in a sport can be measured, but a lot more can be measured than you think, and it can be measured a lot better. And this makes a big difference, since you can't really base decisions on unmeasurable factors.

And now no one watches baseball. I guess that's a win for nerds, but it doesn't say much for nerds ability to develop and maintain popular interest in something once the nerds take over. But that might be the perfect encapsulation of nerdom destroying long term value by monomaniacal focus on short term disputes.

Think about it we are a peak influence for nerds and "nerd" is still the best thing you can call someone to destroy their status.

Baseball teams are signing extremely lucrative broadcast deals with regional networks, and total attendance dwarfs the other professional leagues (though because there are 162 games per season). Where football beats it is having a concentration of games on Sunday and the culture that goes with it. I think the stats provide a secondary source of interest in the sport for some people, similar to fantasy football leagues.


Baseball is more watched, successful, and vaulable than in the past in absolute terms. It may not have the hold on culture it once did, but that is due to increased competition and variety in sports (a trend which is duplicated throughout culture and entertainment).

'since you can’t really base decisions on unmeasurable factors'

Well, I already the role of limited substitutions compared to American sports (a total of three substitutions in 90 minutes of play for German soccers games - four injuries means playing with 10 men on the field). The same is pretty much true when it comes to coaching a soccer team - no player 20 meters away is likely to hear what the coach is shouting during a Bundesliga game. In other words, during a half of 45 minutes, the team on the field is the team playing the game, basing their decisions on unmeasurable factors, within extremely limited input, much less control, from the coach. Which leads to a kind of grim irony - when a German teams does badly, the person with the least effect on actual game performance is the one replaced. Coaches aren't hard to find, after all - good players, on the other hand, are. Mainly because of a number of unmeasurable factors.

You might also want to look at Goalimpact


The idea is that one can try to compute a Shapley Value of each player in terms of expected goal difference per minute of playing time (adjusting for some factors).

I think they are stretching when they try to connect their algorithm to Shapley Value. From their description, what they are doing is what in basketball is called adjusted plus-minus: look at the outcome of the game (meaning not who won or lost but rather what the goal difference per minute was) and see who was on the pitch and calculate what each individual player's contribution was. This is a big regression; its strength is that it's a theoretically correct way to evaluate players' contributions but its weakness is very large standard errors partly due to collinearity and partly due to just plain large residuals requiring large amounts of data to reduce the standard errors -- in NBA basketball it takes 2 or 3 years of data to get semi-reliable results, and by the time you're looking at 2 or 3 years of data you've now got new complications because most players will change over that period of time (young players typically improve, old players typically decline).

No, the algorithm doesn't use regression. While it doesn't seem to be pure Shapley either. It indeed uses large datasets and uses aging curves to circumvent the issue you describe. See


As said above, they call it the Beautiful Game for a reason. The flowing rhythm of a game is an artistic performance as much as a sport. A deft pass or gentle control of a wayward ball will elicit murmurs of pleasure from the crowd much like a musician or actor will please an audience with an exceptional note or aside.

The very vocal - and clever - involvement of fans in the game with songs that can mock or praise add to the atmosphere, making it much like the raucous atmosphere of Shakespeare's plays centuries ago.

Players can be loved for their intangible contributions to these performances. The stats don't measure it because the stats aren't sophisticated enough to measure such complexity yet.

This is close to the right answer, but a better way to put it is that soccer becomes much less enjoyable the more you examine it.

If you treat it merely as a performance, and sing while you at the game, and "love" players for their "intangible contributions"... you are probably going to have a good time. If you start putting the on-field "action" under a microscope, it's going to dawn on you that you might as well be trying to watch grass grow, but there are soccer players in the way.

If by historical accident rugby had become the most popular sport on the continent then rugby would be called the beautiful game. The idea that the beautiful game is anything more than a polemical construct is silly.

Yet, the beautiful game is, somehow, the most played and watched sport on the planet. Possibly, because it is truly a beautiful game - and the requirements to play rugby and soccer are similar enough to avoid pointing to that as a reason for how history turned out.

However, it is probably as good a time as any to reveal that I find soccer a remarkably boring game in general, and have little understanding why so many people find it beautiful. But I accept the fact that a couple of billion people may know more about what they consider beautiful than I do.

Statistics & soccer do have an uneasy history, and the mainstream sports media isn't great at covering the sport in stats terms; I always cringe when I read defenders described by their goals and assist - these are clearly not key, reliable metrics for defensive players.

When John Henry's Fenway Sports Group took over at Liverpool, they tried to bring a a stats focus to recruitment, and famously failed. Damien Comolli, the erstwhile technical director in charge of recruitment, is still widely reviled for his sabermetrics-esque approach.

I personally enjoyed this 538 piece on Lionel Messi: http://fivethirtyeight.com/features/lionel-messi-is-impossible/

Nice article on Messi. I think that Messi must be hard to see on the field since he's so small--maybe that's his secret. Many years ago, the American football "Washington Redskins" had some wide receiver (catchers of the football) called the "Smurfs" who were effective because they were hard to see, as well as some running backs (run with the football) who were small and hard to tackle. Messi might be from this mold. Hey, the Skins are in the NFL playoffs I believe? Time to check out how they did...

As Valdano described Messi, he like Maradona is a "culo-bajo" (low ass) they have short legs, that makes them have a lower center of gravity and that allows them change direction quicker , that why he seem to zig-zag and make defenders look like statues.

I'd add that the fact that soccer is a low-scoring game introduces three complications. (a) Luck matters comparatively more than in other sports. In soccer, it is not unusual to see a game in which a team completely dominates the other, yet loses. How common in that in basketball or baseball? (b) For the same reason, a (not necessarily unexpected) goal can have a dramatic effect on how the game is played. Furthermore, there may be path dependence: for example, a team that is losing may adopt a more risky strategy, and thus may end up conceding more goals. This is hard to model. (c) One can collect a lot of statistics, but given that soccer is a game with so many players and so little scoring opportunities, measuring players' performance requires adjusting for the fact that not all players have the same opportunities to become "relevant". For example, if one team completely dominates another, how can you judge the performance of its goalkeeper? I don't know if current statistics deal with this.

My vague impression is that soccer statistics have been improving rapidly.

Steve watch a soccer highlight show sometime and you'll see they haven't. Believe me if they were ESPN would turn their highlight show into a stat fest. Instead they mostly just talk about future match ups and momentum.

ESPN are colossal idiots, hence their absence. But they do have two of the most stats-literate football writers on their website (Mike L. Goodman and Michael Caley).

Analytics have progressed massively in football. But the revolution has mainly been a proprietary one at clubs and bookies, rather than in public. However there are some very good public stats sites online, such as Statsbomb and Analytics FC.

Another difficulty is positional and influence even if not in possession of the ball. Consider the games in which Messi does not touch the ball much. Statistics would capture a "weak" game. But maybe he did not touch the ball much because he had two dedicated defenders controlling him, which allowed Neymar and Suarez to score. In other words, how do we measure the influence of a player whose effect is to change the rivals position, strategy and allocation of defenders?

'In other words, how do we measure the influence of a player whose effect is to change the rivals position, strategy and allocation of defenders?'

Measuring such things statistically would end up being a matter of judgment. And if one is already at that level, who needs to care about statistics when the decisions are already being based on judgment?

Basically, this American need for statistics, as compared to just watching the performance of the players of a team on the field, is another area where the U.S. tends to be the exception.

There's no reason why football can't have every stat that hockey does. Time on ice/pitch, goals, assists, corners, shots, shot %, saves, save %, shots blocked, turnovers, takeaways, fouls. Plus even a few more, like cross completion, shots from inside/outside the box, time in posession.

You could quite turn this into relatively meaningful analysis like "Aberdeen have the highest cross completion in the league but Thistle have one of the tallest back fives in Scotland and only concede at 6% of set plays."

One of the problems holding football statistics back is that the early attempts were pretty useless and didn't give fans any useful information. They pretty soon realised that the "possession" stat was largely meaningless to the final score and even stats supposedly more reflective proved largely irrelevant when limited to the course of one game. Much was made of Andrea Pirlo completing more passes than the entire England team at Euro 2012 yet the game went 120 minutes without a goal before the Italians won on penalties.

There certainly are stats that could provide valuable information to help the fan better understand the match they are about to watch "Oh, St. Johnstone take way more shots from outside of the box than most teams and 63% of their attacks go through Michael O'Halloran". The problem is that broadcasters haven't yet learnt which stats give genuine insight and instead focus on signaling their technical prowess. This isn't helped by the studio pundits who (in Scotland at least) fail miserably to inject any analysis other than repeating what has just been shown on the replay "You can see that he caught him late there." or regurgitating the same cliches "Tannadice is a difficult ground to go to". The kind of stats the broadcasters love are ones which have no bearing on the outcome what so ever, such as "Albion Rovers haven't won at Celtic Park since 1942" results from sixty years ago have no bearing on today's match, they've actually only played four times in that period, and everyone knows that Celtic have historically been a far better side than Albion Rovers so it doesn't even surprise anyone.

As a lover of football and a lover of stats (but not necessarily football stats), I firmly believe there is a place for stats in football but broadcasters and journalists need to learn which stats will help their customers understand the game better.

P.S. Barcelona are a horrid organisation and represent all that is soulless and wrong with 21st century football, but that's perhaps a comment for another time (and has potential for very interesting economic analysis).

This is interesting. I was wondering if hockey had the same issues as soccer. Apparently not. This makes me wonder if the lack of stats in soccer might be a "European phenomenon", in the sense that the sports statistics craze that started in the US has yet to make it across the pond (and maybe never will). It seems like European sports announcers would rather see soccer as an "art" that cannot be quantified, as opposed to US announcers who are hell bent on measuring things as "objectively" and precisely as possible.

Maybe it's a class thing. I realize that soccer isn't as blue collar on the Continent as it is in England (e.g., the German kid who scored the winning goal in the last World Cup is the son of a Herr Professor), but still, soccer in England is lower class than rugby or cricket.

Statistics-crazed baseball in America was always pretty bourgeois. Granted, it was drifting working class in the 1890s when the National League was dominated by the Irish-dominated Baltimore Orioles, but then in 1901 Ban Johnson founded the American League to draw respectable families to the ballpark, and the American League has been dominant pretty much ever since.

My guess is that baseball selects against book-learning because reading a lot of books when you are young weakens your eyes, which are key to hitting a baseball. So baseball hitters aren't intellectuals. (Pitchers -- e.g., Jim Brosnan, Jim Bouton, R.A. Dickey -- are a little less anti-reading.) Still, smart statistical thinkers like Henry Chadwick, Branch Rickey, and Bill James eventually fit in well with baseball because it's a pretty middle class game.

Does reading when young really weaken the eyes?

I don't think so. It is probably merely "time spent reading books is not time spent playing baseball". And vice versa.

Time spent indoors out of the sun correlates with near-sightedness.

I'm also a pretty big hockey fan and European hockey media don't focus, or even record, stats to anywhere near the same extent as in North America. There's definitely a cultural element to it.

Maybe Americans are too obsessed with measuring stuff?

Pfffft. Where is your data to back that up?

I think there are two quite distinct issues that are being conflated here. The first is whether association football is simply more difficult to analyse statistically in a meaningful way than most US sports. (Or to put it another way are the main American sports much easier to quantify). I think the answer here is very clear which is that with modern technology there's no difference. At one time the stop/start nature of most US sports did make statistical analysis easier - interestingly cricket and tennis , which are also stop/start games have always had lots of statistical analysis, particularly cricket. With modern technology this is no longer the case and there are loads of football stats being collected by OPTA and others. Not only that but these are now used massively by coaches to decide things like which formation to play, who to start in what position, when to make a substitution and of whom, which players to buy and so on.Of course these don't capture the genius of players like Messi or Iniesta but that kind of observation is true in any sport and truly great players like that are actually very rare. There is a certain kind of knowledge about a sport that can't be captured in stats but contrariwise there's an awful lot that can be.

The other, different issue is that of the part that stats play in sports culture on either side of the Atlantic, whether you're talking about broadcasting or fan discussions. Here I think there is a big difference. While the professionals in European football are up to their ears in stats you don't find fans talking about them when they discuss players or performances, which is quite a contrast to American sports fans. In the media there is an increasing reliance on them but it's still nothing like what you get in US coverage and qualitative analysis is still much more common. This is true even in sports like cricket that are very easy to analyse statistically. Interesting question as to why that is, I don't know if Canada or Australia are similar to the US in this regard or more like Europe (rugby league is another game that lends itself to statistical analysis).

'when to make a substitution and of whom'

Considering a maximum of three substitutions in 90 minutes of play, and the rational anticipation of at least one injury per game requiring a substitution being held in reserve for most of a game, basically each team can substitute a total of 1 player per half.

To be honest, such constraints make the idea of using statistics as compared to merely make a decision based on experience concerning substitutions seem pretty irrelevant.

You've not convinced me on the first point. And I don't think the start/stop nature of a sport is the relevant metric. Baseball talent is easy to measure because each player steps to the plate to try to (a) score a run, or (b) barring that, increase the probability of a run scoring that inning by getting on base. Similarly, the pitcher is trying to (a) make an out, or (b) barring that, induce weak contact. In a regulation game, there are 27 contexts in which a player comes to the plate (1st inning, no outs; 1st inning, one out, etc.), and in each of these contests, the batter and pitcher stand in a defined space with a defined distance between them for a contest btw/ individuals with easily defined and competing goals.

The analogy to soccer would be if the sport consisted of penalty shoot-outs. One-on-one contest btw/ two individuals w/ competing goals. Most of soccer is contextual, and evolves in-play, with contributions that are not as clearly demarcated.

That's exactly right. One way to "test" this would be to compare the use of basketball stats across countries, say USA vs. Spain. Since the game is the same, the difference in stats tracking must be due to "culture" or "preferences". My impression is that it's a mix of both. Fans and sports announcers in Spain do mention more stats when talking about basketball than about soccer. However, the use of basketball stats is a lot heavier in the US than in Spain.

Baseball's statistics mania influences other sports in America.

The Moneyball craze in America is real but overblown because the sabermetricians hitched a ride on the rise of steroids: back in the 20th Century, their leading advice was to stop playing wiry banjo hitters with good hands and no power, instead play hulks who could hit homers and therefore would draw walks. This advice worked great in an era of no steroid testing.

Bill James almost never wrote about steroids until 2009 when he published a pathetic piece in Slate saying he thought Barry Bonds' maple bats really deserve more credit for Bonds' insane late career stats. Nate Silver admitted he thought Manny Ramirez was unlikely to be using steroids. Steroids only come up once in Moneyball, in a footnote.

There's literature of everything. A nice football statistics literature review: http://www.footballscience.net/special-topics/performance-analysis/ Remarkable ideas:

a) "Ball possession as a sequence: Possession lengths of 3 to 7 passes seemed more likely to produce goals than shorter and longer length possessions (60)." So, possession measured in number of passes is a variable with a distribution. Peak goal performance may be at medium values. This may work, clumsy teams lose the ball quickly, and teams without imagination (defensive ones) control the ball too much without doing anything. This may explain the poor correlation between match averaged ball possession and who wins.

b) There's an interesting divide between low and high intensity activity on the field. This may explain why the total distance players run is meaningless. What matters may be the distance ran an max intensity.

c) Moreover, players from successful teams were found to be leaner and more muscular than their unsuccessful counterparts (48) Lago-Penas, C., et al., Anthropometric and physiological characteristics of young soccer players according to their playing positions: relevance for competition success. Journal of Strength and Conditioning Research, 2011. 25(12): 3358-3367.

Article C is an illusion crushing reality. This factor is considered by football coaches but it will never arrive to TV because it kills footballs fans dreams. Football fans like to believe top football players are normal guys. That's why Messi and Maradona are more popular than players that have a more athletic image. Messi has great training. I'm sure if he is measured in a lab his numbers will be great and top 0.001% athletic performance for humans.......at least for humans that run kicking a football ball. However, his genes and the football clothes give him an average guy look. The average TV & beer drinking fan dreams that they can be good in the next neighborhood match because both the fan an Messi look the same wearing loose football clothes. The fans will dream even more that their children look just like Messi in a Barcelona shirt.

That's why football fans hate statistics. They don't want to know how fast or how athletic are their football heroes. I think American football fans are resigned to look at sports stars as unreachable. If statistics do exist, football investors and coaches know them. Football still sells the illusion to fans that everyone can do it.

The book Soccernomics does a good job covering a lot of what is out there.


Messi and Maradona are popular because they are best players of their respective generation. Cristiano Ronaldo is also a stand out, and had Messi not existed would be considered the best in the world. CR is very athletic, but the real reason of his relative unpopularity is that he is also a world class jerk.

'This factor is considered by football coaches but it will never arrive to TV because it kills footballs fans dreams.'

Well, maybe English language soccer TV, but this is considered an utterly banal observation in Germany - of course players in better shape will be more likely to win games. Which is a major, and very public, reason why the German national teams do so well - one very often commented on over decades by the same announcers that praise individuals players for the beauty of their game.

It isn't either/or, at least when one has experience of a less simplistic way of looking at a sport.

That said the lack of advanced stats in soccer is something of a god send for the sports popularity in the US because the insistence on American soccer supporters that is to say fans deacribing the jogo bonito (because why wouldn't you throw out some random Portuguese term) with words like regate, pitch, and service is already so off putting in the extreme.

I don't think that any US -v- UK cultural explanations have much merit. People have been analyzing cricket statistics since at least 1864 when Wisden Cricketers' Almanack was first published. At least where I'm from in the UK both sports had a large common pool of fans and amateur players (cricket is played in the summer and soccer in the winter). The history isn't nearly as long, but I think that there is also a useful use of statistics in Rugby.

I tend to agree with the above comments that with soccer its much harder to quantify what was important about a game. In particular, in a soccer match a player that had a mediocre game but scored a brilliant goal after 30 seconds of inspired play might win the match 1-0. In cricket, a brilliant shot by an otherwise mediocre batsman would add six points so a team score of several hundred. Stats are much more useful if game performance is based upon consistent behavior. They aren't if games can be won and lost in brief moments of outstanding performance.

Baseball, easily the most quantified sport in the world, is also relatively low scoring, with the outcome of individual games often determined by freak occurrences such as a mediocre player having one outstanding moment. The baseball statistics definitely offer only the long term view.

Of course, baseball seasons are very long: each team plays 162 games. Individual games are determined in large part by random chance, but over the course of baseball's very long season, the cream rises to the top.

"Baseball easily the most quantified sport" you aren't familiar with many sports obviously. The amount of stats on the TV coverage is due to TV needing to fill the gaps in play. Cricket is similar to baseball in this respect and to stats. Other sports have enourmous amonts of stats gathered by automated means, but it doesn't appear in the coverage so much.

Good cricket stats go back to 1812.

The biggest events are wickets not runs. There are only at most 20 wickets taken by a team in a first class game - ten in a one day game, but the average number of wickets is less. If you fluke the wicket of a number three batsman when your team only takes 5 wickets in the match, it can have a 25% influence on the outcome of a game. It's not the same as fluking a goal in a one nill soccer win, but it is not as different as you make out.

Henry Chadwick, the key 19th Century figure in the development of baseball statistics, was an English cricket fan who moved at age 12 to Brooklyn.

It's appropriate that a discussion of stats in sports should appear in an economics blog because arcane sports statistics are used in the US to make financial decisions on player signings and contracts, not primarily to reveal nuances of the game. Oddly, some obvious statistics aren't generally published. For instance, in American football, when a pass receiver drops an easily catchable ball it's recorded as an incompletion in the quarterback's resume. Certainly the coaches are aware that an end with butterfingers is unlikely to stay in the lineup but, unlike a stone-handed third baseman, no error is given for the dropped ball in the game synopsis. When the quarterback is forced to eat the ball a sack, or even a fraction of a sack is awarded to a defensive player but a corresponding "missed block" stat isn't kept for the offensive lineman that failed to protect the passer.

American football culture downplays statistics and emphasizes coaches watching huge numbers of hours of videotape of games. Presumably, coaches emerge from the film room with fairly statistically accurate impressions, even if statistics aren't used.

Dropped Pass Leaders in the NFL

Good article on how offensive linemen are graded.
I'm a casual fan, but I hear quite a bit of talk of offensive linemen grades for various games and during free agency when I visit my team's forum.

Non-traditional soccer stats are getting more and more common. They haven't broken through to ESPN, but it's a matter of time before "key passes" (shot assists), expected goals, successful dribbles, tackles/interceptions/etc. break through. However, as has been mentioned, so much of what an elite defender or midfielder does is tied into subtle team and positional contexts that assessing the value of either just from available statistical evidence is near-impossible.

The closest analogue is basketball -- it's hard for individual numbers to capture the value of a good screen, or a well-orchestrated hedge defending a pick and roll. However, in basketball, the frequency of substitutions and sheer number of minutes played allows advanced on-off metrics to capture this value. In soccer, with fewer matches and drastically fewer substitutions, on-off metrics can't have the necessary precision to be useful. This isn't to say that the existing stats don't help in understanding player value, because they definitely do, just that we're a long way from creating a serviceable one-number metric.

A big problem with measuring soccer players’ values is that the game is always evolving (Tactics, Fashion, Steroids, Skill Innovation, Training, Etc.). This causes problems with stats because ones that once were relevant can become obsolete over time. In addition, there is a division of labor on the field with positions being destroyed by creative destruction. For example, I made my trade as a stopper and eventually moved to outside midfield. This was a smart move as the AMERICAN game moved to a flat back four and the stopper position was destroyed. Right now the greatest gains in soccer have come from speed:


I believe for this reason we have only begun to see the destruction on the focus of ball retention. Arsene Wenger has actually recently said that, “It's the first year in the Premier League where the possession doesn't give you the win as much.”
"I keep my philosophy, but as well I'm an observer and I go through the stats of every game at the moment. I am trying to understand; is it something new, is something happening there, is something going on that was not before?”


In my opinion Wenger is the best manager of all-time based off his profit signal. Please note, he is an economist.

Every sport is always evolving.

Next up. Who is the best Old Master painter, to six decimal places. Americans, sigh.

David Brooks, "Baseball or Soccer?"


The original question is silly because the answer is so obvious: what sort of statistics could one collect about soccer/football in the first place?

And the answers are equally obvious; only a few of the comments here have provided decent answers. One can (and there are several organizations that do this, e.g. Opta Sports) collect information about touches, passes completed and incomplete, balls won, tackles, etc. See e.g. fourfourtwo's StatZone: http://www.fourfourtwo.com/statszone

But even without looking at the stats, it's also obvious that those stats are only going to nibble at the edges of describing what happened in the game. They fail to give a good description of what a player accomplished or even how the team performed.

One needs optical tracking; it's not enough to count events, we also need to track *where* on the field that event happened and where the twenty-two players were at the time (and what direction they were facing and what their velocity was). E.g. a sloppy pass from a fullback to a midfielder might under some circumstances be a minor nuisance: the midfielder merely has to alter his direction and run a few yards to corral the errant pass. But under other circumstances that exact same pass in the exact same spot can be disastrous, if there's an alert defender lurking who can steal the pass and get a breakaway goal.

Needless to say there are companies doing this too, several for the NBA and I believe several for soccer/football too. In the old days coaching staffs would watch video; nowadays there are companies that do that for them, in some cases using human recorders to tabulate the information and in other cases using optical tracking and computers.

The video data are of course enormously complex: twenty-two players plus the ball tracked two-dimensionally or even three-dimensionally. I don't know the state of the art in soccer/football but in the NBA researchers are only starting to figure out how to model and analyze these sorts of data. Fivethirtyeight.com does a good job of presenting recent public work but I suspect that most of the most advanced work in both basketball and soccer/football is done by individual teams and kept proprietary. The Sloan Sports Analytics Conference is another good place to see where the (public) state of the art is; the much smaller NESSIS conference put on by the Harvard statistics department also covers this sort of stuff.

What is the utility function?

There is the value to winning games, and then there is value to the club owners in terms of ticket sales, tv and merchandise revenue. The value of expensive players int the latter case seems like a much easier thing to show.

Soccer has just as many stats as most other sport. For example every pass has been recorded for all European top divisions in the last 5 years. They have so far prevented the use of the tracking/monitoring devices that have become standard in Rugby and some other sports, but they are not behind in other respects.

Large amounts of stats have not made it into most soccer coverage. Mostly the stats are used by teams themselves and sports academics. I guess this is due to the audience. In most of the world Soccer is the only major team sport, and most of the audience doesn't watch any other major sport. It is possibly the world's most isolated and insular sport. It is not a fast growing sport, and the audience in most countries is in the second or third generation. It's possibly also one of the most conservative sports.

Couldn't this guy have done a little research before making his claims? We know that SEVERAL soccer clubs use analytics and advanced statistics. They include: Arsenal, Brentford, FC Mitjylland, Man City, Liverpool, Leicester... hell, Arsene Wenger has recently mentioned expected goals in a post-match interview, which right now is the gold standard in soccer analytics. Sam Allardyce is famed for using analytics to develop is long-ball tactic. Whether clubs have used analytics successfully is another matter, but this kind of thinking is certainly being done.

Here's an article about the analytics revolution at Mitjylland. He recently hired a prominent analytics blogger, Ted Knutson, to run that side of the business for Midtjylland and his other team, Brentford.


Public soccer analytics bloggers:

http://statsbomb.com/ (a collection of them)


Guys who don't do analytics so much as use them in their analysis:


Honestly, this is just a circular argument of ignorance. Check out www.statsbomb.com and www.analyticsfc.co.uk

That's where the public use of analytics in football is at. The private use at clubs is a good deal farther than that.

Keep in mind that sabermetrics emerged in the late 20th Century America as a passion among some smart white guys in part because it was a Safe Space for engaging in pattern recognition without risking one's career. Nobody gets fired like Larry Summers or James D. Watson did for noticing for holding a politically incorrect view derived from baseball statistics.

Here's an episode of Hot Takedown (the FiveThirtyEight sports podcast) on exactly this topic: http://espn.go.com/espnradio/play?id=13464281

It seems to me there are two separate issues. The first is the difficulty of coding events in soccer without injecting too much subjectivity.

However, I think that if people really wished to generate useful statistics there are plenty of workable coding strategies. For instance one can record the fraction of goals scored by a team after a player gains control of the ball within so many meters of the goal. How frequently they complete a pass when at least y players of the opposing team are within z meters etc.. etc.. Recent advances in computer vision should allow these kind of statistics to be generated in an automated manner (it should be fine if it only roughly guesses at things like possession).

I think the real effect you are seeing is that soccer doesn't have any statistics that are of interest to the average individual. You can look at baseball and basketball statistics and, without running any complicated analysis on them, use them to roughly compare players. Sure better teammates make it easier to score baskets but Michael Jordan's points per game was going to be high no matter who his other players are.

Statistics for soccer won't have any such nice property. Given the importance of passing and teamwork what really matters are the results of complex models, e.g., how much more likely was the team to score conditional on a certain player gaining possession on a certain area of the field. Of course if some simple calculation like this produced really good results you could just use that as your statistics but one expects that, just like analyzing data in the social sciences, complicated analysis to elimate various confounders will be required. Fan interest won't justify creating those statistics but when they get cheap enough to produce algorithmically bookmakers might (though they will stay secret).

Comments for this post are closed