Scoring dynamics across professional team sports: tempo, balance and predictability
© Merritt and Clauset; licensee Springer 2014
Received: 20 January 2014
Accepted: 5 February 2014
Published: 28 February 2014
Despite growing interest in quantifying and modeling the scoring dynamics within professional sports games, relative little is known about what patterns or principles, if any, cut across different sports. Using a comprehensive data set of scoring events in nearly a dozen consecutive seasons of college and professional (American) football, professional hockey, and professional basketball, we identify several common patterns in scoring dynamics. Across these sports, scoring tempo - when scoring events occur - closely follows a common Poisson process, with a sport-specific rate. Similarly, scoring balance - how often a team wins an event - follows a common Bernoulli process, with a parameter that effectively varies with the size of the lead. Combining these processes within a generative model of gameplay, we find they both reproduce the observed dynamics in all four sports and accurately predict game outcomes. These results demonstrate common dynamical patterns underlying within-game scoring dynamics across professional team sports, and suggest specific mechanisms for driving them. We close with a brief discussion of the implications of our results for several popular hypotheses about sports dynamics.
(See supplementary material 1)
Professional team sports like American football, soccer, hockey, basketball, etc. provide a rich and relatively well-controlled domain by which to study fundamental questions about the dynamics of competition. In these sports, most environmental irregularities are eliminated, players are highly trained, and rules are enforced consistently. These features produce a level playing field on which competition outcomes are determined largely by a combination of skill and luck (ideally more the former than the latter).
Modern sports in particular produce large quantities of detailed data describing not only competition outcomes and team characteristics, but also the individual events within a competition, e.g., scoring events, referee calls, timeouts, ball possessions, court positions, etc. The availability of such data has enabled many quantitative analyses of individual sports [1–12]. Relatively little work, however, has asked what patterns or principles, if any, cut across different sports, or whether there are fundamental processes governing some dynamical aspects of all such competitions. These questions are the focus of this study, and our results shed light on several other phenomena, including the roles of skill and luck in determining outcomes, and the extent to which events early in the game influence events later in the game.
Game theory provides an attractive quantitative framework for understanding the principles and dynamics of competition . Given a set of payoffs for different actions, formal game theory can identify the optimal strategy or probability distribution over actions against an intelligent adversary. In simple decision spaces, like penalty shots in soccer  or serve-and-return play in tennis , professional athletes appear to behave as game theory predicts (although some do not ). However, most professional team sports exhibit large and complex decision spaces, with many possible actions of uncertain payoffs, and execution is carried out by an imperfectly coordinated team. Game theory provides less guidance within such complex games, and the resulting dynamics are often better described using tools from dynamical systems [17, 18].
Using such an approach, we investigate the within-game scoring dynamics of four team sports, college and professional (American) football, professional hockey, and professional basketball. Our primary goals are (i) to quantify and identify the common empirical patterns in scoring dynamics of these sports, and (ii) to understand the competitive processes that produce these patterns. We do not consider non-stationary effects across games, e.g., evolving team rosters or skill sets, playing field variables, etc. Instead, we focus explicitly on the sequence of scoring events within games. For each sport, we study three measurable quantities: scoring event tempo, balance, and predictability. We take an inferential approach to investigating their cross-sport patterns and present a generative model of competition dynamics that can be fitted directly to scoring event data within games. We apply this model to a comprehensive data set of 1,279,901 scoring events across 9 or 10 years of consecutive seasons in our four team sports.
There are many claims in both the academic literature and the popular press about scoring dynamics within sports, and sports are often used as exemplars of decision making and dynamics in complex competitive environments [16, 19–21]. Our results on common patterns in scoring dynamics and the processes that generate them serve to clarify, and in several cases directly contradict, many of these claims, and provide a systematic perspective on the general phenomenon.
1.1 Summary of results
Across all sports, scoring tempo - when scoring events occur - is remarkably well-described by a Poisson process, in which scoring events occur independently with a sport-specific rate at each second on the game clock. This rate is fairly stable across the course of gameplay, except in the first and last few seconds of a scoring period, where it is much lower or much higher, respectively, than normal. This common pattern implies that scoring events are largely memoryless, i.e., the timing of events earlier in the game have little or no impact on the timing of future events. Memorylessness contrasts with the dynamics of strategic games like chess or Go, in which events early in a game constrain and drive later events. Instead, professional sports appear to exhibit little strategic entailment, and events are driven instead by short-term optimization for scoring as quickly as possible.
The scoring balance between teams - how often a team wins a scoring event - is well-described by a common Bernoulli process, with a bias parameter that varies effectively over gameplay and across sports. Football and hockey exhibit a common pattern in which the probability of scoring again while in the lead effectively increases with lead size. In basketball, however, this probability decreases with lead size (a phenomenon first identified by ). The former pattern is consistent with the outcome of each scoring event being determined by a memoryless coin flip whose bias depends on the difference in the teams’ inherent skill levels. The pattern in basketball is also consistent with such a process, but where on-court team skill varies inversely with lead size as a result of teams deploying their weaker players when they are in the lead and their stronger players when they are not. This player management strategy produces substantially more unpredictable games than in other sports, with winning teams losing their lead and losing teams regaining it much more often than we would normally expect.
A summary of our results, in question-and-answer format
Does scoring in games of different team sports follow common patterns?
Yes. The pattern of when points are scored and who gets them are remarkably similar across sports.
What is the common pattern?
Events occur randomly (a Poisson process). Which team wins the points is coin flip (a Bernoulli process) that depends on the relative skill difference of the teams on the field.
What might cause this pattern?
A strong focus on short-term maximization of scoring opportunities, while blocking the other team from the same. There is no evidence of strategic planning across plays, as in games like chess or Go. Teams largely react to events as they occur.
What determines how often scoring occurs?
Each sport has a characteristic rate (see Table 3), which increases dramatically at the end of scoring periods.
What determines who wins an event?
Skill and luck, in that order.
Do events early in a game influence events later in a game?
No. Each scoring event or ‘play’ is effectively independent, once we control for relative team skill (and lead size in basketball). Gameplay is effectively ‘memoryless.’
Can a team be ‘hot,’ where they score in streaks?
No. Just like players , teams do not get ‘hot.’ Scoring streaks are caused by getting lucky.
When is it easier or harder to score?
Every moment is equally easy or difficult. But, teams try harder at the end of a period.
Which sport is the most unpredictable?
Pro basketball, where lead sizes (spreads) tend to shrink back to zero. This tendency generates many ‘ties’ as a game unfolds.
Do other sports exhibit this pattern?
No. Pro basketball is the only sport where the spread tends to shrink. In football and hockey, the spread tends to grow over time.
Does being behind help you win, as argued by ?
No. Being behind helps you lose. Being ahead and being lucky helps you win.
We combine these insights within a generative model of gameplay and demonstrate that it accurately reproduces the observed evolution of lead-sizes over the course of games in all four sports, and also makes highly accurate predictions of game outcomes, when only the first few scoring events have occurred. Cursory comparisons suggest that this model achieves accuracy comparable to or better than several commercial odds-makers, despite this model knowing nothing about teams, players, or strategies, and instead relying exclusively on the observed tempo and balance patterns in scoring events.
2 A null model for competition dynamics
We first introduce the limiting case of an ideal competition, which provides a useful tool by which to identify and quantify interesting deviations within real data, and to generate hypotheses as to what underlying processes might produce them. Although we describe this model in terms of two teams accumulating points, it can in principle be generalized to other forms of competition.
In an ideal competition, events unfold on a perfectly neutral or ‘level’ playing field, in which there are no environmental features that could give one side a competitive advantage over the other . Furthermore, each side is perfectly skilled, i.e., they possess complete information both about the state of the game, e.g., the position of the ball, the location of the players, etc. and the set of possible strategies, their optimum responses, and their likelihood of being employed. This is an unrealistic assumption, as real competitors are imperfectly skilled, and possess both imperfect information and incomplete strategic knowledge of the game. However, increased skill generally implies improved performance on these characteristics, and the limiting case would be perfect skill. Finally, each side exhibits a slightly imperfect ability to execute any particular chosen strategy, which captures the fact that no side can control all variables on the field. In other words, two perfectly skilled teams competing on a level playing field will produce scoring events by chance alone, e.g., a slight miscalculation of velocity, a fumbled pass, shifting environmental variables like wind or heat, etc.
An ideal competition thus eliminates all of the environmental, player, and strategic heterogeneities that normally distinguish and limit a team. The result, particularly from the spectator’s point of view, is a competition whose dynamics are fundamentally unpredictable. Such a competition would be equivalent to a simple stochastic process, in which scoring events arrive randomly, via a Poisson process with rate λ, points are awarded to each team with equal probability, as in a fair Bernoulli process with parameter , and the number of those points is an iid random variable from some sport-specific distribution.
The evolution of the difference in these scores thus follows an finite-length unbiased random walk on the integers, moving left or right with equal probability, starting at at .
Real competitions will deviate from this ideal because they possess various non-ideal features. The type and size of such deviations are evidence for competitive mechanisms that drive the scoring dynamics away from the ideal.
3 Scoring event data
Summary of data for each sport, including total number of seasons, teams, competitions, and scoring events
A brief overview of each sport’s primary game mechanics is provided in Additional file 1 as Appendix A. In general, games in these sports are competitions between two teams of fixed size, and points are accumulated each time one team places the ball or puck in the opposing team’s goal. Playing fields are flat, featureless surfaces. Gameplay is divided into three or four scoring periods within a maximum of 48 or 60 minutes (not including potential overtime). The team with the greatest score at the end of this time is declared the winner.
4 Game tempo
A game’s ‘tempo’ is the speed at which scoring events occur over the course of play. Past work on the timing of scoring events has largely focused on hockey, soccer and basketball [4, 6, 10], with little work examining football or in contrasting patterns across sports. However, these studies show strong evidence that game tempo is well approximated by a homogenous Poisson process, in which scoring events occur at each moment in time independently with some small and roughly constant probability.
Analyzing the timing of scoring events across all four of our sports, we find that the Poisson process is a remarkably good model of game tempo, yielding predictions that are in good or excellent agreement with a variety of statistical measures of gameplay. Furthermore, these results confirm and extend previous work [10, 19], while contrasting with others [12, 25], showing little or no evidence for the popular belief in ‘momentum’ or ‘hot hands,’ in which scoring once increases the probability of scoring again very soon. However, we do find some evidence for modest non-Poissonian patterns in tempo, some of which are common to all four sports.
4.1 The Poisson model of tempo
Tempo summary statistics for each sport, along with simple derived values for the expected number of events per game and seconds between events
Under a Poisson model , the number of scoring events per game follows a Poisson distribution with parameter λT, and the maximum likelihood estimate of λ is the average number of events observed in a game divided by the number of intervals (which varies per sport). Furthermore, the time between consecutive events follows a simple geometric (discrete exponential) distribution, with mean , and the two-point correlation between these delays is zero at all time scales.
where is the k th inter-arrival time, n indicates the gap between it and a subsequent event, and is the mean time between events. If is positive, short intervals tend to be followed by other short intervals (or, large intervals by large intervals), while a negative value implies alternation, with short intervals followed by long, or vice versa. Across all four sports, the correlation function is close or very close to zero for all values of n (Figure 2 insets), in excellent agreement with the Poisson process, which predicts for all , representing no correlation in the timing of events (a result also found by  in basketball). However, in CFB, NFL and NHL games, we find a slight negative correlation for very small values of n, suggesting a slight tendency for short intervals to be closely followed by longer ones, and vice versa.
4.2 Common patterns in game tempo
Our results above provide strong support for a common Poisson-like process for modeling game tempo across all four sports. We also find some evidence for mild non-Poissonian processes, which we now investigate by directly examining the scoring rate as a function of clock time. Within each sport, we tabulate the fraction of games in which a scoring event (associated with any number of points) occurred in the t th second of gameplay.
4.2.1 Early phase: non-linear increase in tempo
When a period begins, players are in specific and fixed locations on the field, and the ball or puck is far from any team’s goal. Thus, without regard to other aspects of the game, it must take some time for players to move out of these initial positions and to establish scoring opportunities. This would reduce the probability of scoring relative to the game average by limiting access to certain player-ball configurations that require time to set up. Furthermore, and potentially most strongly in the first of these phases (beginning at ), players and teams may still be ‘warming up,’ in the sense of learning  the capabilities and tendencies of the opposing team and players, and which tactics to deploy against the opposing team’s choices. These behaviors would also reduce the probability of scoring by encouraging risk averse behavior in establishing and taking scoring opportunities.
We find evidence for both mechanisms in our data. Both CFB and NFL games exhibit short and modest-sized dips in scoring rates in periods 2 and 4, reflecting the fact that player and ball positions are not reset when the preceding quarters end, but rather gameplay in the new quarter resumes from its previous configuration. In contrast, CFB and NFL periods 1 and 3 show significant drops in scoring rates, and both of these quarters begin with a kickoff from fixed positions on the field. Similarly, NBA and NHL games exhibit strong but short-duration dips in scoring rate at the beginning of each of their periods, reflecting the fact that each quarter begins with a tossup or face-off, in which players are located in fixed positions on the court or rink. NBA and football games also exhibit some evidence of the ‘warming up’ process, with the overall scoring rate being slightly lower in period 1 than in other equivalent periods. In contrast, NHL games exhibit a prolonged warmup period, lasting well past the end of the first period. This pattern may indicate more gradual within-game learning in hockey, perhaps are a result of the large diversity of on-ice player combinations caused by teams rotating their four ‘lines’ of players every few minutes.
4.2.2 Middle phase: constant tempo
Once players have moved away from their initial locations and/or warmed up, gameplay proceeds fluidly, with scoring events occurring without any systematic dependence on the game clock. This produces a flat, stable or stationary pattern in the probability of scoring events. A slight but steady increase in tempo over the course of this phase is consistent with learning, perhaps as continued play sheds more light on the opposing team’s capabilities and weaknesses, causing a progressive increase in scoring rate as that knowledge is accumulated and put into practice.
A stable scoring rate pattern appears in every period in NFL, CFB and NBA games, with slight increases observed in periods 1 and 2 in football, and in periods 2-4 in basketball. NHL games exhibit stable scoring rates in the second half of period 2 and throughout period 3. Within a given game, but across scoring periods, scoring rates are remarkably similar, suggesting little or no variation in overall strategies across the periods of gameplay.
4.2.3 End phase: sharply increased tempo
The end of a scoring period often requires players to reset their positions, and any effort spent establishing an advantageous player configuration is lost unless that play produces a scoring event. This impending loss-of-position will tend to encourage more risky actions, which serve to dramatically increase the scoring rate just before the period ends. The increase in scoring rate should be largest in the final period, when no additional scoring opportunities lay in the future. In some sports, teams may effectively slow the rate by which time progresses through game clock management (e.g., using timeouts) or through continuing play (at the end of quarters in football). This effectively compresses more actions than normal into a short period of time, which may also increase the rate, without necessarily adding more risk.
We find evidence mainly for the loss-of-position mechanism, but the rules of these games suggest that clock management likely also plays a role. Relative to the mean tempo, we find a sharply increased rate at the end of each sport’s games, in agreement with a strong incentive to score before a period ends. (This increase indicates that a ‘lolly-gag strategy,’ in which a leading team in possession intentionally runs down the clock to prevent the trailing team from gaining possession, is a relatively rare occurrence.) Intermediate periods in NFL, CFB and NBA games also exhibit increased scoring rates in their final seconds. In football, this increase is greatest at the end of period 2, rather than period 4. The increased rate at the ends of periods 1 and 3 in football is also interesting, as here the period’s end does not reset the player configuration on the field, but rather teams switch goals. This likely creates a mild incentive to initiate some play before the period ends (which is allowed to finish, even if the game clock runs out). NHL games exhibit no discernible end-phase pattern in their intermediate periods (1 and 2), but show an enormous end-game effect, with the scoring rate growing to more than three times its game mean. This strong pattern may be related to the strategy in hockey of the losing team ‘pulling the goalie,’ in which the goalie leaves their defensive position in order to increase the chances of scoring. Regardless of the particular mechanism, the end-phase pattern is ubiquitous.
In general, we find a common set of modest non-Poissonian deviations in game tempo across all four sports, although the vast majority of tempo dynamics continue to agree with a simple Poisson model.
5 Game balance
A game’s ‘balance’ is the relative distribution of scoring events (not points) among the teams. Perfectly balanced games, however, do not always result in a tie. In our model of competition, each scoring event is awarded to one team or the other by a Bernoulli process, and in the case of perfect balance, the probability is equal, at . The expected fraction of scoring events won by a team is also , and its distribution depends on the number of scoring events in the game. We estimate this null distribution by simulating perfectly balanced games for each sport, given the empirical distribution of scoring events per game (see Figure 1). Comparing the simulated distribution against the empirical distribution of c provides a measure of the true imbalance among teams, while controlling for the stochastic effects of events within games.
Across all four sports, we find significant deviations in this fraction relative to perfect balance. NFL and CFB games exhibited more variance than expected, while NHL and NBA games exhibited the least. Within a game, scoring balance exhibits unexpected patterns. In particular NBA games exhibit an unusual ‘restoring force’ pattern, in which the probability of winning the next scoring event decreases with the size of a team’s lead (a pattern first observed by ). In contrast, NFL, CFB and NHL games exhibit the opposite effect, in which the probability of winning the next scoring event appears to increase with the size of the lead - a pattern consistent with a heterogeneous distribution of team skill.
5.1 Quantifying balance
The fraction of all events in the game that were won by a randomly selected team provides a simple measure of the overall balance of a particular game in a sport. Let r and b index the two teams and let () denote the total number of events won by team r in its game with b. The maximum likelihood estimator for a game’s bias is simply the fraction of all scoring events in the game won by r.
In CFB and NFL, the distributions of scoring balances are similar, but the shape for CFB is broader than for NFL, suggesting that CFB competitions are less balanced than NFL competitions. This is likely a result of the broader range of skill differences among teams at the college level, as compared to the professionals. Like CFB and NFL, NHL games also exhibit substantially more blowouts and fewer ties than expected, which is consistent with a heterogeneous distribution of team skills. Surprisingly, however, NBA games exhibit less variance in the final relative lead size than we expect for perfectly balanced games, a pattern we will revisit in the following section.
5.2 Scoring while in the lead
Although many non-Bernoulli processes may occur within professional team sports, here we examine only one: whether the size of a lead L, the difference in team scores or point totals, provides information about the probability of a team winning the next event.  previously considered this question for scoring events and lead sizes within NBA games, but not other sports. Across all four of our sports, we tabulated the fraction of times the leading team won the next scoring event, given it held a lead of size L. This function is symmetric about , where it passes through probability where the identity of the leading team may change.
Although the positive function for CFB, NFL and NHL games may superficially support a kind of ‘hot hands’ or cumulative advantage-type mechanism, in which lead size tends to grow superlinearly over time, we do not believe this explains the observed pattern. A more plausible mechanism is a simple heterogeneous skill model, in which each team has a latent skill value , and the probability that team r wins a scoring event against b is determined by a Bernoulli process with . (This model is identical to the popular Bradley-Terry model of win-loss records of teams , except here we apply it to each scoring event within a game.)
For a broad class of team-skill distributions, this model produces a scoring function with the same sigmoidal shape seen here, and the linear pattern at is the result of averaging over the distribution of biases c induced by the team skill distribution. The function flattens out at large assuming the value representing the largest skill difference possible among the league teams. This explanation is supported by the stronger correlation in CFB games (+0.005 probability per point in the lead) versus NFL games (+0.002 probability per point), as CFB teams are known to exhibit much broader skill differences than NFL teams, in agreement with our results above in Figure 4.
NBA games, however, present a puzzle, because no distribution of skill differences can produce a negative correlation under this latent-skill model.  suggested this negative pattern could be produced by possession of the ball changing after each scoring event, or by the leading team ‘coasting’ and thereby playing below their true skill level. However, the change-of-possession rule also exists in CFB and NFL games (play resumes with a faceoff in NHL games), but only NBA games exhibit the negative correlation. Coasting could occur for psychological reasons, in which losing teams play harder, and leading teams less hard, as suggested by . Again, however, the absence of this pattern in other sports suggest that the mechanism is not psychological.
A plausible alternative explanation is that NBA teams employ various strategies that serve to change the ratio as a function of lead size. For instance, when a team is in the lead, they often substitute out their stronger and more offensive players, e.g., to allow them to rest or avoid injury, or to manage floor spacing or skill combinations. When a team is down by an amount that likely varies across teams, these players are put back on the court. If both teams pursue such strategies, the effective ratio c will vary inversely with lead size such that the leading team becomes effectively weaker compared to the non-leading team. In contrast to NBA teams, teams in CFB, NFL and NHL seem less able to pursue such a strategy. In football, substitutions are relatively uncommon, implying that should not vary much over the course of a game. In hockey, each team rotates through most of its players every few minutes, which limits the ability for high- or low-skilled players to effectively change over the course of a game.
6 Modeling lead-size dynamics
The previous insights identify several basic patterns in scoring tempo and balance across sports. However, we still lack a clear understanding of the degree to which any of these patterns is necessary to produce realistic scoring dynamics. Here, we investigate this question by combining the identified patterns within a generative model of scoring over time, and test which combinations produce realistic dynamics in lead sizes. In particular, we consider two models of tempo and two models of balance. For each of the four pairs of tempo and balance models for each sport, we generate via Monte Carlo a large number of games and measure the resulting variation in lead size as a function of the game clock, which we then compare to the empirical pattern.
Our two scoring tempo models are as follows. In the first (Bernoulli) model, each second of time produces an event with the empirical probability observed for that second across all games (shown in Figure 3). In the second (Markov), we draw an inter-arrival time from the empirical distribution of such gaps (shown in Figure 2), advance the game clock by that amount, and generate a scoring event at that clock time.
Our two balance models are as follows. In the first (Bernoulli) model, for each match we draw a uniformly random value c from the empirical distribution of scoring balances (shown in Figure 4) and for each scoring event, the points are won by team r with that probability and by team b otherwise. In the second (Markov), a scoring event is awarded to the leading team with the empirically estimated probability for the current lead size L (shown in Figure 5). Once a scoring event is generated and assigned, that team’s score is incremented by a point value drawn iid from the empirical distribution of point values per scoring event for the sport (see Additional file 1, Appendix B).
The four combinations of tempo and balance models thus cover our empirical findings for patterns in the scoring dynamics of these sports. The simpler models (called Bernoulli) represent dynamics with no memory, in which each event is an iid random variable, albeit drawn from a data-driven distribution. The more complicated models (called Markov) represent dynamics with some memory, allowing past events to influence the ongoing gameplay dynamics. In particular, these are first-order Markov models, in which only the events of the most recent past state influence the outcome of the random variable at the current state.
That being said, some small deviations remain. For instance, the Markov model slightly overestimates the lead-size variation in the first half, and slightly underestimates it in the second half of CFB games. In NFL games, it provides a slight overestimate in first half, but then converges on the empirical pattern in the second half. NHL games exhibit the largest and most systematic deviation, with the Markov model producing more variation than observed, particularly in the game’s second half. However, it should be noted that the low-scoring nature of NHL means that what appears to be a visually large overestimate here (Figure 6) is small when compared to the deviations seen in the other sports. NBA games exhibit a similar pattern to CFB games, but the crossover point occurs at the end of period 3, rather than at period 2. These modest deviations suggest the presence of still other non-ideal processes governing the scoring dynamics, particularly in NHL games.
We emphasize that the Markov model’s accuracy for CFB, NFL and NHL games does not imply that individual matches follow this pattern of favoring the leader. Instead, the pattern provides a compact and efficient summary of scoring dynamics conditioned on unobserved characteristics like team skill. Our model generates competition between two featureless teams, and the Markov model provides a data-driven mechanism by which some pairs of teams may behave as if they have small or large differences in latent skill. It remains an interesting direction for future work to investigate precisely how player and team characteristics determine team skill, and how team skill impacts scoring dynamics.
7 Predicting outcomes from gameplay
The accuracy of our generative model in the previous section suggest that it may also produce accurate predictions of the game’s overall outcome, after observing only the events in the first t seconds of the game. In this section, we study the predictability of game outcome using the Markov model for scoring balance, and compare its accuracy to the simple heuristic of guessing the winner to be the team currently in the lead at time t. Thus, we convert our Markov model into an explicit Markov chain on the lead size L, which allows us to simulate the remaining seconds conditioned on the lead size at time t. For concreteness, we define the lead size L relative to team r, such that implies that b is in the lead.
where, for the particular sport, we use the empirical probability function for scoring as a function of lead size (Figure 5), from r’s perspective, and the empirical distribution (Additional file 1, Appendix B) for the point value.
The probability that team r is the predicted winner depends on the probability distribution over lead sizes at time T. Because scoring events are conditionally independent, this distribution is given by , where n is the expected number of scoring events in the remaining clock time , multiplied by a vector representing the initial state . Given a choice of time t, we estimate , which is the expected number of events given the empirical tempo function (Figure 3, also the Bernoulli tempo model in Section 6) and the remaining clock time. We then convert this distribution, which we calculate numerically, into a prediction by summing probabilities for each of three outcomes: r wins (states ), r ties b (state ), and b wins (states ). In this way, we capture the information contained in the magnitude of the current lead, which is lost when we simply predict that the current leader will win, regardless of lead size.
We test the accuracy of the Markov chain using an out-of-sample prediction scheme, in which we repeatedly divide each sports’ game data into a training set of a randomly selected 3/4 of all games and a test set of the remaining 1/4. From each training set, we estimate the empirical functions used in the model and compute the Markov chain’s transition matrix. Then, across the games in each test set, we measure the mean fraction of times the Markov chain’s prediction is correct. This fraction is equivalent to the popular AUC statistic , where AUC =0.5 denotes an accuracy no better than guessing.
Instead of evaluating the model at some arbitrarily selected time, we investigate how outcome predictability evolves over time. Specifically, we compute the AUC as a function of the cumulative number of scoring events in the game, using the empirically observed times and lead sizes in each test-set game to parameterize the model’s predictions. When the number of cumulative events is small, game outcomes should be relatively unpredictable, and as the clock runs down, predictability should increase. To provide a reference point for the quality of these results, we also measure the AUC over time for a simple heuristic of predicting the winner as the team in the lead after the event.
In all cases, the Markov chain substantially outperforms the ‘leader wins’ heuristic, even in the low-scoring NHL games. This occurs in part because small leads are less informative than large leads for guessing the winner, and the heuristic does not distinguish between these.
Although there is increasing interest in quantitative analysis and modeling in sports [31–35], many questions remain about what patterns or principles, if any, cut across different sports, what basic dynamical processes provide good models of within-game events, and the degree to which the outcomes of games may be predicted from within-game events alone. The comprehensive database of scoring events we use here to investigate such questions is unusual for both its scope (every league game over 9-10 seasons), its breadth (covering four sports), and its depth (timing and attribution information on every point in every game). As such, it offers a number of new opportunities to study competition in general, and sports in particular.
Across college (American) football (CFB), professional (American) football (NFL), professional hockey (NHL) and professional basketball (NBA) games, we find a number of common patterns in both the tempo and balance of scoring events. First, the timing of events in all four sports is remarkably well-approximated by a simple Poisson process (Figures 1 and 2), in which each second of gameplay produces a scoring event independently, with a probability that varies only modestly over the course of a game (Figure 3). These variations, however, follow a common three-phase pattern, in which a relatively constant rate is depressed at the beginning of a scoring period, and increases dramatically in the final few seconds of the period. The excellent agreement with a Poisson process implies that teams employ very few strategically-chosen chains of events or time-sensitive strategies in these games, except in a period’s end-phase, when the incentive to score is elevated. These results provide further support to some past analyses [10, 19], while contrasting with others [12, 25], showing no evidence for the popular notion of ‘hot hands,’ in which scoring once increases the chance of scoring again soon.
Second, we find a common pattern of imbalanced scoring between teams in CFB, NFL and NHL games, relative to an ideal model in which teams are equally likely to win each scoring event (Figure 4). CFB games are much less balanced than NFL games, suggesting that the transition from college to professional tends to reduce the team skill differences that generate lopsided scoring. This reduction in variance is likely related both to only the stronger college-level players successfully moving up into the professional teams, and in the way the NFL Draft tends to distribute the stronger of these new players to the weaker teams.
Furthermore, we find that all three of these sports exhibit a pattern in which lead sizes tend to increase over time. That is, the probability of scoring while in the lead tends to be larger the greater the lead size (Figure 5), in contrast to the ideal model in which lead sizes increase or decrease with equal probability. As with overall scoring balance, the size of this effect in CFB games is much larger (about 2.5 times larger) than in NFL games, which is consistent with a reduction in the variance of the distribution of skill across teams. That is, NFL teams are generally closer in team skill than CFB teams, and this produces gameplay that is much less predictable. Both of these patterns are consistent with a kind of Bradley-Terry-type model in which each scoring event is a contest between the teams.
NBA games, however, present the opposite pattern: team scores are much closer than we would expect from the ideal model, and the probability of scoring while in the lead effectively decreases as the lead size grows (Figure 5; a pattern originally identified by ). This pattern produces a kind of ‘restoring force’ such that leads tend to shrink until they turn into ties, producing games that are substantially more unpredictable. Unlike the pattern in CFB, NFL and NHL, no distribution of latent team skills, under a Bradley-Terry-type model, can produce this kind of negative correlation between the probability of scoring and lead size.
Recently,  analyzed similar NBA game data and argued that increased psychological motivation drives teams that are slightly behind (e.g., by one point at halftime) to win the game more often than not. That is, losing slightly is good for winning. Our analysis places this claim in a broader, more nuanced context. The effective restoring force is superficially consistent with the belief that losing in NBA games is ‘good’ for the team, as losing does indeed empirically increase the probability of scoring. However, we find no such effect in CFB, NFL or NHL games (Figure 5), suggesting either that NBA players are more poorly motivated than players in other team sports or that some other mechanism explains the pattern.
One such mechanism is for NBA teams to employ strategies associated with substituting weaker players for stronger ones when they hold various leads, e.g., to allow their best players to rest or avoid injury, manage floor spacing and offensive/defensive combinations, etc., and then reverse the process when the other team leads. In this way, a team will play more weakly when it leads, and more strongly when it is losing, because of personnel changes alone rather than changes in morale or effort. If teams have different thresholds for making such substitutions, and differently skilled best players, the averaging across these differences would produce the smooth pattern observed in the data. Such substitutions are indeed common in basketball games, while football and hockey teams are inherently less able to alter their effective team skill through such player management, which may explain the restoring force’s presence in NBA games and its absence in CFB, NFL or NHL games. It would be interesting to determine whether college basketball games exhibit the same restoring force, and the personnel management hypothesis could be tested by estimating the on-court team’s skill as a function of lead size.
The observed patterns we find in the probability of scoring while in the lead are surprisingly accurate at reproducing the observed variation in lead-size dynamics in these sports (Figure 6), and suggest that this one pattern provides a compact and mostly accurate summary of the within-game scoring dynamics of a sport. However, we do not believe these patterns indicate the presence of any feedbacks, e.g., ‘momentum’ or cumulative advantage . Instead, for CFB, NFL and NHL games, this pattern represents the distribution of latent team skills, while for NBA games, it represents strategic decisions about which players are on the court as a function of lead size.
This pattern also makes remarkably good predictions about the overall outcome of games, even when given information about only the first ℓ scoring events. Under a controlled out-of-sample test, we found that CFB, NFL and NHL game outcomes are highly predictable, even after only a few events. In contrast, NBA games were significantly less predictable, although reasonable predictions here can still be made, despite the impact of the restoring force.
Given the popularity of betting on sports, it is an interesting question as to whether our model produces better or worse predictions than those of established odds-makers. To explore this question, we compared our model against two such systems, the online live-betting website Bovadab and the odds-maker website Sports Book Review (SBR).c Neither site provided comprehensive coverage or systematic access, and so our comparison was necessarily limited to a small sample of games. Among these, however, our predictions were very close to those of Bovada, and, after 20% of each game’s events had occurred, were roughly 10% more accurate than SBR’s money lines across all sports. Although the precise details are unknown for how these commercial odds were set, it seems likely that they rely on many details omitted by our model, such as player statistics, team histories, team strategies and strengths, etc. In contrast, our model uses only information relating to the basic scoring dynamics within a sport, and knows nothing about individual teams or game strategies. In that light, its accuracy is impressive.
These results suggest several interesting directions for future work. For instance, further elucidating the connection between team skill and the observed scoring patterns would provide an important connection between within-game dynamics and team-specific characteristics. These, in turn, could be estimated from player-level characteristics to provide a coherent understanding of how individuals cooperate to produce a team and how teams compete to produce dynamics. Another missing piece of the dynamics puzzle is the role played by the environment and the control of space for creating scoring opportunities. Recent work on online games with heterogeneous environments suggests that these spatial factors can have large impact on scoring tempo and balance , but time series data on player positions on the field would further improve our understanding. Finally, our data omit many aspects of gameplay, including referee calls, timeouts, fouls, etc., which may provide for interesting strategic choices by teams, e.g., near the end of the game, as with clock management in football games. Progress on these and other questions would shed more light on the fundamental question of how much of gameplay may be attributed to skill versus luck.
Finally, our results demonstrate that common patterns and processes do indeed cut across seemingly distinct sports, and these patterns provide remarkably accurate descriptions of the events within these games and predictions of their outcomes. However, many questions remain unanswered, particularly as to what specific mechanisms generate the modest deviations from the basic patterns that we observe in each sport, and how exactly teams exerting such great efforts against each other can conspire to produce gameplay so reminiscent of simple stochastic processes. We look forward to future work that further investigates these questions, which we hope will continue to leverage the powerful tools and models of dynamical systems, statistical physics, and machine learning with increasingly detailed data on competition.
We thank Dan Larremore, Christopher Aicher, Joel Warner, Mason Porter, Peter Mucha, Pete McGraw, Dave Feldman, Sid Redner, Alan Gabel, Owen Newkirk, Oskar Burger, Rajiv Maheswaran and Chris Meyer for helpful conversations. This work was supported in part by the James S McDonnell Foundation.
- Klaassen FJGM, Magnus JR: Are points in tennis independent and identically distributed? Evidence from a dynamic binary panel data model. J Am Stat Assoc 2001, 96: 500–509. 10.1198/016214501753168217MathSciNetView ArticleGoogle Scholar
- Albert J, Bennett J, Cochran JJ 16. In Anthology of statistics in sports. SIAM, Philadelphia; 2005.View ArticleGoogle Scholar
- Ben-Naim E, Vazquez F, Redner S: What is the most competitive sport? J Korean Phys Soc 2007, 50: 124–126. 10.3938/jkps.50.124View ArticleGoogle Scholar
- Thomas AC (2007) Inter-arrival times of goals in ice hockey. J Quant Anal Sports 3(3)Google Scholar
- Duch J, Waitzman JS, Amaral LAN: Quantifying the performance of individual players in a team activity. PLoS ONE 2010., 5: Article ID 10937 Article ID 10937Google Scholar
- Heuer A, Müller C, Rubner O: Soccer: is scoring goals a predictable Poissonian process? Europhys Lett 2010., 89: Article ID 38007 Article ID 38007Google Scholar
- Buttrey SE, Washburn AR, Price WL: Estimating NHL scoring rates. J Quant Anal Sports 2011., 7(3): Article ID 24 Article ID 24Google Scholar
- Radicchi F: Who is the best player ever? A complex network analysis of the history of professional tennis. PLoS ONE 2011., 6: Article ID 17249 Article ID 17249Google Scholar
- Radicchi F: Universality, limits and predictability of gold-medal performances at the Olympics games. PLoS ONE 2012., 7: Article ID 40335 Article ID 40335Google Scholar
- Gabel A, Redner S (2012) Random walk picture of basketball scoring. J Quant Anal Sports 8Google Scholar
- Goldman M, Rao JM: Effort vs. concentration: the asymmetric impact of pressure on NBA performance. Proceedings MIT Sloan sports analytics conference 2012, 1–10.Google Scholar
- Yaari G, David G: ‘Hot hand’ on strike: bowling data indicates correlation to recent past results, not causality. PLoS ONE 2012., 7: Article ID 30112 Article ID 30112Google Scholar
- Myerson RB: Game theory: analysis of conflict. Harvard University Press, Cambridge; 1997.Google Scholar
- Palacios-Huerta I: Professionals play minimax. Rev Econ Stud 2003, 70(2):395–415. 10.1111/1467-937X.00249View ArticleGoogle Scholar
- Walker M, Wooders J: Minimax play at Wimbledon. Am Econ Rev 2001, 91(5):1521–1538. 10.1257/aer.91.5.1521View ArticleGoogle Scholar
- Romer D: Do firms maximize? Evidence from professional football. J Polit Econ 2006, 114(2):340–365. 10.1086/501171View ArticleGoogle Scholar
- Reed D, Hughes M: An exploration of team sport as a dynamical system. Int J Perform Anal Sport 2006, 6(2):114–125.Google Scholar
- Galla T, Farmer JD: Complex dynamics in learning complicated games. Proc Natl Acad Sci USA 2013, 110: 1232–1236. 10.1073/pnas.1109672110MathSciNetView ArticleGoogle Scholar
- Ayton P, Fischer I: The hot hand fallacy and the gambler’s fallacy: two faces of subjective randomness? Mem Cogn 2004, 32(8):1369–1378. 10.3758/BF03206327View ArticleGoogle Scholar
- Balkundi P, Harrison DA: Ties, leaders, and time in teams: strong inference about network structure’s effects on team viability and performance. Acad Manag J 2006, 49: 49–68. 10.5465/AMJ.2006.20785500View ArticleGoogle Scholar
- Berger J, Pope D: Can losing lead to winning? Manag Sci 2011, 57(5):817–827. 10.1287/mnsc.1110.1328View ArticleGoogle Scholar
- Vergin RC: Winning streaks in sports and the misperception of momentum. J Sport Behav 2000, 23: 181.Google Scholar
- Merritt S, Clauset A: Environmental structure and competitive scoring advantages in team competitions. Sci Rep 2013., 3: Article ID 3067 Article ID 3067Google Scholar
- Barney J: Firm resources and sustained competitive advantage. J Manag 1991, 17: 99–120.Google Scholar
- Yaari G, David G: The hot (invisible?) hand: can time sequence patterns of success/failure in sports be modeled as repeated independent trials. PLoS ONE 2011., 6: Article ID 24532 Article ID 24532Google Scholar
- Boas ML: Mathematical methods in the physical sciences. 3rd edition. Wiley, Hoboken; 2006.Google Scholar
- Box GEP, Jenkins GM, Reinsel GC: Time series analysis: forecasting and control. Wiley, Hoboken; 2013.Google Scholar
- Thompson P: Learning by doing. In Handbook of economics of technical change. Edited by: Hall B, Rosenberg N. Elsevier, Philadelphia; 2010:429–476.Google Scholar
- Bradley RA, Terry ME: Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika 1952, 39(3/4):324–345. 10.2307/2334029MathSciNetView ArticleGoogle Scholar
- Bradley AP: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 1997, 30(7):1145–1159. 10.1016/S0031-3203(96)00142-2View ArticleGoogle Scholar
- Arkes J, Martinez J (2011) Finally, evidence for a momentum effect in the NBA. J Quant Anal Sports 7Google Scholar
- Bourbousson J, Sève C, McGarry T: Space-time coordination dynamics in basketball: Part 2. The interaction between the two teams. J Sports Sci 2012, 28(3):349–358.View ArticleGoogle Scholar
- de Saá Guerra Y, Martín González JM, Sarmiento Montesdeoca S, Rodríguez Ruiz D, Arjonilla López N, García Manso JM: Basketball scoring in NBA games: an example of complexity. J Syst Sci Complex 2013, 26(1):94–103. 10.1007/s11424-013-2282-3View ArticleGoogle Scholar
- Everson P, Goldsmith-Pinkham PS: Composite Poisson models for goal scoring. J Quant Anal Sports 2008., 4(2): Article ID 13 Article ID 13MathSciNetGoogle Scholar
- Neiman T, Loewenstein Y: Reinforcement learning in professional basketball players. Nat Commun 2011., 2: Article ID 569 Article ID 569Google Scholar
- Price DDS: A general theory of bibliometric and other cumulative advantage processes. J Am Soc Inf Sci 1976, 27(5):292–306. 10.1002/asi.4630270505View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.