In most sports, a team wins by scoring more points than its opponent during a game. However, the methods by which sport teams score are as varied as the different sports themselves. Some sports use a time-based system (for example, figure skating and certain disciplines in road running and cross country skiing), while others use distance or height as the primary scorekeeping mechanism, such as in baseball, football, field hockey, and soccer.
Despite increased interest in quantifying and modeling competition in sport, relatively little is known about what patterns or principles cut across different sports. Here, we leverage a comprehensive dataset of scoring events in nearly every professional hockey and basketball game over the last decade to investigate this question.
Our data set is unique in its scope (covering every game over 9-10 seasons), breadth (including all major leagues), and depth (including timing and attribution information for each point scored). We find that within-game scoring balance and tempo, measured as the probability of a scoring event occurring, closely follow a Poisson process with a sport-specific rate. Similarly, within-game lead-size variation follows a Bernoulli process with a parameter that effectively varies with the size of the lead.
Furthermore, our generative model accurately reproduces the observed evolution of these phenomena and makes predictions about future lead-size fluctuations as well as scores at any point in the game. Moreover, our results suggest that professional sports exhibit little strategic entailment and are instead heavily driven by short-term optimization for scoring as quickly as possible.