An Introduction to Sabermetrics
by Jim Albert
Sabermetrics is the mathematical and statistical analysis of baseball records. To understand the field of Sabermetrics, one first should be familiar with the game of baseball. This sport is one of the most popular games in the United States; it is often called the national pastime. Baseball began in the eastern United States in the mid-1800’s. Professional baseball started near the end of the 18th century; the National League was founded in 1876 and the American League in 1900. Currently in the United States, there are 28 professional teams in the American and National Leagues and millions of people watch games in ballparks and on television.
The game of baseball
The game of baseball is played between two teams, each consisting of nine players. The nine players are a pitcher, a catcher, first baseman, second baseman, shortstop, third baseman, left fielder, center fielder and right fielder. A game of baseball consists of nine innings. One inning is divided into two halves; in the top half of the inning, one team plays in the field and the second team comes to bat, and in the bottom half, the teams reverse roles. The team that is batting during a particular half-inning is trying to score runs. The team with the higher number of runs at the end of the nine innings is the winner of the game.
During an inning, a player on the team in the field, called a pitcher, throws a baseball toward a player of the team at-bat, called the batter. The batter will try to hit the ball using a wooden stick (called a bat) in a location out of the reach of the players in the field. By hitting the ball, the batter has the opportunity to run around four bases that lie in the field. If a player advances around all of the bases, he has scored a run. If a batter hits a ball that can be caught, or that can be thrown to first base before he runs to that base, then he is said to be out, and cannot score a run.
A batter is also out if he fails to hit the baseball three times or if three good pitches (called strikes) have been thrown.
The objective for the batting team during an inning is to score as many runs as possible before obtaining three outs.
The basic batting statistics
One notable aspect of the game of baseball is the wealth of numerical information that is recorded about the game.
The effectiveness of batters and pitchers is typically assessed by particular numerical measures. The usual measure of hitting effectiveness for a player is the batting average which is computed by dividing the number of hits by the number of at-bats. This statistic gives the proportion of opportunities (at-bats) in which the batter succeeds (gets a hit). The batter with the highest batting average during a baseball season is called the best hitter that year. Batters are also evaluated on their ability to reach one, two, three, or four bases on a single hit; these hits are called respectively singles, doubles, triples, and home runs. The slugging average is computed by dividing the total number of bases (in short, total bases) by the number of opportunities. Since it weights hits by the number of bases reached, this measure reflects the ability of a batter to hit a long ball for distance. The most valued hit in baseball is the home run where a player advances four bases on one hit. The number of home runs is recorded for all players and the batter with the largest number of home runs at the end of the season is given special recognition.
The basic pitching statistics
A number of statistics are also used in the evaluation of pitchers. For a particular pitcher, one counts the number of games in which he was declared the winner or loser and the number of runs allowed. Pitchers are usually rated in terms of the average number of “earned” runs allowed for a nine inning game. Other statistics are useful in understanding pitching ability. A pitcher records a strikeout when the batter fails to hit the ball in the field and records a walk when he throws four inaccurate pitches (balls) to the batter. A pitcher who can throw the ball very fast can record a high number of strikeouts. A pitcher who is “wild” or relatively inaccurate will record a large number of walks.
Better measure of hitting ability — runs created
One goal of sabermetrics is to find good measures of hitting and pitching performance. Bill James (1982) compares the batting records of two players, Johnny Pesky and Dick Stuart, who played in the 1960’s. Pesky was a batter who hit for a high batting average but hit few home runs. Stuart, in contrast, had a modest batting average, but hit a high number of home runs. Who was the more valuable hitter? James argues that a hitter should be evaluated by his ability to create runs for his team. From an empirical study of a large collection of team hitting data, he established the following formula for predicting the number of runs scored in a season based on the number of hits, walks, at-bats,
and total bases recorded in a season.
(HITS + WALKS) (TOTAL BASES)
RUNS = ----------------------------
AT-BATS + WALKS
This formula reflects two important aspects in scoring runs in baseball. The number of hits and walks of a team reflects the team’s ability to get runners on base. The number of total bases of a team reflects the team’s ability to move runners that are already on base. This runs created formula can be used at an individual level to compute the number of runs that a player creates for his team. In 1942, Johnny Pesky had 620 at-bats, 205 hits, 42 walks, and 258 total bases; using the formula, he created 96 runs for his team. Dick Stuart in 1960 had 532 at-bats with 160 at-bats, 34 walks, and 309 total bases for 106 runs created. The conclusion is that Stuart in 1960 was a slightly better hitter than Pesky in 1942 since he created a few more runs for his team.
An alternative approach to evaluating batting performance is based on a linear weights formula. George Lindsey (1963) was the first person to assign run values to each event that could occur while a team was batting. By the use of recorded data from baseball games and probability theory, he developed the formula
RUNS = (.41) 1B + (.82) 2B + (1.06) 3B + (1.42) HR
where 1B, 2B, 3B, and HR are respectively the number of singles, doubles, triples, and home runs hit in a game. One notable aspect of this formula is that it recognizes that a batter creates a run three ways. There is a direct run potential when a batter gets a hit and gets on base. In addition, the batter can advance runners that are already on base. Also, by not getting an out, the hitter allows a new batter a chance of getting a hit, and this produces an indirect run potential. Thorn and Palmer (1993) present a more sophisticated version of the linear weights formula which predicts the number of runs produced by an average baseball team based on all of the offensive events recorded during the game. Like James’ runs created formula, the linear weights rule can be used to evaluate a player’s batting performance.
Runs to wins
Although scoring runs is important in baseball, the basic objective is for a team to score more runs than its opponent. To learn about the relationship between runs scored and the number of wins, James (1982) looked at the number of runs produced, the number of runs allowed, the number of wins and the number of losses during a season for a large number of recent major league teams. James noted that the ratio of a team’s wins to losses was approximately equal to the square of the ratio of runs scored to the runs allowed. Equivalently,
RUNS = -------------- = --------------------------- .
WINS + LOSSES RUNS^2 + OPPOSITION RUNS^2
This relationship can be used to measure a batter’s performance in terms of the number of wins that he creates for his team.
Better measure of pitching ability
Sabermetrics has also developed better ways of evaluating pitching ability. The standard pitching statistics, the number of wins and the earned runs per game (ERA) are flawed. The number of wins of a pitcher can just reflect the
fact that he pitches for a good offensive (run scoring) team. The ERA does measure the rate of a pitcher’s efficiency, but it does not tell you about the actual benefit of this pitcher over an entire season. Thorn and Palmer (1993) developed the pitching runs formula
PITCHING RUNS = Innings Pitched x ----------- - ER.
The factor (League ERA/9) measures the average runs allowed per inning for all teams in the league. This value is multiplied by the number of innings pitched by that pitcher — this product represents the number of runs that pitcher would allow over the season if he was average. Last, one subtracts the actual earned runs (ER) the pitcher allowed for that season. If the pitching runs is larger than 0, then this pitcher is better than average. This new measure appears to be useful in measuring the efficiency and durability of a pitcher.
Player game percentage
Good measures of hitting, pitching, and fielding performance of baseball players have been developed.
However, these statistics do not directly measure a player’s contribution to a win for his team. Bennett and Flueck (1984) used data from two baseball seasons to estimate the probability the home team wins a game given the run differential (the home team runs minus visiting team runs), the half inning (top or bottom of the inning), the number of outs, and the on-base situation. Using these estimated probabilities, one can see how the probability of
winning changes for each game event. One can measure a player’s contribution to winning a game by summing the changes in win probabilities for each play in which the player has participated. This statistic, called the Player
Game Percentage, was used by Bennett (1993) to evaluate the batting performance of Joe Jackson. This player was banished from baseball for allegedly throwing the 1919 World Series. A statistical analysis using the Player Game Percentage showed that Jackson played to his full potential during this series.
People are often interested in comparing batters or pitchers from different eras. In making these comparisons, it is important to view batting or pitching statistics in the context in which they were achieved. For example, Bill Terry led the National League in 1930 with a batting average of .401, a mark that has been surpassed since by only one hitter. In 1968 Carl Yastrzemski led the American League in hitting with an average of .301. It appears on the surface that Terry was the clearly superior hitter. However, when viewed relative to the hitters that played during the same time,
both hitters were approximately 27 percent better than the average hitter (Thorn and Palmer, 1993). The hitting accomplishments of Terry in 1930 and Yastrzemski in 1968 were actually very similar. Likewise, there are significant differences in hitting in different ball parks, and hitting statistics need to be adjusted for the ball park played to make accurate comparisons between players.
Learning from selected data
Watching a baseball game raises questions that motivate interesting statistical analyses. During the broadcast of a game, a baseball announcer will typically report selected hitting data for a player. For example, it may be reported that Barry Bonds has 10 hits in his most recent 20 at-bats. What have you learned about Bonds’ batting average on the basis of this information? Clearly, Bonds’ batting average can’t be as large as 10/20 = .500 since this data was chosen to maximize the reported percentage. Casella and Berger (1994) construct the likelihood function for a player’s true batting average on the basis of this selected information and find the maximum likelihood estimate. They conclude that this selected data only provides a little insight into the “complete data” batting average that is obtained from batting records over the entire season.
Another interesting question is on the existence of streakiness in hitting data. During a season it is observed that some ballplayers will experience periods of “hot” hitting where they will get a high proportion of hits. Other hitters will go through slumps or periods of hitting with very few hits. But these periods of hot and cold hitting may be just a reflection of the natural variability observed in coin tossing. Is there statistical evidence for a “hot hand” among baseball hitters where the probability of obtain a hit is dependent on recent at-bats? Albright (1993) looked at a large collection of baseball hitting data and used a number of statistics such as the number of runs to detect streakiness in hitting data. His main conclusion was that there little statistical evidence generally for a hot hand in baseball hitting.
Currently there is great interest among fans and the media in situational baseball data. The hitting performance of batters is recorded for a number of different situations, such as day versus night games, on grass fields and artificial turf fields, against pitchers who throw right-handed and left-handed, and during home and away games. There are two basic questions in the statistical analysis of this type of data. First, are there particular situations that can explain a significant amount of variation in the hitting data? Second, are there ballplayers that perform particularly well or poorly in a given situation? Albert (1994) analyzed a large body of published situational data and used Bayesian hierarchical models to combine data from a large group of players. His basic conclusion is that there do exist some important situations. For example, batters hit on average 20 points higher when facing a pitcher of the opposite arm, and hit 8 points higher when they are playing in their home ballpark. However, there is generally little statistical evidence for individual differences in these situational effects.
Major league baseball is currently divided into six divisions and one goal of any team is to finish first in its division. Suppose that part of the season has been completed. Using the teams’ records from this partial season, is it possible to predict accurately the winners of the divisions? Barry and Hartigan (1993) use a choice model for the probability that a team wins an individual game. This model allows for different strengths between the teams, different home advantages, and team strengths that can change randomly with time. The authors use this model to simulate the results of future baseball games and estimate the probabilities that each team will win its respective divisions.
Currently, major league baseball games are recorded in very fine detail. Information about every single ball pitched, fielded and hit during a game are noted, creating a large database of baseball statistics. This database is used in a number of ways. Public relations departments of teams use the data to publish special statistics about their players. The statistics are used to help determine the salaries of major league ballplayers. Specifically, statistical information is used as evidence in salary arbitration, a legal proceeding which sets salaries. A number of teams have employed full-time professional statistical analysts and some managers use statistical information in deciding on strategy during a game. Bill James and other baseball statisticians have shown that it is possible to answer a variety of questions about the game of baseball by means of statistical analyses.
- Albert, J. (1994), “`Exploring baseball hitting data: what about those breakdown statistics?”, Journal of the American Statistical Association , 89, 1066-1074.
- Albright, S. C. (1993), “A statistical analysis of hitting streaks in baseball,” Journal of the American Statistical Association , 88, 1175-1183.
- Barry, D., and Hartigan, J. A. (1993), “Choice Models for Predicting Divisional Winners in Major League Baseball,” Journal of the American Statistical Association , 88, 766-774.
- Bennett, J. M. (1993), “Did Shoeless Joe Jackson Throw the 1919 World Series?”, The American Statistician, 47, 241-250.
- Bennett, J. M. and Flueck, J. A. (1984), “Player Game Percentage”, in Proceedings of the Social Statistics Section, American Statistical Association, 378-380.
- Casella, G. and Berger, R. (1993), “Estimation With Selected Binomial Information or Do You Really believe that Dave Winfield is Batting .471?”, Journal of the American Statistical Association , 89, 1080-1090.
- James, B. (1982), The Bill James Baseball Abstract, New York: Ballantine Books. Lindsey, G. (1963) “An Investigation of Strategies in Baseball,”
- Operations Research, 11, 447-501.
- Thorn, J. and Palmer, P. (1993), Total Baseball, New York: Harper Collins.