One of the most popular advanced statistics of the Sabermetrics era is Wins Above Replacement (WAR). WAR is a catch-all summary of a player’s contributions in all facets of the game, giving an estimated number of wins a player contributes relative to their potential replacement. For example, a position player with a WAR of 4.3 means that player is worth 4.3 wins more than a replacement-level player (i.e. a minor leaguer) at the same position. WAR is also cumulative, allowing for player analysis at the game, season, and career levels. For example, according to FanGraphs, Shohei Ohtani leads MLB with a WAR of 4.5 (2.9 batting WAR, 1.6 pitching WAR), meaning over the course of the season, the Angels have won between 4 and 5 more games than expected if his innings and at-bats were replaced by a replacement level player.
It is important to note what defines a “replacement-level player”. Rather than comparing a player’s value to league average, the WAR calculation uses borderline Major Leaguers (typically players who bounce between AAA and MLB) and readily available fill-in free agents as a baseline. For the purposes of this guide, we will be referencing FanGraph’s calculation of WAR, known as fWAR.
Position Players: WAR accounts for production on offense, defense, and on the basepaths. Statistics such as weighted on-base average (wOBA), ultimate zone rating (UZR), and ultimate base running (UBR) measure a player’s contribution in terms of runs created and runs saved. WAR also factors in positional, park, and league adjustment as well, allowing for comparisons between players of different positions without having to further scale the metric.
Pitchers: The calculation of a pitcher’s WAR uses either RA9, runs allowed per nine innings (think of ERA without the “E”), or FIP, fielding independent pitching. The measure used is dependent upon who is computing WAR, a potential flaw which we will explore later.
Each of these standalone statistics are blended to produce a single value measure of a player’s contribution.
While batting average and on-base percentage both convey if a batter reaches base safely, both statistics fail to provide sufficient context as to how they got on. Slugging percentage offers a partial solution, weighing hits in different ways. However, the values are skewed and, like batting average, fails to account for other ways to reach base, such as walks. On-base plus slugging (OPS) also falls short, assuming one percentage point of on-base percentage is equal to that of slugging percentage. Weighted on-base average, or wOBA, provides the contextual information these statistics lack.
Rather than just telling you a batter reached safely, wOBA assigns a value to how they reached base in relation to projected runs scored. Each at-bat event (excluding intentional walks) is given a weight that varies year to year, adjusting for the run expectancy of each outcome in the context of that season. In terms of interpretation, wOBA is set to the same scale as on-base percentage. Thus, a league-average value for on-base percentage is very close to league-average wOBA. Unlike other calculations, wOBA is NOT adjusted for park effects, meaning hitter-friendly parks will induce inflated wOBAs.
Ultimate Zone Rating, or UZR, quantifies a player’s defensive contribution by measuring how many runs a defender saves. UZR accounts for errors, range, outfield arm, and double-play ability. As with other measures, each component of UZR is calculated relative to an average defender. For example, Outfield Arm Runs (ARM) is a part of UZR that quantifies the number runs above average an outfielder saves by preventing runners from advancing to the next base. There are two important considerations to make when referencing UZR:
Ultimate Base Running, or UBR, accounts for a player’s value on the basepaths on non-stolen base plays. In its simplest form, UBR assigns value (again, in terms of runs added above average) to advancing an additional base. For example, tagging up on a fly ball out is weighed in a player’s UBR value.
Much like wOBA does for getting on-base, Fielding Independent Pitching (FIP) aims to provide further context behind a pitcher’s performance. FIP is similar to ERA, except it attempts to isolate events that the pitcher has control over and exclude ones they do not. In other words, FIP focuses on strikeouts, (unintentional) walks, hit-by-pitches, and homeruns. It excludes batted balls hit into the field of play, eliminating the influence of both negative and positive defensive influence.
As with wOBA, interpretation of a pitcher’s FIP is done relative to a more well-known measure. The FIP calculation is scaled to match that of the league’s ERA. Thus, FIP can be compared directly to ERA. For example, if a pitcher surrenders a high batting average on balls in play (BABIP), his FIP will likely be lower than his ERA, as most batted balls are excluded from the FIP equation.
WAR is far from a perfect measure. Reducing player analysis to a singular statistic is a practice prone to error. Any singular statistic does not provide sufficient context to truly judge a player’s value. Individual baseball plays are far too complex for one statistic to tell the whole story.
However, one of the biggest issues with WAR is not how much or how little it tells us about a player. The calculation of WAR is not standardized, resulting in varying measures depending on which version of WAR you are referencing. Baseball Reference (bWAR) and Fangraphs (fWAR) differ in their measure of fielding runs: bWAR uses Defensive Runs Saved (DRS), while fWAR uses UZR. Although these statistics aim to accomplish the same goal of quantifying defensive ability, the slight differences can result in drastically different WAR calculations for some position players. The resulting values cannot be used interchangeably. Thus, when assessing a player’s WAR, we must be careful to consider both the context of the measure and its method of calculation.
Finally, a growing question involves Shohei Ohtani, the current leader in WAR for the 2023 season. How does WAR account for two-way players such as Ohtani? As a follow-up to this article, I will be diving into how Ohtani’s WAR is calculated and whether the measure gives an accurate assessment of the phenom’s value.