Once again, it’s on. Jon Steiner continues with his weekly feature regarding the Cleveland Indians mixed in with a little math. Do Enjoy…
Baseball games are inherently complex things, as anyone who’s ever had to look up the infield fly rule can attest. But no matter the minutiae, the games always boil down to one thing: who won? Of course, it’s easy to figure this out. All we have to do is look at the box score, tally up the runs, and see who scored more. No matter how complex a game gets, there’s nothing more black and white than wins and losses.
But the reason more and more baseball fans care about statistics is that they might be able to tell us what to expect (or more precisely, why to expect what we expect) rather than simply what happened. After all, just because your team won today, doesn’t mean they will win tomorrow.
And for this reason, predicting wins and losses can be a tricky thing. So much goes into a single baseball game: usually around 80 plate appearances, 250 pitches, countless in-game decisions regarding pitch selection, defensive alignments, pitching changes and baserunning.
What we do know is that you need to score more runs than your opponent to win games, right? I mean, I’ve never seen a team win a game when it scored fewer runs than its opponent. Or, as Rick Manning might say, “Whoever scores more runs tonight might just come out on top of this thing!”
So even though the individual games can get complex, if we can estimate how many runs a team will score and how many runs it will allow over the course of a season, we should be able to get a feel for what their record should be.
Which brings me to one of the foundational formulas of the statistical community. It’s called Pythagorean Expectation, and it’s really pretty simple:
Basically, the formula attempts to estimate what a team’s record would be based solely on two inputs: runs scored and runs allowed. Let’s check its accuracy in the AL for 2009:
Sure, there are some outliers in the chart—there always will be. The Mariners’ winning percentage was 60 points better than we would have expected based on runs scored and allowed, while the Blue Jays were 40 points worse, but overall, this looks like a fairly accurate tool.
Let’s use this formula to examine some hypothetical situations; I promise this will be worth it. Start by assuming that the 2010 Indians score exactly 800 runs and allow exactly 800 (optimistic, I know!). What would we expect their record to be? Well, when we plug 800 into the RS and RA inputs, we end up with a .500 winning percentage—an 81 win team. Makes sense. Now what if they score only 790 runs, but still allow 800 runs. Well, when we crank the numbers, we come up with a .494 expected winning percentage. Multiply that by 162 games, and we get 80 wins and 82 losses. A change of one win. Now let’s make the pitching staff worse by 10 runs—getting into the neighborhood of reality for the Indians. We’ll plug in 790 for RS again, but 810 for RA. The new expected winning percentage? .488. The W-L? 79-83. Another change of one win. In fact, it looks like every 10 runs we score results in one win, while every 10 runs we allow cost us one win.
This is why, generally speaking, when a player adds 10 runs to his team’s performance (or prevents 10 runs via pitching or defense) we generally say that he’s “added” a win over the course of a season. It’s a simple rule with some complicated consequences.
Believe it or not, this fact—that 10 runs equal a win—tells us just about everything we need to know about a player’s value. After all, if we can evaluate all the things a player does offensively, defensively, and pitchingwise (a new word? Get that Merriam guy on the phone!), we can estimate how many runs and wins he’s worth. Furthermore, if we know his salary, we can figure out how much he’s making per win!
But here’s the trick: what should we compare a player’s performance against? I mean, it’s great to know that a player added 5 wins to his team, but what would a different player have added? After all, it’s not like your option is to play Grady Sizemore or not have a CF at all. No, there has to be some baseline against which we can compare a player, and that’s what we need to identify.
Well, it would be nice to compare a player to the average player at his position—a perfectly reasonable thing to do. After all, we know the average hitter’s wOBA, the average pitcher’s FIP, and the average fielder’s UZR (see these posts if you need reminders). But here’s the problem: we don’t know the average player’s salary, since it can vary so much. And since we eventually want to determine how much money a player makes per win added, the average player just won’t work. After all, some league average players make the league minimum; some (*cough, Hafner, cough*) make much more than the league minimum.
So rather than using the average player as our benchmark, we’re going to use what’s called a replacement player. Here’s how that works. A replacement player is basically a player who makes the league minimum ($400,000) and performs at the level of your “AAAA” player. If you need an example, think of Trevor Crowe or Josh Barfield. According to Dave Cameron, replacement players are “usually available every winter as minor league free agents, via the Rule 5 draft, or as cheap trade acquisitions where a team can acquire one of these players without giving up any real talent in return.” Typically, a fourth outfielder (not a platoon player) is a good example of a replacement player.
Now if you were to construct a roster completely out of replacement level players, your team would have won about 43 games in the AL last season*—a terrible team to be sure, but remember that every player on your team would be far below average. So any wins an actual team generates over that 43 would be called Wins Above Replacement (WAR). We’ve got what we want!
*Interestingly, by my calculations, a replacement level team in the NL would have won nearly 50 games. Odd but true. The AL was that much better than the NL.
If you’re interested in calculating a player’s WAR yourself, I’ll direct you to my favorite primer from FanGraphs (scroll to the bottom), but there are many great ones out there on the interwebs. For our purposes, I think it should suffice to say the following:
- A position player’s WAR is based on the number of runs, compared to a replacement level player, that he produces hitting (measured by a version of wOBA), the number of runs he saves or costs defensively (measured by UZR), and the number of runs he creates or costs from his baserunning (not just steals). That number is then adjusted for the defensive position he plays, since league average offense at SS is obviously more valuable than league average offense in RF.
- A pitcher’s WAR is based on his FIP—scaled to total runs allowed rather than earned runs allowed. Basically, it’s the number of runs the pitcher was responsible for allowing, removing his defense from the picture (remember, that’s accounted for in the defense calculation for position players).
- WAR is just the number of runs above a replacement player that a given player produced divided by 10 (since 10 runs equals one win).
Things are more complicated than that for sure, but those three points cover most of the important bases. Let’s check to see if the 2009 WAR leaderboard correlates with what we think we already believe about player value:
I’d say that list looks pretty much like what we’d expect if we asked people to name the 10 best performances last year. Zobrist might surprise you, until you realize he hit 30 HRs, batted close to .300, and played some great defense at multiple positions for the Rays.
So what do these numbers mean? Let’s look at Albert Pujols as an example. Last year, Pujols helped the Cardinals win eight to nine more games than they would have won with a replacement level first baseman playing for them (Andy Marte?). This includes his above average defense, his amazing offense, and his strong baserunning. I can’t think of a stat that tells me more about a player’s contribution to his team than WAR.
Shin-Soo Choo provided us with five wins more than a replacement level RF would have. That’s quite valuable, when you consider that the best player in baseball last year provided only 9.4 wins above replacement (Grienke). But beyond Choo, the news wasn’t so great. Sure, Cabrera had a breakout year, but that was offset by Sizemore dropping from 6.4 WAR in 2008 to only 2.1 in 2009 due to injury and ineffectiveness. At Hafner’s peak in 2006, he was worth 6.2 WAR—unheard of for someone with no defensive value. Last year, it was a measly 1.3. Peralta peaked in 2008 with a 3.9 WAR; now he’s basically good for one win a year over your everyday minor leaguer. Oh yeah, and two of our top five performers were traded for players who probably won’t be in the opening day lineup.
So what is WAR good for? Absolutely nothing? Not quite. Theoretically, if we add up all the wins over replacement contributed by an entire roster, we should be able to find out how many wins a team should have had over the course of a season. Furthermore, if we have a good feel for the WARs of our current roster, we can predict how many wins the 2010 Indians might end up with. Let’s start by using WAR to evaluate all the 2009 AL teams:
|Team||Wins||Pos. WAR||Pit. WAR||Replacement||Expected Wins|
It looks complex, but it’s not. Basically, if you add the wins above replacement contributed by position players and pitchers to the replacement level, you have the number of wins your team should have won. So the final column is the sum of the previous three. Some teams get luckier or unluckier than others, but WAR tells us that the Yankees, Rays and Red Sox had the most talented rosters in the AL (100, 94 and 94 expected wins respectively), while the Orioles, Royals, and Indians players contributed the fewest wins above replacement in the AL. Any surprise that those three teams also held the three worst records in the league?
Let’s look at the Indians a bit more closely. Last season the Indians won only 65 games, but according to WAR, they should have won 73. Not a powerhouse by any stretch, but if there’s one thing that I’ve tried to drive home over the past month or so, it’s that the Indians were not nearly as bad last season as their record suggested. They ran into some horrible luck. They lost too many close games. They struggled with injuries, trades, uncharacteristically poor performances, and some dubious managing decisions from the Grinder.
So what might change this year? Well, Grady should bounce back to form; if healthy, I’d mark him down for 5 to 6 WAR. So long as Choo’s 2009 wasn’t a fluke (and I don’t think it was) I’d give him 4 to 5 WAR. Cabrera should be good for another 3—more if his defense is as good as we think it might be. And between Hafner, Branyan, and LaPorta/Brantley, I’m hoping for about 7 WAR combined. That would put us at 19 to 20 WAR from 5 position players, or about what we had last year for our entire roster of position players. In other words, I believe that our position players have a chance to be significantly better than they were last season.
The question is, of course, with our pitching staff. I can’t remember the last time this team didn’t have a starter I’d call above average, but this season is starting to look that way.
From 2004 to 2007 Westbrook averaged around 3.5 WAR per season, but he’s obviously a giant question mark for the 2010 campaign. In 2007, Fausto was worth 4.2 wins above replacement, but over the past two seasons he’s been worth less than one due to his control issues. Laffey and Huff have typically been worth between one and two WAR in their careers, which is fine so long as they aren’t relied upon to be major cogs in the middle of the rotation. Sowers has basically been the definition of a replacement-level pitcher, compiling only 3 wins above replacement over four combined seasons. Who knows what to expect from Masterson or Talbot, but I wouldn’t expect more than 2 to 3 WAR from either. Then there’s the bullpen, which last season contributed only 1.2 wins above replacement—led by Kerry Wood’s 0.4! It has to get better this year, doesn’t it?
Based on the above, here are my guesses for the value of the 2010 squad (beware: the following prediction includes mandatory March optimism):
And there you have it! 82 wins. As a long-lost WFNY commenter would say, “Book it!”
Next time we’ll look at how WAR correlates with player salaries, and how the Indians (or any “small market team”) should be thinking in terms of contracts, resources, and Hafner-esque sized mistakes.
As always, feel free to ask questions and I’ll do my best to point you toward an answer. See you next time!
Thanks to the guys at WFNY for picking me up as an occasional contributor. Much of the research in this series is built on ideas from The Book: Playing the Percentages in Baseball, the ongoing work at FanGraphs, StatCorner, The Hardball Times, and Tom Tango’s blog, and the countless other blogs and books that refuse to stop thinking and arguing about baseball.