April 23, 2014

SABR-Toothed Triber: Runs, Wins and 2010 Predictions

Once again, it’s on.  Jon Steiner continues with his weekly feature regarding the Cleveland Indians mixed in with a little math.  Do Enjoy…

Wow.  That’s a non-descript title.  Let’s have some fun with it.

Baseball games are inherently complex things, as anyone who’s ever had to look up the infield fly rule can attest.  But no matter the minutiae, the games always boil down to one thing: who won?  Of course, it’s easy to figure this out.  All we have to do is look at the box score, tally up the runs, and see who scored more.  No matter how complex a game gets, there’s nothing more black and white than wins and losses.

But the reason more and more baseball fans care about statistics is that they might be able to tell us what to expect (or more precisely, why to expect what we expect) rather than simply what happened.  After all, just because your team won today, doesn’t mean they will win tomorrow.

And for this reason, predicting wins and losses can be a tricky thing.  So much goes into a single baseball game: usually around 80 plate appearances, 250 pitches, countless in-game decisions regarding pitch selection, defensive alignments, pitching changes and baserunning.

What we do know is that you need to score more runs than your opponent to win games, right?  I mean, I’ve never seen a team win a game when it scored fewer runs than its opponent.  Or, as Rick Manning might say, “Whoever scores more runs tonight might just come out on top of this thing!”

So even though the individual games can get complex, if we can estimate how many runs a team will score and how many runs it will allow over the course of a season, we should be able to get a feel for what their record should be.

Which brings me to one of the foundational formulas of the statistical community.  It’s called Pythagorean Expectation, and it’s really pretty simple:

Basically, the formula attempts to estimate what a team’s record would be based solely on two inputs: runs scored and runs allowed.  Let’s check its accuracy in the AL for 2009:

Team W L W% RS RA Pythag%
Yankees 103 59 0.636 915 753 0.596
Angels 97 65 0.599 883 761 0.574
Red Sox 95 67 0.586 872 736 0.584
Rangers 87 75 0.537 784 740 0.529
Twins 87 76 0.534 817 765 0.533
Tigers 86 77 0.528 743 745 0.499
Mariners 85 77 0.525 640 692 0.461
Rays 84 78 0.519 803 754 0.531
White Sox 79 83 0.488 724 732 0.495
Blue Jays 75 87 0.463 798 771 0.517
A’s 75 87 0.463 759 761 0.499
Indians 65 97 0.401 773 865 0.444
Royals 65 97 0.401 686 842 0.399
Orioles 64 98 0.395 741 876 0.417

Sure, there are some outliers in the chart—there always will be.  The Mariners’ winning percentage was 60 points better than we would have expected based on runs scored and allowed, while the Blue Jays were 40 points worse, but overall, this looks like a fairly accurate tool.

Let’s use this formula to examine some hypothetical situations; I promise this will be worth it.  Start by assuming that the 2010 Indians score exactly 800 runs and allow exactly 800 (optimistic, I know!).  What would we expect their record to be?  Well, when we plug 800 into the RS and RA inputs, we end up with a .500 winning percentage—an 81 win team.  Makes sense.  Now what if they score only 790 runs, but still allow 800 runs.  Well, when we crank the numbers, we come up with a .494 expected winning percentage.  Multiply that by 162 games, and we get 80 wins and 82 losses.  A change of one win.  Now let’s make the pitching staff worse by 10 runs—getting into the neighborhood of reality for the Indians.  We’ll plug in 790 for RS again, but 810 for RA.  The new expected winning percentage?  .488.  The W-L?  79-83.  Another change of one win.  In fact, it looks like every 10 runs we score results in one win, while every 10 runs we allow cost us one win.

This is why, generally speaking, when a player adds 10 runs to his team’s performance (or prevents 10 runs via pitching or defense) we generally say that he’s “added” a win over the course of a season.  It’s a simple rule with some complicated consequences.

Believe it or not, this fact—that 10 runs equal a win—tells us just about everything we need to know about a player’s value.  After all, if we can evaluate all the things a player does offensively, defensively, and pitchingwise (a new word?  Get that Merriam guy on the phone!), we can estimate how many runs and wins he’s worth.  Furthermore, if we know his salary, we can figure out how much he’s making per win!

But here’s the trick: what should we compare a player’s performance against?  I mean, it’s great to know that a player added 5 wins to his team, but what would a different player have added?  After all, it’s not like your option is to play Grady Sizemore or not have a CF at all.  No, there has to be some baseline against which we can compare a player, and that’s what we need to identify.

Well, it would be nice to compare a player to the average player at his position—a perfectly reasonable thing to do.  After all, we know the average hitter’s wOBA, the average pitcher’s FIP, and the average fielder’s UZR (see these posts if you need reminders).  But here’s the problem: we don’t know the average player’s salary, since it can vary so much.  And since we eventually want to determine how much money a player makes per win added, the average player just won’t work.  After all, some league average players make the league minimum; some (*cough, Hafner, cough*) make much more than the league minimum.

So rather than using the average player as our benchmark, we’re going to use what’s called a replacement player.  Here’s how that works.  A replacement player is basically a player who makes the league minimum ($400,000) and performs at the level of your “AAAA” player.  If you need an example, think of Trevor Crowe or Josh Barfield.  According to Dave Cameron, replacement players are “usually available every winter as minor league free agents, via the Rule 5 draft, or as cheap trade acquisitions where a team can acquire one of these players without giving up any real talent in return.”  Typically, a fourth outfielder (not a platoon player) is a good example of a replacement player.

Russell Branyan: WARNow if you were to construct a roster completely out of replacement level players, your team would have won about 43 games in the AL last season*—a terrible team to be sure, but remember that every player on your team would be far below average.  So any wins an actual team generates over that 43 would be called Wins Above Replacement (WAR).  We’ve got what we want!

*Interestingly, by my calculations, a replacement level team in the NL would have won nearly 50 games.  Odd but true.  The AL was that much better than the NL.

If you’re interested in calculating a player’s WAR yourself, I’ll direct you to my favorite primer from FanGraphs (scroll to the bottom), but there are many great ones out there on the interwebs.  For our purposes, I think it should suffice to say the following:

  • A position player’s WAR is based on the number of runs, compared to a replacement level player, that he produces hitting (measured by a version of wOBA), the number of runs he saves or costs defensively (measured by UZR), and the number of runs he creates or costs from his baserunning (not just steals).  That number is then adjusted for the defensive position he plays, since league average offense at SS is obviously more valuable than league average offense in RF.
  • A pitcher’s WAR is based on his FIP—scaled to total runs allowed rather than earned runs allowed.  Basically, it’s the number of runs the pitcher was responsible for allowing, removing his defense from the picture (remember, that’s accounted for in the defense calculation for position players).
  • WAR is just the number of runs above a replacement player that a given player produced divided by 10 (since 10 runs equals one win).

Things are more complicated than that for sure, but those three points cover most of the important bases.  Let’s check to see if the 2009 WAR leaderboard correlates with what we think we already believe about player value:

Player WAR
Zack Greinke 9.4
Ben Zobrist 8.6
Albert Pujols 8.5
Justin Verlander 8.2
Tim Lincecum 8.2
Joe Mauer 8.1
Chase Utley 7.6
Derek Jeter 7.4
Roy Halladay 7.3
Evan Longoria 7.2

I’d say that list looks pretty much like what we’d expect if we asked people to name the 10 best performances last year.  Zobrist might surprise you, until you realize he hit 30 HRs, batted close to .300, and played some great defense at multiple positions for the Rays.

So what do these numbers mean?  Let’s look at Albert Pujols as an example.  Last year, Pujols helped the Cardinals win eight to nine more games than they would have won with a replacement level first baseman playing for them (Andy Marte?).  This includes his above average defense, his amazing offense, and his strong baserunning.  I can’t think of a stat that tells me more about a player’s contribution to his team than WAR.

So let’s look at the Tribe, and see where our best and worst performers were in 2009 (here all the full lists for position players and pitchers):

Best Five:

Name WAR
Shin-Soo Choo 5.0
Cliff Lee 4.1
Asdrubal Cabrera 3.1
Victor Martinez 2.8
Grady Sizemore 2.1

Worst 5:

Name WAR
Jensen Lewis -0.3
Andy Marte -0.4
Michael Brantley -0.5
Chris Gimenez -0.7
Tomo Ohka -0.8

Shin-Soo Choo provided us with five wins more than a replacement level RF would have.  That’s quite valuable, when you consider that the best player in baseball last year provided only 9.4 wins above replacement (Grienke).  But beyond Choo, the news wasn’t so great.  Sure, Cabrera had a breakout year, but that was offset by Sizemore dropping from 6.4 WAR in 2008 to only 2.1 in 2009 due to injury and ineffectiveness.  At Hafner’s peak in 2006, he was worth 6.2 WAR—unheard of for someone with no defensive value.  Last year, it was a measly 1.3.  Peralta peaked in 2008 with a 3.9 WAR; now he’s basically good for one win a year over your everyday minor leaguer.  Oh yeah, and two of our top five performers were traded for players who probably won’t be in the opening day lineup.

So what is WAR good for?  Absolutely nothing?  Not quite.  Theoretically, if we add up all the wins over replacement contributed by an entire roster, we should be able to find out how many wins a team should have had over the course of a season.  Furthermore, if we have a good feel for the WARs of our current roster, we can predict how many wins the 2010 Indians might end up with.  Let’s start by using WAR to evaluate all the 2009 AL teams:

Team Wins Pos. WAR Pit. WAR Replacement Expected Wins
Yankees 103 38.4 18.7 43 100
Angels 97 29.7 16.6 43 89
Red Sox 95 27.7 23.6 43 94
Rangers 87 22.1 18.5 43 84
Twins 87 21.6 16.4 43 81
Tigers 86 21.9 16.5 43 81
Mariners 85 21.1 16.4 43 81
Rays 84 34.4 16.9 43 94
White Sox 79 11 22.6 43 77
Blue Jays 75 21 18.1 43 82
A’s 75 17.5 19.3 43 80
Indians 65 19.8 10.5 43 73
Royals 65 6.9 19.9 43 70
Orioles 64 15.6 7.4 43 66

It looks complex, but it’s not.  Basically, if you add the wins above replacement contributed by position players and pitchers to the replacement level, you have the number of wins your team should have won.  So the final column is the sum of the previous three.  Some teams get luckier or unluckier than others, but WAR tells us that the Yankees, Rays and Red Sox had the most talented rosters in the AL (100, 94 and 94 expected wins respectively), while the Orioles, Royals, and Indians players contributed the fewest wins above replacement in the AL.  Any surprise that those three teams also held the three worst records in the league?

Let’s look at the Indians a bit more closely.  Last season the Indians won only 65 games, but according to WAR, they should have won 73.  Not a powerhouse by any stretch, but if there’s one thing that I’ve tried to drive home over the past month or so, it’s that the Indians were not nearly as bad last season as their record suggested.  They ran into some horrible luck.  They lost too many close games.  They struggled with injuries, trades, uncharacteristically poor performances, and some dubious managing decisions from the Grinder.

So what might change this year?  Well, Grady should bounce back to form; if healthy, I’d mark him down for 5 to 6 WAR.  So long as Choo’s 2009 wasn’t a fluke (and I don’t think it was) I’d give him 4 to 5 WAR.  Cabrera should be good for another 3—more if his defense is as good as we think it might be.  And between Hafner, Branyan, and LaPorta/Brantley, I’m hoping for about 7 WAR combined.  That would put us at 19 to 20 WAR from 5 position players, or about what we had last year for our entire roster of position players.  In other words, I believe that our position players have a chance to be significantly better than they were last season.

The question is, of course, with our pitching staff.  I can’t remember the last time this team didn’t have a starter I’d call above average, but this season is starting to look that way.

From 2004 to 2007 Westbrook averaged around 3.5 WAR per season, but he’s obviously a giant question mark for the 2010 campaign.  In 2007, Fausto was worth 4.2 wins above replacement, but over the past two seasons he’s been worth less than one due to his control issues.  Laffey and Huff have typically been worth between one and two WAR in their careers, which is fine so long as they aren’t relied upon to be major cogs in the middle of the rotation.  Sowers has basically been the definition of a replacement-level pitcher, compiling only 3 wins above replacement over four combined seasons.  Who knows what to expect from Masterson or Talbot, but I wouldn’t expect more than 2 to 3 WAR from either.  Then there’s the bullpen, which last season contributed only 1.2 wins above replacement—led by Kerry Wood’s 0.4!  It has to get better this year, doesn’t it?

Based on the above, here are my guesses for the value of the 2010 squad (beware: the following prediction includes mandatory March optimism):

  2010 2009
WAR (Position) 25.6 19.8
WAR (Starters) 9.8 9.3
WAR (Bullpen) 3.4 1.2
AL Replacement 43  
Record 81.8  

And there you have it!  82 wins.  As a long-lost WFNY commenter would say, “Book it!”

Next time we’ll look at how WAR correlates with player salaries, and how the Indians (or any “small market team”) should be thinking in terms of contracts, resources, and Hafner-esque sized mistakes.

As always, feel free to ask questions and I’ll do my best to point you toward an answer.  See you next time!


Thanks to the guys at WFNY for picking me up as an occasional contributor.  Much of the research in this series is built on ideas from The Book: Playing the Percentages in Baseball, the ongoing work at FanGraphs, StatCorner, The Hardball Times, and Tom Tango’s blog, and the countless other blogs and books that refuse to stop thinking and arguing about baseball.

  • http://www.waitingfornextyear.com Scott

    Awesome tie-in of the last few weeks. Regarding Lee, is his WAR prorated? Also, I would have to assume that Masa Kobayashi was one of the five worst last season; did he not make the cut?

  • Jon Steiner

    @ Scott:

    Yep. That’s Lee’s WAR with the Indians only. His total season WAR was 6.6. Nifty pitcher to have around.

    Kobayashi’s WAR for the season was -0.1 last year; so as you’d expect, he cost us more runs than a replacement level guy would have (all while making $3 million!). However, we cut him loose about halfway through the season if memory serves, which means he didn’t have quite enough opportunities to really show off all his detrimental “abilities.”

    Okha on the other hand, pitched 71 innings with a 6.50 FIP. Just awful.

  • Josh

    This is incredibly interesting….great post

  • http://www.waitingfornextyear.com Denny

    I didn’t read the post but wanted to comment that Russell Branyan in every picture is fantastic.

  • http://www.waitingfornextyear.com Scott

    “Okha on the other hand, pitched 71 innings with a 6.50 FIP. Just awful.”

    I was at the game where Pujols hit two home runs in back-to-back at-bats off of Okha. Didn’t realize how much WAR was in the making…

  • Jon Steiner

    @ Denny:

    Those pictures are solely Scott’s creation, so I can take no credit. But juxtaposing MC Escher with Branyan may be one of the more genius moves of our generation–”is he going up or down?” relates well to both men.

    Scott told me he spent all week “on the photoshopper” in the hopes that his creation would carry him to victory in the Blog Sports Ohio competition.

    I’d say, based on his photoshoppery, it’s looking pretty darned good.

  • http://www.waitingfornextyear.com Denny

    I knew it! Luckily, I’ve got youth on my side in my quest to be the greatest photoshopper in the history of people who write for this website. Scott’s such an old curmudgeon that it’ll take him years to figure out photoshop. Looks like Branyan is his Tim Couch Bear. His muse, if you will. He may be highly motivated now, with the Love Muscle tugging at his heart-strings.

    But shoppery certainly will help him in the FSO gig. His writing is much better than the other guys’ already, so he’s got that going for him. Which is nice.

  • Tommy

    So are there things that have shown to correlate with overperforming or underperforming the win expectation based on runs? It has been a source of frustration for me over the last few years to see the Indians always projected to contend by these systems and the record has always fallen short what we “should” have done.

    The exceptions being ’05 and ’07, which also happens to be the only times that the Tribe’s bullpen was not horrid. Is there a correlation between bullpen success and winning close games? Or any other aspect of the game that would somewhat consistently correlate to winning close games, therefore consistently making more efficient use of our runs and outperforming the win expectations?

    I’m also looking forward to seeing if Wedge and his style had any effect on our consistent under performance of our expectations. Have you seen any evidence to suggest this was the case?

  • Jon Steiner

    @ Tommy:

    Really good stuff. First, I should say that my “projections” are not really all that scientific. In fact, they’re wild guesses based on optimism, roster moves, and last year’s figures.

    Second, WAR is my favorite metric for player evaluation, but it doesn’t do great with bullpens for exactly the reason you indicate. Outs late in a close game are more important than outs in the first inning, and for this reason, people often talk about “leverage index” (LI) and “win probability added” (WPA) as better measures for relievers. Maybe I’ll flesh those out in another post.

    Finally, a lot has been done on winning and losing close games. Strong defense has shown some correlation with an ability to win close games, but since we don’t have a good measure of “managerial effectiveness” it’s hard to measure someone like Wedge’s (deleterious?) effects on these stats. But if teams consistently outperform or underperform their projections, people often point to the manager. Think of the Angels. For the last nine years or so they outperformed their projection, and no one has a great answer other than Mike Sciosca.

    I guess I would also say that while the Indians did underperform under Wedge, they weren’t as bad as the Angels are good. Here are the Indians actual winning percentages and pytagorean winning percentages (based on runs scored and allowed) since 2002:

    2002: aW% – .457 pW% – .438
    2003: aW% – .420 pW% – .447
    2004: aW% – .497 pW% – .501
    2005: aW% – .574 pW% – .602
    2006: aW% – .481 pW% – .553
    2007: aW% – .595 pW% – .575
    2008: aW% – .500 pW% – .528

    So they “overperformed” in 2002 and 2007, “underperformed” in 2003, 2005, 2006, and 2008, and were dead on in 2004. I would tend to think a lot of that year-to-year variation is just statistical noise, but if we need a scapegoat for four “bad years” and only two “good years”, there’s always the Grinder…

  • MrCleaveland

    That photoshop of the Escher sketch is fantastic. (Get it? Escher sketch. Etch-a-Sketch. Ha ha ha! Thanks. It’ a gift.)

  • sleepless in cincinnati

    What is it good for?
    Absolutely nothin
    Say it again

    I couldn’t resist.

  • Pingback: SABR-Toothed Triber: Pay that Man his Money? | WaitingForNextYear

  • Pingback: SABR-Toothed Triber: WPA & the Epistemology of Blame* | WaitingForNextYear