With all of the recent discussion over Clint Hurdle's bullpen management, I spent the afternoon rereading a few chapters on bullpen usage from one of the important books in baseball research, Baseball Between the Numbers: Why Everything You Know About The Game Is Wrong. In an article titled "Is Joe Torre a Hall of Fame Manager?" James Click researches the question of whether "there may be some managers that who separate themselves from the pack by employing their given resources (bullpen) better than most." Both the methodology he used and the question he asked are the basis for this study.
The assumption underlying Click's research is straightforward: managers should have relief aces, not closers. Relief aces should be deployed in high leverage (high leverage = high pressure) situations, whenever those situations arise. This means that aces may be used in the seventh, eighth, or ninth inning. They are not locked into ninth inning appearances.
At the end of the season, the relief ace should have the highest average entering leverage on the staff. By "entering leverage," I mean the measure of leverage at the moment he takes the mound. Since relief aces should be most often used to extinguish high-leverage threats, they should have the highest entering leverage on the staff. Ideally, at the end of the season, the relief ace should have the highest entering leverage, the second-best reliever the second-highest entering leverage, and so on.
In order to examine how efficiently managers employed their bullpens, Click looked at two statistics: Fair Run Average FRA and Leverage (I use gmLI, entering game leverage). He does not report how he used these metrics, but he concludes that there is no evidence that managers can be significantly distinguished as being better or worse in this regard. He posits that the lack of distinction in this area is largely due to modern baseball's lack of innovation in bullpen management.
I think Click's conclusion is probably accurate. I find it very unlikely that Clint Hurdle is much better or worse than his peers in maximizing his bullpen resources.
However, out of curiosity, I decided to collect the same data that Click used, FRA and Leverage, for the 2012 season. What I wanted to find out was where the Pirates ranked in the National League in terms of matching their pitchers to leverage. Basically the idea is that the pitchers with the lowest/highest FRA should have the highest/lowest leverage scores.
(Click's description of Fair Run Average, or FRA: "FRA is just like ERA except that it corrects for inherited and bequeathed runners ... For example, if a pitcher leaves the game with a man on first and 2 outs and the following reliever allows a home run, ERA would credit that first run to the pitcher who already left. By contrast, FRA attributes some of the credit to each pitcher since one was responsible for putting him on base and the other for letting him score.") (On EDIT: Baseball Prospectus has changed FRA since the publication of Click's chapter: the new definition is a follows, "Fair Run Average takes things a step further. Pitchers receive credit for good sequencing, thus rewarding pitchers who seem to work out of jams more often than usual. Fair Run Average also considers batted ball distribution, base-out state, and team defensive quality (as measured by Fielding Runs Above Average". thanks to Tom Tango for pointing this out.)
What I Did
1. Collect all the relief pitchers who had at least 25 innings pitched in 2012.
3. Created scatter plots and ran linear regression trendlines through them and calculated R-Squared.
4. Transform R-Squared into r in order to report correlations.
Interpretation of Results
The higher the correlation, the more effectively a team matched its relief pitchers to leverage.
We need to be careful about drawing any firm conclusions from these results. We are dealing with very small sample sizes when we deal with relief pitchers. It may very well be the case that a team's best reliever actually ended the season with a higher FRA than a lesser reliever. One or two bad outings can give a misleading impression about a reliever's talent.
Moreover, it is certainly the case that there is more to bullpen management than just responding to leverage. Matchups matter, as do considerations of overuse.
What the data does shows us, then, is what teams got closest to getting it "right," for whatever reason. Part of it may be good management, some of it may be pitchers posting (un)expected FRAs.
First, is the scatterplot of relief pitchers in the MLB with 25 or more innings pitched.
As you can see, the trendline is moving in the expected direction. As FRA increases, average entering leverage decreases. The correlation is not terribly strong, but it's not nothing either, .407.
(click all images to enlarge.)
Next are the scatter plots for both the National League and American League. The National League has a stronger correlation than the American League: .434 to .379.
This result is somewhat odd to me since one would expect that American League teams would have more bullpen flexibility, since AL managers do not have to worry about pinch hitting for pitchers.
Below is a table of the correlations for every National League team in 2012. The Pirates rank somewhat below NL average, .37 compared to .434. The San Diego Padres have a very strong correlation. What's most interesting (or disturbing) is the extreme range of values.
Here is a look at the team with the highest correlation (.87), the San Diego Padres.
The Pittsburgh Pirates (.37). Moving left to right, the dots represent Grilli, Lincoln, Hanrahan, Watson, Cruz, Resop, and Hughes.
I do not feel comfortable drawing any strong conclusions from this analysis in terms of evaluating managers. Certainly their decisions play some role in the differences we see but, as I mentioned above, since we are dealing with small sample sizes FRA may not be identifying the truly most talented pitchers.
However, I think we can make the observation that, for whatever reason, some teams seemed to have gotten much closer to getting it right than others. Whether this reflects an underlying, repeatable managerial skill is open for debate, though Click's research suggests that it is not.
In the future, one way to redo this study is to substitute projected FRA for actual FRA. That would give us a better idea of pitcher's underlying talent than one year's worth of pitching.
In terms of the Pirates, what we can say is that the team's bullpen ended up slightly below league average in terms of being matched up with leverage. In hindsight, Brad Lincoln should have seen more high leverage situations; Grilli more high leverage than Hanrahan, and Resop more than Hughes.