BABIP is Your Friend
I want to take the time today to talk about BABIP, a baseball term which stands for Batting Average on Balls In Play. I get the sense that sometimes that acronym, BABIP, is something of a touchy subject for baseball fans, and I want to use this platform to try to bridge that gap to some degree.
The main series of questions I want to answer in this piece are as follows:
What is BABIP? What does BABIP measure? How is BABIP useful? They're all interrelated concepts, so I'll try to break it down as best I can.
As I mentioned at the top, BABIP (I pronounce it phonetically, for what it's worth) quite literally stands for Batting Average on Balls In Play. While that sounds like some newfangled sabermetric stat, I hesitate to really call it a purely a "stat," per se.
To me, BABIP is more of a measurement tool than a statistic. Yes, it's expressed in a number, but like any baseball-related statistic, it requires a little more nuance to properly contextualize what it's really measuring. BABIP to me is an inexact, but important, rough outline of a player's batted ball profile.
What you may have surmised from its name is that BABIP only measures batted balls that are in play. That means strikeouts and walks have no baring on it, but neither do home runs. Home runs are considered out of play, and they aren't subject to what BABIP measures for that reason.
Mathematically, the formula to calculate BABIP is: (H - HR) / (AB - K - HR + SF + SH). I'm not a huge math guy, which may come as a shock to my readers and Twitter followers, but I'm truly not very mathematically and/or formulaically sound. I respect the math and science that goes into these metrics, but you won't see me inventing a mathematical proof any time soon.
The best I can do is try to understand what a measure like BABIP is really measuring, as opposed to what it doesn't measure, and try to contextualize why BABIP is a useful tool for baseball fans.
People might say, "why should I care about BABIP? It doesn't include strikeouts, walks and homers. Strikeouts, walks and homers are part of the game!" That's absolutely true, and no proper application of BABIP aims to measure or truly impacts anything other than batted balls in play.
BABIP is not trying to be anything more than a measure of a specific criteria of baseball plays. It is not meant to encompass everything, nor is it a magic stat that means a significant amount on its own. It is simply meant to scratch the surface of batted ball profiles, and help us get a better of understanding of what happens when a ball is put in play.
Something unique about BABIP and batted ball profiles in general is that every single player, pitcher and batter, has a unique and individualized batted ball profile. Pitcher A may not have similar batted ball profiles than Pitcher B, because they have different skill-sets. Same for Hitter C and Hitter D.
The theory on BABIP for hitters is that they have somewhat more control over their BABIP and batted ball profiles than pitchers do, because in theory, hitters can influence the direction, speed and quality of contact made on the balls they put in play. Maybe not to a large degree, but still more than a pitcher can.
Essentially, a batter's approach to a pitch can influence a batted ball more than a pitcher can because a pitcher is at the mercy of the defense behind him, the batter's skill, the execution of his pitches, etc. Luck is inherently a part of baseball, which to me is a beautiful thing.
The best players in the world need a certain degree of luck to succeed. Just as a hard line drive can be caught, and a soft blooper or dribbler can go for a base hit, luck is inherently as big a part of baseball as grass and dirt.
The proper use of BABIP, in my opinion, is looking at it in the grand scheme of things. Look at a player's career BABIP and see if it falls in line with their current season BABIP. Sometimes, for a variety of factors, it does not.
Maybe a pitcher is getting a higher rate of ground balls a given year and getting weaker contact more consistently, leading to a lower BABIP. In theory, weaker contact should lead to a lower BABIP.
I've mentioned batted ball profiles, and I want to expand on that as well. Batted ball profiles typically include ground ball percentage, fly ball percentage, and line drive percentage, (along with other more minor batted ball types such as infield fly balls, bunt hits, infield hits, etc.)
Consistency in baseball is really hard, obviously. It's very difficult to hit line drives all the time if you're a hitter, or get weak contact if you're a pitcher. And of course, even if you do exactly what you were trying to do, success isn't guaranteed. Luck is a fickle beast, and because of that, so is BABIP.
I'll use David Wright as an example. He is a player who has put up a career BABIP of .341, which is extraordinarily good. From 2005 to 2013 for example, his .343 BABIP is tied for 15th among hitters. How does he do it? Mostly on the strength of a career 22.5% line drive percentage. David Wright hits the baseball hard a lot. He's pretty skilled, and his skill-set is predisposed to a pretty solid BABIP. Since the BA in BABIP literally stands for batting average, BABIP is as limited as AVG is, treating all base hits equally like AVG does.
Wright's worst offensive season in 2011? Career-low BABIP of .302, and career-low line drive percentage of 18.0%. He also had career-lows in AVG, OBP, SLG that year, which isn't very surprising. It's all there in the batted ball profile. He did not hit the ball as hard that year as he typically does, and lady luck did not bless him with a high BABIP.
I keep coming back to luck because I can't stress enough how much luck has to do with it. I wish I could say Chris Johnson of the Braves is an extremely skilled hitter with his career .361 BABIP and his outstanding 24.9% live drive rate (career-best 27.0% last year, unsurprisingly along with career-best and MLB-best .394 BABIP.) He's very good at hitting for batting average, though there's not much else to his offensive game. His skill-set predisposes him to good batted ball luck, and luck is always a factor.
I'm rambling. I do that pretty often. This post is long, and frankly I'm a little worried I haven't explained any of this very well at all. I'm glad you're still reading, and I really hope you're not more confused about BABIP than you were when you started this article and saw my terrible photoshop.
The bottom line is that BABIP is a useful concept when properly applied. I think you can say that about just about any metric in baseball. And like any metric, basic or advanced, the bigger a sample size you're working with, the higher degree of reliability it has. BABIP measures what happens when the ball is put in play. It doesn't measure anything else, and quite frankly, it doesn't tell you nearly everything that happens when that ball is put in play, either.
Sample size is always important. Small sample size is dangerous for reliability purposes. For hitters, you might need somewhere in the range of 800-1000 total balls in play to properly and reliably make statements about a player's BABIP. For pitchers it's a lot more. Fangraphs estimates it at about 2000 balls in play before a pitcher's BABIP becomes reliable. In a nutshell, single-season pitcher BABIP is very, very unstable and unreliable. Always beware of small sample size in statistical analysis. Accuracy and objectivity is the goal, and without a proper sample, inaccuracy can run rampant.
BABIP is imprecise. It's rough, just as batting average is. It means very little in a vacuum, but it can be very meaningful when compared to the quality of contact in a player's batted ball profile that season and those same things over the course of career. You don't have to be a good player to have above-average BABIP luck, and you don't have to be a bad player to get below-average BABIP luck. Random variation is a big part of baseball, and BABIP is simply a rough estimation of the degree of luck at play.
An above-average or below-average BABIP on its own doesn't mean a player is necessarily lucky or unlucky, especially if that player does it consistently over many seasons. But statistical outliers, such as a pitcher putting up a .272 BABIP with a career .310 BABIP all of a sudden, that can be a red flag that, "hey, that guy's getting lucky, and he's probably due to regress back towards the .300 mark or so." Fluky outliers happen, and typically if they don't make sense, over time, they correct themselves.
I hope that makes sense.
I'll leave you these silly but informative YouTube clips by Fangraphs writer Bradley Woodrum, this one on pitcher BABIP, and this one on hitter BABIP. He's much more concise and a lot smarter than I am, and if you're a visual learner it'll probably help.
I truly hope by the end of this paragraph you have a greater understanding and appreciation of the concept of BABIP in general, and please don't hesitate to tweet me with any questions or comments. I'm happy to help.