Speed, Smarts and DefenseI have been at a conference (not SABR) for the last few days, and spent some time doing a small research project, so I apologize for the lack of posts. To make up for it, I'll dive into the new research right away.Announcers are fond of saying that a fast player is a good defender and vice versa, especially in the outfield. If someone covers a lot of ground, they say he's fast and assume that it also makes him a good baserunner. If someone steals lots of bases, they'll point to that as proof that he covers a lot of ground in the outfield. Defense is notoriously difficult to measure, and even SB statistics can mislead as high totals often come at the expense of high CS totals. Due to the dense fog shrouding both stats, the Joe Morgans of the world can get away with inserting their personal anecdotes in the place of evidence. How would he explain someone like Andruw Jones, lauded for his glove and range, but without gaudy SB totals? I wanted a better answer.
I figured that it would be relatively simple to run a regression analysis for the best defensive statistics available against various measures of speed or discipline on offense to see if they move together. Somewhat surprisingly, offensive skills do little to predict defensive ability in the outfield, according to my research. I took a sampling of every CF to play 100 games in a season since the 2001 season and recorded his
rate2. I then plotted it against various offensive stats: AVG, OPS+, triples, SBs, SB% and BB rate in order to test a few different stereotypes. I chose CF because it is the most difficult OF defense position, meaning the sample wouldn't be watered down by players with one skill who are stuck on an OF corner in spite of it.
Note: Rate2 is always the dependent variable, always on the Y-axis.
AVG vs. Rate2I started with AVG versus rate2 as a control for the experiment, because there is not much reason to think that hitting for a high average would make a player successful in the field. There are plenty of examples of high average players with less than stellar defensive reputations/skill sets, such as Tony Gwynn. But he was tremendously fat, you say. Yes, exactly, he was able to hit for a high average despite carrying the equivalent of a fishtank hanging over his belt, so it shouldn't have anything to do with fielding ability. If there is a high correlation here, it would indicate that offensive skills predict defensive skills, Joe Morgan could make even stronger statements about fielding, and this whole exercise would be a waste of time.
We're off to a good start, as the nearly straight red line indicates that a good batting average does not predict a good fielder. AVG can fluctuate wildly as demonstrated by players like Scott Podsednik and Johnny Damon, both of whom showed up multiple times in the data. Potentially, if someone had a high enough AVG, he would play in the field often, which could theoretically make him a better fielder over time. This graph provides evidence against that theory, so the confounding variables are unlikely true and we can move along.
OPS+ vs. Rate2
OPS+ is a measure of a player's OPS relative to league average (Player OPS/ League OPS) to give an idea of how he has performed offensively compared to average. I used it because it's easily accessible and seldom varies past the 50-150 range, making it fit more cleanly in a linear graph.
Once again, there is no statistically significant evidence to suggest a correlation. I only included OPS+ because it is a more complete measure of total offensive ability than AVG, but didn't really expect there to be any strong correlation. Now that we have established that offensive and defensive ability are not inherently linked, we can move on to the section comparing speed indicators on offense to OF defense.
SB vs. Rate2
If Morgan, et al. are correct, then it stands to reason that the ability to steal a base should predict the ability to field the ball well in CF, since both rely on the same fundamental skill: footspeed. My expectation was that there would be a weakly positive relationship between the two skills, at it makes some intuitive sense that great base-stealers and CFs need to be fleet of foot, but this was where I was most surprised.
The strongest negative relationship of any of the data I examined came from the SB set. Notice the high outliers for SBs on the right end of the graph. The owners of the four highest SB totals were all substantially below average defenders. In other words, if good base stealers have to be fast, then speed actually predicts a worse defensive CF, contrary to every bit of conventional wisdom and intuition. Perhaps I'm playing fast and loose here with the definition of a good base stealer, but we'll look more carefully at that later.
Specifically, the correlation between the two sets of data is -17.3%, not particularly negative, but surprising nonetheless. Another way to explain it is to say that the portion of the variation in defensive ability that can be accounted for by SB totals (the r-squared value) is .03.
Triples vs. Rate2
Assume for a moment that base stealing has nothing to do with speed, and it just depends on a runner's ability to read the pitcher's motion. Far-fetched as that may sound, there is probably some kernel of truth in it. If that is the case, perhaps triples are a better measure of speed and would do more to predict a player's ability in the outfield. After all, there are examples of less speedy players with respectable SB totals, but very few leadfooted triplers. In Twins terms, think of Torii Hunter stealing bases by taking off extremely early to make him much faster than he really is. For the triples argument, imagine Fatthew Lecroy chugging around second to try to waddle out a three-bagger (or just remember when it actually happened and smile).
Just like Steve Buschemi in Fargo, this graph only strikes me as kinda funny lookin'. Because there is little variation in triples (it's extremely rare for anyone to go above 20) and the values can only be integers (thank you, Donny), it ends up looking like a poorly completed standardized test. Once you get past the initial chortle over the graph's shape, you'll notice that there is another slightly negative correlation between a speed stat and defense. Again, not statistically significant, but surprising anyway.
SB% vs. Rate2
Another possible explanation for the data in the previous two graphs is that the SB and the triple are dangerous stats that require potentially reckless behavior to accomplish. Runners will often stay at second instead of trying to stretch the double in the name of caution, which is often a smart play, as the extra base will add some run probability, but the high chance of getting out many times will outweigh it. In that respect, perhaps a smart player is a better defensive player. If that hypothesis were true, then stats that indicate intelligence in a player would predict defense, which makes some sense considering the importance of taking good paths to the ball, knowing when and where to make throws, etc. SB success rate is one of those stats that supposedly indicates intelligence, so let's see how that varies with defense.
Surprisingly to me, this graph shows another slightly negative correlation, this time to the tune of -4.5%. At the risk of sounding like a broken record, it's a non-statistically significant relationship that is surprisingly since a strongly positive relationship would seem natural to anyone who watches baseball. Someone who steals and does it well should be able to make a good CF since he's fast and smart. But there must be some other skill at play, since the relationship just isn't there.
BB rate vs. Rate2
Walk rate is a similar "smart stat" to SB%, but it takes a couple of additional factors into consideration. First, someone who draws lots of walks should have good vision at the plate to see the pitches and not swing at them. Vision would be valuable as a CF in terms of seeing the ball off of the bat and picking it up. Also, BB rate could speak to a player's ability to react quickly, choosing not to swing in a fraction of a second. Such a skill would help a fielder in every stage of fielding the ball, from picking it up off of the bat, to making a good path to the ball and finally actually making the catch. But is there a relationship here?
Well, we're making some progress here, as the thin red line goes in the right direction, but it's still a random smattering of points with no real relationship. To put that in more concrete terms, there's only 9.8% correlation, and the BB rate only explains slightly less than 1% of the variance in Rate2.
Conclusion
Maybe the lack of a relationship between offense and defense is because I used a park/league adjusted defensive stat but not for offense. Maybe it's because the sample size was 100 players and that was too small. Maybe I somehow picked the only speed/smart stats that don't correlate with base stealing. But the more likely story is that people who use offensive ability to predict defensive ability fundamentally misunderstand what it takes to play in the OF. Since the relationship between pure offensive ability (OPS+) and rate2 is actually stronger than that of speed or smart stats, there is no statistical reason to believe that anything a player does in the top half of the inning will predict what he does in the bottom half.