Is Batting Average Bad for Your Health?

Matt Bruce

In a recent article here, Dave Paisley showed a linear relation between On-Base Percentage Phus Slugging (OPS) and Runs Created (RC). He suggested, to my complete agreement, that it's far easier to add two numbers together than to retake algebra. Forgive me in advance, then, for putting some especially devious algebra (and some probability) on the table here.

The fact that OPS and RC were the competing elements suggests a high level of sophistication among you readers. I like that. I wish my friends and colleagues were so cultured. Unfortunately I live and work with people who stick to the Triple Crown numbers. (That's batting average, home runs and RBI for those of you who have been reading Bill James too long.)

The relative importance of on-base percentage and slugging percentage may be interesting, but it is the relative importance of batting average -- if any -- that is worth demonstrating to the potential statheads among the masses. To that end, imagine two theoretical players. Both have a .800 OPS, including a .400 on-base percentage and a .400 slugging percentage.

One, we'll call him the Singles Guy, has a .400 batting average. That is, he singles twice per five plate appearances but otherwise makes outs. The other, call him Slumping Thome, has a .100 batting average. That is, in 30 plate appearances he will draw 10 walks, hit two home runs and make 18 outs. If the Singles Guy isn't out-producing Slumping Thome, then traditional baseball thought is almost completely irrelevant. If batting average is worth anything, then the .400 hitter should run rings around the .100 hitter, no?

To make my thought experiment as simple as possible I imagined two different lineups, each stocked with clones of the same type of player. The only four possible outcomes of a plate appearance in this world are outs, walks, singles and home runs. Runners go from first to second on a single but score from second or third. This is "station-to-station ball" for mathematical simplicity. If you think that "taking the extra base" is too big an offensive factor to ignore, then write to me with math to back up the assertion.

How many runs will each lineup expect to score in an inning? You can solve this algebraically by working backwards. If a Singles Guy steps up to the plate with two on and two out, there is a 60% chance that he will end the inning but a 40% chance of a hit that will put one more run on the board and restore the same situation. Let X be the number of runs that the Singles Guys should expect to score after this situation.

X=0+.4*(X+1)
X=.4X+.4
.6X=.4
X=0.667

For the Singles Guys, a two-on, two-out situation leads to an expected run output of 0.667. (Note: This is how many runs they expect to score and not just the probability of scoring at least once.) From there we move to one on, two out. Let Y...oh never mind, here are the spreadsheet numbers:

Singles Guys
  0 Out 1 Out 2 Outs
2 On 2.000 1.3333 0.6667
1 On 1.2160 0.6933 0.2667
0 On 0.6912 0.3413 0.1067
Slumping Thomes
  0 Out 1 Out 2 Outs
3 On 2.5745 1.7719 0.9185
2 On 1.7905 1.1319 0.5185
1 On 1.2225 0.7319 0.3185
0 On 0.7985 0.4519 0.1852

The startling conclusion here is that a lineup full of Slumping Thomes will expect to produce more runs than a lineup full of Singles Guys -- even though every player involved has the same OBP and SLG, while the Thomes have lower batting averages. The Singles will score about six runs a game but the Thomes will score about seven runs a game. Over a full season, an extra run a game can turn a .500 team into a division champion.

Fans of little-ball will ask, "ah, but isn't the higher output a result of the potential "big inning" from the Thomes?" Suppose you're down a run going into the bottom of the ninth. Which lineup is more likely to tie the game. Believe it or not, it's still the Thomes. The algebra (or spreadsheet formulae) are simpler here. The Thomes have a .3440 chance of scoring at least one run in a given inning while the Singles Guys have just a .3174 chance of scoring at least once.

Just for kicks I ran some simulated games between the two teams. Like anything involving random numbers, the results are streaky. Both teams will win four or five in a row every now and then. My results (available on request if you really want): The Thomes win 59 out of 100 games.

The data suggests an even stronger conclusion than I would have professed before. Hardcore statheads readily agree that batting average is useless as a stat. It's far bolder to suggest that evaluating players based on batting average is counterproductive. Yet here it is. I say that even between two players with the same On-Base % and same Slugging %, the player with the lower batting average is the one you want.

Tell me why I'm wrong, but if you do, back it up with math.

Matt Bruce is currently on tour with the Up With America troupe, although his lilting rendition of "Rex Hudler was a Career Scrub" isn't exactly bringing down the house. Suggest something more snappy, like "I (Heart) Andro," at