## The 600 Home Run AlmanacJuly 28, 2010

Posted by tomflesher in Baseball, Economics.
Tags: , , , , , , , , , , , , , ,

People are interested in players who hit 600 home runs, at least judging by the Google searches that point people here. With that in mind, let’s take a look at some quick facts about the 600th home run and the people who have hit it.

Age: There are six players to have hit #600. Sammy Sosa was the oldest at 39 years old in 2007. Ken Griffey, Jr. was 38 in 2007, as were Willie Mays in 1969 and Barry Bonds in 2002. Hank Aaron was 37. Babe Ruth was the youngest at 36 in 1931. Alex Rodriguez, who is 35 as of July 27, will almost certainly be the youngest player to reach 600 home runs. If both Manny Ramirez and Jim Thome hang on to hit #600 over the next two to three seasons, Thome (who was born in August of 1970) will probably be 42 in 2012; Ramirez (born in May of 1972) will be 41 in 2013. (In an earlier post that’s when I estimated each player would hit #600.) If Thome holds on, then, he’ll be the oldest player to hit his 600th home run.

Productivity: Since 2000 (which encompasses Rodriguez, Ramirez, and Thome in their primes), the average league rate of home runs per plate appearances has been about .028. That is, a home run was hit in about 2.8% of plate appearances. Over the same time period, Rodriguez’ rate was .064 – more than double the league average. Ramirez hit .059 – again, over double the league rate. Thome, for his part, hit at a rate of .065 home runs per plate appearance. From 2000 to 2009, Thome was more productive than Rodriguez.

Standing Out: Obviously it’s unusual for them to be that far above the curve. There were 1,877,363 plate appearances (trials) from 2000 to 2009. The margin of error for a proportion like the rate of home runs per plate appearance is

$\sqrt{\frac{p(1-p)}{n-1}} = \sqrt{\frac{.028(.972)}{1,877,362}} = \sqrt{\frac{.027}{1,877,362}} \approx \sqrt{\frac{14}{1,000,000,000}} = .00012$

Ordinarily, we expect a random individual chosen from the population to land within the space of $p \pm 1.96 \times MoE$ 95% of the time. That means our interval is

$.027 \pm .00024$

That means that all three of the players are well without that confidence interval. (However, it’s likely that home run hitting is highly correlated with other factors that make this test less useful than it is in other situations.)

Alex’s Drought: Finally, just how likely is it that Alex Rodriguez will go this long without a home run? He hit his last home run in his fourth plate appearance on July 22. He had a fifth plate appearance in which he doubled. Since then, he’s played in five games totalling 22 plate appearances, so he’s gone 23 plate appearances without a home run. Assuming his rate of .064 home runs per plate appearance, how likely is that? We’d expect (.064*23) = about 1.5 home runs in that time, but how unlikely is this drought?

The binomial distribution is used to model strings of successes and failures in tests where we can say clearly whether each trial ended in a “yes” or “no.” We don’t need to break out that tool here, though – if the probability of a home run is .064, the probability of anything else is .936. The likelihood of a string of 23 non-home runs is

$.936^{23} = .218$

It’s only about 22% likely that this drought happened only by chance. The better guess is that, as Rodriguez has said, he’s distracted by the switching to marked baseballs and media pressure to finally hit #600.