Trends in DH use June 11, 2010
Posted by tomflesher in Baseball, Economics.Tags: Baseball, baseball-reference.com, designated hitter, economics, Interleague play, Mets, regression, sports economics, Stuff Keith Hernandez Says
trackback
Last night, Keith Hernandez was talking about how the Mets are scheduled to play in American League parks starting, well, today. He pointed out that the Mets will be in a bit of a pickle because they aren’t built, as AL teams are, to carry one big hitter to be the full-time DH. Instead, an NL team will be forced to spread the wealth among lighter hitters who are carried for their defensive acumen as well as their offensive prowess. Keith then corrected himself and said that AL managers are using the DH differently – to rest individual players instead of having an everyday DH.
That pinged my “Stuff Keith Hernandez says” meter, and so I decided to crunch some numbers and see if that’s true. I interpreted Keith’s statement as implying that the number of designated hitters should be increasing, since managers are moving away from an everyday DH and toward spreading the DH assignments around a bit more. The crunching also needs to account for interleague play, which should obviously increase the number of DHes. So, after controlling for interleague play, does DH use show an increasing trend with time?
To set up the regression, I modified an existing data set I had to include a variable for the number of people with at least one at-bat as a designated hitter (culled from baseball-reference.com/play-index). B-R.com didn’t have a listing for 1973, so I noted that 1974 had 106 DHs and 1975 had 107 and made an educated guess (that would be consistent with Keith’s statement) that 1973 had 105. Then, I added a binary variable Inter which took value 1 if there was interleague play that year and value 0 otherwise. Finally, I created time variables DHt (starts at 1 in 1973 and increases with each year), Intert (starts at 1 in 1997 and increases with each year), and squares of both of the time variables. My dependent variable is the number of players with at least one at-bat as a designated hitter (DHes) divided by the number of teams playing with the DH rule (DHTms). Finally, armed with this dataset, I pushed the numbers through R and came out with this result:
Estimate | Std Error | t value | p value | Signif | |
B0 | 0.00483 | 0.06735 | 0.07200 | 0.94295 | 0.05706 |
DHt | -0.19479 | 0.07961 | -2.44700 | 0.01610 | 0.98390 |
DHtsq | 0.00600 | 0.00299 | 2.00600 | 0.04753 | 0.95247 |
DHTms | 0.74367 | 0.03300 | 22.53400 | 0.00000 | 1.00000 |
Inter | 3.08814 | 0.65227 | 4.73400 | 0.00001 | 0.99999 |
Intert | 0.44171 | 0.19733 | 2.23800 | 0.02734 | 0.97266 |
Intertsq | -0.04639 | 0.01321 | -3.51200 | 0.00066 | 0.99934 |
Some caveats are in order. First of all, according to a Breusch-Pagan test, the error terms are absolutely heteroskedastic (that is, they’re correlated to something that I haven’t accounted for in my data). Second, I have an R[sup]2[/sup] of .9884, meaning that this data explains almost 99% of the variance in the number of designated hitters used. That’s a lot of explanatory value, and usually means you’re doing a regression that looks like “Right shoes = B0 + B1 Price + B2 Left shoes + error term” – that is, one where you’re missing some obvious highly correlated term. I’m not sure what that term might be, though. Also, there isn’t really enough data from interleague play to run robust time series analysis on it.
However, we can make some statements. First of all, interleague play adds about 43 designated hitters, or about 2.68 per National League team although that probably varies by the number of series played. Second, DHes per team decreased until they hit a minimum in 1989 and then began increasing again in terms of time series. What do you know? Keith might have been right after all.
Comments»
No comments yet — be the first.