##
Hit Batsman Roundup, 2010
*December 26, 2010*

*Posted by tomflesher in Baseball.*

Tags: Brett Carroll, hit batsman, hit by pitch, Hunter Pence, Kevin Youkilis, Omar Infante, Raul Ibanez, regression, Rickie Weeks, Scott Podsednik, spurious correlation, Victor Martinez

add a comment

Tags: Brett Carroll, hit batsman, hit by pitch, Hunter Pence, Kevin Youkilis, Omar Infante, Raul Ibanez, regression, Rickie Weeks, Scott Podsednik, spurious correlation, Victor Martinez

add a comment

There’s very little more subtle and involved than the quiet elegance of a batter getting beaned. In fact, that particular strategy was invoked 1549 times in 2010, with 419 batters getting plunked at least one.

The absolute leader this season was not **Kevin Youkilis** or **Brett Carroll** but **Rickie Weeks**, who led with 25 HBP in 754 plate appearances. Put another way, Weeks got hit in 3.32% of his plate appearances. That’s almost once every 30 plate appearances, or nearly four times the MLB-wide rate of 0.83% of the time. (Incidentally, that’s total HBP divided by total plate appearances. The more skewed mean percentage is 0.58%.) What leads to such a high number of plunkings?

I would assume that a few things would go into the decision to hit a batter intentionally:

- Pitchers are less likely to be hit by other pitchers.
- If a hitter is likely to get on base anyway, he’s more likely to be hit – you don’t lose anything by putting him on base, and you control the damage by limiting him to one base.
- If a batter is likely to hit for extra bases, he’s more likely to be hit.
- If a batter is likely to steal a base, he’s less likely to be hit, but there is an offsetting effect for caught stealing.
- American League batters are more likely to be hit because of the moral hazard effect of pitchers not having to bat.

With that in mind, I set up a regression in R using every player who had at least one plate appearance in 2010. I added binary variables for Pitcher (1 if the player’s primary position is pitcher, 0 otherwise) and Lg (1 if the player played the entire season in the American League, 0 otherwise), then regressed *HBP/PA* on *Pitcher, Lg, BB, HR, OBP, SLG, SB,* and *CS*. The results were somewhat surprising:

Call: lm(formula = hbppa ~ Pitcher + Lg + BB + HR + OBP + SLG + SB + CS) Residuals: Min 1Q Median 3Q Max -0.0154027 -0.0059081 -0.0018096 0.0001845 0.1397065 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6.847e-03 9.815e-04 6.975 5.77e-12 *** Pitcher -5.399e-03 9.136e-04 -5.909 4.81e-09 *** Lg -1.614e-03 7.054e-04 -2.289 0.0223 * BB -1.412e-05 3.257e-05 -0.434 0.6647 HR 1.122e-04 7.956e-05 1.411 0.1587 OBP 8.570e-03 3.477e-03 2.465 0.0139 * SLG -3.451e-03 2.468e-03 -1.398 0.1624 SB -6.749e-05 8.693e-05 -0.776 0.4377 CS 1.770e-04 2.646e-04 0.669 0.5036 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.01042 on 935 degrees of freedom Multiple R-squared: 0.08839, Adjusted R-squared: 0.08059 F-statistic: 11.33 on 8 and 935 DF, p-value: 2.07e-15

Created by Pretty R at inside-R.org

That’s right – only *Pitcher, Lg, HR,* and *SLG* are even marginally significant (80% level). *BB, SB,* and *CS* aren’t even close. Why not?

Well, for one, the number of stolen bases and times caught stealing are relatively small no matter what. There probably isn’t enough data. For another, there simply probably isn’t as much intent to hit batters as we’d like to pretend.

Second, American Leaguers are **less** likely to be hit. This baffles me a little bit.

Also, keep in mind that this model shouldn’t be expected to, and cannot, explain all or even most of the variation in hit batsman. The R-squared is about .09, meaning that it explains about 9% of the variation. It ignores probably the most important factor, physics, entirely. (That is, the model doesn’t have any way to account for accidental plunkings.) As a side note, other regressions show there might be an effect for plate appearances, meaning you’re more likely to get hit by chance alone if you take enough pitches.

Finally, there are some guys who manage to do the opposite of Weeks’ feat. Houston outfielder **Hunter Pence** went 156 games and 658 plate appearances without getting plunked at all. Honorable mentions go to **Raul Ibanez**, **Scott Podsednik**, **Victor Martinez**, and **Omar Infante**, all of whom went over 500 plate appearances without a beaning. Now THAT’S plate discipline.