How often should Youk take his base? June 30, 2010
Posted by tomflesher in Baseball, Economics.Tags: Baseball, baseball-reference.com, binomial distribution, Brett Carroll, Greek God of Take Your Base, hit batsmen, hit by pitch, Kevin Youkilis, R
add a comment
Kevin Youkilis is sometimes called “The Greek God of Walks.” I prefer to think of him as “The Greek God of Take Your Base,” since he seems to get hit by pitches at an alarming rate. In fact, this year, he’s been hit 7 times in 313 plate appearances. (Rickie Weeks, however, is leading the pack with 13 in 362 plate appearances. We’ll look at him, too.) There are three explanations for this:
- There’s something about Youk’s batting or his hitting stance that causes him to be hit. This is my preferred explanation. Youkilis has an unusual batting grip that thrusts his lead elbow over the plate, and as he swings, he lunges forward, which exposes him to being plunked more often.
- Youkilis is such a hitting machine that the gets hit often in order to keep him from swinging for the fences. This doesn’t hold water, to me. A pitcher could just as easily put him on base safely with an intentional walk, so unless there’s some other incentive to hit him, there’s no reason to risk ejection by throwing at Youkilis. This leads directly to…
- Youk is a jerk. This is pretty self-explanatory, and is probably a factor.
First of all, we need to figure out whether it’s likely that Kevin is being hit by chance. To figure that out, we need to make some assumptions about hit batsmen and evaluate them using the binomial distribution. I’m also excited to point out that Youk has been overtaken as the Greek God of Take Your Base by someone new: Brett Carroll. (more…)
Edwin Jackson, Fourth No-Hitter of 2010 June 25, 2010
Posted by tomflesher in Baseball, Economics.Tags: baseball-reference.com, BayesBall, Dallas Braden, Diamondbacks, Edwin Jackson, no-hitters, poisson distribution, Rays, Roy Halladay, Ubaldo Jimenez
2 comments
Tonight, Edwin Jackson of the Arizona Diamondbacks pitched a no-hitter against the Tampa Bay Rays. That’s the fourth no-hitter of this year, following Ubaldo Jimenez and the perfect games by Dallas Braden and Roy Halladay.
Two questions come to mind immediately:
- How likely is a season with 4 no-hitters?
- Does this mean we’re on pace for a lot more?
The second question is pretty easy to dispense with. Taking a look at the list of all no-hitters (which interestingly enough includes several losses), it’s hard to predict a pattern. No-hitters aren’t uniformly distributed over time, so saying that we’ve had 4 no-hitters in x games doesn’t tell us anything meaningful about a pace.
The first is a bit more interesting. I’m interested in the frequency of no-hitters, so I’m going to take a look at the list of frequencies here and take a page from Martin over at BayesBall in using the Poisson distribution to figure out whether this is something we can expect.
The Poisson distribution takes the form
where is the expected number of occurrences and we want to know how likely it would be to have
occurrences based on that.
Using Martin’s numbers – 201506 opportunities for no-hitters and an average of 4112 games per season from 1961 to 2009 – I looked at the number of no-hitters since 1961 (120) and determined that an average season should return about 2.44876 no-hitters. That means
and
Above is the distribution. p is the probability of exactly n no-hitters being thrown in a single season of 4112 games; cdf is the cumulative probability, or the probability of n or fewer no-hitters; p49 is the predicted number of seasons out of 49 (1961-2009) that we would expect to have n no-hitters; obs is the observed number of seasons with n no-hitters; cp49 is the predicted number of seasons with n or fewer no-hitters; and cobs is the observed number of seasons with n or fewer no-hitters.
It’s clear that 4 or even 5 no-hitters is a perfectly reasonable number to expect.
| 2.448760831 |
At the other end… June 22, 2010
Posted by tomflesher in Baseball.Tags: Andre Ethier, As, Cedrick Bowers, Diamondbacks, Esmerling Vasquez, extra innings, free baseball, home runs, Joey Votto, Michael Wuertz, Ramon Hernandez, Rangers, Reds, Scott Rolen, weird lines
add a comment
Although AJ Burnett had a bad first inning last night, the Oakland As had a bad tenth inning. After taking a 2-2 game into extra innings, the Cincinnati Reds knocked three out of the park against pitchers Michael Wuertz and Cedrick Bowers. The first was hit by Ramon Hernandez; Joey Votto and Scott Rolen went deep back to back. Although extra-inning home runs aren’t very rare (there have been 35 so far this year), only three pitchers have surrendered more than one, and neither of the other two (Chad Durbin and Matt Belisle) gave them both up on the same night.
Last year, everyone’s favorite balk-off artist, Arizona’s Esmerling Vasquez, gave up two home runs in extra innings against the Texas Rangers on June 25th. Those were two of 83 free-baseball homers in 2009. Extra-innings home runs are more common in the tops of innings, because in a tied game a home run for the home team is a walk-off whereas the road team will get the chance to capitalize on their momentum, but I would have expected the proportions to be much more different than they are. In 2009, for example, of those 83, only 44 were hit by the away team with 39 hit by the home team (and 33 of those were game-enders).
So far, no batter has more than one extra-innings home run this year, but last year there were several. Andre Ethier led the pack with 3, with a bunch of batters who had 2.
AJ Burnett: Statistical Anomaly June 21, 2010
Posted by tomflesher in Baseball.Tags: 2 outs, Adam LaRoche, AJ Burnett, baseball-reference.com, first inning, home run, Justin Upton, Mark Reynolds, weird lines
2 comments
Tonight, A.J. Burnett had a weird first inning in a game that’s still going on as I write this. He got the first two outs fairly easily, and then surrendered home runs to Justin Upton, Adam LaRoche, and Mark Reynolds. Before he knew it, he was down 5-0 in the bottom of the first. That can’t happen very often.
I queried Baseball-Reference.com’s event finder for home runs, then narrowed it down to first inning home-runs with two outs this year. Prior to tonight, there had been 82. None of them came in three-homer games – that answers that.
Just for fun, I checked 2009 as well. In total, there were 209 2-out, first-inning home runs in 2009. Only one of those home runs happened in a three-homer game, so it didn’t happen then, either.
Poor AJ.
Carlos Zambrano, Ace Pinch Hitter? June 21, 2010
Posted by tomflesher in Baseball.Tags: Baseball, baseball-reference.com, bullpen, Carlos Zambrano, Cubs, Joba Chamberlain, Lou Piniella, Micah Owings, RE24, relief, setup man, starter, Ubaldo Jimenez
1 comment so far
Earlier this year, Chicago Cubs manager Lou Piniella experimented with moving starting pitcher and relatively big hitter Carlos Zambrano to the bullpen, briefly making him the Major Leagues’ best-paid setup man. Zambrano is back in the rotation as of the beginning of June. I’m curious what the effect of moving him to the bullpen was.
The thing is that not only is Zambrano an excellent pitcher (though he was slumping at the time), he’s also a regarded as a very good hitter for a pitcher. He’s a career .237 hitter, with a slump last year at “only” .217 in 72 plate appearances (17th most in the National League), which was 6th in the National League among pitchers with at least 50 plate appearances. He didn’t walk enough (his OBP was 13th on the same list), but he was 9th of the 51 pitchers on the list in terms of Base-Out Runs Added (RE24) with about 5.117 runs below a replacement-level batter. Ubaldo Jimenez was also up there with a respectable .220 BA, .292 OBP, but -8.950 RE24.
It should be pointed out that pitcher RE24 is almost always negative for starters – the best RE24 on that list is Micah Owings with -2.069. Zambrano’s run contribution was negative, sure, but it was a lot less negative than most starters. Zambrano also lost a bit of flexibility as an emergency pinch hitter (something that Owings is going through right now due to his recent move to the bullpen) – he’s more valuable as a reliever, so they won’t use him to pinch hit. As a result, he loses at-bats, and that not only keeps him from amassing hits. It also allows him to get rusty.
It’s hard to precisely value the loss of Zambrano’s contribution, although he’s already on pace for -6.1 batting RE24. It’s likely, in my opinion, that his RE24 will rise as he continues hitting over the course of the year. His pitching value is also negative, however, which is unusual. He’s always been very respectable among Cubs starters. It’s possible that although he was pitching very well in relief, the fact that he has the ability to go long means that it’s inefficient to use him as a reliever. This is the opposite of, say, Joba Chamberlain, who is overpowering in relief but struggles as a starter.
As a starter, Zambrano has never been a net loss of runs. He needs to stay out of the bullpen, and Joba needs to stay there.
Modeling Run Production June 19, 2010
Posted by tomflesher in Baseball, Economics.Tags: Baseball, economics, regression, run production, sports economics
add a comment
A baseball team can be thought of as a factory which uses a single crew to operate two machines. The first machine produces runs while the team bats, and the second machine produces outs while the team is on fields. This is a somewhat abstract way to look at the process of winning games, because ordinarily machines have a fixed input and a fixed output. In a box factory, the input comprises man-hours and corrugated board, and the output is a finished box. Here, the input isn’t as well-defined.
Runs are a function of total bases, certainly, but total bases are functions of things like hits, home runs, and walks. Basically, runs are a function of getting on base and of advancing people who are already on base. Obviously, the best measure of getting on base is On-Base Percentage, and Slugging Average (expected number of bases per at-bat) is a good measure of advancement.
OBP wraps up a lot of things – walks, hits, and hit-by-pitch appearances – and SLG corrects for the greater effects of doubles, triples, and home runs. That doesn’t account for a few other things, though, like stolen bases, sacrifice flies, and sacrifice hits. It also doesn’t reflect batter ability directly, but that’s okay – the stats we have should represent batter ability since the defensive side is trying to prevent run production. The model might look something like this, then:
This is the simplest model we can start with – each factor contributes a discrete number of runs. If we need to (and we probably will), we can add terms to capture concavity of the marginal effect of different stats, or (more likely) an interaction term for SLG and, say, SB, so that a stolen base is worth more on a team where you’re more likely to be brought home by a batter because he’s more likely to give you extra bases. As it is, however, we can test this model with linear regression. The details of it are behind the cut. (more…)
Leadoff Home Runs June 19, 2010
Posted by tomflesher in Baseball.Tags: baseball-reference.com, Jose Reyes, leadoff home runs, Mets, Nate McLouth, Phil Hughes, Subway Series, Yankees
add a comment
Jose Reyes led off today’s Mets-Yankees game with a home run off Phil Hughes. That’s the eleventh leadoff home run of the year. That’s a little over half as many as there were last year on June 19, when Nate McLouth hit the 19th leadoff home run of 2009.
Last year, there were 51 leadoff home runs over roughly 6 months (early April through the first week of October), which puts uniformly distributed homers at 8.5 per month (so McLouth’s #19 on June 19 was about 2.25 behind pace). So far, with eleven over 2.5 months, that puts us on pace for 26.4, or, to be generous, about 30 leadoff home runs.
The change probably isn’t indicative of anything other than chance, but in 2008 #24 of 52 came on June 20, and in 2007 they were already up to 28 of 59 by June 19. Over the past few years there’s been a slowing of leadoff home runs which may be due to chance or may be due to some other factor. Who knows? It’s way too small a sample to say anything about.
Appearances as Pitcher and DH June 17, 2010
Posted by tomflesher in Baseball.Tags: baseball-reference.com, Cardinals, designated hitter, Diamondbacks, Felipe Lopez, Jeff Kunkel, Mark Loretta, pitcher, Wade Boggs
add a comment
Earlier this year, Felipe Lopez pitched in relief for the St. Louis Cardinals in their 20-inning game against the Mets. Last year, he also played DH during an interleague game for Arizona. That made me curious how many players have at least one appearance each at DH and pitcher. I generated this table at Baseball Reference to check.
Several of these – for example, the bottom two in the list – were pitchers who started games at DH to allow their managers to insert hitting specialists when the DH came up. This led to a rule that the DH has to come to bat at least once unless the opposing team changes pitchers.
More interesting are the three at the top of the list – Jeff Kunkel, Wade Boggs, and Mark Loretta – all of whom have two seasons in which they both DHed and pitched. Loretta pitched an inning in 2001 and a single out in 2009, with Kunkel pitching for Texas in 1988 and 1989 and Boggs pitching for the Yankees in 1997 and the Rays in 1999. Hopefully the Cards will find an excuse to DH Lopez at some point this year just to even things out.
June 15 Wins Above Expectation June 16, 2010
Posted by tomflesher in Baseball.Tags: Angels, Baseball, Rays, Tigers, wins above expectation
add a comment
Wins Above Expectation are a statistic determined using team wins and the Pythagorean expectation, which is in turn determined using runs scored by and against each team. The Pythagorean expectation is the proportion of runs scored squared to runs scored squared plus runs against squared. It’s interpreted as an expected winning percentage.
Wins Above Expectation (WAE) is then the difference between Wins and Expected Wins, which are simply the Pythagorean Expectation multiplied by Games played. It’s a useful measure because it can be interpreted as wins that are due to efficiency (in economic terms) or, more simply, play that’s some combination of smart, clutch, and non-wasteful. It rewards winning close games and penalizes teams that win lots of laughers but lose close games, since the big wins predict more games will be won when all those runs are spent winning only one game.
Using Baseball-Reference.com, I crunched the numbers for AL teams up to June 15. As usual, the Los Angeles Angels of Anaheim lead the league in WAE with 3.68, with Detroit’s 2.39 a close second, but the Tampa Bay Rays are a surprising last with -1.96 WAE. Obviously, this early in the season it’s too soon to conclude anything based on this, but the complete data is behind the cut. (more…)