NL Cy Young: Heating up early May 31, 2010
Posted by tomflesher in Baseball.Tags: Baseball, baseball-reference.com, Cy Young, Dallas Braden, Mark Buehrle, Roy Halladay, Ubaldo Jimenez
add a comment
There’s considerable debate, following Roy Halladay‘s perfect game, as to whether he or Ubaldo Jimenez should be considered the top contender for the National League’s Cy Young Award. Of course, it’s way too early to make those sorts of decisions, but let’s take a look at some of the data quickly.
Jimenez is sitting at 3.7 Wins Above Replacement and 38 Runs Above Replacement in 10 starts:
| Year | Age | Tm | Lg | IP | GS | R | Rrep | Rdef | aLI | RAR | WAR | Salary |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2010 | 26 | COL | NL | 71.1 | 10 | 7 | 45 | 0 | 1.0 | 38 | 3.7 | $1,250,000 |
| 5 Seasons | 577.2 | 93 | 241 | 362 | 0 | 1.0 | 121 | 12.2 | $2,392,000 | |||
Halladay has considerably less, with 22 RAR and 2.4 WAR:
| Year | Age | Tm | Lg | IP | GS | R | Rrep | Rdef | aLI | RAR | WAR | Salary |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2010 | 33 | PHI | NL | 86.0 | 11 | 23 | 45 | 3 | 1.0 | 22 | 2.4 | $15,750,000 |
| 13 Seasons | 2132.2 | 298 | 893 | 1407 | 19 | 1.0 | 514 | 49.8 | $88,991,666 | |||
Of course, 10 or 11 starts is far too small a sample to draw conclusions from this early in the season. Halladay has a perfect game; Jimenez has a no-hitter. Still, there’s no reason to believe that a perfect game, in and of itself, is enough to get Doc a Cy Young Award. After all, Mark Buehrle didn’t win the Cy last year, and Dallas Braden isn’t even in contention.
If both players keep pitching at or near this level, Halladay becomes a realistic contender, because at that point his marginal contribution may make the difference between whether the Phillies make the playoffs or not. As it stands right now, the NL East is entirely too volatile to make that decision.
(Incidentally, I love Baseball-Reference.com’s new stat sharing and player link tools!)
What is the effect of the Designated Hitter? May 30, 2010
Posted by tomflesher in Baseball.Tags: baseball-reference.com, designated hitter, R, regression
2 comments
Intuitively, the designated hitter rule seems like it should increase scoring. By getting on base more often than the pitcher would have, the designated hitter helps produce runs by hitting, by being on base so that other players can drive him in, and by not accumulating outs by bunting or striking out as often as the pitcher does. However, there should be a corresponding effect from having pitchers left in the game longer: a better pitcher who remains in the game might get more outs than a reliever who came in simply because the manager pinch-hit for the starting pitcher because he needed offense.
Behind the cut, I’ll explain the testing I did to determine whether the effect of a DH is positive (hint: it is) and look at how big an effect is actually there.
Roy Halladay's Perfect Game May 30, 2010
Posted by tomflesher in Baseball.Tags: baseball-reference.com, Braden's perfect game, Dallas Braden, Halladay's perfect game, Perfect Games, Roy Halladay
add a comment
Just what the Doctor ordered.
Andy at Baseball-Reference.com has an interesting blog entry about Doc’s perfect game. Roy Halladay was 0-3 in the game with two strikeouts, threw 115 pitches to 27 batters, and had a 98 Game Score.
Compared to Dallas Braden, Doc was much, much more likely to achieve this. Halladay’s opposing OBP is a miniscule career,
this year, with his complementary probabilities of getting a batter out at
and
. Using his career numbers, his probability of getting 27 consecutive batters out would be
, or
, which is approximately
.
Interestingly, the last 3 perfect games have all had Florida teams as the victim.
Addendum on Pythagorean Expectation May 20, 2010
Posted by tomflesher in Baseball, Economics.Tags: Baseball, economics, Pythagorean expectation, statistics
1 comment so far
I noted below that the sample size of 13 games is too small to make a determination as to whether the proportions of conditions expected to predict the winning team – the home team, the team with the higher Pythagorean expectation, the team with more runs scored, and the team with the higher run differential – is significantly different from chance. If chance were the only determinant of the winner, then we would expect each proportion to be .5, since you’d expect a randomly-selected home team to win half the games, a randomly-selected team with higher run differential to win half the games, and so on.
Making the standard statistical assumptions, the margin of error using proportions is . Three of the proportions were .46, meaning that the margin of error would be
which simplifies to
. Using 12 degrees of freedom, a t-table shows that the critical value for 95% confidence is 2.18. Thus, the binomial confidence interval method, tells us we can be 95% sure that the true value of the proportion lies within the range .46 ± 2.18*.1382 = .46 ± .30 = .16 … .76. Clearly, this range is far too large to reject the conclusion that the proportion is significantly different from .5.
For the simple measure of more runs, the proportion was .31, meaning that the margin of error is or
. The 95% confidence interval around .31 is .31 ± 2.18*.1283 = .31 ± .2797 = .03 … .59. Again, .5 is included in this range.
How Useful is the Pythagorean Expectation? May 18, 2010
Posted by tomflesher in Baseball.Tags: baseball-reference.com, one-game playoffs, Pythagorean expectation, wins above expectation
add a comment
The Pythagorean expectation is a method used to approximate how many wins a baseball team “should” have based on its offense (runs scored) and its defense (runs allowed). As the linked article points out, there are some problems with the formula. As far as I’m concerned, the most useful application of an expected win percentage is to compare teams that are otherwise similar. Let’s say, for example, that I have two teams that have identical records and I want to predict which team will win an upcoming series. In that case, an expected win percentage would be useful to indicate which team has more firepower over time.
What’s the perfect way to test this? One-game playoffs. Behind the cut, I have the results of some number-crunching I did to test whether the Pythagorean expectation generates useful results.
Quickie: Dallas Braden's Perfect Game May 11, 2010
Posted by tomflesher in Baseball.Tags: Baseball, Braden's perfect game, Buehrle's perfect game, Dallas Braden, Oakland As, probability, sabermetrics, Tampa Bay Rays
add a comment
Dallas Braden of the Oakland As pitched a perfect game Sunday, on Mother’s Day. Under the methods discussed last year after Buehrle’s perfect game, Braden – who’s been active for four seasons – has an OBP-against of .328. That means he has a probability for any given plate appearance of .672 of the batter not reaching base.
Since he sat down 27 batters consecutively, the probability of that event happening is (.672)27, or .0000218; equivalently, given his current stats, a bit over 2 in every 100,000 games that Braden pitches should be perfect games.
Over the same period (2007-2010), the American League OBP has hovered between .331 (this year) and .338 (2007). .336 was the mode (2008, 2009), so I’ll use it to estimate that the chance for a perfect game facing the league average team would be (.664)27, or .0000157, or equivalently about 1.5 out of every 100,000 games should be a perfect game.As you can see, it’s more likely for Braden than the average pitcher, but not by much.
Nice job, Dallas!
As a side note, the Tampa Bay Rays were the victim of BOTH perfect games. Their team OBP was .343 in 2009, with a probability not to get on base of .657, meaning that the probability of getting 27 batters seated consecutively is about 1.2 in 100,000. Since many other teams have lower team OBPs, it’s very surprising that the Rays were the victims of both games.
Quickie: MLB Playoffs by Pitching Statistics February 23, 2010
Posted by tomflesher in Baseball.Tags: Baseball, OLS, playoffs, probit, regression
add a comment
It’s cold out today. Last night, Buffalo was covered in a thin layer of freezing rain. I’m trying to stay warm by turning up my hot stove the way only an economist can – crunching the numbers on playoffs.
I’m re-using the dataset from my Cy Young Predictor a few entries ago in the interest of parsimony. It contains dummy variables teamdivwin and teamwildcard which take value 1 if the pitcher’s team won the division or the wildcard respectively. I then created a variable playoffs which took the value of the sum of teamdivwin and teamwildcard – just a playoff dummy variable.
Using a Probit model and a standard OLS regression model, I estimated the effects of individual pitching stats on playoffs. Neither model has very strong predictive value (linear has R-squared of about .05), which is unsurprising since it doesn’t take the team’s batting into account at all. None of the coefficient values are shocking – in the American League (designated as lg = 1), teams have a higher probability of making the playoffs because there are fewer teams, and although complete games appear to have a negative effect, the positive shutout effect more than makes up for that in both models. I’m interested in whether complete game wins and complete game losses have differential effects – that will probably be my next snowy-day project.
Results are behind the cut.
Cy Young gives me a headache. January 15, 2010
Posted by tomflesher in Baseball, Economics.Tags: Baseball, baseball-reference.com, Bill James, Cy Young predictor, economics, Eric Gagne, linear regression, R, Rob Neyer, sabermetrics, Tim Lincecum, Weighted saves, Weighted shutouts
add a comment
As usual, I’ve started my yearly struggle against a Cy Young predictor. Bill James and Rob Neyer’s predictor (which I’ve preserved for posterity here) did a pretty poor job this year, having predicted the wrong winner in both leagues and even getting the order very wrong compared to the actual results. Inside, I’d like to share some of my pain, since I can’t seem to do much better.
Three Catchers, Four Starters, and Other Playoff Thoughts October 26, 2009
Posted by tomflesher in Baseball.Tags: 2009 ALCS, Angels, Phillies, pinch hitters, pinch runners, rosters, world series, Yankees
add a comment
Last night, the LA Angels lost Game 6 of the 2009 ALCS to the New York Yankees. Mike Scioscia started left-handed pitcher Joe Saunders; he carries, as is becoming the norm, three catchers including light-hitting third catcher Bobby Wilson. Joe Girardi also carries three catchers, although his array includes defensive specialist Jose Molina, sometime-DH Jorge Posada, and Francisco Cervelli, who hit .298 in 94 at-bats this season. Though Mike Napoli was hot during the postseason, Scioscia’s group of catchers wasn’t as specialized as it was in 2005, when he carried big-hitter Bengie Molina, Jose Molina for his glove, and Josh Paul for emergencies. Here, he appeared to be carrying three catchers solely because none of them are big hitters. In retrospect, although Napoli and Mathis are both a big part of the Angels clubhouse, Scioscia should have made a move during the regular season to replace one of them with a catcher who was more of the Bengie Molina or Jorge Posada mold – someone whose glove or arm is slightly defective, but who can hit the ball when necessary. Instead, Scioscia was forced to burn two pinch hitters and a second catcher in his attempt to win the game last night, whereas Girardi has in previous games been able to use the traditional approach of starting Molina and using Posada to pinch hit, or starting Posada and using Molina as a defensive replacement late in the game. In a perfect world, Scoscia could have traded Kendry Morales away and acquired Victor Martinez to use mainly at first base and as an emergency third catcher, replacing Wilson’s more or less dead weight with a big bat but not forgoing any real utility.
In addition, Scioscia started Joe Saunders. This isn’t a crime in and of itself. However, in the ALCS, he started John Lackey, Saunders, Jered Weaver, and Scott Kazmir. Girardi, meanwhile, is using Joe Torre’s time-honored trick of carrying only three starters (CC Sabathia, Andy Pettitte, and AJ Burnett) and using traditional long-relief men like Dan Robertson in addition to standard situational relief like Joba Chamberlain, Damaso Marte, and Mariano Rivera. In Game 6, Saunders went only 3.1 innings. Weaver performed well in relief and, frankly, should have been left there for the duration of the series. Instead, Scioscia spread his men too thin and was left making an all-hands-on-deck call in the late games where he used both Weaver and Kazmir in relief. Saunders pitched brilliantly in Game 2, and Scioscia should have been prepared to maximize his usage of Lackey, Saunders, and Kazmir while leaving Weaver in the bullpen. Granted, Saunders pitched like crap last night, but all pitchers have their off nights.
Finally, Girardi will probably do quite well in the World Series, as he’s experienced in managing under National League rules. Hideki Matsui, with his legs in bad shape, will be almost entirely useless in the Phillies’ park. In a perfect world, Girardi would be able to dump fifth-outfielder Freddy Guzman and use Matsui in the field. However, that seems unlikely, so Matsui will remain an overpaid pinch-hitter. With Jerry Hairston, Jr., on the bench, Guzman’s utility as a pinch runner is moderate at best. It would be a gutsy move, but I think Girardi would do best to dump Guzman and bring Shelley Duncan in as a pinch hitter and emergency outfielder.
Still, Girardi gets paid the big bucks to do his job, so I’m sure every move he makes is well-considered.
Probability of a perfect game July 24, 2009
Posted by tomflesher in Baseball.Tags: Baseball Analysts, Buerhle's perfect game, links
add a comment
Sky over at Baseball Analysts ran some probabilities using on-base percentages to calculate particular pitchers’ probabilities of pitching a perfect game once and over a career. The method’s simple enough that it’s easy to calculate for any pitcher.