jump to navigation

Addendum on Pythagorean Expectation May 20, 2010

Posted by tomflesher in Baseball, Economics.
Tags: , , ,
1 comment so far

I noted below that the sample size of 13 games is too small to make a determination as to whether the proportions of conditions expected to predict the winning team – the home team, the team with the higher Pythagorean expectation, the team with more runs scored, and the team with the higher run differential – is significantly different from chance. If chance were the only determinant of the winner, then we would expect each proportion to be .5, since you’d expect a randomly-selected home team to win half the games, a randomly-selected team with higher run differential to win half the games, and so on.

Making the standard statistical assumptions, the margin of error using proportions is \sqrt{\frac{p(1-p)}{n}} . Three of the proportions were .46, meaning that the margin of error would be \sqrt{\frac{.46(.54)}{13}} = \sqrt{\frac{.2484}{13}} which simplifies to \sqrt{.0191} = {.1382} . Using 12 degrees of freedom, a t-table shows that the critical value for 95% confidence  is 2.18. Thus, the binomial confidence interval method, tells us we can be 95% sure that the true value of the proportion lies within the range .46 ± 2.18*.1382 = .46 ± .30 = .16 … .76. Clearly, this range is far too large to reject the conclusion that the proportion is significantly different from .5.

For the simple measure of more runs, the proportion was .31, meaning that the margin of error is \sqrt{\frac{.31(.69)}{13}} = \sqrt{\frac{.2139}{13}} or \sqrt{.0165} = {.1283}. The 95% confidence interval around .31 is .31 ± 2.18*.1283 = .31 ± .2797 = .03 … .59. Again, .5 is included in this range.

Quickie: Dallas Braden's Perfect Game May 11, 2010

Posted by tomflesher in Baseball.
Tags: , , , , , , ,
add a comment

Dallas Braden of the Oakland As pitched a perfect game Sunday, on Mother’s Day. Under the methods discussed last year after Buehrle’s perfect game, Braden – who’s been active for four seasons – has an OBP-against of .328. That means he has a probability for any given plate appearance of .672 of the batter not reaching base.

Since he sat down 27 batters consecutively, the probability of that event happening is (.672)27, or .0000218; equivalently, given his current stats, a bit over 2 in every 100,000 games that Braden pitches should be perfect games.

Over the same period (2007-2010), the American League OBP has hovered between .331 (this year) and .338 (2007). .336 was the mode (2008, 2009), so I’ll use it to estimate that the chance for a perfect game facing the league average team would be (.664)27, or .0000157, or equivalently about 1.5 out of every 100,000 games should be a perfect game.As you can see, it’s more likely for Braden than the average pitcher, but not by much.

Nice job, Dallas!

As a side note, the Tampa Bay Rays were the victim of BOTH perfect games. Their team OBP was .343 in 2009, with a probability not to get on base of .657, meaning that the probability of getting 27 batters seated consecutively is about 1.2 in 100,000. Since many other teams have lower team OBPs, it’s very surprising that the Rays were the victims of both games.

Quickie: MLB Playoffs by Pitching Statistics February 23, 2010

Posted by tomflesher in Baseball.
Tags: , , , ,
add a comment

It’s cold out today. Last night, Buffalo was covered in a thin layer of freezing rain. I’m trying to stay warm by turning up my hot stove the way only an economist can – crunching the numbers on playoffs.

I’m re-using the dataset from my Cy Young Predictor a few entries ago in the interest of parsimony. It contains dummy variables teamdivwin and teamwildcard which take value 1 if the pitcher’s team won the division or the wildcard respectively. I then created a variable playoffs which took the value of the sum of teamdivwin and teamwildcard – just a playoff dummy variable.

Using a Probit model and a standard OLS regression model, I estimated the effects of individual pitching stats on playoffs. Neither model has very strong predictive value (linear has R-squared of about .05), which is unsurprising since it doesn’t take the team’s batting into account at all. None of the coefficient values are shocking – in the American League (designated as lg = 1), teams have a higher probability of making the playoffs because there are fewer teams, and although complete games appear to have a negative effect, the positive shutout effect more than makes up for that in both models. I’m interested in whether complete game wins and complete game losses have differential effects – that will probably be my next snowy-day project.

Results are behind the cut.

(more…)

Cy Young gives me a headache. January 15, 2010

Posted by tomflesher in Baseball, Economics.
Tags: , , , , , , , , , , , ,
add a comment

As usual, I’ve started my yearly struggle against a Cy Young predictor. Bill James and Rob Neyer’s predictor (which I’ve preserved for posterity here) did a pretty poor job this year, having predicted the wrong winner in both leagues and even getting the order very wrong compared to the actual results. Inside, I’d like to share some of my pain, since I can’t seem to do much better.

(more…)

Barry Bonds (with bonus Collusion discussion) March 25, 2009

Posted by tomflesher in Academia, Baseball, Economics.
Tags: , , , , , , , , ,
add a comment

Sorry about the infrequent updates. It’s a busy time in the semester.

Barry Bonds is, without a doubt, one of the most controversial figures in baseball. He’s currently trying, again, what he tried last year – shopping himself around for the league’s minimum salary. (Thanks to the Sports Law Blog for the link.) Inside, I’d like to briefly discuss collusion and look at the incentives involved with this situation.

(more…)

Sabernomics on A-Rod and Steroid Use February 11, 2009

Posted by tomflesher in Baseball.
Tags: , , ,
add a comment

At Sabernomics, JC Bradbury crunches some numbers on home run numbers for Alex Rodriguez during the seasons in which he admits steroid use:

So, what were A-Rod’s steroids worth? 2.37 home runs over two seasons, or a little over one home run a season. At least, that is the estimate based on the method I laid out above; however, it’s probably best to say that there was no observed effect.

In the comments section, Bradbury crunches the walk numbers to control for the possibility that a more powerful A-Rod was less selective at the plate and, again, finds no observable effect. There are some moderately outlandish hypotheses that could account for this, such as the league’s pitchers cycling steroids coincident with Rodriguez, so that a roided-up A-Rod would hit against roided-up pitchers and a clean A-Rod would hit against clean pitchers, but, well, Occam’s Razor.

Arbitration in MLB – "File and Go" and Market Inefficiency January 27, 2009

Posted by tomflesher in Baseball.
Tags: , , , , , , , , ,
add a comment

Ed Edmonds at the Sports Law Blog wrote up a piece on Tampa Bay’s “File-and-Go” strategy for arbitration. The blog references an MLB.com article; more information is available at USA Today, but I’ve preserved the text of the article here. Some thoughts on arbitration as market inefficiency, plus a haiku, behind the cut.

(more…)

Statistical evidence that the Rays are outclassed. October 27, 2008

Posted by tomflesher in Baseball.
Tags: , ,
1 comment so far

The series thus far.

Q.E.D.

Poor Kazmir. October 17, 2008

Posted by tomflesher in Baseball.
Tags: , , , , , , , ,
add a comment

Last night, Scott Kazmir pitched 6 scoreless innings  in ALCS game 5, giving up 2 hits and 3 walks but striking out 7 batters. He totalled up to a game score of 72 points. His bullpen then proceeded to give up 8 runs, allowing the Red Sox to come back and win the game (thus extending the series to game 5).

Has Scotty suffered the greatest postseason indignity ever? Nope. Not even close. That honor belongs to Mike Mussina of the 1997 Orioles.

(more…)

Bailouts! September 25, 2008

Posted by tomflesher in Baseball.
Tags: , , , , , , , ,
add a comment

That’s right… in the interest of keeping up with this week’s news about the $700b bailout of the financial sector, I’m going to take a look at key instances of bailouts by the bullpen.

(more…)