How Useful is the Pythagorean Expectation?

How Useful is the Pythagorean Expectation? May 18, 2010

Posted by tomflesher in Baseball.
Tags: baseball-reference.com, one-game playoffs, Pythagorean expectation, wins above expectation
trackback

The Pythagorean expectation is a method used to approximate how many wins a baseball team “should” have based on its offense (runs scored) and its defense (runs allowed). As the linked article points out, there are some problems with the formula. As far as I’m concerned, the most useful application of an expected win percentage is to compare teams that are otherwise similar. Let’s say, for example, that I have two teams that have identical records and I want to predict which team will win an upcoming series. In that case, an expected win percentage would be useful to indicate which team has more firepower over time.

What’s the perfect way to test this? One-game playoffs. Behind the cut, I have the results of some number-crunching I did to test whether the Pythagorean expectation generates useful results.

There have been 13 tiebreaker games in Major League Baseball. Under the earlier rules, a series format was used. In all cases, though, the team that won the first game of the series went on to win, so for simplicity’s sake I’ll treat all of the playoffs as though they were one-game series.

I created a spreadsheet with data from Wikipedia and Baseball-Reference.com with variables for the winning and losing team’s run for and against, then used that data to generate Pythagorean expectation and run differential. I created binary variables to test whether the winning team was the home team (wHome = 1), the team with the higher Pythagorean expectation (wEx = 1), the team with more runs scored in the season (wRuns = 1), and the team with the higher run differential (wDelta = 1). My expectation is that more often than not, the Home team would win, and that the team with the higher Pythagorean expectation, more runs, and a higher differential would win. The results were quite surprising.

The data for the winners:

Year	Winner	wRunsF	wRunsA	wDiff	wPyth
1946	Cardinals	700	539	161	62.78%
1948	Indians	840	567	273	68.70%
1951	Giants	781	641	140	59.75%
1959	Dodgers	705	670	35	52.54%
1962	Giants	878	690	188	61.82%
1978	Yankees	735	582	153	61.46%
1980	Astros	637	589	48	53.91%
1995	Mariner	796	708	88	55.83%
1998	Cubs	831	792	39	52.40%
1999	Mets	853	711	142	59.01%
2007	Rockies	860	758	102	56.28%
2008	White Sox	811	729	82	55.31%
2009	Twins	817	765	52	53.28%

And the losers:

Loser	lRunsF	lRunsA	lDiff	lPyth
Dodgers	695	561	134	60.55%
Red Sox	907	720	187	61.34%
Dodgers	855	672	183	61.81%
Braves	724	623	101	57.46%
Dodgers	842	697	145	59.34%
Red Sox	796	657	139	59.48%
Dodgers	663	591	72	55.72%
Angels	801	697	104	56.91%
Giants	845	739	106	56.66%
Reds	865	711	154	59.68%
Padres	741	666	75	55.32%
Twins	829	745	84	55.32%
Tigers	743	745	-2	49.87%

And the results:

wHome	wEx	wRuns	wDelta
1	1	1	1
0	1	0	1
0	0	0	0
0	0	0	0
1	1	1	1
0	1	0	1
0	0	0	0
1	0	0	0
1	0	0	0
0	0	0	0
0	1	1	1
1	0	0	0
1	1	1	1
0.46	0.46	0.31	0.46

As you can see, none of the values I expected to be good predictors of winning a one-game playoff actually were. In fewer than half the cases, the expected team wins.

I’m not sure what to think of this. When I played with the Pythagorean expectation a while back, I used it to generate a stat called Wins Above Expectation (which appears to have been invented independently by several other people as well) with the interpretation that a team with a higher WAE value was generating more wins with the same number of runs, so they must be scoring runs when they count and not wasting them on laugher wins. The same interpretation seems to be valid here – a lower expectation given the same record shows some sort of intelligence or ability to play a clutch game, so it makes sense that the team with the lower expecatation would win.

Still, remember that the sample size is only 13 games. We can’t draw conclusions (or do a useful regression analysis) based on such a small n.

Comments»

No comments yet — be the first.

The World's Worst Sports Blog