##
Kirk’s Big Spring
*March 20, 2015*

*Posted by tomflesher in Baseball, Economics.*

Tags: BABIP, KBB, Kirk Nieuwenhuis, Spring training

add a comment

Tags: BABIP, KBB, Kirk Nieuwenhuis, Spring training

add a comment

Kirk Nieuwenhuis is having an incredible spring. All the usual caveats are in play – it’s spring training, so the stats are useless – but Kirk’s production has been exceptional. His slash line is .469/.553/.625 on 38 plate appearances. Let’s hit some sanity checks on Kirk’s production.

First of all, his BAbip is off the charts. This spring, Kirk’s batting average on balls in play is .536, which is ridiculously high. Kirk won’t be able to maintain that into the season. If he’s still got a .536 OBP by the trade deadline, I’ll eat my hat and post the video. Kirk’s BAbip has been pretty streaky, though. During his rough April, Kirk had a .300 BAbip, about the league average over the season; after coming back up in late June, he had a .377 BAbip over the remainder of the season, broken up as .625 over five June games with 11 at-bats, .267 over 28 at-bats in July, .400 over 23 August at-bats, and .348 over 32 at-bats in September.

From 2012 to 2013, Kirk’s BAbip dropped from about .379 to .246, and then shot back up to .370 in 2014. Using those numbers and taking first differences, then using the ratio of differences, that means we’d expect Kirk’s BAbip to drop to about .254 this season. Nonetheless, Kirk’s platoon splits are huge – against right-handed pitchers, from 2014, he’s got a .040/.050/.283 split (although he only made 9 at-bats and 10 plate appearances against left-handed pitchers). Though Kirk’s splits aren’t readily available, it’s possible that his big spring is residual of facing mostly right-handers.

In the spring, Kirk’s BAbip denominator (AB – HR – K – SF) is 28 and the numerator (H – HR) is 15. If we take Kirk’s previous-year .377 BAbip, over 28 trials we’d expect 15 or more successes to occur about 2.86% of the time. That’s just barely within the bounds of statistical significance (which would indicate we’d expect Kirk to hit between 6 and 15 times about 95% of the time), and well outside if we assume Kirk has a true mean of .254 (which would put our confidence interval at around 3-11 successes in 28 trials).

Second, take a look at Kirk’s K/BB ratio. Kirk has typically had a strikeout-to-walk ratio above 1; in 2013, he struck out about 2.67 times for every time he walked, and in 2014 it was about 2.44 strikeouts per walk. Over this small spring sample size, Kirk’s K/BB has actually dipped below 1, at 4/6 (or .667). Assuming Kirk walked 6 times anyway, using a conservative 2:1 K/BB ratio would turn 8 of Kirk’s hits into strikeouts. That would make Kirk’s BAbip tighten up to .350. Still strong, but not the obscene .536 we’ve seen. Even if we convert one walk to a strikeout and maintain a 2 K/BB, that would leave Kirk at .409, a very respectable spring.

Kirk’s numbers have been shocking, and of course he’s out of options, so he’s extremely likely to make the team. As a left-handed bat, he’d be a strong everyday player if the outfield weren’t so crowded, but with Michael Cuddyer and Juan Lagares in the mix already along with lefties Curtis Granderson and Matt den Dekker, it’s going to be tough to find Kirk a clean platoon spot.

##
A Pythagorean Exponent for the NHL
*March 17, 2015*

*Posted by tomflesher in Sports.*

Tags: hockey, luck, NHL, parameter identification, Pythagorean expectation, Sabres

add a comment

Tags: hockey, luck, NHL, parameter identification, Pythagorean expectation, Sabres

add a comment

A Pythagorean expectation is a statistic used to measure how many wins a team should expect, based on how many points they score and how many they allow. The name ‘Pythagorean’ comes from the Pythagorean theorem, which measures the distance between the two short sides of a right triangle (the hypotenuse); the name reflects the fact that early baseball-centric versions assumed that Runs^2/(Runs^2 + Runs Allowed^2) should equal the winning percentage, borrowing the exponent of 2 from the familiar Pythagorean theorem (a^2 +b^2 =c^2).

The optimal exponent turned out not to be 2 in just about any sport; in baseball, for example, the optimal exponent is around 1.82. This is found by setting up a function – in the case of the National Hockey League, that formula would be – with a variable exponent. This is equivalent to . Set up an error function – the standard is square error, because squaring is a way of turning all distances positive and penalizing bigger deviations more than smaller deviations – and minimize that function. In our case, that means we want to find the x that minimizes the sum of all teams’ . Using data from the 2009-2014 seasons, the x that minimizes that sum of squared errors is 2.2266, which is close enough to 2.23 that the sum of squared errors barely changes.

Porting that exponent into the current season, there are a few surprises. First of all, the Anaheim Ducks have been lucky – almost six full wins worth of luck. It would hardly be surprising for them to tank the last few games of the season. Similarly, the Washington Capitals are on the precipice of the playoff race, but they’re over four games below their expected wins. With 11 games to go, there’s a good chance they can overtake the New York Islanders (who are 3.4 wins above expectation), and they’re likely to at least maintain their wild card status.

On the other end, somehow, the Buffalo Sabres are obscenely lucky. The worst team in the NHL today is actually 4 games better than its expectation. Full luck standings as of the end of March 16th are behind the cut.

##
What is BAbip?
*March 16, 2015*

*Posted by tomflesher in Baseball.*

Tags: BABIP, evergreen

add a comment

Tags: BABIP, evergreen

add a comment

The first stat we all learned about as kids was the batting average, where you calculate what proportion of at-bats end with getting a hit. Then, of course, we start thinking about why there are weird exceptions – why doesn’t getting hit by a pitch count? Why don’t walks count? Why doesn’t advancing to first on catcher’s interference count? OBP, or on-base percentage, fixes that. (Well, maybe not the catcher’s interference part…)

Batting average has some interesting properties, though. It captures events that have unpredictable outcomes – when you walk, it’s basically impossible to be put out on your way to first. Ditto being hit by a pitch. Of course, BA does have some of those determined outcomes, too – home runs and strikeouts don’t have much dynamic nature to them, although you’ll occasionally see brilliant defense save a sure homer (a la Carl Crawford’s MVP performance in the or a sloppy catcher mishandle a third strike and forget to tag the batter. (I’m looking at you, Josh Paul.) Nonetheless, balls in play – balls that the batter makes contact with, forcing the defense to try to make a play – are a major source of variation in the game.

BAbip is measured as , meaning it takes the strikeouts and home runs out of the equation and (like all sane measures should!) includes sacrifice flies.

Since the ball is out of the pitcher’s control as soon as it leaves his hand, BAbip measures things that the pitcher isn’t responsible for – that is, it’s handy as a measure of pitching luck, or, teamwide, as a measure of defensive effectiveness. The NL team BAbip average was .299, and AL average BAbip was about .298.

**Use Cases for BAbip:**

– **Evaluating hitting development.** If a batter has had a stable BAbip for a while and his BAbip increases significantly, be suspicious! Particularly if his walk rate hasn’t increased, his home run rate hasn’t increased, and his strikeout rate hasn’t decreased, this might be a function of lucky hitting against bad or inefficient defenses. If the biggest part of an increase in production has been on balls in play, your hitter may not have actually improved. On the other hand, if you can see physical changes, or you have an explanation (e.g., went to AAA to work on his swing), you may see a more balanced improvement in OBP.

**- Evaluating pitching luck.** Most of the time, all the pitchers for the same team pitch in front of the same defense. Even with a personal catcher in the mix, expect most pitchers on a team to have similar batting averages on balls in play. If you have one pitcher whose BAbip is much higher than the rest of the pitchers, he may be pitching against bad luck. With that in mind, you can expect that pitcher to improve going forward.

**- Comparing defenses.** In 2014, Oakland had a .274 BAbip and allowed 572 runs – the best in the American league in BAbip and 18 runs behind Seattle – while Minnesota had a .317 BAbip and allowed 777 runs, the worst in both categories in the league. Defensive efficiency (a measure of 1 – BAbip) tracks closely with runs allowed. BAbip can operate as a quick and dirty check on how well a defense is performing behind a pitcher.

##
Spitballing: Pi Day
*March 14, 2015*

*Posted by tomflesher in Baseball.*

Tags: Pi Day

add a comment

Tags: Pi Day

add a comment

Happy Pi Day! In honor of Pi Day, I’d like to share a few leaderboards.

First, league median BAbip was .297. Here are four pitchers who got a little less lucky than the average, since their BAbip was .314:

Player | Year | Tm | |
---|---|---|---|

Randall Delgado | .314 | 2014 | ARI |

Erik Bedard | .314 | 2014 | TBR |

Aaron Barrett | .314 | 2014 | WSN |

Chad Qualls | .314 | 2014 | HOU |

Louis Coleman | .314 | 2014 | KCR |

Then, let’s follow up with the other side: the unlucky hitters who would only get on base 3.14 out of every 10 plate appearances:

Player | HR | Year | Tm | Pos | |
---|---|---|---|---|---|

Alejandro De Aza | 8 | .314 | 2014 | TOT | *78/HD9 |

Michael Bourn | 3 | .314 | 2014 | CLE | *8/H |

Elvis Andrus | 2 | .314 | 2014 | TEX | *6/DH |

Jose Tabata | 0 | .314 | 2014 | PIT | *H97/8D |

In addition, let’s take a look at the two hitters from 2014 who hit the ball 3.14 out of every 10 at-bats:

Player | Year | Age | Tm | |
---|---|---|---|---|

Robinson Cano | .314 | 2013 | 30 | NYY |

Andrew McCutchen | .314 | 2014 | 27 | PIT |

Robinson Cano | .314 | 2014 | 31 | SEA |

Yeah, Robbie Cano managed to hit Pi in 2013 AND 2014. The boy must love his geometry.

Our last mention: This year’s Pi Day mascot is Stephen Strasburg, who had an ERA of 3.14. The league-average ERA for pitchers who started 60% of their games was 3.86, so Stephen was in pretty good shape.

##
Spring Training: Still Useless For Predicting Stats
*March 12, 2015*

*Posted by tomflesher in Baseball.*

Tags: Spring training

add a comment

Tags: Spring training

add a comment

A few days ago, I watched a Mets-Marlins spring training game that ended in a brutal 13-2 loss. It had all of the usual spring training fun – Zack Wheeler working too far inside and hitting two batters, Michael Cuddyer starting at first with Lucas Duda out, and Don Kelly’s hustle allowing him to draw a walk, steal a base, and score on a single, even while Cliff Floyd was snickering about how Jim Leyland kept him on the roster for no apparent reason in the playoffs.

(Yeah, I know, Kelly’s a Marlin. Shut up.)

During the game, I tweeted out a link to a file-drawer post from last year that indicated that there’s almost no correlation between spring performance and regular-season performance. I thought I’d run a quick update on that, so I dug up the Mets’ individual performance in spring training and analyze it compared to the regular season.

There were 15 Mets who had 30 plate appearances in Spring Training and 100 plate appearances in the regular season. That’s a really small sample, so accuracywise we’d better keep our fingers crossed, but it’s enough data to spitball a little.

I ran four correlations on this – spring and regular season batting average, OBP, SLG, and OPS – and then created an additional stat to measure whether hitters changed hitting style from spring to the regular season. This was a quick and dirty attempt to measure whether hitters favored OBP or SLG, so I took the ratio (SLG/OPS) and reasoned that a power hitter will have a larger ratio and a singles hitter will have a smaller. I measured this correlation, too, to determine if there were big changes.

The results are unsurprising – the correlations are really low. Batting average correlates at around .019, and SLG at .305. OBP actually had a negative correlation, indicating that a high spring OBP may be a bad sign for the regular season. This is probably sampling error, due to the tiny number of observations, due almost entirely to Anthony Recker’s magical .426 spring and average regular season. That was about a -.25 correlation, which explains why OPS has a -.05 (near-zero) correlation – that big flip in OBP is going to offset the OPS correlation, too.

The strongest correlation was style – at about .619, it’s a pretty good indicator that if a hitter’s SLG is how he scores, he’ll maintain that hitting style throughout the season.

##
What is OPS?
*January 12, 2015*

*Posted by tomflesher in Baseball.*

Tags: evergreen, OBP, OPS, SLG, statistics

2 comments

Tags: evergreen, OBP, OPS, SLG, statistics

2 comments

Sabermetricians (which is what baseball stat-heads call ourselves to feel important) disregard batting average in favor of on-base percentage for a few reasons. The main one is that it really doesn’t matter to us whether a batter gets to first base through a gutsy drag bunt, an excuse-me grounder, a bloop single, a liner into the outfield, or a walk. In fact, we don’t even care if the batter got there through a judicious lean-in to take one for the team by accepting a hit-by-pitch. Batting average counts some of these trips to first, but not a base on balls or a hit batsman. It’s evident that plate discipline is a skill that results in higher returns for the team, and there’s a colorable argument that ability to be hit by a pitch is a skill. OBP is .

We also care a lot about how productive a batter is, and a productive batter is one who can clear the bases or advance without trouble. Sure, a plucky baserunner will swipe second base and score from second, or go first to third on a deep single. In an emergency, a light-hitting pitcher will just bunt him over. However, all of these involve an increased probability of an out, while a guy who can just hit a double, or a speedster who takes that double and turns it into a triple, will save his team a lot of trouble. Obviously, a guy who snags four bases by hitting a home run makes life a lot easier for his teammates. Slugging percentage measures how many bases, on average a player is worth every time he steps up to the plate and doesn’t walk or get hit by a pitch. Slugging percentage is . If a player hits a home run in every at-bat, he’ll have an OBP of 1.000 and a SLG of 4.000.

OPS is just On-Base Percentage plus Slugging Percentage. It doesn’t lend itself to a useful interpretation – OPS isn’t, for example, the average number of bases per hit, or anything useful like that. It does, however, provide a quick and dirty way to compare different sorts of hitters. A runner who moves quickly may have a low OBP but a high SLG due to his ability to leg out an extra base and turn a single into a double or a double into a triple. A slow-moving runner who can only move station to station but who walks reliably will have a low SLG (unless he’s a home-run hitter) but a high OBP. An OPS of 1.000 or more is a difficult measure to meet, but it’s a reliable indicator of quality.

##
The Hall of Fame Black Ink Test
*January 11, 2015*

*Posted by tomflesher in Baseball.*

Tags: Black Ink, evergreen, Hall of Fame

1 comment so far

Tags: Black Ink, evergreen, Hall of Fame

1 comment so far

The Baseball Hall of Fame‘s mission is “Preserving History, Honoring Excellence, Connecting Generations.” An important measure of the excellence honored in Cooperstown is called the Black Ink Test. “Black ink” refers to the boldface type used to show the league’s leader in an important category.

The categories used for the Black Ink Test are, of course, different for pitchers and batters, but they also vary depending on the importance of the stat. A batter who excels in hitting home runs is more valuable to a team than one who takes the most at-bats regardless of outcome. For batters, points are awarded as follows:

- One point for games, at-bats, or triples
- Two points for doubles, walks, or stolen bases
- Three points for runs scored, hits, or slugging percentage
- Four points for home runs, RBIs, or batting average

Pitchers receive:

- One point for appearances, starts, or shutouts
- Two points for complete games, lowest Walks/9, or lowest Hits/9
- Three points for innings pitched, saves, or win-loss percentage
- Four points for wins, ERA, or strikeouts

That means that there are 30 black-ink points per year for batters and 30 for pitchers. (Multiple black-ink points can be awarded; for example, this year, at least 10 pitchers started 34 games in the National League, each of whom earns 1 point.) However, while it’s conceivable that a single batter could monopolize most of the categories, it’s not likely that a pitcher could – appearances and saves will go to a reliever, while most of the categories will go to a starter.

Because black ink requires a player lead his league, it’s hard to come by – and when there are more teams in a league, even the best players may not lead the league. One notable example of the bias toward older players is Ross Barnes, who was active for nine seasons from 1871 to 1881. (He didn’t play in 1878 or 1880.) Although Ross isn’t eligible for the Hall because he didn’t play ten seasons, he amassed an astonishing 60 points of black ink in the National Association by the age of 31. Since the National Association was only 9 teams, he competed against around 115 other batters for those points. During the 2014 season, the same 30 points of black ink were spread over 672 National League batters. Though Ross was truly an outstanding player, leading the league in nearly every category in 1873 and 1876, it was a lot easier to get those points then.

As of today, the batters with the most black ink not to be elected to the Hall of Fame are Barry Bonds (69), Pete Rose (68), and Alex Rodriguez (64). A-Rod and Rose, of course, aren’t eligible (A-Rod is still active). New Hall of Famer Craig Biggio had 17 and mediocre, forgettable middle-infielder Derek Jeter comes in at a whopping 10.

The pitchers with the most black ink not to be elected are Roger Clemens (100), Roy Halladay (48), Bucky Walters (48), and Justin Verlander (46). Verlander is still active and Halladay retired too recently to be elected, but Walters is truly a baffling case. New Hall of Famers this year were Randy Johnson (99), Pedro Martinez (58), and John Smoltz (34).

##
The Spectrum Club: 2014 Edition
*January 1, 2015*

*Posted by tomflesher in Baseball.*

Tags: Spectrum Club

add a comment

Tags: Spectrum Club

add a comment

2013 and 2014 were unusually large Spectrum Clubs. The prestigious^{1} Spectrum Club consists of players who played as designated hitter and also pitched for their teams. Though there surely are a couple of people caught in this table who were primarily pitchers and just came in listed as a DH on the batting order, 2013 shows the largest Spectrum Club since the introduction of the designated hitter and 2014 following closely behind. The list of all Spectrum Club members is here.

This year inducted nine brand-new members. Although Mitch Maier and Darnell McDonald repeated from 2010 to 2011, everyone this year was a first-time pitcher/DH. As usual, though, they were all primarily position players.

This year’s inductees are:

Rk | Player | HR | PA | Year | Age ▾ | Pos | |
---|---|---|---|---|---|---|---|

1 | Adam Dunn | 22 | 511 | 2014 | 34 | .752 | *D3H/791 |

2 | Chris Gimenez | 0 | 128 | 2014 | 31 | .640 | *23/H1D5 |

3 | Steven Tolleson | 3 | 189 | 2014 | 30 | .679 | *4H5/9617D |

4 | J.P. Arencibia | 10 | 222 | 2014 | 28 | .608 | 32D/H1 |

5 | Mitch Moreland | 2 | 184 | 2014 | 28 | .644 | D3/H71 |

6 | Andrew Romine | 2 | 273 | 2014 | 28 | .554 | *64/H1D |

7 | Mike Carp | 0 | 149 | 2014 | 28 | .519 | *3H7/91D5 |

8 | Travis Snider | 13 | 359 | 2014 | 26 | .776 | 9H7/1D |

9 | Leury Garcia | 1 | 155 | 2014 | 23 | .399 | H584/6D971 |

Congratulations to this year’s inductees!

—–

^{1} Not a guarantee.

##
BABIP as a Defensive Metric
*October 11, 2014*

*Posted by tomflesher in Baseball, Economics.*

Tags: BABIP, BJ Upton, models, statistics

add a comment

Tags: BABIP, BJ Upton, models, statistics

add a comment

I follow OOTP on Facebook, and this Reddit thread about editing the Braves to go 0-162 popped up the other day.

I went into commissioner mode and basically ranked everyone’s stats to go 0-550 with 550 Ks (although when I went back, OOTP changed it to give them all a few hits and a couple of walks, etc.) I did not have to edit

BJ Upton, as he was already programmed to do so.

One reply asked whether 1-BABIP is a valid defensive metric, and that got the wheels turning. (Note that for statistical purposes, summary statistics for 1-BABIP will be the same magnitude and the opposite sign as statistics for BABIP, so I went ahead and just used BABIP.)

For a quick check, I checked in at Baseball Reference to get the National League’s team-level statistics for the last 5 years, then correlated BABIP to runs allowed by the team. That correlation is .741 – that’s a pretty strong correlation. Similarly, the correlation between BABIP and team wins was about -.549. It’s a weaker and negative correlation, which is expected – negative because an added point of opposing team BABIP would mean more balls in play were falling in as hits, and weaker because it ignores the team’s offensive production entirely.

If BABIP accurately describes a team’s defensive power, then a statistical model that models team runs allowed as a function of fielding-independent pitching and pitching-independent fielding should explain a large proportion, but not all, of the runs allowed by a team, and thereby explain a significant but smaller proportion of the team’s wins.

I crunched two models to test this, each with the same functional form: Dependent Variable = a + b*FIP + c*BABIP. With Runs as the dependent variable, the R^{2} of the model was .8625; with Wins as the dependent variable, the R^{2} was .5246. Since R^{2} roughly describes the percent of variation explained by the model, this makes a lot of sense. In the Runs model, about 14% of runs come due to something other than home runs, walks, or hits, such as baserunning and errors; in the Wins model, about 47% of team wins are explained by something other than defense and pitching. (Say…. offense? That’s crazy.) In both models, the coefficients are statistically significant at the 99% level.

BABIP’s coefficient in the Runs model is 3444.44, which means that a batting average on balls in play of 1.000 would lead to about 3444 runs scored over a season; more realistically, if BABIP increases by .01, that would translate to about 34 runs per season. Its coefficient in the Wins model is -328.757, meaning that an increase of .01 in BABIP corresponds to about 3.29 extra losses. That’s surprisingly close to the 10 runs-1 win ratio that Bill James uses as a rule of thumb.

Since the correlations were strong, this bears a closer look at game-level rather than simply team-level data.

##
Mets Run Support by Starting Pitcher
*August 1, 2014*

*Posted by tomflesher in Baseball.*

Tags: Jacob deGrom, Mets, pitching, run support, Zack Wheeler

2 comments

Tags: Jacob deGrom, Mets, pitching, run support, Zack Wheeler

2 comments

Yesterday’s post discussed distributional wins and losses based on the Mets’ inconsistent bunching of runs together. Since the boys didn’t play last night, I had a pretty stable dataset to work with, and the opportunity to crunch some numbers to see if the hypothesis that we’re working with is true. In addition, I took a look at each of our current starting rotation’s run support numbers and found some surprising things.

First of all, no pitcher had a statistically significant run support number than any other. Although Dillon Gee‘s run support is .77 lower than the average pitcher, for example, the p-value is .44, meaning the probablity that that’s statistically different from 0 is just about 56%. Jacob deGrom has a similar number – .796 runs below the average, but a .42 p-value. The only pitcher with a positive effect on run support is Bartolo Colon, but his p-value is a whopping .72, meaning it’s more likely than not that his number is a statistical artifact.

The runs allowed are a bit more stable – deGrom allows 1.18 runs fewer than average with a .2 p-value – but Gee, Jonathon Niese, Colon, and Zack Wheeler all have statistically 0 effect on runs allowed. Their ps are, respectively, .91, .84, .64, and .79. Basically, this means that an effect would have to be really big to show up in such a small sample size, not even all 108 games are covered in the sample.

Another way of tracking pitcher run support is to track team wins and losses in the games started by those pitchers and compare it to the team’s Pythagorean expectation in those games. This is a bit more revealing; for example, the Mets are 6-8 in starts by deGrom, but would have a Pythagorean expectation of about .568, or about 8-6, in those games. Wheeler also ends up with a Pythagorean expectation better than his record, predicting the Mets would have won 11 rather than 10 of his 22 games. The other pitchers are more or less in line with their expectations, although, like Zack, the pitchers don’t always get credit for the wins they pitched in.

Behind the cut is the table of regression results for a linear model with a dummy variable for each pitcher’s starts, plus a totally useless Away game dummy to look for home field advantage. (Surprise: There is none for the Mets, but all pitchers do allow roughly .74 more runs on the road than at home.)