##
What does a long game do to teams? *April 13, 2015*

*Posted by tomflesher in Baseball.*

Tags: extra innings, file drawer, linear model, linear regression, Red Sox, Yankees

trackback

Tags: extra innings, file drawer, linear model, linear regression, Red Sox, Yankees

trackback

Friday, the Red Sox took a 19-inning contest from the Yankees. Both teams have the unfortunate circumstance of finishing a game around 2:15 A.M. and having to be back on the field at 1:05 PM. Everyone, including the announcers, discussed how tired the teams would be; in particular, first baseman **Mark Teixeira** spent a long night on the bag to keep backup first baseman and apparent emergency pitcher **Garrett Jones** fresh, leading **Alex Rodriguez** to make his first career appearance at first base on Saturday.

Teixeira wasn’t the only player to sit out the next day – center fielders **Jacoby Ellsbury** and **Mookie Betts**, catchers **Brian McCann** and **Sandy Leon**, and most of the bullpen all sat out, among others. The Yankees called up pitcher **Matt Tracy** for a cup of coffee and sent **Chasen Shreve** down, then swapped Tracy back down to Scranton for **Kyle Davies**. Boston activated starter **Joe Kelly** from the disabled list, sending winning pitcher **Steven Wright** down to make room. Shreve and Wright each had solid outings, with Wright pitching five innings with 2 runs and Shreve pitching 3 1/3 scoreless.

All those moves provide some explanation for a surprising result. Interested in what the effect of these long games are, I dug up all of the games from 2014 that lasted 14 innings or more. In a quick and dirty data set, I traced the scores for each team in their next games along with the number of outs pitched and the length in minutes of the game.

I fitted two linear models and two log models: two each with the next game’s runs as the dependent variable and two each with the difference in runs (next game’s runs – long game’s runs) as the dependent variable. Each used the length of the game in minutes, the number of outs, the average runs scored by the team during 2014, and an indicator variable for the presence of a designated hitter in each game. For each dependent variable, I modeled all variables in a linear form once and the natural log of outs and the natural log of the length of the game once.

With runs scored as the dependent variable, nothing was significant. That is, no variable correlated strongly with an increase or decrease in the number of runs scored.

With a run difference model, the length of the game in minutes became marginally significant. For the linear model, extending the length of the game by one minute lowers the difference in runs by about .043 runs – that is, normalizing for the number of runs scored the previous day, extending the game by one minute lowered the runs the next day by about .043. In the semilog model, extending the game by about 1% lowered the run difference by about 14; this was offset by an extremely high intercept term. This is a very high semielasticity, and both coefficients had p-values between .01 and .015. Nothing else was even close.

With all of the usual caveats about statistical analysis, this shows that teams are actually pretty good at bouncing back from long games, either due to the fact that most of the time they’re playing the same team (so teams are equally fatigued) or due to smart roster moves. Either way, it’s a surprise.

## Comments»

No comments yet — be the first.