jump to navigation

The Giants Are Playing to Win (SF Game 156 Recap) September 29, 2015

Posted by tomflesher in Baseball, Sports.
Tags: , ,
add a comment

With the Mets idle on Monday, the focus turned to west coast baseball and our likely opponents in the Division Series. Entering the evening, the Mets sit at 89-67 and the Dodgers were at 87-69. With 6 games remaining, the Dodgers would need to go 3-3 with the Mets losing every game in order to keep home field advantage in the division series.

Photo: SD Dirk

Photo: SD Dirk

San Francisco denied the Dodgers an opportunity to clinch the National League West last night. Although the Dodgers’ magic number is 2, the Giants are the second-place team, so any win for the Dodgers is simultaneously a loss for the Giants.

Despite a Greinkish start by Zack Greinke (7 innings, 4 hits, 2 runs, 3 walks and 7 Ks for a game score of 65), the Dodgers couldn’t get the job done. Chris Hatcher, Juan Nicasio, Luis Avilan, Pedro Baez, and J.P. Howell combined for four scoreless innings of relief after Greinke left with a 2-1 deficit. Andre Ethier grounded out off of Santiago Casilla to bring Corey Seager home in the bottom of the 9th. In the bottom of the 12th, Dodgers reliever Yimi Garcia allowed hits to Marlon Byrd and Kelby Tomlinson, who singled Byrd from first to third. Don Mattingly lifted Garcia for left-hander Adam Liberatore to face lefty pinch hitter Alejandro De Aza, who promptly sacrifice flied Byrd home for the win. Garcia takes the loss; Hunter Strickland was the pitcher of record for the Giants.

Alejandro De Aza started the year with Baltimore but was traded to Boston for cash and a prospect after being DFAed in May after hitting .214 in 112 plate appearances. Boston then flipped De Aza to San Francisco for a minor-league pitcher after De Aza hit .292 in 178 plate appearances; the Giants needed his left-handed bat off the bench.

Today’s game will pit Madison Bumgarner against Clayton Kershaw. Despite Bumgarner’s vaunted bat, he’s 2 for 12 against Kershaw, although one of those hits is a home run; Kershaw is 3 for 12 against Bumgarner. Current Dodgers hit .199/.242/.294 against Madison, and current Giants hit .191/.229/.244. Though both pitchers are consistently good, Bumgarner’s numbers tend to be more thinly spread – beyond Scott Van Slyke‘s shocking .483 (9 for 24) against Bumgarner, no one else with 10 appearances has gotten on base at a greater than .273 clip (Justin Turner and A.J. Ellis, both with 33 PAs).

Although a significant amount of Bumgarner’s variance is due to Van Slyke’s surprising success against him, this battle of the pitching titans is difficult to predict. A desire to win the division will likely propel Kershaw to the win, but I’ll be rooting for Madison.


The Giants are nothing if not consistent. May 6, 2014

Posted by tomflesher in Baseball.
Tags: ,
add a comment

May 5, 2014: Giants 11, Pirates 10, in 13 innings. Wining pitcher: Jean Machi. Save: Sergio Romo.

April 23, 2014: Giants 12, Rockies 10, in 11 innings. Winning pitcher: Jean Machi. Game finished: Sergio Romo.

Home Field Advantage Again July 12, 2011

Posted by tomflesher in Baseball, Economics.
Tags: , , , , , , ,
add a comment

In an earlier post, I discussed the San Francisco Giants’ vaunted home field advantage and came to the conclusion that, while a home field advantage exists, it’s not related to the Giants scoring more runs at home than on the road. That was done with about 90 games’ worth of data. In order to come up with a more robust measure of home field advantage, I grabbed game-by-game data for the national league from the first half of the 2011 season and crunched some numbers.

I have two questions:

  • Is there a statistically significant increase in winning probability while playing at home?
  • Is that effect statistically distinct from any effect due to attendance?
  • If it exists, does that effect differ from team to team? (I’ll attack this in a future post.)

Methodology: Using data with, among other things, per-game run totals, win-loss data, and attendance, I’ll run three regressions. The first will be a linear probability model of the form

\hat{p(W)} = \beta_0 + \delta_{H} + \beta_1 Att + \beta_2 Att^2 + \beta_3 AttH + \beta_4 AttH^2

where \delta_{H} is a binary variable for playing at home, Attendance is announced attendance at the game, and AttH is listed attendance only if the team is at home and 0 if the team is on the road. Thus, I expect \beta_1 < 0, \beta_3 > 0, |\beta_3| > |\beta_1| so that a team on the road suffers from a larger crowd but a team at home reaps a larger benefit from a larger crowd. The linear probability model is easy to interpret, but not very rigorous and subject to some problems.

As such, I’ll also run a Probit model of the same equation to avoid problems caused by the simplicity of the linear probability model.

Finally, just as a sanity check, I’ll run the same regression, but for runs, instead of win probability. Since runs aren’t binary, I’ll use ordinary least squares, and also control for the possibility that games played in American League parks lead to higher run totals by controlling for the designated hitter:

\hat{R} = \beta_0 + \delta_{H} + \beta_1 Att + \beta_2 Att^2 + \beta_3 AttH + \beta_4 AttH^2

Since runs are a factor in winning, I have the same expectations about the signs of the beta values as above.


Regression 1 (Linear Probability Model):

\begin{tabular}{|l||c|c|c|}  \textbf{Variable}&\textbf{Estimate}&\textbf{SE}&\textbf{t}\\ \hline  Intercept&.3443 &.125&2.754\\  Home&.3549&.1791&1.981\\  Att&1.589e-05&9.014e-06&1.773\\  Att\textsuperscript{2} &-3.509e-10&1.519e-10&-2.31\\  AttH&-3.392e-05&1.285e-05&-2.639\\  AttH\textsuperscript{2}&7.086e-10&2.158e-10&3.284\\  \end{tabular}

So, my prediction about the attendance betas was incorrect, but only because I failed to account for the squared terms. The effect from home attendance increases as we approach full attendance; the effect from road attendance decreases at about the same rate. There’s still a net positive effect.

Regression 2 (Probit Model):

\begin{tabular}{|l||c|c|c|}  \textbf{Variable}&\textbf{Estimate}&\textbf{SE}&\textbf{t}\\ \hline  Intercept&-4.090&.322&-1.27\\  Home&.9239&.4623&1.998\\  Att&4.177e-05&2.335e-05&1.789\\  Att\textsuperscript{2} &-9.141e-10&3.995e-10&-2.312\\  AttH&-8.808-05&3.332e-05&-2.643\\  AttH\textsuperscript{2}&1.836e-09&5.615e-10&3.271\\  \end{tabular}

Note that in both cases, there’s a statistically significant \delta{H}, meaning that teams are more likely to win at home, and that for large values of attendance, the Home effect outweighs the attendance effect entirely. That indicates that the attendance effect is probably spurious.

Finally, the regression on runs:

Regression 3 (Predicted Runs):

\begin{tabular}{|l||c|c|c|}  \textbf{Variable}&\textbf{Estimate}&\textbf{SE}&\textbf{t}\\ \hline  Intercept&2.486 &.7197&3.454\\  Home&2.026&1.031&1.964\\  DH&.0066&.2781&.024\\  Att&1.412e-04&5.19e-05&2.72\\  Att\textsuperscript{2} &-2.591e-09&8.742e-10&-2.964\\  AttH&-1.7032e-04&7.4e-05&-2.301\\  AttH\textsuperscript{2}&3.035e-09&1.242e-09&2.443\\  \end{tabular}

Again, with runs, there is a statistically significant effect from being at home, and a variety of possible attendance effects. For low attendance values, the Home effect is probably swamped by the negative attendance effect, but for high attendance games, the Home effect probably outweighs the attendance effect or the attendance effect becomes positive.

Again, the Home effect is statistically significant no matter which model we use, so at least in the National League, there is a noticeable home field advantage.

Home Field Advantage July 9, 2011

Posted by tomflesher in Baseball, Economics.
Tags: , ,
1 comment so far


The Mets unfortunately played a 10 PM game in San Francisco last night, so I’m short on sleep today. I do remember, though, that Gary Cohen mentioned, repeatedly, the Giants’ significant home field advantage. Even after last night’s loss at the hands of Carlos Beltran (coming from a rare blown save by Brian Wilson), the Giants have a .619 winning percentage at home (26-16) versus a .500 winning percentage on the road (24-24). Interestingly, their run differential is much worse at home – they’ve scored 205 and allowed 184 on the road for a total differential of +21, but their run differential at home is actually negative. They’ve scored 120 but allowed 135 for a differential of -15.

Some of that is due to the way walk-offs are scored – they end an inning immediately, so a scoring inning at home is cut short when the same inning on the road would continue and might lead to further scoring – but it’s still quite shocking to see that large a split. So far, the Giants have only scored 11 walk-off RBIs, compared with only 7 RBIs in the 9th inning on the road that came with the Giants ahead. So, even adding in an extra few runs wouldn’t account for the difference.

Last year, there wasn’t much of a home field effect at all. Running a very simple linear regression of runs scored against dummy variables for playing at home and playing with a DH, I estimated that

R_{2010} = 4.17 + .02 Home + 1.47 DH

and only the intercept term, which represents (essentially) the unconditional average number of runs the Giants score, was significant.

For this year, the numbers are quite different.

R_{2011} = 4.24 - 1.38 Home + .26 DH

with both the intercept and Home terms significant at the 95% level. It’s clear that the Giants are winning more at home, but it’s not because they’re scoring more at home.

Fire Up The Hot Stove November 2, 2010

Posted by tomflesher in Baseball.
Tags: , , , , , ,
add a comment

Although I’m usually fairly heavy on the statistical content, I can’t help but mention a few impressions from Game 5 of the World Series last night.

  • If I didn’t have Baseball-Reference.com to tell me different, I’d have assumed Aubrey Huff wasn’t an everyday first baseman from the way he played last night. He was competent and made some nice picks, but he didn’t seem to have the ankle-preservation instinct that most everyday 1Bs do. He seemed to have his heels back quite far on the bag most of the time.
  • The rumors about the Yankees pursuing Cliff Lee strike me as cartoonish supervillainy. “If I cannot defeat you, I will simply BUY you!”
  • Game 3 was the Lee vs. Tim Lincecum gem that we all assumed Game 1 would be.
  • Somewhere, Bengie Molina is secretly pouring champagne all over himself.
  • If the postseason came before voting, Buster Posey would be a lock for Rookie of the