jump to navigation

Spitballing: Position changes June 3, 2011

Posted by tomflesher in Baseball.
Tags: , , , , , , , , ,
add a comment

First thing’s first: this entry was prompted by Buster Posey and his horrific ankle injury, but it’s not just about him. The first time I started thinking about it seriously was last year, when the Mets’ Carlos Beltran was about to come off the DL and Angel Pagan‘s placement was in doubt. Either Gary, Keith, or Ron tripped my “Stuff Keith Hernandez Says” meter by saying that fans had suggested moving Pagan to second base to fill in for the ailing Luis Castillo, and commented that “You can’t just move a guy to second base.” Very true.

Similarly, it’s very hard to “just move a guy” to catcher, which is why a guy like Buster Posey is so valuable. In the National League, the median OPS+ for players with at least 100 plate appearances and who played more than half their games at catcher was 91. Posey’s OPS+ was 129 – that’s over 40% better. If instead you look at first basemen with at least 100 plate appearances, the median OPS+ is 107. All of a sudden, Posey’s offensive value-added drops to about 20% above average, and that’s before accounting for regression to the mean. Moving him to third base instead mitigates the damage and takes full advantage of his arm, but he’s suddenly a much less special player when he’s on the hot corner instead of behind the plate.

It’s also maddening to hear about efforts to move Derek Jeter to center field. Even though he’s on the downswing, he’s hit well above average every year from 1996 through 2009. Even last year, his 91 OPS+ was acceptable, especially considering his popularity. Granted, he costs his team runs on defense (he’s rarely had a positive defensive Wins Above Replacement), but his offensive contribution more than makes up for it. He’s 6’3″, making him more than big enough to move to first base, and first base doesn’t require him to have the range that center field would. After Jorge Posada hangs it up, splitting  the duties at first base and DH between Jeter and Alex Rodriguez will start to make more sense, and using homegrown prospects to take over at shortstop and third base ensures continuing fan loyalty.

Finally, I’d be remiss if I didn’t mention future Dodgers closer Kenley Jansen. Although his 2.000 OPS last year grossly overstated his batting ability (only two plate appearances, compared with a lifetime .229 batting average in the minors), Jansen is a success story in his move from catcher to fireballing reliever. That was an excellent move by the Dodgers system – they took Jansen’s innate ability (his cannon-like arm) and moved him to a position where his contribution would be optimized. Whether or not Jansen turns out to be a future dominant closer, he’s probably gotten more playing time as a reliever than he ever would have as a catcher, and he’s generated more value for the Dodgers.

Basically, player moves are difficult. It’s important to try to optimize a player’s contribution, and that needs to take into account his defensive talents instead of merely trying to find a place for him to play. I can only hope Buster Posey’s recuperation goes smoothly and there’s a value-maximizing slot for him with the Giants.

Did Run Production Change in 2010? June 2, 2011

Posted by tomflesher in Baseball, Economics.
Tags: , ,
add a comment

Part of the narrative of last year’s season was the compelling “Year of the Pitcher” storyline prompted by an unusual number of no-hitters and perfect games. Though it’s too early in the season to say the same thing is happening this year, a few bloggers have suggested that run production is down in 2011 and we might see the same sort of story starting again.

As a quick and dirty check of this, I’d like to compare production in the 2000-2009 sample I used in a previous post to production in 2010. This will introduce a few problems, notably that using one year’s worth of data for run production will lead to possibly spurious results for the 2010 data and that the success of the pitchers may be a result of the strategy used to generate runs. That is, if pitchers get better, and strategy doesn’t change, then we see pitchers taking advantage of inefficiencies in strategy. If batting strategy stays the same and pitchers take advantage of bad batting, then we should see a change in the structure of run production since the areas worked over by hitters – for example, walks and strikeouts – will see shifts in their relative importance in scoring runs.

Hypothesis: A regression model of runs against hits, doubles, triples, home runs, stolen bases, times caught stealing, walks, times hit by pitch, sacrifice bunts, and sacrifice flies using two datasets, one with team-level season-long data for each year from 2000 to 2009 and the other from 2010 only, will yield statistically similar beta coefficients.

Method: Chow test.

Result: There is a difference, significant at the 90% but not 95% level. That might be a result of a change in strategy or of pitchers exploiting strategic inefficiencies.

R code behind the cut.

(more…)

One-Third of an Inning Pitched, 6 or More Earned Runs June 1, 2011

Posted by tomflesher in Baseball.
Tags: , , , , , , , , , , , , , , , , ,
add a comment

Carlos Marmol came in last night to close a fine performance by Carlos Zambrano, who had pitched 8 innings and allowed one earned run on 7 hits, no walks, and 7 strikeouts for a game score of 71. (Zambrano went 0-2, dropping his batting average to a paltry .346.) Marmol had allowed 3 runs in 23 innings pitched prior to last night, with 10 saves, two blown saves, and a record of 1-1.

Then came last night.

On one third of an inning pitched, facing the 6-7-8 part of the Astros’ lineup, Marmol first allowed Brett Wallace to single, followed by Chris Johnson doubling and sending Wallace to third. Matt Downs hit for catcher Robinson Cancel and doubled, sending both Wallace and Johnson home. (Two earned runs.)

At this point, I’d have been willing to let pitcher Fernando Rodriguez hit for himself, but Angel Sanchez came in and sacrifice bunted Downs to third base. Credit Marmol with one-third of an inning pitched. Michael Bourne singled to bring Downs home from third (three earned runs), then stole second to put the winning run in scoring position. Clint Barmes walked, followed by Hunter Pence homering (six earned runs). Mercifully, Sean Marshall came in to finish off the inning, allowing one more single but getting the two outs to end the inning.

It’s surprisingly common to have at least 6 earned runs in one-third or less of an inning pitched. Ryan Dempster even managed to allow seven earned runs in .1 IP to start the game and his team bravely held on for the loss, and Jason Marquis once allowed seven earned runs in NO innings pitched (although in Marquis’ defense he left the bases loaded and Miguel Batista allowed all three inherited runners to score).

So, buck up, Marmol, and buy Mr. Zambrano a steak dinner.

Is scoring different in the AL and the NL? May 31, 2011

Posted by tomflesher in Baseball, Economics.
Tags: , , , , , , , ,
1 comment so far

The American League and the National League have one important difference. Specifically, the AL allows the use of a player known as the Designated Hitter, who does not play a position in the field, hits every time the pitcher would bat, and cannot be moved to a defensive position without forfeiting the right to use the DH. As a result, there are a couple of notable differences between the AL and the NL – in theory, there should be slightly more home runs and slightly fewer sacrifice bunts in the AL, since pitchers have to bat in the NL and they tend to be pretty poor hitters. How much can we quantify that difference? To answer that question, I decided to sample a ten-year period (2000 until 2009) from each league and run a linear regression of the form

\hat{R} = \beta_0 + \beta_1 H + \beta_2 2B + \beta_3 3B + \beta_4 HR + \beta_5 SB + \beta_6 CS + \\    \beta_7 BB + \beta_8 K + \beta_9 HBP + \beta_{10} Bunt + \beta_{11} SF

Where runs are presumed to be a function of hits, doubles, triples, home runs, stolen bases, times caught stealing, walks, strikeouts, hit batsmen, bunts, and sacrifice flies. My expectations are:

  • The sacrifice bunt coefficient should be smaller in the NL than in the AL – in the American League, bunting is used strategically, whereas NL teams are more likely to bunt whenever a pitcher appears, so in any randomly-chosen string of plate appearances, the chance that a bunt is the optimal strategy given an average hitter is much lower. (That is, pitchers bunt a lot, even when a normal hitter would swing away.) A smaller coefficient means each bunt produces fewer runs, on average.
  • The strategy from league to league should be different, as measured by different coefficients for different factors from league to league. That is, the designated hitter rule causes different strategies to be used. I’ll use a technique called the Chow test to test that. That means I’ll run the linear model on all of MLB, then separately on the AL and the NL, and look at the size of the errors generated.

The results:

  • In the AL, a sac bunt produces about .43 runs, on average, and that number is significant at the 95% level. In the NL, a bunt produces about .02 runs, and the number is not significantly different from saying that a bunt has no effect on run production.
  • The Chow Test tells us at about a 90% confidence level that the process of producing runs in the AL is different than the process of producing runs in the NL. That is, in Major League Baseball, the designated hitter has a statistically significant effect on strategy. There’s structural break.

R code is behind the cut.

(more…)

Bartolo Colon pitches shutout, shuffles off mound on walker May 31, 2011

Posted by tomflesher in Baseball.
Tags: , , , , , , , ,
1 comment so far

Noted elderly person Bartolo Colon was handed the ball by Joe Girardi on Memorial Day, no doubt to commemorate Colon’s memories of having a functional back. Colon, at age 38 as of last week, turned around and pitched a brilliant 4-hit shutout against the Oakland Athletics, taking out As starter Trevor Cahill on an exceptionally economical 103 pitches. Keep in mind, at age 38, Colon has been playing professionally since 1993, so his career can legally purchase cigarettes this year.

We don’t have to go back very far to steal Colon’s thunder, though. Last May, Phillies starter and AARP representative Jamie Moyer managed a complete game two-hitter at the spry young age of 47 years, 170 days old. He topped Colon’s game score, 88 compared to 85, and just barely used more pitches (105). Moyer unfortunately had to retire after recurring injuries, which will probably be Colon’s fate soon enough.

Complete Game Shutout… PSYCH! May 30, 2011

Posted by tomflesher in Baseball.
Tags: , , , , , , , , , , , , ,
add a comment

Jered Weaver pitched a brilliant game Saturday night for the Angels against the Twins. He’s had a strange opening to the season, starting with six straight wins and then beginning May with four straight losses followed by a no-decision. Saturday, on four days rest, he pitched nine scoreless innings with 2 hits, 0 runs, 2 walks, 7 strikeouts, no hit batsmen, a Game Score of 88, and a career-high 128 pitches. It’s a good thing he grabbed another win… wait, no he didn’t. The game went into extra innings, the Angels lost, and Weaver walked off the mound with a no decision.

Put another way, if anyone had managed to hit a home run, or if Hank Conger had singled instead of popping fly to third in the eighth, Weaver would have a two-hit complete game shutout, and we’d be talking about how he still had it. Instead, he gets a no decision, and the Angels lost the game.

That doesn’t happen a whole lot, but it does happen enough to take notice. For example, on May 12, a 2-1 win for the Orioles over the Mariners was 0-0 into the 12th. So, both the Mariners’ Jason Vargas (9 IP, 7 H, 0 R, o ER, 1 BB, 4 K, 76 GSc) and the Orioles’ Zach Britton (9 IP, 3 H, 0 R, 0 ER, 0 BB, 5 K, 86 GSc) left with complete game shutouts that weren’t.

Similarly, last year, on July 10, Roy Halladay was outpitched by the Reds’ Travis Wood in an 11-inning 1-run loss. Wood managed a game score of 93 on one hit, no walks, and 8 strikeouts, whereas Halladay had a paltry 85 on 5 hits, 1 walk and 9 strikeouts. Neither man got the win, which went to Phillies reliever Jose Contreras.

Zambrano Back on the Horse May 27, 2011

Posted by tomflesher in Baseball.
Tags: , , , , , , ,
add a comment

Last night, Carlos Zambrano pitched on one day’s rest after pinch-hitting against the hapless Mets for two RBIs on Tuesday. We’ve talked about Zambrano’s pinch-hitting prowess before, but last night he was an awesome 3 for 3 from the plate, including a double. In fact, in 26 plate appearances, Zambrano’s got 9 hits for a .375 batting average and, since he has no walks, a .375 on-base percentage. Not only is that impressive, but I hear he can pitch, too.

I figured that was pretty impressive. It can’t be often that a pitcher gets three at-bats and hits for all of them, can it? It’s happened 450 times since 1919, including, surprisingly, once already this year. The Mets’ Chris Young managed a 3-for-3 night while notching the win against the Phillies back on April 5.

In recent memory, the most at-bats by a pitcher who hit each time was Dan Haren, who grabbed a 4-for-4 as a Diamondback against the Cardinals last year (also as the winning pitcher). Haren also gave up a whopping 7 runs, so he’s lucky he was hitting.

Micah Owings has had two games where he pitched and hit in all of at least 3 plate appearances, including a 4-for-4 from 2007 in which three of his 4 hits were doubles.

Finally, Mel Stottlemyre (in 1964) and two pitchers from the 1920s had 5-for-5 games. Stottlemyre’s two-hit gem included him hitting a double and pitching to a game score of 83.

Complete Game in a Non-Quality Start May 26, 2011

Posted by tomflesher in Baseball.
Tags: , , , , , , , ,
add a comment

Dillon Gee of the Mets was credited with a complete game in last night’s win over the Cubs. His line: 6 IP, 4 H, 4 R, 4 ER, 2 BB, 4 K, 0 HR, and 1 HBP, for a game score of 50. He qualified for a quality start under the Game Score definition, but not under the six-inning, three-run criterion. That makes it a form of Cheap Win, where a pitcher is credited with a win even though he didn’t pitch as effectively as expected.

Since the game was shortened by rain, Gee got a complete game, even though that usually involves 8 innings for the visiting pitcher on a losing team or 9 inning for a winning pitcher regardless. That made me wonder how many pitchers from the modern era, when complete games are less common than in previous years, have pitched complete games in non-quality starts.

A quality start, under the Game Score definition, is a start with less than 50 points. That represents that a pitcher had negative value for his team. It can’t be especially common, can it?

According to this list I queried from Baseball Reference, a non-quality start complete game hasnt been pitched since 2006 when Freddy Garcia pitched a rain-shortened 5-inning complete game for the White Sox to defeat the Blue Jays 6-4, with a game score of 42. The last nine-inning complete game non-quality start was Pete Harnisch with the Reds, who won a 10-6 slugfest in August of 2000 on 124 pitches with only one walk and three strikeouts. Aside from the six earned runs (all scored in the first three innings) it wasn’t a bad performance, somewhat reminiscent of Edwin Jackson‘s ugly but effective no-hitter last year.

Wilson Valdez, Utility Pitcher Extraordinaire May 26, 2011

Posted by tomflesher in Baseball.
Tags: , , , , , , , , , ,
1 comment so far

Interested in position players who pitched? Check out The Best Game Ever and a previous post on what I like to call Utility Pitchers.

So, the Phillies and the Reds went into extra innings last night and Wilson Valdez was the winning pitcher. His line: 1.0 IP, 0 H, 0 R, 0 ER, 0 BB, 0 K, 0 HR, 4 BF on 10 pitches. He did have a hit batsman – Scott Rolen – but that’s not surprising, since Valdez has never pitched professionally at any level.

First of all, let me say that I’m thoroughly impressed with the way both managers managed the game. Ordinarily, a 19-inning game is full of spot relievers going a few innings each and at some point the managers seem to lose control of the situation and start panicking. The most common solution is to throw starters in on their throw day, which is how Mike Pelfrey got his save last year. Instead, Reds manager Dusty Baker seemed to know that Carlos Fisher, who has never started a game at the Major League level, had 5 2/3 innings of starter-quality stuff in him. Similarly, the Phillies’ Charlie Manuel relied on Danys Baez, who hadn’t pitched more than four innings since the Bush administration, for five innings that would have made any manager happy. To offer some perspective, if Baez had pitched his five innings at the beginning of the game and been lifted, his game score would have been 67; Fisher’s would have been 58 had he been removed from the game at the moment he gave up his run. That’s not only a quality start for each pitcher, but both of the relievers put together a higher game score than their team’s starter.

Oh, yeah, and the Phillies’ starter was Roy Halladay.

Also, Wilson Valdez had an incredible night. In addition to becoming the first position player to be the winning pitcher since 2000, Valdez started the game at second base and went 3 for 6 with a walk. To compare, when catcher Brent Mayne was the Rockies’ winning pitcher in 2000, he came in off the bench and didn’t bat at all.

Hats off to Charlie Manuel and Dusty Baker for managing a smart game, and bravo to Wilson Valdez for solid inning pitched and a great night at the plate.

Is ‘luck’ persistent? May 25, 2011

Posted by tomflesher in Baseball, Economics.
Tags: , , ,
2 comments

I’ve been listening to Scott Patterson’s The Quants in my spare time recently. One of the recurring jokes is Wall Street traders’ use of the word ‘Alpha’ (which usually represents abnormal returns in finance) to refer to a general quality of being skillful or having talent. That led me to think about an old concept I haven’t played with in a while – wins above expectation.

As a quick review, wins above expectation relate a team’s actual wins to its Pythagorean expectation. If the team wins more than expected, it has a positive WAE number, and if it loses more than expected, it has wins below expectation, or, equivalently, a negative WAE. It’s tempting to think of WAE as representing a sort of ‘alpha’ in the traders’ sense – since the Pythagorean Expectation involves groups of runs scored and runs allowed, it generates a probability that a team with a history represented by its runs scored/runs allowed stats will win a given game. If a team has a lot more wins than expected, it seems like that represents efficiency – scoring runs at crucial times, not wasting them on blowing out opponents – or especially skillful management. Alternatively, it could just be luck. Is there any way to test which it is?

It’s difficult. However, let’s break down what the efficiency factor would imply. In general, it would represent some combination of individual player skill (such as the alleged clutch hitting ability) and team chemistry, whether that boils down to on- or off-field factors. Assuming rosters don’t change much over the course of the year, then, efficiency also shouldn’t change much over the course of the year. Similarly, if a manager’s skill was the primary determinant of wins above expectation, then for teams that don’t change managers midyear, we wouldn’t expect much of a change throughout the course of the season. Most managers work up through the minors, so there probably isn’t a major on-the-job training effect to consider.

On the other hand, if wins above expectation are just luck, then we wouldn’t need to place any restrictions on them. Maybe they’d change. Maybe they wouldn’t. Who knows?

In order to test that idea, I pulled some data for the American League off Baseball Reference from last season. I split the season into pre- and post-All-Star Break sets and calculated the Pythagorean expectation (using the 1.81 exponent  referred to in Wikipedia) for each team. I found WAE for each team in each session, then found each team’s ‘Alpha’ for that session by dividing WAE by the number of games played. Basically, I assumed that WAE represented extra win probability in some fashion and assumed it existed in every game at about the same level. The results:

\begin{tabular}{ | l | c | c | c| r | }  \hline  Team & WAE1 & Alpha1 & WAE2 & Alpha2 \\ \hline  NYY & 0.823 & 0.009 & -2.474 & -0.033 \\ \hline  TBR & -0.5 & -0.003 & 0.207 & 0.003 \\ \hline  BOS & 0.494 & 0.006 & 0.900 & 0.012 \\ \hline  TEX & -1.041 & -0.012 & 0.291 & 0.004 \\ \hline  CHW & 2.379 & 0.027 & -0.244 & -0.003 \\ \hline  DET & 3.918 & 0.046 & -4.706 & -0.062 \\ \hline  MIN & -1.67 & -0.019 &.3.693 & 0.05 \\ \hline  LAA & 3.83 & 0.042 & -2.860 & -0.040 \\ \hline  TOR & -0.202 & -0.002 & 1.555 & 0.021 \\ \hline  OAK & -1.939 & -0.022 & -2.418 & -0.033 \\ \hline  KCR & 0.023 & 0.000 & 1.976 & 0.027 \\ \hline  SEA & 0.225 & 0.003 & 2.188 & 0.03 \\ \hline  CLE & -2.096 & -0.023 & 0.907 & 0.012 \\ \hline  BAL & -1.028 & -0.012 & 8.900 & 0.120 \\ \hline  \end{tabular}

As is evident from the table, a whopping 10 out of the 14 teams see a change in the sign of Alpha from before the All-Star Game to after the All-Star Game. The correlation coefficient of Alpha from pre- to post-All-Star is -.549, which is a pretty noisy correlation. (Note also that this very closely describes regression to the mean.) It’s not 0, but it’s also negative, implying one of two things: Either teams become less efficient and/or more badly managed, on average, after the break, or Alpha represents very little more than a realization of a random process, which might just as well be described as luck.