What is OPS? January 12, 2015Posted by tomflesher in Baseball.
Tags: evergreen, OBP, OPS, SLG, statistics
Sabermetricians (which is what baseball stat-heads call ourselves to feel important) disregard batting average in favor of on-base percentage for a few reasons. The main one is that it really doesn’t matter to us whether a batter gets to first base through a gutsy drag bunt, an excuse-me grounder, a bloop single, a liner into the outfield, or a walk. In fact, we don’t even care if the batter got there through a judicious lean-in to take one for the team by accepting a hit-by-pitch. Batting average counts some of these trips to first, but not a base on balls or a hit batsman. It’s evident that plate discipline is a skill that results in higher returns for the team, and there’s a colorable argument that ability to be hit by a pitch is a skill. OBP is .
We also care a lot about how productive a batter is, and a productive batter is one who can clear the bases or advance without trouble. Sure, a plucky baserunner will swipe second base and score from second, or go first to third on a deep single. In an emergency, a light-hitting pitcher will just bunt him over. However, all of these involve an increased probability of an out, while a guy who can just hit a double, or a speedster who takes that double and turns it into a triple, will save his team a lot of trouble. Obviously, a guy who snags four bases by hitting a home run makes life a lot easier for his teammates. Slugging percentage measures how many bases, on average a player is worth every time he steps up to the plate and doesn’t walk or get hit by a pitch. Slugging percentage is . If a player hits a home run in every at-bat, he’ll have an OBP of 1.000 and a SLG of 4.000.
OPS is just On-Base Percentage plus Slugging Percentage. It doesn’t lend itself to a useful interpretation – OPS isn’t, for example, the average number of bases per hit, or anything useful like that. It does, however, provide a quick and dirty way to compare different sorts of hitters. A runner who moves quickly may have a low OBP but a high SLG due to his ability to leg out an extra base and turn a single into a double or a double into a triple. A slow-moving runner who can only move station to station but who walks reliably will have a low SLG (unless he’s a home-run hitter) but a high OBP. An OPS of 1.000 or more is a difficult measure to meet, but it’s a reliable indicator of quality.
The Hall of Fame Black Ink Test January 11, 2015Posted by tomflesher in Baseball.
Tags: Black Ink, evergreen, Hall of Fame
1 comment so far
The Baseball Hall of Fame‘s mission is “Preserving History, Honoring Excellence, Connecting Generations.” An important measure of the excellence honored in Cooperstown is called the Black Ink Test. “Black ink” refers to the boldface type used to show the league’s leader in an important category.
The categories used for the Black Ink Test are, of course, different for pitchers and batters, but they also vary depending on the importance of the stat. A batter who excels in hitting home runs is more valuable to a team than one who takes the most at-bats regardless of outcome. For batters, points are awarded as follows:
- One point for games, at-bats, or triples
- Two points for doubles, walks, or stolen bases
- Three points for runs scored, hits, or slugging percentage
- Four points for home runs, RBIs, or batting average
- One point for appearances, starts, or shutouts
- Two points for complete games, lowest Walks/9, or lowest Hits/9
- Three points for innings pitched, saves, or win-loss percentage
- Four points for wins, ERA, or strikeouts
That means that there are 30 black-ink points per year for batters and 30 for pitchers. (Multiple black-ink points can be awarded; for example, this year, at least 10 pitchers started 34 games in the National League, each of whom earns 1 point.) However, while it’s conceivable that a single batter could monopolize most of the categories, it’s not likely that a pitcher could – appearances and saves will go to a reliever, while most of the categories will go to a starter.
Because black ink requires a player lead his league, it’s hard to come by – and when there are more teams in a league, even the best players may not lead the league. One notable example of the bias toward older players is Ross Barnes, who was active for nine seasons from 1871 to 1881. (He didn’t play in 1878 or 1880.) Although Ross isn’t eligible for the Hall because he didn’t play ten seasons, he amassed an astonishing 60 points of black ink in the National Association by the age of 31. Since the National Association was only 9 teams, he competed against around 115 other batters for those points. During the 2014 season, the same 30 points of black ink were spread over 672 National League batters. Though Ross was truly an outstanding player, leading the league in nearly every category in 1873 and 1876, it was a lot easier to get those points then.
As of today, the batters with the most black ink not to be elected to the Hall of Fame are Barry Bonds (69), Pete Rose (68), and Alex Rodriguez (64). A-Rod and Rose, of course, aren’t eligible (A-Rod is still active). New Hall of Famer Craig Biggio had 17 and mediocre, forgettable middle-infielder Derek Jeter comes in at a whopping 10.
The pitchers with the most black ink not to be elected are Roger Clemens (100), Roy Halladay (48), Bucky Walters (48), and Justin Verlander (46). Verlander is still active and Halladay retired too recently to be elected, but Walters is truly a baffling case. New Hall of Famers this year were Randy Johnson (99), Pedro Martinez (58), and John Smoltz (34).
The Spectrum Club: 2014 Edition January 1, 2015Posted by tomflesher in Baseball.
Tags: Spectrum Club
add a comment
2013 and 2014 were unusually large Spectrum Clubs. The prestigious1 Spectrum Club consists of players who played as designated hitter and also pitched for their teams. Though there surely are a couple of people caught in this table who were primarily pitchers and just came in listed as a DH on the batting order, 2013 shows the largest Spectrum Club since the introduction of the designated hitter and 2014 following closely behind. The list of all Spectrum Club members is here.
This year inducted nine brand-new members. Although Mitch Maier and Darnell McDonald repeated from 2010 to 2011, everyone this year was a first-time pitcher/DH. As usual, though, they were all primarily position players.
This year’s inductees are:
Congratulations to this year’s inductees!
1 Not a guarantee.
BABIP as a Defensive Metric October 11, 2014Posted by tomflesher in Baseball, Economics.
Tags: BABIP, BJ Upton, models, statistics
add a comment
I follow OOTP on Facebook, and this Reddit thread about editing the Braves to go 0-162 popped up the other day.
I went into commissioner mode and basically ranked everyone’s stats to go 0-550 with 550 Ks (although when I went back, OOTP changed it to give them all a few hits and a couple of walks, etc.) I did not have to edit BJ Upton, as he was already programmed to do so.
One reply asked whether 1-BABIP is a valid defensive metric, and that got the wheels turning. (Note that for statistical purposes, summary statistics for 1-BABIP will be the same magnitude and the opposite sign as statistics for BABIP, so I went ahead and just used BABIP.)
For a quick check, I checked in at Baseball Reference to get the National League’s team-level statistics for the last 5 years, then correlated BABIP to runs allowed by the team. That correlation is .741 – that’s a pretty strong correlation. Similarly, the correlation between BABIP and team wins was about -.549. It’s a weaker and negative correlation, which is expected – negative because an added point of opposing team BABIP would mean more balls in play were falling in as hits, and weaker because it ignores the team’s offensive production entirely.
If BABIP accurately describes a team’s defensive power, then a statistical model that models team runs allowed as a function of fielding-independent pitching and pitching-independent fielding should explain a large proportion, but not all, of the runs allowed by a team, and thereby explain a significant but smaller proportion of the team’s wins.
I crunched two models to test this, each with the same functional form: Dependent Variable = a + b*FIP + c*BABIP. With Runs as the dependent variable, the R2 of the model was .8625; with Wins as the dependent variable, the R2 was .5246. Since R2 roughly describes the percent of variation explained by the model, this makes a lot of sense. In the Runs model, about 14% of runs come due to something other than home runs, walks, or hits, such as baserunning and errors; in the Wins model, about 47% of team wins are explained by something other than defense and pitching. (Say…. offense? That’s crazy.) In both models, the coefficients are statistically significant at the 99% level.
BABIP’s coefficient in the Runs model is 3444.44, which means that a batting average on balls in play of 1.000 would lead to about 3444 runs scored over a season; more realistically, if BABIP increases by .01, that would translate to about 34 runs per season. Its coefficient in the Wins model is -328.757, meaning that an increase of .01 in BABIP corresponds to about 3.29 extra losses. That’s surprisingly close to the 10 runs-1 win ratio that Bill James uses as a rule of thumb.
Since the correlations were strong, this bears a closer look at game-level rather than simply team-level data.
Mets Run Support by Starting Pitcher August 1, 2014Posted by tomflesher in Baseball.
Tags: Jacob deGrom, Mets, pitching, run support, Zack Wheeler
Yesterday’s post discussed distributional wins and losses based on the Mets’ inconsistent bunching of runs together. Since the boys didn’t play last night, I had a pretty stable dataset to work with, and the opportunity to crunch some numbers to see if the hypothesis that we’re working with is true. In addition, I took a look at each of our current starting rotation’s run support numbers and found some surprising things.
First of all, no pitcher had a statistically significant run support number than any other. Although Dillon Gee‘s run support is .77 lower than the average pitcher, for example, the p-value is .44, meaning the probablity that that’s statistically different from 0 is just about 56%. Jacob deGrom has a similar number – .796 runs below the average, but a .42 p-value. The only pitcher with a positive effect on run support is Bartolo Colon, but his p-value is a whopping .72, meaning it’s more likely than not that his number is a statistical artifact.
The runs allowed are a bit more stable – deGrom allows 1.18 runs fewer than average with a .2 p-value – but Gee, Jonathon Niese, Colon, and Zack Wheeler all have statistically 0 effect on runs allowed. Their ps are, respectively, .91, .84, .64, and .79. Basically, this means that an effect would have to be really big to show up in such a small sample size, not even all 108 games are covered in the sample.
Another way of tracking pitcher run support is to track team wins and losses in the games started by those pitchers and compare it to the team’s Pythagorean expectation in those games. This is a bit more revealing; for example, the Mets are 6-8 in starts by deGrom, but would have a Pythagorean expectation of about .568, or about 8-6, in those games. Wheeler also ends up with a Pythagorean expectation better than his record, predicting the Mets would have won 11 rather than 10 of his 22 games. The other pitchers are more or less in line with their expectations, although, like Zack, the pitchers don’t always get credit for the wins they pitched in.
Behind the cut is the table of regression results for a linear model with a dummy variable for each pitcher’s starts, plus a totally useless Away game dummy to look for home field advantage. (Surprise: There is none for the Mets, but all pitchers do allow roughly .74 more runs on the road than at home.)
What If The Mets Spread Their Runs More Evenly? July 31, 2014Posted by tomflesher in Baseball.
Tags: Mets, Runs, statistics
add a comment
The Mets have had quite a run lately – they sandwiched a 6-0 shutout loss on Tuesday between a 7-1 rout and an 11-2 dismantling of the Phillies. The whole series is a microcosm of the Mets’ season – the wildly inconsistent run production, the overuse of Josh Edgin, the disappointing start from Dillon Gee, and the totally unnecessary hit by Jeurys Familia. (Familia is 2 for 2 on the year with a 2.000 OPS.) If the Mets had spread out those 18 runs among the 3 games, there would have been a slightly different result – free baseball on Tuesday, but let’s assume the Mets would have lost the game anyway. In fact, the Mets have an average of 3.92 runs over the first 108 games of the season, and they’ve allowed an average of 3.79. If the Mets had spread out all of those runs evenly, then on average, the Mets would have won every game. (Fractional runs mess this up a little.) Of course, the Mets have been pretty wild with the runs they allow, as the graph at right suggests.
Let’s leave a little bit more to the opponents and just examine the Mets’ distribution. Above, the same graph shows the Mets’ distribution of runs. What would happen if they scored exactly 3.92 runs in every game? That would surely have taken a couple of losses off their docket, but probably earn them a couple of wins, as well. In fact, there are 15 games where the Mets scored below their average that they could have won if they’d scored over 3 runs. These losses are disproportionately spread over the Mets’ younger starting pitchers. Although Jonathan Niese, Dillon Gee, Jenrry Mejia, Rafael Montero and Daisuke Matsuzaka each started one of these games, and Bartolo Colon started two, Zack Wheeler and Jacob deGrom each started four. Those aren’t all starting pitcher losses, but Wheeler and deGrom have both had several tough losses that could have been taken away through some better run support.
On the other hand, there were 11 games the Mets won that they would have lost by scoring only 3.92 runs. Mejia,, Matsuzaka and deGrom each started one of these games, with Wheeler and Colon each starting two, but Niese is clearly the beneficiary of a lot of convenient run support here – he started four of these games that would have been losses.
After 108 games, the Mets have a 52-56 mark. Turning 11 of those wins into losses and 15 of those losses into wins means that number could be reversed – to a 56-52 mark – with more consistent run support for the starting pitchers. They have the capability to score those runs, and have definitely benefited from bunching those runs up at times, but on the whole deGrom and Wheeler would be better off, as would the entire team, with a bit more consistency.
John Baker Gets the W July 30, 2014Posted by tomflesher in Baseball.
Tags: John Baker, position players pitching, utility pitchers
add a comment
In more ways than one!
Much like Madison Bumgarner a few weeks ago, John Baker managed to be the winning pitcher and score the game-winning run for Chicago in last night’s game against the Rockies. Baker, a light-hitting backup catcher, came in from the bullpen for his first professional pitching appearance and pitched a clean 16th inning, walking 1 and striking out none on eleven pitches. Immediately after getting off the mound, Rockies left-hander Tyler Matzek walked Baker, who was then bunted over to second by utilityman Emilio Bonifacio. Arismendy Alcantara added some levity by getting plunked, Anthony Rizzo singled Baker over to third, and Starlin Castro lined a sacrifice fly to right field to bring Baker home for the win.
Welington Castillo deserves an honorable mention for catching all sixteen innings of the game. We can only hope he gets tonight’s game off.
Holy Cow, More On Ruben Tejada’s OBP July 29, 2014Posted by tomflesher in Baseball.
Tags: OBP, Ruben Tejada
1 comment so far
Last night, Ruben Tejada once again hit in the 8th batting order position. In four plate appearances, he walked once, in the bottom 8th; there’s been some discussion that Tejada’s OBP is inflated by intentional walks being thrown to get to the pitcher’s spot, though that definitely wasn’t the case here because the next player was lefty specialist Josh Edgin. As expected, Edgin was lifted for pinch hitter Bobby Abreu, who grounded into a double play. (Hmm. Maybe that was the intent. But Abreu only has 3 GIDPs on 140 plate appearances this year.)
Tejada’s stats by batting order position show some patterns. As an eighth-position hitter, Tejada has 198 plate appearances, 34 hits, 2 home runs, 32 walks, and 31 strikeouts, for a .213/.354/.288 line. In other order positions, he has 128 plate appearances, 27 hits, 0 homers, 14 walks, and 30 strikeouts, giving him a .245/.320/.275 line. Let’s assume, for the moment, that that .320 OBP line is Ruben’s true mark. That means his mark at the 8th inning should be, with 95% probability, somwhere in the range of .320 +/- .066, or somewhere between .254 and .388. Obviously, .354 is in that range. In fact, the .034 difference is about 1 standard error out, meaning there’s about a 70% chance of achieving that mark by chance alone.
In other words, it looks like there’s a statistically significant effect for Ruben batting in the 8th position. If we remove Ruben’s 9 intentional walks received in the 8th position and replace them with 2 hits and 7 outs, we’re left with a truly terrible .297 OBP, which is surprisingly even worse than his OBP while batting elsewhere, and one within one standard deviation of his .320 mark. That is, of course, a worst case scenario, assuming he wouldn’t walk at all in those 9 appearances. If he walked 3 out of 9 times, as his other stats would indicate, that would put him at a still not great .313 OBP.
Tags: extra innings, free baseball, reader questions
add a comment
Occasionally the World’s Worst Sports Blog likes to answer reader questions, which come in either by email at TheBadEconomist@gmail.com or through search engine queries. Today’s reader question: Which teams do the worst in extra innings? There are three measures we can take to see which teams are really the worst in extra innings.
The first is to look at the bare number of extra-innings losses. The Miami Marlins, with an extra-innings record of 6-9, hold that honor. That gives them an extra-innings win-loss percentage of .400, which isn’t great, but it’s well within the realm of chance. In fact, if extra-innings games really are a statistical crapshoot, then margin of error for 15 games is about .130.
There are a few teams that do worse in extra innings than Miami, assuming you ignore the number of games played. Both the Texas Rangers and the Toronto Blue Jays are 1-3 in extras for a win-loss of .250, and the Washington Nationals and Los Angeles Dodgers aren’t much better with records of 3-8 and attendant win percentages of .273. Those are still within the margin of error for such a small sample size. In fact, almost no teams are statistically better than chance in extra innings – only the Orioles, with a .786 win-loss mark in 14 games, are statistically outside the margin of error.
There are a few teams that are much worse than even their scores would lead us to expect. These are teams with really lousy pythagorean luck – that is, their runs allowed and runs scored predict they’d have a much better record than expected.
The unluckiest team so far has been the Chicago White Sox, with a Pythagorean expectation in extra-innings games of .450 and an actual win percentage of .286, for a mark of -.164. Texas and Toronto each come in at .159 and .156, respectively, with the Dodgers, the Nationals, the Reds, the Mariners, and the Cubs all coming in at -.100 or worse. The Giants are the luckiest team, with a luck number of .222.
What reader questions would you like me to address? Use the form below to make a request!
Tags: Bartolo Colon, Mets
add a comment
Bartolo Colon‘s previous start gave a solid 6 2/3 innings of perfect baseball before Robinson Cano broke it up with a single. Though Bart had raised some concerns earlier in the year with his inconsistent performance, he’s shown he still has the capability to throw an excellent ballgame and not lose control when it gets broken up.
The Mets have a perfectly cromulent rotation – Jonathan Niese, Dillon Gee, Zack Wheeler, and Jacob deGrom are currently in the rotation, and Daisuke Matsuzaka, Dana Eveland, and Carlos Torres each have the capability to function as a swing starter – and a bullpen that is slowly becoming more reliable. Though the Mets are allowing a below-average 3.8 runs per game, they’re also scoring a below-average 3.9, indicating that the highest marginal benefit is probably to disassemble Colon for a bat or two.
Trading Colon would leave a hole in the starting rotation that could be filled with one of the bullpen arms; Eveland and Josh Edgin are both operating as lefty bullpen arms, so Eveland might be the more reasonable choice. In the alternative, a AAA starter, rather than a bullpen pitcher, might be promoted. In either case, that leaves a net zero change in the balance between bats and arms. With Wilmer Flores up from Vegas, we can avoid the unfortunate situation of Eric Campbell playing shortstop again. Wilmer may also be able to help by keeping Campbell out of defensive-replacement scenarios, allowing him to focus on pinch hitting. Alternatively, grabbing a low-budget DH player to function as a professional pinch hitter would also be an option, and allow Flores to continue to develop in Las Vegas.
Essentially, the team needs to start supporting its pitchers more consistently. Dropping Colon would eliminate some variance in run support and open up the possibility of using the extra budget room to develop more run support.