Spitballing: Jim Thome and RecognitionJuly 21, 2011

Posted by tomflesher in Baseball.
Tags: , , , , ,

It’s no secret that I’m a fan of Jim Thome. Although he never played in my hometown, Buffalo was Cleveland’s AAA affiliate when I was a wee lad and so I’ve always had a soft spot in my heart for Indians. I also admire Thome’s small-town, farm-boy image. The PepsiMAX Clubhouse in the Corn ad showing Jim asking for autographs played off that image.

Thome’s pretty popular on the internet, based on the proportion of traffic I’m getting from searches for his name.  Kyle Kendrick (no, not that one) of the Winfield (Kansas) Daily Courier noticed, though, that media has been much quieter about Thome’s achievement than about Alex Rodriguez‘ same run last year. Kendrick blames the lack of coverage on Thome’s image:

Honestly, I believe it’s because he is too quiet and too humble for his own good. He isn’t flashy like Bonds, or flamboyant like Sosa or making it look easy like Griffey did. Therefore people, including the media, haven’t latched on to him like they have done with other hitters in the past. Add that to the fact that he’s never played more than one season in a very big media market town like New York or Boston or Chicago, and you may come to understand why he isn’t getting the bigtime coverage.

(Let’s leave aside the dismissal of three seasons in Philadelphia and three and a half in Chicago for a moment.)

It’s pretty clear to me why Derek Jeter‘s 3000-hit milestone got more coverage than Thome’s: Jeter is, for better or for worse, much more well-known than Thome. The average fan probably knows Jeter’s face, but it would take a much more interested fan to recognize Thome’s face. Jim was last an All-Star in 2006 and spent five and a half of the last six seasons  in the AL Central, meaning that the largest markets that he was regularly exposed to were Detroit and Chicago. (Granted, he spent half a season with the Dodgers.) He’s not well-known enough to be wildly popular, and he’s not hated enough (like Rodriguez) for people to take pleasure in any failure that might happen. As soon as A-Rod’s production slowed down, people started accusing him of choking. Thome’s been like clockwork throughout his career, but even if he did slow down, it’s no fun to call a likeable guy a choker. Gary Sheffield was a Met at the time he hit his 500th, so there was a bump in coverage from being with a large-market team, but he got a lot of coverage too. Is it any coincidence he was widely regarded as a bit of a tool?

As I said earlier, Thome will likely hit his 600th home run in August, and it’ll probably be only a few weeks before the September callups. Minnesota is five games back, but in third place in the AL Central, and 12 games back from the wild card. Thome probably won’t get his glory this postseason. Hopefully he’ll get his recognition when he hits #600, but whether or not he does, he’ll go down in history as the eighth member of an exclusive club that won’t expand for some time longer.

Spitballing: Position changesJune 3, 2011

Posted by tomflesher in Baseball.
Tags: , , , , , , , , ,

First thing’s first: this entry was prompted by Buster Posey and his horrific ankle injury, but it’s not just about him. The first time I started thinking about it seriously was last year, when the Mets’ Carlos Beltran was about to come off the DL and Angel Pagan‘s placement was in doubt. Either Gary, Keith, or Ron tripped my “Stuff Keith Hernandez Says” meter by saying that fans had suggested moving Pagan to second base to fill in for the ailing Luis Castillo, and commented that “You can’t just move a guy to second base.” Very true.

Similarly, it’s very hard to “just move a guy” to catcher, which is why a guy like Buster Posey is so valuable. In the National League, the median OPS+ for players with at least 100 plate appearances and who played more than half their games at catcher was 91. Posey’s OPS+ was 129 – that’s over 40% better. If instead you look at first basemen with at least 100 plate appearances, the median OPS+ is 107. All of a sudden, Posey’s offensive value-added drops to about 20% above average, and that’s before accounting for regression to the mean. Moving him to third base instead mitigates the damage and takes full advantage of his arm, but he’s suddenly a much less special player when he’s on the hot corner instead of behind the plate.

It’s also maddening to hear about efforts to move Derek Jeter to center field. Even though he’s on the downswing, he’s hit well above average every year from 1996 through 2009. Even last year, his 91 OPS+ was acceptable, especially considering his popularity. Granted, he costs his team runs on defense (he’s rarely had a positive defensive Wins Above Replacement), but his offensive contribution more than makes up for it. He’s 6’3″, making him more than big enough to move to first base, and first base doesn’t require him to have the range that center field would. After Jorge Posada hangs it up, splitting  the duties at first base and DH between Jeter and Alex Rodriguez will start to make more sense, and using homegrown prospects to take over at shortstop and third base ensures continuing fan loyalty.

Finally, I’d be remiss if I didn’t mention future Dodgers closer Kenley Jansen. Although his 2.000 OPS last year grossly overstated his batting ability (only two plate appearances, compared with a lifetime .229 batting average in the minors), Jansen is a success story in his move from catcher to fireballing reliever. That was an excellent move by the Dodgers system – they took Jansen’s innate ability (his cannon-like arm) and moved him to a position where his contribution would be optimized. Whether or not Jansen turns out to be a future dominant closer, he’s probably gotten more playing time as a reliever than he ever would have as a catcher, and he’s generated more value for the Dodgers.

Basically, player moves are difficult. It’s important to try to optimize a player’s contribution, and that needs to take into account his defensive talents instead of merely trying to find a place for him to play. I can only hope Buster Posey’s recuperation goes smoothly and there’s a value-maximizing slot for him with the Giants.

Teixeira’s Ability to Pick Up Slack: Re-EvaluatingApril 12, 2011

Posted by tomflesher in Baseball, Economics.
Tags: , , , , , ,

In an earlier post, I discussed Yankees broadcaster Michael Kaye’s belief that Mark Teixeira and Robinson Cano were picking up slack during the time in which Alex Rodriguez was struggling to hit his 600th home run. I noticed that Teixeira had hit 18 home runs in 423 plate appearances during the first 93 games of the season for rates of .194 home runs per game and .0426 home runs per plate appearance. During the time between A-Rod’s #599 and #600, Teixeira’s performance was different in a statistically significant way: his production per game was up to .417 home runs per game and .0926 home runs per plate appearance.

Now, let’s take a look at the home stretch of the season. Teixeira played in 52 games, starting 51 of them, and hit 10 home runs in 230 plate appearances. That works out to .1923 home runs per game or .0435 per plate appearance. Those numbers are exceptionally similar to Teixeira’s production in the first stretch of the season, so it seems reasonable to say that those rates represent his standard rate of production.

This is prima facie evidence that Teixeira was working to hit more home runs, consciously or subconsciously, during the time that Rodriguez was struggling. The question then becomes, is there a reason to expect production to increase during the stretch between late July and early August? What if Mark was just operating better following the All-Star Break?

I chose a twelve-game stretch immediately following the All-Star Break to evaluate. This period overlaps with the drought between A-Rod’s 599th and 600th home runs, stretching from July 16 to July 28, so six games overlap and six do not. During that time, Teixeira hit 3 home runs in 56 plate appearances. His rate was therefore .0535 home runs per plate appearance.

If we assume that Teixeira’s true rate of production is about .043 home runs per plate appearance (his average over the season, excluding the drought), then the probability of his hitting exactly 3 home runs in a random 56-plate-appearance stretch is

$p(K = k) = {n \choose k}p^k(1-p)^{n-k} = {56 \choose 3}.043^{3}(.957)^{53} \approx .2146$

He has a 43% chance of hitting 3 or more, compared with the complementary probability 57% probability of hitting fewer than 3. It’s well within the normal expected range. So, the All-Star Break effect is unlikely to explain Teixeira’s abnormal production last July.

Teixeira and Cano: Picking up slack?August 5, 2010

Posted by tomflesher in Baseball, Economics.
Tags: , , , , , , ,
1 comment so far

Michael Kaye, the YES broadcaster for the Yankees, often pointed out between July 22 and August 4 that the Yankees were turning up their offense to make up for Alex Rodriguez‘s lack of home run production. That seems like it might be subject to significant confirmation bias – seeing a few guys hit home runs when you wouldn’t expect them to might lead you to believe that the team in general has increased its production. So, did the Yankees produce more home runs during A-Rod’s drought?

During the first 93 games of the season, the Yankees hit 109 home runs in 3660 plate appearances for rates of 1.17 home runs per game and .0298 home runs per plate appearance. From July 23 to August 3, they hit 17 home runs in 451 plate appearances over 12 games for rates of 1.42 home runs per game and .0377 home runs per plate appearances. Obviously those numbers are quite a bit higher than expected, but can it be due simply to chance?

Assume for the moment that the first 93 games represent the team’s true production capabilities. Then, using the binomial distribution, the likelihood of hitting at least 17 home runs in 451 plate appearances is

$p(K = k) = {n\choose k}p^k(1-p)^{n-k} = {451\choose 17}.0298^{17}(.9702)^{434} \approx .0626$

The cumulative probability is about .868, meaning the probability of hitting 17 or fewer home runs is .868 and the probability of hitting more than that is about .132. The probability of hitting 16 or fewer is .805, which means out of 100 strings of 451 plate appearances about 81 of them should end with 16 or fewer plate appearances. This is a perfectly reasonable number and not inherently indicative of a special performance by A-Rod’s teammates.

Kaye frequently cited Mark Teixeira and Robinson Cano as upping their games. Teixeira hit 18 home runs over the first 93 games and made 423 plate appearances for rates of .194 home runs per game and .0426 home runs per plate appearance. From July 23 to August 3, he had 5 home runs in 12 games and 54 plate appearances for rates of .417 per game and .0926. That rate of home runs per plate appearance is about 8% likely, meaning that either Teixeira did up his game considerably or he was exceptionally lucky.

Cano played 92 games up to July 21, hitting 18 home runs in 400 plate appearances for rates of .196 home runs per game and .045 per plate appearance. During A-Rod’s drought, he hit 3 home runs in 50 plate appearances over 12 games for rates of .25 and .06. That per-plate-appearance rate is about 39% likely, which means we don’t have enough evidence to reject the idea that Cano’s performance (though better than usual) is just a random fluctuation.

It will be interesting to see if Teixeira slows down as a home-run hitter now that Rodriguez’s drought is over.

Quickie: 600th Home Run for A-RodAugust 4, 2010

Posted by tomflesher in Baseball.
Tags: , , , ,

Alex Rodriguez finally hit #600 deep to center field in Yankee Stadium on the third anniversary of his 500th home run. A-Rod hit the home run in his first plate appearance. There were 51 plate appearances since #599. He had a final Choke Index of .944, but luckily he won’t run into another milestone home run for at least a few years.

The ball landed in Monument Park, so the Yankees didn’t need to negotiate with a fan to get it back. (A security guard picked it up.) According to Michael Kaye, if the ball had landed in the stands, the Yankees would have been willing to pay for the person who caught the ball to have lunch with Alex Rodriguez and Cameron Diaz in exchange for getting the ball back, on top of an autographed baseball, hat, and bat. That opens interesting questions of valuation, much like those that came up after Doug Mientkiewicz attempted to keep the ball that he caught to make the final out in the 2004 World Series.

Is A-Rod’s Performance Different?August 3, 2010

Posted by tomflesher in Baseball, Economics.
Tags: , , , , , , , , ,
1 comment so far

In games between milestone home runs, is Alex Rodriguez’ hitting similar to other times? (This is all a very polite way of asking, “Does A-Rod choke?”) It’s difficult to answer, because there’s so little data about those milestone home runs. A-Rod, though, has some statistically improbable results and it would be interesting to look at it a bit more closely.

Over 2008-2009, Alex played in 262 games and had 1129 plate appearances with 281 hits, 65 home runs, a triple:double ratio of 1:50, an OBP of .397, and a SLG of .553. His OBP has a margin of error of .0146, so we can be 95% confident that over those years his baseline production would be somewhere between .368 and .426 and absent any time or age effect that is the range in which A-Rod should produce for any given period.

Two recent milestone home runs come to mind as examples of Rodriguez’s reputed choking. First, the stretch between home run #499 and #500 was 8 games and 36 plate appearances. (I’m intentionally ignoring extra plate appearances on the days he hit #499 and #500.) During that time, Alex had an OBP of only .306. That’s a difference of .091 over 36 plate appearances and that performance has a standard error of about .078 when compared with his regular performance, implying a t-value of about 1.16. With 35 degrees of freedom, Texas A&M’s t Calculator gives a p-value of about .127, so this difference is marginally within the realm of chance. (The usual cutoff for significance would be .05.)

A-Rod hit his last home run on July 22. Discounting the plate appearances after his last home run, he’s played in 11 games with a paltry .255 OBP and .238 SLG over 47 plate appearances. His .255 OBP has a difference of about .142 and a standard error of about .064. That implies a t-value of about 2.21, with a p-value of about .016. That is, the probability of this difference occurring by chance is less than 2%. That gives us one result as close to significant and one as probably significant.

As a side note, A-Rod’s Choke Index continues to rise. He’s gone 48 plate appearances without a home run, and at a rate of .055 home runs per plate appearance the probability of that occurring by chance is about .066. That leaves his Choke Index at .934.

The Choke IndexAugust 1, 2010

Posted by tomflesher in Baseball.
Tags: , , , , , , , ,

It’s been quite a while since Alex Rodriguez hit Home Run #599 – nine days since July 22, but more quantifiably, 42 plate appearances. Just how much of a slump is he in? I’d like to propose a quantifiable answer: the Choke Index.

From 2000 to 2009, A-Rod was hitting approximately .064 home runs per plate appearance. In 2008 he hit .059 and in 2009 he hit .056, so it’s probably much fairer to use a slightly lower rate. I’m going to make the assumption that Rodriguez’s true production is about .055 home runs per plate appearance, since he exhibited a downward trend and his 2010 production has been very low. (It also cuts him some additional slack in the Choke Index.)

Simply, we should assume that A-Rod’s failure to produce is merely the result of chance, and not due to choking or media distraction or even Rodriguez’s discomfort with the special chipped baseballs. (A better man than I would call this the Numbered Ball Effect.) Then, we should see how likely that is.

At .055 home runs per plate appearance, the likelihood of going 42 plate appearances without a home run is $(1-.055)^{42}$ or approximately .093. The Choke Index is simply $1-(likelihood)$ or, in this case, .907. As it becomes progressively less likely that Rodriguez will go another plate appearance without hitting a home run, the Choke Index number rises. A theoretical Choke Index of 1 would indicate that the player’s lack of home run hitting is nearly impossible to describe by chance alone.

A-Rod’s Choke Index between #499 and #500 was about .877. This is a man who doesn’t handle milestones well.

Another example was Gary Sheffield in 2009, when he was attempting to hit his 500th home run. In the previous two years, he hit approximately .041 home runs per plate appearance. Much was made of Sheffield’s trouble hitting #500, but since he was hitting almost exclusively as a pinch hitter, he simply didn’t have many opportunities. Between his final plate appearance on September 26 of 2008 and his only plate appearance on April 17 of 2009, Sheffield went 21 plate appearances without hitting a homer. That gives him a choke index of .556.

Barry Bonds, meanwhile, was hitting .065 home runs per plate appearance in the seasons prior to his record-breaking home run #756. #755 was hit in Bonds’ first plate appearance on August 4, 2007. Bonds made 3 more plate appearances, all walks, in that game. He hit #756 in his third plate appearance only three days later on August 7.  He had August 5 off and made 4 plate appearances on August 6, meaning that Bonds went 9 plate appearances between home runs, giving him a choke index of .454.

Rodriguez will hit his 600th home run eventually, but it’s getting painful to watch.

The Best Game EverJuly 30, 2010

Posted by tomflesher in Baseball.
Tags: , , , , , , , , , , , , , , , , , , , , ,

Two of my favorite things about baseball happened during tonight’s game between the Yankees and the Indians.

First of all, in the top of the ninth inning, corner infielder Andy Marte pitched for the Indians. Marte pitched a perfect ninth and coincidentally struck out Nick Swisher, who was brought in to pitch for the Yankees in a similar situation last year and struck out Gabe Kapler of the Tampa Bay Rays. I can’t promise it’s true, but I think that puts Swisher at the top of the list for involvement in position player pitcher strikeouts.

Marte’s presence was necessary because the Indians used seven other pitchers. Starter Mitch Talbot went only two innings, and the Indians got another two out of Rafael Perez. Frank Hermann took the loss for the Indians during his 1 1/3 innings. Tony Sipp pitched another 1 1/3, and Joe Smith managed to give up four earned runs in 1/3 of an inning before being removed for Jess Todd for an inning. In the bottom of the 9th, Marte was all the Indians had left.

Not to be outdone, Joe Girardi gave up his designated hitter by moving his DH – funnily enough, it was Swisher – into right field as part of a triple switch. Swisher moved to right field; Colin Curtis moved from right field to left field; Marcus Thames moved from left field to third base;  finally, pitcher Chan Ho Park was put into the batting order in place of Alex Rodriguez, who came out of the game.

Finally, A-Rod is up to 33 plate appearances without a home run. Assuming his standard rate of .064 home runs per plate appearance, the likelihood of this happening by chance is $.936^{33} = .113 \approx 11.3 \%$. I stand by my belief that there’s something other than chance (i.e. distraction or other mental factors) causing Rodriguez’s hitting to suffer.

The 600 Home Run AlmanacJuly 28, 2010

Posted by tomflesher in Baseball, Economics.
Tags: , , , , , , , , , , , , , ,

People are interested in players who hit 600 home runs, at least judging by the Google searches that point people here. With that in mind, let’s take a look at some quick facts about the 600th home run and the people who have hit it.

Age: There are six players to have hit #600. Sammy Sosa was the oldest at 39 years old in 2007. Ken Griffey, Jr. was 38 in 2007, as were Willie Mays in 1969 and Barry Bonds in 2002. Hank Aaron was 37. Babe Ruth was the youngest at 36 in 1931. Alex Rodriguez, who is 35 as of July 27, will almost certainly be the youngest player to reach 600 home runs. If both Manny Ramirez and Jim Thome hang on to hit #600 over the next two to three seasons, Thome (who was born in August of 1970) will probably be 42 in 2012; Ramirez (born in May of 1972) will be 41 in 2013. (In an earlier post that’s when I estimated each player would hit #600.) If Thome holds on, then, he’ll be the oldest player to hit his 600th home run.

Productivity: Since 2000 (which encompasses Rodriguez, Ramirez, and Thome in their primes), the average league rate of home runs per plate appearances has been about .028. That is, a home run was hit in about 2.8% of plate appearances. Over the same time period, Rodriguez’ rate was .064 – more than double the league average. Ramirez hit .059 – again, over double the league rate. Thome, for his part, hit at a rate of .065 home runs per plate appearance. From 2000 to 2009, Thome was more productive than Rodriguez.

Standing Out: Obviously it’s unusual for them to be that far above the curve. There were 1,877,363 plate appearances (trials) from 2000 to 2009. The margin of error for a proportion like the rate of home runs per plate appearance is

$\sqrt{\frac{p(1-p)}{n-1}} = \sqrt{\frac{.028(.972)}{1,877,362}} = \sqrt{\frac{.027}{1,877,362}} \approx \sqrt{\frac{14}{1,000,000,000}} = .00012$

Ordinarily, we expect a random individual chosen from the population to land within the space of $p \pm 1.96 \times MoE$ 95% of the time. That means our interval is

$.027 \pm .00024$

That means that all three of the players are well without that confidence interval. (However, it’s likely that home run hitting is highly correlated with other factors that make this test less useful than it is in other situations.)

Alex’s Drought: Finally, just how likely is it that Alex Rodriguez will go this long without a home run? He hit his last home run in his fourth plate appearance on July 22. He had a fifth plate appearance in which he doubled. Since then, he’s played in five games totalling 22 plate appearances, so he’s gone 23 plate appearances without a home run. Assuming his rate of .064 home runs per plate appearance, how likely is that? We’d expect (.064*23) = about 1.5 home runs in that time, but how unlikely is this drought?

The binomial distribution is used to model strings of successes and failures in tests where we can say clearly whether each trial ended in a “yes” or “no.” We don’t need to break out that tool here, though – if the probability of a home run is .064, the probability of anything else is .936. The likelihood of a string of 23 non-home runs is

$.936^{23} = .218$

It’s only about 22% likely that this drought happened only by chance. The better guess is that, as Rodriguez has said, he’s distracted by the switching to marked baseballs and media pressure to finally hit #600.

600 Home Runs: Who’s Second?July 25, 2010

Posted by tomflesher in Baseball, Economics.
Tags: , , , , , , , ,
1 comment so far

Alex Rodriguez is, as I’m writing this, sitting at 599 home runs. Almost certainly, he’ll be the next player to hit the 600 home-run milestone, since the next two active players are Jim Thome at 575 and Manny Ramirez at 554. Today’s Toyota Text Poll (which runs during Yankee games on YES) asked which of those two players would reach #600 sooner.

There are a few levels of abstraction to answering this question. First of all, without looking at the players’ stats, Thome gets the nod at the first order because he’s significantly closer than Driving in 25 home runs is easier than driving in 46, so Thome will probably get there first.

At the second order, we should take a look at the players’ respective rates. Over the past two seasons, Thome has averaged a rate of .053 home runs per plate appearance, while Ramirez has averaged .041 home runs per plate appearance. With fewer home runs to hit and a higher likelihood of hitting one each time he makes it to the plate, Thome stays more likely to hit #600 before Ramirez does… but how much more likely?

Using the binomial distribution, I tested the likelihood that each player would hit his required number of home runs in different numbers of plate appearances to see where that likelihood reached a maximum. For Thome, the probability increases until 471 plate appearances, then starts decreasing, so roughly, I expect Thome to hit his 25th home run within 471 plate appearances. For Manny, that maximum doesn’t occur until 1121 plate appearances. Again, the nod has to go to Thome. He’ll probably reach the milestone in less than half as many plate appearances.

But wait. How many plate appearances is that, anyway? Until recently, Manny played 80-90% of the games in a season. Last year, he played 64%. So far the Dodgers have played 99 games and Manny appeared in 61 of them, but of course he’s disabled this year. Let’s make the generous assumption that Manny will play in 75% of the games in each season starting with this one. Then, let’s look at his average plate appearances per game. For most of his career, he averaged between 4.1 and 4.3 plate appearances per game, but this year he’s down to 3.6. Let’s make the (again, generous) assumption that he’ll get 4 plate appearances in each game from now on. At that rate, to get 1121 plate appearances, he needs to play in 280.25 games, which averages to 1.723 seasons of 162 games or about 2.62 seasons of 75% playing time.

Thome, on the other hand, has consistently played in 80% or more of his team’s games but suffered last year and this year because he hasn’t been serving as an everyday player. He pinch-hit in the National League last year and has, in Minnesota, played in about 69% of the games averaging only 3 plate appearances in each. Let’s give Jim the benefit of the doubt and assume that from here on out he’ll hit in 70% of the games and get 3.5 appearances (fewer games and fewer appearances than Ramirez). He’d need about 120.3 games, which equates to about 3/4 of a 162-game season or about 1.06 seasons with 70% playing time. Even if we downgrade Thome to 2.5 PA per game and 66% playing time, that still gives us an expectation that he’ll hit #600 within the next 1.6 real-time seasons.

Since Thome and Ramirez are the same age, there’s probably no good reason to expect one to retire before the other, and they’ll probably both be hitting as designated hitters in the AL next year. As a result, it’s very fair to expect Thome to A) reach 600 home runs and B) do it before Manny Ramirez.