jump to navigation

Why isn’t baseball’s free agent market clearing? February 21, 2019

Posted by tomflesher in Baseball, Economics.
Tags: , , , , ,
add a comment

There’s been some discussion of the free agent market in baseball and its alleged inefficiency – that players like Manny Machado don’t sign until February and Bryce Harper is still unsigned, for example. Adam Wainwright, for example, has threatened a strike over free agency.

Certainly, there are many factors in play. However, the fact that there are stars who aren’t being picked up doesn’t mean that there’s anything nefarious afoot. Brad Brach, who signed with the Cubs on February 11, has complained about the teams’ use of algorithms to value players:

https://platform.twitter.com/widgets.js

Let’s take that at face value and build a model of algorithms and noise. (It seems that Brach is implying collusion by teams, but in a future post I’ll discuss why I don’t think that’s likely.)

First, the simplifying assumptions:

  1. Players have an accurate valuation of their own talent levels (This is difficult to justify because players have an incentive to overvalue themselves, but the conclusions would not change qualitatively by relaxing this assumption)
  2. Teams have a noisy valuation of players based on the players’ talent levels (This is essentially the face value Brach’s claim: that teams use ‘algorithms’ based on player talent.)
  3. There are two teams with similar noise levels. (Modeling different forms of bias, or different preferences by teams, would probably not change the outcome very much, but would affect the distribution of players. Meanwhile, the market for some players is fairly large, but for many it’s very small, especially as prices rise.)
  4. All contracts are for one year. (This avoids the trouble of modeling players’ intertemporal rates of substitution, but a future version of this model may include preferences about both pay and number of years.)
  5. If a player is offered a contract that he thinks accurately reflects or overpays him, he signs with the team that offers him the bigger contract.

Poorly-constructed R code for a simulated free agent season:

data<-matrix(1:5000,nrow=1000,ncol=5)
for (i in c(1:1000)){data[i,1] <- runif(1)
data[i,2] <- data[i,1]+rnorm(1,mean=0,sd=.05)
data[i,3] <- data[i,1]+rnorm(1,mean=0,sd=.1)
data[i,4] <- max(data[i,2],data[i,3])
data[i,5] <- if(data[i,4]>=data[i,1]) data[i,5]=1 else data[i,5]=0}

Basically, generate a vector of random player talent levels; team 1 accurately values players with a standard deviation of .05, while team 2 accurately values them with a standard deviation of .1. 1000 players go on the market. Outcome:

V1 V2 V3 V4 V5
Min.     :0.0008885 Min.   : -0.1324 Min.   : -0.2024 Min.   : -0.1324 Min.   :0.000
1st Qu.:0.2613380 1st Qu.: 0.2621 1st Qu.: 0.2608 1st Qu.: 0.3012 1st Qu.:1.000
Median :0.4984726 Median : 0.4968 Median : 0.5133 Median : 0.5511 Median :1.000
Mean     :0.4997539 Mean   : 0.4987 Mean   : 0.5087 Mean   : 0.548 Mean   :0.754
3rd Qu.:0.7425434 3rd Qu.: 0.743 3rd Qu.: 0.7566 3rd Qu.: 0.7912 3rd Qu.:1.000
Max.     :0.9995596 Max.   : 1.1115 Max.   : 1.2508 Max.   : 1.2508 Max.   :1.000

That’s right – only 754 of the 1000 players signed. (In multiple simulations, the signing rate hovers around 75%. This makes sense theoretically, since valuations are independent: half the players will be undervalued by each team so 1/4 will be undervalued by both teams.)

Interestingly, player 973 is unsigned:

[973,] 0.9683805341  0.9472948838  0.874961530  0.9472948838    0

He evaluated himself at below the 97th percentile, but got unlucky in that both teams evaluated him below that: team 1 would offer him a 95th percentile contract and team 2 would rank him even further down.

Meanwhile, player 25 gets lucky:

[25,] 0.0109281745  0.0236191242  0.089982324  0.0899823237    1

Despite being in the 1st percentile, both teams accidentally overvalue him, and his contract ends up being suited to a player with nearly 9 times his value. (For the phenomenon where competition leads reliably to overpayment, see “winner’s curse.”)

We’re going to see both of these types of errors in any market where there’s a subjective evaluation of players. Particularly if the teams are using algorithmic valuations, much of the information they’re based on is going to be publicly available; even if teams weight it differently, efficient algorithms are likely to produce similar results.