##
How does scoring affect NHL standings?
*February 2, 2016*

*Posted by tomflesher in Economics, Hockey, Sports.*

Tags: hockey, modeling points, NHL, parameter estimation

add a comment

Tags: hockey, modeling points, NHL, parameter estimation

add a comment

Patrick Kane (right) leads the league in goals and assists (and therefore total points). His Blackhawks are second in the league but lead the Western Conference, and are all but a lock to make the playoffs this season even with 29 games to go. Kane’s 30 goals (and 27.8 goals created) are a touch ahead of the second-place Jamie Benn and Alex Ovechkin, who each have 28 goals. Just how much, though, are those extra goals worth, and would it be better to focus on defense instead? In this entry, I’ll use some econometric modeling techniques to float an answer to that question.

When a team wins a game, they receive 2 points in the standings; a team that loses in regulation receives no points, but a team that loses in overtime or a shootout gets a consolation point for the tie after regulation. Since points, rather than direct win-loss records, are used to determine who makes the playoffs, it’s possible for a playoff team to have fewer points than a team that doesn’t make the playoffs. Though that didn’t happen in 2014-15, the Nashville Predators did finish ahead of the eventual Stanley Cup winning Chicago Blackhawks by having more points (104 to 102) despite having one win fewer (47 to 48). Nashville’s 10 overtime losses to Chicago’s 6 made the difference.

In a previous post, I calculated a pythagorean exponent useful for estimating a hockey team’s win-loss percentage and found a value of 2.11; a similar method to calculate the percentage of available points yields an exponent of 2.09. (That is, win-loss percentage and points percentage are really fairly close, in expectation terms.) Those models make the assumption that a goal against is worth the same as a goal for. Let’s try a couple of things to estimate the usefulness: first, let’s keep the same assumption, and fit a model of *Points = a + b*Goals For – b*Goals Against*. Using 2014-2015, that gives us an optimal model of *Points = 92.2 + .357*Goals For – .357*Goals Against*, with a sum of squared errors of 557.355.

Relaxing that assumption – allowing a goal for to be worth a different amount than a goal against – is theoretically justified by the fact that a losing team not only scores no points but loses the opportunity to score an overtime point, too. If we allow that flexibility, 2014-2015 gives us an estimated *Points = 88.2 + .366*Goals For – .347*Goals Against*, with a sum of squared errors of 555.594.

Since 88.2 points are expected in a season of 0-0 games, the expected points at the beginning of a game are approximately 88.2/82 = 1.076. Someone scores two, but the point given to the eventual losing team for a regulation tie pulls that average up a smidge.

The upshot of all of this? Defense is important, but since the expected points for a goal are higher than the expected penalty for allowing the other team to score, a team at the margin should consider signing a roving defenseman rather than a strictly defensive player – and it wouldn’t be crazy for a losing team to play four attackers and one defender for the last few minutes of the game, even before pulling the goalie.

##
NHL Luck and Playoffs through February 1
*February 1, 2016*

*Posted by tomflesher in Hockey, Sports.*

Tags: Hockey analytics, NHL, Pythagorean luck

add a comment

Tags: Hockey analytics, NHL, Pythagorean luck

add a comment

Following the technically brilliant performance of John Scott in last night’s All-Star Game,^{1} the NHL is slightly past its midway point in the season. Most teams have played 49 of their 82 games, and while the Buffalo Sabres haven’t been mathematically eliminated yet, winning every game for the remainder of the season would only get them as many points as last year’s third-place finisher in the Eastern conference.

In other words, it’s spring.

Currently, the Eastern Conference would send Florida, Tampa Bay, and Detroit automatically to the playoffs from the Atlantic division, along with Washington and both New York teams. The wild cards would be Boston and New Jersey, with the latter at 52 points. The Sabres have 44.

The Western Conference would send Chicago, Dallas, and St. Louis from the Central and Los Angeles, San Jose, and Arizona automatically from the Pacific. Both wild card slots would come from the Central, going to Nashville with 56 and Colorado with 57. Minnesota (55) would be edged out, though they’d have made the playoffs instead of Arizona under last year’s system.

That’s a pisser for the Wild, who are actually playing about 3.3 wins below expectation; in other words, for their goals for and goals against, we’d expect about 26.3 wins, but they only have 23. Arizona, meanwhile, is at 24 wins, but “should” have only about 21.7. Though Minnesota can overtake Nashville or Colorado relatively easily, it looks like the more competitive Central division is going to lead to someone being excluded from the playoffs who would have made it under last year’s system.

Luck numbers through the All-Star Break follow.

Team | pythwins | luck | GP | W | L | PTS |

Arizona Coyotes | 21.7 | 2.3 | 49 | 24 | 20 | 53 |

Los Angeles Kings | 28.0 | 2.0 | 49 | 30 | 16 | 63 |

Washington Capitals | 33.2 | 1.8 | 47 | 35 | 8 | 74 |

St. Louis Blues | 26.2 | 1.8 | 52 | 28 | 16 | 64 |

Chicago Blackhawks | 31.5 | 1.5 | 53 | 33 | 16 | 70 |

Colorado Avalanche | 26.0 | 1.0 | 52 | 27 | 22 | 57 |

Ottawa Senators | 22.1 | 0.9 | 50 | 23 | 21 | 52 |

Philadelphia Flyers | 20.1 | 0.9 | 47 | 21 | 18 | 50 |

Anaheim Ducks | 21.1 | 0.9 | 47 | 22 | 18 | 51 |

Dallas Stars | 30.2 | 0.8 | 50 | 31 | 14 | 67 |

Calgary Flames | 20.4 | 0.6 | 48 | 21 | 24 | 45 |

Detroit Red Wings | 24.5 | 0.5 | 49 | 25 | 16 | 58 |

New Jersey Devils | 24.5 | 0.5 | 50 | 25 | 20 | 55 |

New York Rangers | 27.0 | 0.0 | 49 | 27 | 17 | 59 |

Edmonton Oilers | 19.1 | -0.1 | 50 | 19 | 26 | 43 |

Winnipeg Jets | 22.2 | -0.2 | 49 | 22 | 24 | 47 |

Carolina Hurricanes | 23.2 | -0.2 | 51 | 23 | 20 | 54 |

San Jose Sharks | 26.3 | -0.3 | 48 | 26 | 18 | 56 |

Nashville Predators | 24.4 | -0.4 | 50 | 24 | 18 | 56 |

Florida Panthers | 29.5 | -0.5 | 49 | 29 | 15 | 63 |

Tampa Bay Lightning | 27.5 | -0.5 | 49 | 27 | 18 | 58 |

Pittsburgh Penguins | 24.6 | -0.6 | 48 | 24 | 17 | 55 |

Buffalo Sabres | 20.7 | -0.7 | 50 | 20 | 26 | 44 |

New York Islanders | 26.0 | -1.0 | 47 | 25 | 16 | 56 |

Montreal Canadiens | 25.0 | -1.0 | 50 | 24 | 22 | 52 |

Vancouver Canucks | 21.1 | -1.1 | 50 | 20 | 19 | 51 |

Boston Bruins | 27.1 | -1.1 | 49 | 26 | 18 | 57 |

Toronto Maple Leafs | 20.0 | -3.0 | 48 | 17 | 22 | 43 |

Minnesota Wild | 26.3 | -3.3 | 49 | 23 | 17 | 55 |

===

^{1}Since I will never be able to type that phrase again, I figured I’d use it while I could.

##
NHL Pythagorean Luck through December 10, 2015
*December 11, 2015*

*Posted by tomflesher in Hockey, Sports.*

Tags: hockey, NHL, Pythagorean expectation, wins above expectation

add a comment

Tags: hockey, NHL, Pythagorean expectation, wins above expectation

add a comment

Below is a plot of NHL teams’ Pythagorean luck through games played on December 10. The bubbles are scaled to the number of wins each team has.

Shockingly, the 12-16 Calgary Flames are 2.4 wins above their expectation, meaning that they should really be a 10-18 or 9-19 team right now. Meanwhile, the Canucks are suffering at 3.4 wins below expectation; at 11-19, they could easily be a .500 team if a few pucks had bounced differently.

Lucky wins for each team follow:

Team |
Lucky Wins |

Dallas Stars | 2.97 |

Montreal Canadiens | -1.23 |

Washington Capitals | 1.58 |

New York Rangers | -1.24 |

Los Angeles Kings | 1.38 |

New York Islanders | -0.77 |

Detroit Red Wings | 1.09 |

St. Louis Blues | 1.08 |

Nashville Predators | 0.30 |

Ottawa Senators | -0.21 |

Chicago Blackhawks | -0.29 |

Boston Bruins | -0.60 |

Minnesota Wild | -0.23 |

Florida Panthers | -0.93 |

Pittsburgh Penguins | 1.28 |

New Jersey Devils | -0.66 |

Tampa Bay Lightning | -1.65 |

Philadelphia Flyers | 1.41 |

Winnipeg Jets | 0.58 |

Vancouver Canucks | -3.40 |

San Jose Sharks | 0.39 |

Anaheim Ducks | 0.13 |

Arizona Coyotes | 1.50 |

Edmonton Oilers | -0.19 |

Calgary Flames | 2.40 |

Buffalo Sabres | -1.26 |

Toronto Maple Leafs | -1.58 |

Colorado Avalanche | -1.38 |

Carolina Hurricanes | 0.23 |

Columbus Blue Jackets | -0.70 |

League Average | 0.00 |

##
After 20 Games, How Are The Sabres Distributing Ice Time?
*November 23, 2015*

*Posted by tomflesher in Economics, Hockey, Sports.*

Tags: analytics, Dan Bylsma, Evander Kane, NHL, Rasmus Ristolainen, Sabres, shots

add a comment

Tags: analytics, Dan Bylsma, Evander Kane, NHL, Rasmus Ristolainen, Sabres, shots

add a comment

On Saturday, the Sabres managed to turn another close game into a decisive loss, and once again the game turned on the power play. Down 1-0 entering the third against Lindy Ruff‘s 16-4 Stars, Tyler Ennis took a crucial holding penalty to allow the Stars onto the power play. Though the Sabres had killed a trio of second-period penalties, they couldn’t take advantage of Dallas’s three penalties (including 51 seconds on the two-man advantage to start the second period). Jamie Benn knocked it in on the power play and then followed up less than two minutes later with a second goal; Antoine Russel had the go-ahead for Dallas in the first period.

Chad Johnson was back in net and saved 29 of 32, but the Sabres only took 22 shots on net. Dan Bylsma is on record as saying the quality, not the quantity, of shots is his aim. Bylsma certainly hasn’t rewarded a high proportion of shots while on ice; he’s rewarded players like Rasmus Ristolainen and captain Brian Gionta, both of whom are well below the 50% mark (meaning that the opposition takes more shots than the Sabres while Ristolainen or Gionta are on the ice), but Ristolainen is averaging almost 23 minutes per game and Gionta is hovering just below 18. In general, more minutes lead to more shots, of course, but there’s little informative about that.

Below is a bubble chart with Fenwick For% on the x-axis; Fenwick For% is the percentage of the total shots taken and attempted while the player is on-ice that were taken by the Sabres. On the y-axis is minutes per game. The bubbles are scaled by shots attempted by the player divided by total time on the ice, to show how influential each individual player’s shooting is on his own numbers.

Evander Kane is hovering up around the 20 minute mark; he’s a clear outlier in terms of number of shots per minute, at 64 shots and averaging around 20 minutes per game. He’s also up on top of the Fenwick For%, at 56.0. As far as shots per minute, the decision seems to be hit and miss; of course Zach Bogosian is going to get some ice time to reacclimate despite not taking tons of shots, and Jack Eichel, Tyler Ennis, and Gionta are clustered around the 15-20-minute mark despite their low Fenwick numbers because of their relative importance to the team. Oddly, Josh Gorges is clustered right in there with them; like Ristolainen, his low Fenwick numbers seem to raise flags about his defense. Still, the shots per minute on ice seem to be scattered fairly randomly throughout the lineup.

Not every player is a scorer. It’s distressing, though, to see a player like defenseman Cody Franson taking 75 shots and averaging 18:27 over 20 games when he doesn’t seem to be preventing all that many shots; his 47.6 Fenwick For% is well below the 50-50 mark we’d hope for. Meanwhile, Chad Johnson’s .901 save percentage ranks him 50th out of 68 goalies in the NHL and he’s sporting a 117 GA%, meaning he’s allowed 17% more goals than average. (Linus Ullmark has an 87.) Ullmark may not maintain this consistency, but the Sabres have to protect Johnson if they’re going to keep these tight games from breaking open.

St Louis visits Buffalo tonight. Ullmark will start; his counterpart will likely be Jake Allen, who has a .930 save percentage to Ullmark’s .926 and an 82 GA%. Allen, 25, is a solid young goalie, but so is Ullmark. This one will depend on solid defense …. and, hopefully, avoiding power plays.

##
Evaluating Hockey Analytics (and bonus luck numbers through November 15, 2015)
*November 16, 2015*

*Posted by tomflesher in Economics, Hockey, Sports.*

Tags: Corsi, Fenwick, Hockey analytics, NHL, Pythagorean expectation, Pythagorean luck, Sabres

add a comment

Tags: Corsi, Fenwick, Hockey analytics, NHL, Pythagorean expectation, Pythagorean luck, Sabres

add a comment

The Buffalo Sabres have been having a weird season. They’ve been outshot and won, they’ve outshot their opponents and lost, and (aside

from starting goalie Chad Johnson) their ice time leader, defenseman Rasmus Ristolainen, is bringing up the rear in relative Corsi and Fenwick stats. Ristolainen has a nasty -9.5 Corsi Rel, while fellow defenders Jake McCabe, Mark Pysyk, and Mike Weber have 8.5, 9.1, and 13.5, respectively. Ristolainen is averaging over 24 minutes a game, with the other three down by six to eight minutes each. What’s more, Ristolainen appears to be pulling his weight – he’s made 45 shots, second only to center Jack Eichel, and has 4 goals with an 8.9 shooting percentage. Ristolainen has 11 points (second only to Ryan O’Reilly with 14) but is tied with Tyler Ennis for the team’s worst +/- at -6. See? Weird year so far.

A lot of that is small sample size, of course. The Sabres are only 17 games into the 82-game season. They are, however, looking awfully lucky so far. Just how lucky? Let’s find out using the same Pythagorean metric that shows up in baseball.

Since Corsi and Fenwick both measure attempts to shoot, they’re noisier than goals. I was curious how much noisier, so I fired up R using the 2014 data and decided to update my post from earlier this year about the optimal Pythagorean exponent for the NHL. In it, I set up three minimization problems, all of them estimating winning percentage (and counting overtime losses as losses – the exponent changes only slightly if you estimate points-percentage instead of wins). Those three problems each minimized the sum of squares, using the Pythagorean formulas. The first used the traditional method of estimating goals and goals against; the second used Corsi For and Corsi Against; the third used Fenwick For and Fenwick Against.

The Goals For/Goals Against form () returned an optimal x value of a bit over 2.11, with a residual sum of squares of .0289. That means that if you square each team’s win-loss percentage and compare it to , then square all of the differences (to keep them positive) and add them up, you get a total of .0289. The number itself doesn’t mean anything, but it’s a useful way to compare one model to another.

The Corsi For/Corsi Against form returns an optimal x of 1.445, but the residual sum of squares ballooned to .24. That means on average the squared error is almost ten times as great – you get a pretty good predictor, but with much more “noise.”

Right in the middle, the Fenwick form yields an optimal x of 1.877, with a residual squared error of .203. It’s a better predictor of wins and losses than the Corsi version, but it’s still not as good a predictor of wins as the simple Goals For/Goals Against form.

Above, I’ve graphed each team’s winning percentage against the Pythagorean (Goals For/Goals Against form), as well as all three trendlines: note that the black Goals line and the red Fenwick line are extremely close, while the blue Corsi line is a bit higher up. Two conclusions can be drawn:

- The Fenwick line is a better predictor than the Corsi line, but the Corsi line appears to bias expected percentage upward. That is, it overestimates the imact of each shot more than goals and Fenwick do.
- Since the Fenwick line is a better predictor, that indicates that Corsi’s inclusion of blocked shots probably does just add noise. Blocked shots are, at least according to this model, of limited predictive value.

Through November 15, Corsi For % had a correlation of .11 with points and .125 with winning percentage; Fenwick For % had correlations of .17 and .19, respectively. Blocked shots had negative correlations in both cases.

Pythagorean luck is defined as the number of wins above expectation. Behind the jump are the numbers, through November 15, demonstrating which teams are lucky and which aren’t.

##
What is the Corsi statistic? (And why is there a Fenwick number?)
*November 16, 2015*

*Posted by tomflesher in Hockey, Sports.*

Tags: Corsi, corsi statistic, evergreen, Fenwick, fenwick number, Hockey analytics, NHL

2 comments

Tags: Corsi, corsi statistic, evergreen, Fenwick, fenwick number, Hockey analytics, NHL

2 comments

Growing up in Buffalo, I was surrounded by hockey, whether it was watching the Sabres or heading to the rink to watch my brother play defense as a bantam or high schooler. During those years, my father, who could barely skate, often served as a volunteer coach for my brother’s teams. Like Malcolm Gladwell’s story of Vivek Ranadivé leading his “little blonde girls” to success using out-of-the-box basketball coaching, my father felt he was bringing an outsider’s perspective to the game by emphasizing a simple philosophy: own the puck.

This is easier said than done, of course, and when a group of squirts, peewees, or bantams head out onto the ice they need to apply some serious skill in order to “own the puck.” Overall, though, the point of owning that puck is to put it into the net. So, logically, the more a team controls the puck, the more likely it is to control the game.

It’s possible, of course, for a team to take many more shots and still lose, but the Corsi stat is meant to measure overall control. As such, it includes all attempted shots, so Corsi, as such, is defined as **Shots + Attempted Shots – Shots Against – Attempted Shots Against**. This gives you a simple differential in shots.

You’ll also see the following stats:

**Corsi For:**Shots + Attempted Shots by the team, making it possible to isolate whether a team is making too few shots or allowing too many**Corsi Against:**Shots + Attempted Shots by the opposing team**Corsi For Percentage (CF%):**100*Corsi For/(Corsi For + Corsi Against), giving a ratio rather than a simple differential. This measures what percentage of shots and shot attempts a team makes compared to its opponents. A CF% above 50% means a team attempts more shots than its opponent.**Corsi On:**A team’s Corsi while a particular player is on the ice scaled up to 60 minutes of ice time, effectively measuring whether the player’s Corsi is as good as, better than, or worse than the team’s as a whole. A Corsi ON greater than the team’s means the player contributes proportionally more to the team than ice time would indicate.**Corsi Relative (Corsi REL)**: Corsi On – Corsi Off, showing whether a team performs better or worse with a player on the ice. If Corsi REL is positive, the team does a better job with the player on the ice.

Corsi was named after a Buffalo Sabres goaltending coach. Bob McKenzie of TSN shared the story of the Corsi number in 2014. Financial analyst Tim Barnes, writing under the pseudonym Vic Ferrari, heard Sabres GM Darcy Regier discussing shot attempts and save percentage as a goalie metric, but Ferrari didn’t care for the name “Regier Number” or “Ruff Number” (for Sabres coach Lindy Ruff). After browsing photos of the Sabres staff, Ferrari settled on Jim Corsi (above) as the eponym for the statistic. Interestingly, Corsi actually did come up with the idea and planted it in Regier’s head.

A similar stat, the **Fenwick**, simply discounts blocked shots since blocking shots is a skill.

##
A Pythagorean Exponent for the NHL
*March 17, 2015*

*Posted by tomflesher in Sports.*

Tags: hockey, luck, NHL, parameter identification, Pythagorean expectation, Sabres

4 comments

Tags: hockey, luck, NHL, parameter identification, Pythagorean expectation, Sabres

4 comments

A Pythagorean expectation is a statistic used to measure how many wins a team should expect, based on how many points they score and how many they allow. The name ‘Pythagorean’ comes from the Pythagorean theorem, which measures the distance between the two short sides of a right triangle (the hypotenuse); the name reflects the fact that early baseball-centric versions assumed that Runs^2/(Runs^2 + Runs Allowed^2) should equal the winning percentage, borrowing the exponent of 2 from the familiar Pythagorean theorem (a^2 +b^2 =c^2).

The optimal exponent turned out not to be 2 in just about any sport; in baseball, for example, the optimal exponent is around 1.82. This is found by setting up a function – in the case of the National Hockey League, that formula would be – with a variable exponent. This is equivalent to . Set up an error function – the standard is square error, because squaring is a way of turning all distances positive and penalizing bigger deviations more than smaller deviations – and minimize that function. In our case, that means we want to find the x that minimizes the sum of all teams’ . Using data from the 2014 season, the x that minimizes that sum of squared errors is 2.113475, which is close enough to 2.11 that the sum of squared errors barely changes.

Porting that exponent into the current season, there are a few surprises. First of all, the Anaheim Ducks have been lucky – almost six full wins worth of luck. It would hardly be surprising for them to tank the last few games of the season. Similarly, the Washington Capitals are on the precipice of the playoff race, but they’re over four games below their expected wins. With 11 games to go, there’s a good chance they can overtake the New York Islanders (who are 3.4 wins above expectation), and they’re likely to at least maintain their wild card status.

On the other end, somehow, the Buffalo Sabres are obscenely lucky. The worst team in the NHL today is actually 4 games better than its expectation. Full luck standings as of the end of March 16th are behind the cut.