A Look Back: The First 2 Weeks of the NHL Season

As most teams have played ~4-5 games, I wanted to take a look back at the Sim's track record so far this season. I want to be completely transparent about how well the Sim is doing and I finally have an outlet to get that information to our listeners efficiently.

Coming into this season, I was incredibly anxious to start seeing how my new models would work, given the moderate success last year of the Sim and getting my foot in the door of the fantasy hockey prediction world. Most of all, I was curious to see how the three-pronged approach (frequency, payoff, and leverage) to prediction would compare to the traditional "point-projection" models that offer no insight into upside/downside.

As a comparison in this column, I will be referring to a points-based model (PBM) as the benchmark for precision. I built the PBM using simple multivariate regression on the same data I used to build Sim v2. 

Let's look at the first part of the "projection triad" that I'm using: Frequency.


My frequency is tied to the percentage of the time Player X scores more than his salary expectation. In a PBM, the projection is the most likely number of points he will score; but this doesn't say anything about exactly how often this will happen.  To keep scoring consistent, each PBM projection was labeled either a 0 or 1 based on whether the projection was greater than the player's salary-based expectation (just like in Freq).


Accuracy is reported as RMSE, i.e. the lower the value, the closer to the actual outcome the prediction came. The effect of using a parameter to specifically measure floor here is evident. The Sim performed 42% better in terms of predicting floor. It should be noted that the average Freq of the Sim was 0.32 and the average Freq of the PBM was 0.29. This tells me that the PBM is providing a significant handicap to all players to account for the fact they don't meet expectations every game. In other words: PBM projections are projection of the "average" score, i.e. the payoff. So it stands to reason PBM will perform better in the next category.


The payoff in the Sim is the product of magnitude and frequency.  In a PBM, the payoff is directly related to the projection. As noted above, the average projection is lower than the average score above expectation because it must account for the ~70% of the time a player doesn't meet salary-based expectation. Without further ado, comparison of RMSE between payoff and actual DK points night to night.


This is about what I would expect. The Sim and a PBM are well-matched at predicting the overall payoff of playing any given player. Important to note is the distinct edge the PBM had in predicting high-priced players. This is something to watch out for in the future, but also shows that the Sim is better at targeting lower-price guys. Overall, the Sim performed 4% worse at predicting total payoff.


Review: The Sim's Payoff is comparable to a regular model that only projects points on a game by game basis. Because of this, it is perfectly acceptable to look solely at the Payoff category and know that this is the equivalent of any model you see in the marketplace right now. The Own% and Frequency stats are what should set the Sim apart, and the numbers back that up. While any model is right "on average," the Sim's inclusion of the likelihood of a player exceeding their salary-based expectation improves the game by game accuracy and precision.

Brent Burns:  Freq - 34%  /  Payoff - 3.4 Pts  /  DK ppg = 3.8

Shea Weber:  Freq - 40%  /  Payoff - 3.3 Pts  /  DK ppg = 3.4

Both players would appear on a traditional projection sheet at ~3.4 pts every night, but the projected frequency is 6% higher for Weber. So Weber will score less, more often to get to the SAME average ppg over time! This is the penultimate example of the shortcomings of traditional models: they're right on average, but not every night.


Coming soon: 

Line Stacks - As soon as we find a new way to get tables onto the website from my simulations, I will be adding a page for line stacks along with combined line ownership.

PICL - The mysterious rating system I include in my ramblings on the pod as well as the site. I will be detailing its use in an upcoming post. Because it is more related to fit in cash lineups, I want to have a good dataset before standing behind PICL 100%.

xDKpts - Ever wished there was a stat that reflected that players get unlucky? I have been working on an expected DK pts stat that will capture expected production, not just what actually happened. Finally, a guy with 9 shots, 4 minutes of PP time, and 21 corsi who didn't score will get the respect he deserves! Also, my hope is that this stat will point out guys who are undervalued.