Monday, February 10, 2014

Fantasy Basketball Manifesto Part IV - why MRiS Rankings are Better


Obviously, since I went through all the work to develop these rankings I probably have a reason. In part I, I explained how most websites (Yahoo, ESPN, and Basketball Monster, among others I am sure) do their rankings (Standard Scoring). In part II, I explained the idea of Rarity Rankings. Finally, in part III, I explained in detail my ranking system, which I am calling Modified Rarity Scoring, or MRiS.

The concepts are the same - try to normalize production across all categories. In Standard Scoring, this is done by using standard deviations away from the average, and in MRiS it is based on the rarity of production above a minimum production threshold. Since both systems attempt to compare production in all different categories, it is easy to compare them. 

Standard Scoring assigns each player a negative starting point by subtracting from their production the average production in each category. This means that if they have a zero score overall, they are producing the average amount overall. In a 100 player league, they should be ranked around 50th. However, since every single player gets the same "average production" subtracted, we can ignore it. Like I explained earlier in the Manifesto, if both teams got 30 extra points to start a game, the winner is still determined in the 48 minutes of actual play.

The normalized scoring unit is the standard deviation. A single "Z score" is assigned for every standard deviation of production. In Modifed Rarity Scoring, Equivalent Fantasy Points are the normalized ranking units. So, by setting EFP equal to a Standard Scoring "Z score", we can compare the two systems and see which one makes more sense.

Before we get to the numbers, I'll make a pitch on purely theoretical grounds. This is the important argument - if we only agreed to accept results that we were hoping to see in science, not much progress would occur. Obviously, this is not a rigourous scientific endeavor (as much as I try!) but the idea is the same. In my mind, it does not matter how the statistics are distributed amongst players, only how many stats you can accumulate overall as a team. I see no logical reason that standard deviations should be added together and used to compare players. Fantasy basketball works by accumulating statistics, not players. Modified Rarity Scoring is based on the simple principle of equality of categories.

And now, on to the numbers...

CategoryStandard ScoringMRiSPercent Difference (MRiS/SS)
Points1.571.570.0%
3pt Made8.7610.2517.0%
Rebounds2.92.982.8%
Assists3.454.0116.2%
Steals15.9216.564.0%
Blocks12.7918.5645.1%
FGOP12.9912.3-5.3%
FTOP20.2831.354.3%
Turnovers-9.4-10.137.8%

As you can see, I 'anchored' the scoring systems to points. It looks like most of the categories are more valuable in my system, but really this is demonstrating that Standard Scoring does not account for the high True Zero of points. There is actually a wider difference than Standard Scoring allows between players since we expect even the worst player to score a significant number of points. 

The big differences between the systems are 3PTM, Assists, Blocks, and FT%. These are all categories with high standard deviations - some players score a lot of these, other players score few. The Standard Scoring method would have you believe that blocks are less valuable because blocks are more scattered among players. Does it really matter if you win a category with 1 player scoring 15 blocks one week? 

Standard Scoring is faulted because it does not account for the minimum expected scoring of the worst player worthy of being picked up, and it mistakenly assumes that the more tightly-grouped players are in a category, the more valuable that category is. Most seasoned fantasy players know that it only takes a couple good producers to win blocks for you every week, and a lot of times those same players will guarantee that you lose Free Throw Percentage as well! These two effects are discounted in the Standard Scoring method.

Stay Tuned to StatDance.com for our rankings pages, soon to come! I will post the MRiS rankings in standard leagues with their EFP in each category, along with some Free Agent ideas and players to target if you are tanking certain categories. A lot to come!

Sunday, February 2, 2014

How to Accurately Rank Fantasy Basketball Players

In part I, I explained how fantasy players are usually ranked - with Standard Scoring. In part II, I introduced another way, Rarity Scoring. This, part III, is putting the finishing touches on Rarity Scoring by introducing what I call "True Zero" - finally giving us a quality means of ranking fantasy basketball players. In part IV, I will discuss the differences between Standard Scoring and Rarity Scoring (and hopefully show how much better my system is than the standard system).

True Zero


True Zero (t0) is the amount of the statistic that you expect is the baseline value for any player that is good enough to be owned in your league. Pure rarity scoring assigns weights to each statistical category so that they are equal in value, since each category is just as valuable as another. Using True Zero values, we will be able to equally value production accross all statistics above the minimum expected from owned players.

I know that this isn't a simple concept to grasp - at least with how well I described it - so here is an example to try to make it more obvious. Earlier in the Manifesto, to demonstrate pure Rarity Scoring, I grabbed the stats from the top 12 players and ranked a few of their stats. The weights ("Value" in the pictures) were for pure rarity scoring. Since points are by far the most common statistic, the other coefficients ("Values") are much higher, i.e. points is 1.00 and blocks is 29.23.



This example, and these coefficients, aren't applicable to actual leagues since it only accounts for twelve players and six categories. But the concepts are nearly identical. In this league, the worst player scores almost twenty points. If we assume that these are the only players you can play, in order to out-score your opponent, the worst player you can play will score twenty points. That makes a player who can score 30 points much more valuable!


(Click on image to open in larger view)

The relative value ("Coefficient" is the term I will usually use to refer to these values that scale different categories so they are equally weighted) of each category is represented by the blue "Value" row. Since the worst player in this league (like most leagues!) records 0 blocks, the True Zero (t0) is actually zero. The result is that blocks are 6 times (the blue "Value" region) as rare as points scored. In the future, I will refer to the resulting score (Coefficient*Production) as Fantasy Points Equivalent (FPE).

In actual leagues, determining True Zero is much more complicated. We have to figure out what the worst player in each category would produce but still be good enough to be owned. To be clear, a player who scored the t0 value in every category would be a terrible fantasy player. For example, t0 would be the number of blocks we expect a lazy point guard to have or the number of points a pure defensive specialist will score. Using a smooth-line approach gives us what we expect to find on the waiver wires in each category, as a minimum.

True Zero by Category


It turns out that production in fantasy basketball is best represented with exponential decay. Using this knowledge, I smooth the lines and find what the worst player in each category should produce in that category. For categories like Blocks and Threes, these values are basically zero, but for categories like Points and Rebounds, there is significant expected value for everyone in the league.

These plots are FPE for the best 156 players ranked using all of the categories except turnovers. Unfortunately, the way my spreadsheet is set up this is much easier than the actual stats and I'd have to tinker with all of my numbers to get these screenshots to reflect production instead of the equivalent FPE. The important thing to notice is how the smoothed lines match up with production (or don't!) and the general shape of the plots. OK, maybe it's not important, but I thought it was interesting to see the shapes of the stats.

To calculate the weight of each category, I add up all the production with the top players (based on league size, if you have 10 teams of 13 players, your population is 130) , then subtract (number of players in the league)*(minimum expected production), or Population*t0. Then I assign coefficients for each category so they are weighted equally above true zero.

The t0 is only used to develop the coefficients for each category, not in ranking players. This means I don't subtract the t0 value from production individually when I rank players - it would have no effect. If you took that value away from everyone, it would be like giving every NBA team 30 points to start the game. It changes the total score at the end, but the winner is still the person who scored the most during the game. All players start at zero and get the same credit for each point, steal, block, FGOP, or assist as every other player.

I also do not find a t0 for the percentage categories like Field Goal Percentage and Free Throw Percentage, since FGOP and FTOP true zeros are actually 0, which is what an empty spot on your roster scores and the average shooters score.

Finishing the System


When developing these numbers, the coefficients for the traditional counting categories stay pretty constant (points, rebounds, steals, assists, etc) no matter what time period you analyze over, but the percentages vary wildly. After some panic, I realized this is due to it being common for players to get in shooting streaks, so high values of FGOP and FTOP happen in short time periods compared to points scored. However, these values tend to flatten out in longer time periods. This means high and low shooting percentages are more rare if you look at season-long stats, and therefore FGOP and FTOP are "worth more" compared to points scored. For roto leagues that compare statistics over the course of the entire season, we should use the larger coefficients, but for more common weekly leagues, we should use the smaller coefficients.

It would simplify the numbers to use a flat 1.00 for points scored. I have decided, instead, to normalize the numbers to have a total of 10 FPE above the t0 value for points. So in my system, the average player would get 10 FPE in every category above the true zero. This makes total scores more consistent across different leagues, but is wholly unnecessary for analysis - using 1.00 would work identically; to convert my numbers listed below simply divide all the coefficients by the points coefficient. The end result is that the average score, for each category and no matter what your league size or settings are, is 10 FPE above t0.

So, How Do I Use This?


Now, for the numbers! The numbers below are for a league size of 156 (12 teams of 13 players) and 8 categories - Points, 3PM, FG%, FT%, Rebounds, Assists, Steals, and Blocks. I then computed the value of a Turnover, but the players were not ranked using turnovers. The process for creating these is the same as I went over in Part II, but using the t0 values.

CategoryStatCoefficientTrue Zero
Field Goal PercentageFGOP12.300
Free Throw PercentageFTOP31.300
Three Pointers Made3PTM10.250.10
Points ScoredPTS1.578.17
Total ReboundsREB2.982.17
AssistsASTS4.010.83
StealsSTLS16.560.44
BlocksBLKS18.560.08
TurnoversTO-10.130.95
Some notes on these numbers:

1. The t0 values listed above are actually stats, not Fantasy Point Equivalents.

2. FGOP over the season has a coefficient of 23.1, FTOP has 51.0. These numbers I used here are over the past 7 days, which we are assuming are average values (these are a little low, but not very far away).

3. An FTOP of over 31 does seem really high, but imagine how hard it is to have an entire free throw made over average percentage (78.8% so far in this example league) per game. You would have to average 5/5 from the line, or 9/10 for a full FGOP. It is just as helpful to your fantasy team to have a player score 31.3 FPE's in FTOP or Points - which is going 9/10 from the line or scoring 20 points (31.3/1.57=20). That sounds about right to me.

4. Note that some plays help you in multiple categories - making shots (free throws or regular) helps you in points and percentages. Some leagues have FTA or OREB as categories, which help in multiple categories as well.

5. I use yahoo's in-game average stats, so .245 blocks is calculated the same as .155 blocks, both show up as .2. These errors will average out almost all of the time, so I haven't tweaked my spreadsheet to fix this.

6. Due to normalizing for 10 FPE over t0, the total amount of FPE in every category will equal (10+t0*coeff)*Players_Owned.

7. Blocks are much more rare than steals, but since they are so much more spread out, they are nearly equal in true rarity. It is approximately the same to average 1 block as it is to average 1.5 steals, for fantasy valuation purposes.

Using These Results


To close, I've created a WolframAlpha widget to calculate players values using these coefficients. Disclaimers: different sized leagues and different league settings would have to change the numbers to be perfectly accurate. For leagues not using Turnovers, enter 0. For leagues using different categories not listed here, and different sizes - stay tuned, I will be posting results for all the categories I've heard of eventually!