Stat Dance: 2012

Wednesday, August 8, 2012

NBA Draft Analysis IV: Ranking the GMs

This is the fourth installment of the StatDance.com NBA Draft Analysis series. In Part I, I established the basis of how the process works - how to value players and establish an Expected Value based on their draft position. In Part II, I ranked the last ten draft classes by strength. And Part III was ranking the best-drafting teams over the last ten drafts.

Now, we get to the decision-makers and analyze the General Managers themselves. Originally, this was why I started the NBA Draft analysis project. I looked back at who was the GM for every team at every draft over the last ten years (2002-2011 drafts) and assigned each of them credit (or blame) for their team's draft that year. I do realize that every GM has help and/or orders on draft night - the team owner or another executive might force some picks and the GM has no choice but to take responsibility for it.

For example, Michael Jordan has been widely ridiculed for running the Bobcats poorly, but he has never actually been the GM in Charlotte. Therefore, he is not eligible for the rankings, despite most likely having a lot of influence over who gets drafted. Another example is Pat Riley, who was hired as the head coach and team president of the Heat in 1995, but has only had the General Manager title since the 2009 draft. While I assume that Riley was calling the shots since he got there, if I make a judgement call for any team, I have to look in to every team - and it still wouldn't be fair. Even Pat Riley still has a boss, and it could have been the owner's decision to draft Wade in 2003. The point is, there is no fair way to look at this unless we use a black-and-white system of who had the GM title.

I highly recommend looking back at Part I and Part III for further information about how these numbers were generated if you are interested. Now, on to the rankings!

The Top 10 Drafting Tenures as GM (2002-2011)

The formula I came up to rank the top GM tenures with is overly complicated, but basically it values drafting well over a large sample size. To qualify, you must have at least 5 draft picks. This is a ranking I plan on taking another look at in the future with a wider historical window, since ten years doesn't really do this justice, especially if you consider how many players can still change their evaluation.

The Worst 10 Drafting Tenures as GM (2002-2011)

Much simpler ranking - total raw value short of Expected Value. I did not include Jerry Krause since Jay Williams' motorcycle accident is the only reason he would have made the list. This list favors failed GMs from early in the 2002-2011 window, but the more recent GMs might argue their picks still might turn out, and I tend to agree.

Ranking the Current GMs from 1 to 30

The simplest rating so far - just ranking by percentage of Expected Value they have drafted as GM from 2002-2011. Some of these executives have multiple tenures during this period, but they have all been combined to give a overall drafting performance over the last decade.

Before Pat Riley emails me and complains that he doesn't deserve to be on the bottom of this list, let me just say that its not really fair to say Riley has drafted worse than Kahn and Pritchard. They have a much larger raw deficit, this is just simple percentage based ranking.

Some other interesting notes:

Kevin Pritchard has a new job for the Pacers after he showed up in the top 10 worst tenures in the last decade ranking. He made the mistake in Portland a lot of people would have made, drafting Oden over Durantula. He was not far below average other than that pick.
Michael Jordan was only GM for one year that I looked at in Washington (2002) - he got 70% of his expected value with four picks, two in each round.
17% of the league's current GMs were not in charge of a team from 2002-2011
Donnie Nelson looks like a poor drafter in this analysis, which might be true - but he has been in charge of the Mavericks for 10 years and has only had 1 pick in the top 24 in that time (he picked Devin Harris 5th, who has outperformed his EV). During that tenure they are second in the NBA in winning percentage. That's hard to do even if you find talent late in the draft - which he obviously can't do.

Thanks, as always, to Basketball-Reference.com for their data (both player stats and executive listings), this would be even more time-intensive if not for that resource.

This was Part IV of the StatDance.com NBA draft analysis.
Part I: Determining the expected value of a draft pick
Part II: Ranking the Strongest NBA Drafts

Part III: The best and worst drafting teams
Part V: Who did they miss? Looking at the undrafted free agents in the NBA - Coming Soon

Monday, August 6, 2012

NBA Draft Analysis Part III: Top Drafting Teams of the Last Decade

This is part III of the StatDance.com NBA Draft Analysis series. In part I, I went over how to fairly evaluate a draft pick. Basically, the contribution of each pick is measured using Player Efficiency Rating and the number of minutes played every year. The first eight years of a career are weighted, and then measured against the Expected Value of that draft pick. The Expected Value is a smooth-line historical average, based on years after being drafted and the draft position. So, if a player performs better than the average player drafted at his spot, he gets a positive evaluation, and vice versa. In part II, I looked at which drafts over the last ten drafts (2002-2011) were the strongest and the weakest.

While it is true that there are other ways to build your team from year-to-year, the draft is the only organic way to acquire talent. Free agency is great - you get a known commodity (usually!) but it is dangerous to count on, since you never know for sure what players you are able to sign. In order to field a strong team, you need to acquire talent through the draft - if you trade them away to land other assets, it still took an astute evaluation of the talent to get the players you need.

How are championship teams actually put together? Lets look at the last 5 NBA champions and see how they built their teams. We will look at the top 3 or 4 players in PER*MP for each team.

2012 - Miami Heat
LeBron James - 71408 - Free Agency
Chris Bosh - 37932 - Free Agency
Dwane Wade - 42737 - Drafted

2011 - Dallas Mavericks
Dirk Nowitzki - 58594 - Drafted
Tyson Chandler - 37886 - Trade
Shawn Marion - 38301 - Trade
Jason Terry - 40768 - Trade

2010 Los Angeles Lakers
Andrew Bynum - 39935 - Drafted
Kobe Bryant - 62086 - Drafted
Pau Gasol - 55029 - Traded

2009 - Los Angeles Lakers
Kobe Bryant - 72224 - Drafted
Pau Gasol - 66578 - Trade
Lamar Odom - 38445 - Trade

2008 - Boston Celtics
Kevin Garnett - 58898 - Trade
Paul Pierce - 56330 - Drafted
Ray Allen - 43033 - Trade

2007 - San Antonio Spurs
Tim Duncan - 71149 - Drafted
Manu Ginobili - 49646 - Drafted
Tony Parker - 53479 - Drafted

Only 8 of the 17 players that were the major contributors to a title had been drafted by the team that they took to a championship, but only 2/16 were actually signed in free agency. This just shows that (at least over the last six years) it is vitally important to acquire assets in the draft so you can play them, or at least trade them for who you want.

In today's NBA, free agency greatly favors the stronger teams - great players want to win. And in order to trade for the players you need to complete your team, you need to have assets - thats where drafting wins you championships. The Heat don't land LeBron without drafting Wade. Every championship team is built by acquiring talent, and the biggest part of that is on draft night.

I've analyzed the NBA draft for the last ten drafts - 2002-2011 (2012 would be useless to analyze, since they haven't played yet). I compared the results for each team with their winning percentage over the last ten years. For each team, I compared the value they got out of their draft with the Expected Value of the picks they got "credit" for. To get "credit" for a draft pick, the team must either draft with their own pick and keep the player, or acquire a player's draft rights near draft time (usually on draft night, but occasionally afterwards).

Over the ten year span, I have winning percentages for each team, ranking from .358 (Charlotte) to .706 (San Antonio). Then I have the draft successes. Of the 30 teams, 15 have gotten over 100% of their Expected Value, and 15 have gotten less. The most successful team (Boston) has gotten 149% of its expected value. The least successful team drafting, the Clippers, has only gotten 68.5% of their Expected Value.

Can You Win Without Good Drafting?

While everyone knows that drafting better players correlates with winning basketball games, its nice to know that the system works, so lets analyze the results. Only 4 teams have a winning record over the last 10 years without also getting over 100% their Expected Value on draft night. The teams are: Dallas, Denver, Phoenix, and ~~San Antonio~~ Houston (Luis Scola was credited to Houston mistakenly and is now credited to the Spurs - once I switched him, the Rockets went below 100% and the Spurs went over 100%). The Mavs (Dirk), and the Spurs (Duncan/Parker/Manu) all drafted huge stars before the first year of my analysis. If I expanded my analysis another five years, the Spurs would be one of the strongest drafting teams. The Mavericks are only at 81%, after adding Nowitski they might still be close. The rest of that team is assembled with pieces that they got at a discount in trades.

Denver has only had two bad drafts in the last decade: 2002 (picking Nikoloz Tskitishvili 5th) and 2005 (picking Julius Hodge 20th). Since then, they have consistently had good drafts or gotten rid of all their picks (you can't mess up a draft if you trade the picks for proven commodities). They also turned Carmelo Anthony into a lot of assets. Still, definitely an exception with overall poor draft performance and a .551 winning percentage.

Phoenix is an even better example of poor drafting and yet still winning, having had only two good drafts in the last decade. Their Nash aqcuisition really propelled them a long ways, with a .595 winning percentage and one great pick (Amare).

Houston has drafted just below 100% of their Expected Value and yet has a winning percentage of .562. They have been good enough since they drafted Yao Ming in 2002 to avoid any high draft picks. While they are at 98%, they are only 13966 points away from being at 0%, the closest of any team to breaking even.

So, to recap - only 4 teams posted winning records without good draft results, and one of them was very close to breaking even.

Can You Lose While Drafting Well?

There are ways to win in the NBA without drafting well - so of course there are other ways to lose, too. It only takes a few really horrible free agent signings to completely tank a franchise. Fortunately for my analysis, it seems the teams that draft well are more likely to run their team well - there are only 5 teams that have a positive draft record and have managed to post a losing record: the Wizards, Knicks, Hornets, Kings, and 76ers.

The Wizards are barely positive with the draft at 102% of Expected Value, but they suffered from Michael Jordan's mismanagement from 2000-2003 and then the Gilbert Arenas era (four productive years for the Wizards and two contracts signed worth a total of $170 Million).

The Knicks are the classic case of great drafting and horrible management. To quote Bill Simmons (pretending to quote Isiah) "If you look at what I've done over the years, I always drafted well: Stoudamire, T-Mac, Camby, Frye, Ariza … you want to stockpile as many assets as possible, only because it gives you more options to do something dumb." What more can I say?

The Hornets (winning percentage: .488) have gotten an impressive 131.8% of their Expected Value over the last ten years. Their winning percentage the last five years (after they moved back to New Orleans) is .530 - they had some bad years when they were in Oklahoma City.

Sacramento, despite their winning percentage over the last ten years of .456, has drafted from the second round or late in the first round the first five years of our analysis, 2002-2006. Since then, their first picks have been 10, 12, 4, 5, and 10. Their drafting record is stellar, having had only 1 year the past 10 under 100% of Expected Value (2006 when they picked Quincy Douby 19th with their only pick). Their winning percentage the last 5 years is an atrocious .320, even worse than the 10 year mark. Either the Kings are going to start winning titles now, or they are one of the worst-managed teams in the history of the league.

Philadelphia is the best example of good drafting and a bad record - boasting a ridiculous 147.3% value from drafting while posting a .474 winning percentage. The 76ers are a study of mediocrity - always playing well enough to avoid drafting too high (Iguodala 9th in 2004 being their only top 10 selection), but never having the talent to really start winning.

Ranking The Best Drafting Teams

So, to summarize the last two sections: 3 teams drafted poorly with winning records, and 5 teams drafted well and had losing records. That means 23 teams either posted winning records with positive draft results, or posted losing records with negative draft resuluts (under 100% Expected Value).

So we are left with the results! Here are the teams that have gotten the best value for their picks from 2002-2011. Of course, these rankings could all change a lot since the majority of the players are still playing, but this is how it stands today.

Boston Celtics 148.9%
Philadelphia 76ers 147.2%
Sacramento Kings 138.0%
New Orleans Hornets 131.8%
Cleveland Cavaliers 131.5%
Miami Heat 128.1%
Los Angeles Lakers 117.8%
New York Knicks 116.2%
Indiana Pacers 113.6%
Utah Jazz 112.8%
Detroit Pistons 111.6%
Orlando Magic 105.5%
San Antonio Spurs 104.4%
Washington Wizards 102.2%
Chicago Bulls 101.4%
Houston Rockets 98.32%
Milwaukee Bucks 97.37%
Atlanta Hawks 93.61%
Oklahoma City Thunder 91.45%
Memphis Grizzlies 90.72%
Phoenix Suns 89.28%
Charlotte Bobcats 88.19%
Denver Nuggets 82.86%
Toronto Raptors 82.82%
Brooklyn Nets 80.83%
Dallas Mavericks 80.55%
Portland Trail Blazers 79.77%
Golden State Warriors 77.49%
Minnesota Timberwolves 72.65%
Los Angeles Clippers 68.46%

And here is each team, with every pick they get credit for over the last ten years.

(note: Houston no longer has credit for Luis Scola)

(note: San Antonio now has credit for Luis Scola)

This was Part III of the StatDance.com NBA draft analysis.
Part I: Determining the expected value of a draft pick
Part II: Ranking the Strongest NBA Drafts
Part IV: We evaluate every NBA GM since 2002 - Coming Soon
Part V: Who did they miss? Looking at the undrafted free agents in the NBA - Coming Soon

Friday, August 3, 2012

NBA Draft Analysis Part II: Ranking the Draft Classes

In Part I of the NBA Draft Analysis series, I went through the methodology of determining a players worth, and listed some of the best value picks of all time. In this second installment (of five) I'll go through each of the last ten drafts (2002-2011) and look at some of the best and worst picks from each draft, and then rank the drafts in order of overall strength.

To briefly recap the value system, for every pick in the last ten drafts we average out the performance in an eight-year weighted average and determined the expected value from each of the picks. If you haven't read Part I yet, I wrote a pretty detailed explanation of the system.

You can view a gallery of the drafts (from Part I) directly on imgur from here. The overall rankings are based on the total production of all players drafted divided by the total expected value.

2002
Overall: 83.16%

Best Picks: Yao Ming (1), Amare Stoudamire (9), Carlos Boozer (34)
Worst Picks: Nikoloz Tskitishvili (5), Dajuan Wagner (6)

Only 11 players performed over 125% of their expected value. This draft was pretty weak across the board with the top 10, 11-30, and 31-57 slices under-performing.

2003
Overall: 120.21%

Best Picks: LeBron (1), Carmelo Anthony (3), Chris Bosh (4), Dwyane Wade (5)
Worst Picks: Darko Milicic (2), Mike Sweetney (9)

There were lots of standout picks in this draft, but with the four greats I have listed, it doesn't seem fair to list the others. David West at 18 performed at the Expected Value of a #2 pick, Josh Howard (29) at a #3 pick, and Mo Williams (47) close to #4 value.

Interestingly, this draft was not exceptionally deep. The overall 120% value mostly due to the players I've already mentioned. Overall, over a third (23/58) of the picks gave less than 25% of their Expected Value overall, an average number over the last decade.

It should be noted that the Darko pick by the pistons was made exponentially worse by the other members of the top 5. The Pistons had just won the NBA Championship - imagine if they had added a Carmelo or a Wade to that team.

2004
Overall: 100.28%

Best Picks: Dwight Howard (1) Andre Iguodala (9), Josh Smith (17), Kevin Martin (26), Al Jefferson (15)
Worst Picks: Shaun Livingston (4), Rafael Araujo (8), Luke Jackson (10)

This draft is exceptional for having a very shallow talent pool. With the names listed above, there was a lot of talent trafted in the first round. Three players went on to have production consistent with being drafted first overall (Howard, Iguodala, and Josh Smith), and four more produced number two overall values - Okafor (2), Ben Gordon (3), Luol Deng (7), and Al Jefferson (15). Thats seven players that could have been drafted #1 or #2 and been a worthy selection. But unlike most drafts, there were almost no players drafted in the second round that went on to have significant careers - Trevor Ariza (43) and Chris Duhon (38) were the only two exceptions.

2005
Overall: 114.04%

Best Picks: Chris Paul (4), Danny Granger (17), David Lee (30), Monta Ellis (40)
Worst Picks: Yaroslav Korolev (12), Julius Hodge (20)

The high overall rating of this draft is amazing considering the careers of the first and second picks (Andrew Bogut and Marvin Williams) - both have under-performed their draft position. The draft is bolstered by Deron Williams (3) and Chris Paul (4), and an exceptionally solid 17-40, with players like Monta Ellis and Louis Williams getting drafted in the second round.

2006
Overall: 77.91%

Best Picks: LaMarcus Aldridge (2), Rajon Rondo (21)
Worst Pick: Adam Morrison (3)

This draft was so weak it seems hard to call many of the picks bad - there just wasn't that much talent available. Only 4 players have given the Expected Value of a #4 pick, compared to a similarly weak 2002 draft when 7 players performed at that level. Only 7 players have given the Expected Value of a top 10 player.

2007
Overall: 95.75%

Best Picks: Kevin Durant (2), Marc Gasol (48)
Worst Pick: Greg Oden (1)

The only player to really stand out in this draft is Durant, who was half of the obvious #1/#2 pairing with Oden. Gasol as a late round pick was a great pick since he wasn't going to play the next year, which obviously paid off and he has given the third highest value of his draft class so far, despite missing a year.

2008
Overall: 117.06%

Best Picks: Westbrook (4), Love (5), Brook Lopez (10)
Worst Picks: Joe Alexander (8), Alexis Ajinca (20)

An exceptionally deep draft, with only the two "worst" picks not producing well among the first 29 picks. A very high 114% overall performance, without a group like the 2003 draft (LeBron/Bosh/Wade/Carmelo) makes this draft unique among the last ten. An amazing 34 players drafted performed at least at 75% of their Expected Value, the most in the ten years of this survey.

2009
Overall: 108.32%

Best Picks: Brandon Jennings (10), Jrue Holiday (17), Ty Lawson (18), Darren Collison (21), Marcus Thornton (43)
Worst Pick: Hasheem Thabeet (2)

With only three seasons to look at, many of these picks are still works-in-progress. Ricky Rubio has given almost nothing back to the Timberwolves, but the glimpse we saw of him last year shows he could still be a good investment of a #5 pick. This draft, much like 2008, appears to be very deep with 32 players overall performed at 75% or better of their Expected Value, the second-most in the last ten drafts. However, no player has given, so far, performance equal to that of a #1 pick. Blake Griffin is closest, having been drafted in the spot, and he missed a season due to injury - so all expectations are that he will exceed his Expected Value soon.

2010
Overall: 76.73%

Best Picks: Greg Monroe (7), Landry Fields (39)
Worst Picks: Evan Turner (2), Cole Aldrich (11)

These players have only had two seasons to perform, so its not very fair to be evaluating the draft already. But so far, it is remarkable that Landry Fields has been able to contribute the Expected Value of a #4 pick from the 39th pick. Greg Monroe has put up exceptional value, producing more than John Wall - and both of them over the EV of a number one overall.

2011
Overall: 96.39%

Best Picks: Kyrie Irving (1), Isaiah Thomas (60)
Worst Picks: N/A

While this is way too early to evaluate the draft using the metrics that I have designed, it should be noted that what Thomas did, as the 60th pick in the draft, is pretty remarkable - having the second-most productive rookie season of the draft class.

Here is a summary of the overall results:

This was Part II of the StatDance.com NBA draft analysis.
Part I: Determining the expected value of a draft pick
Part III: Team-by-team NBA draft performance - Coming Soon
Part IV: We evaluate every NBA GM since 2002 - Coming Soon
Part V: Who did they miss? Looking at the undrafted free agents in the NBA - Coming Soon

Tuesday, July 31, 2012

NBA Draft Analysis Part I: How to value an NBA Draft Pick

A couple of weeks ago, I started to analyze the NBA draft, and came up with a general formula for the value of a draft pick. Since then, I've spent many long nights digging into the numbers. This is part one of my five-part NBA Draft Analysis, where I'm going more in-depth on my previous article to develop a means for analyzing the NBA drafts to compare teams, draft classes, and the GMs making the decisions.

The statistic I decided to use as a general-purpose comparison is PER*MP (Player Efficiency Rating)*(Minutes Played). This gives a heavy consideration for actually being on the floor, and a heavy consideration for contributing while they are playing. John Hollinger, creater of the complicated PER, has a similar statistic - called Value Added (VA). PER*MP and VA differ in that Value Added corrects for position. Value Added is an attempt to compare a players contributions as it compares to a replacement player. For a variety of reasons VA does not lend itself well to our analysis of the NBA draft, and we will stick with PER*MP.

I calculated the PER*MP performance of every draft pick since 2002 (the last ten drafts) for every season. I then calculated the average performance for every pick, for every year since they were drafted and smoothed the line out in Excel. Basically, I found the PER*MP you should expect from a pick for every year after being drafted. For example, the 8th pick in the draft, in their 4th season since being drafted, is expected to post a PER*MP of 24154. If they average 20 minutes a game and play in 75 games, the PER would have to be 16.1 to reach the "expected" performance. This expected performance accounts for all the busts at the draft position - Rafael Araujo in 2004, for example - as well as the standouts like Rudy Gay in 2006.

The expected performance for each draft position increased until year three, then leveled out until year seven, then dropped down year eight. This behavior was much more evident early in the draft - the late picks peaked at year 3 as well, but with much smaller differences since very little is expected of a late draft pick. It should be noted that few picks reach the expected value late in the second round. Most of them post very low numbers, with a few players that go on to have good careers. The expected value is the smoothed-curve approximation to the average performance. Showing these trends on a plot was pretty hard in two dimensions, so I tried to get a decent 3D chart in excel.

Next, I had to decide how to weigh each year given the PER*MP every season. I took the expected values and figured out a formula to weigh the seasons to get a fair comparison. Obviously, good performance early is more valuable than good performance in three years. For picks with 8 full seasons to compare, the weighing is as follows:

1st Year: 17.3%
2nd Year: 17.0%
3rd Year: 16.3%
4th Year: 15.3%
5th Year: 13.6%
6th Year: 11.2%
7th Year: 7.5%
8th Year: 1.8%

The picks that have fewer years are weighed in the same pattern, with the first year always the most valuable. For example a second-year player would have his first season weighed at just over 50%.

The 2002-2004 drafts have already had eight seasons to judge by, so these Modified Pick Values (MPVs) won't change going forward. For the more recent drafts, we don't have 8 years of performance to judge the picks on. These expected and actual MPVs will both change. Of course, this means that the expected MPV will be different for every draft after 2004, as you can see in the slideshow below.

To view any image in full resolution, click the gear in the top-right corner and select "view full resolution". Or, you can view the charts at picasa or imgur.

Since ranking things is probably the most-fun thing we can do with statistics, lets see what these numbers tell us. A more appropriate ranking system would probably be the raw value over the expected value - but with only a ten-year window, it would extremely favor the 2002-2004 drafts. So, instead we will rank them by percent of expected value - the best and worst value picks.

This is a very unfair ranking system, its just for fun. No one is suggesting that these are the best picks of the last decade - they are just the highest value picks at their spots. And since we are ranking players who have yet to finish more than a year or two, a lot of the results will be quite a bit ridiculous. So, with those qualifications, on to the rankings!

Here are the 25 top value picks since 2002 by percent of expected value:

2011 Isaiah Thomas 60 3844.80%
2007 Ramon Sessions 56 986.01%
2006 Paul Millsap 47 811.82%
2007 Marc Gasol 48 798.47%
2003 Mo Williams 47 664.00%
2005 Ryan Gomes 50 651.34%
2003 Kyle Korver 51 638.68%
2009 Marcus Thornton 43 607.14%
2005 Monta Ellis 40 572.20%
2002 Carlos Boozer 34 532.45%
2010 Landry Fields 39 525.20%
2011 Chandler Parsons 38 520.83%
2005 Andray Blatche 49 477.58%
2005 Amir Johnson 56 470.19%
2005 David Lee 30 457.89%
2005 Louis Williams 45 454.08%
2002 Rasual Butler 52 446.74%
2009 Chase Budinger 44 446.35%
2005 Marcin Gortat 57 440.77%
2009 DeJuan Blair 37 401.45%
2008 Goran Dragic 45 379.82%
2004 Trevor Ariza 43 86166 370.50%
2003 Josh Howard 29 148134 370.09%
2003 Zaza Pachulia 42 86890 361.49%
2011 Lavoy Allen 50 5598 345.41%

All of these top value picks are in the later part of the draft due to the lower expected values. LeBron would have had to be drafted 6th to take the 25th spot overall! Here are the 25 best value picks in the top 10:

2004 Andre Iguodala 9 237.22%
2003 Dwyane Wade 5 229.23%
2002 Amare Stoudemire 9 221.43%
2005 Chris Paul 4 221.12%
2008 Brook Lopez 10 203.72%
2002 Caron Butler 10 201.58%
2010 Greg Monroe 7 199.31%
2003 LeBron James 1 189.96%
2009 Brandon Jennings 10 189.94%
2003 Chris Bosh 4 188.13%
2006 Rudy Gay 8 187.23%
2007 Kevin Durant 2 174.80%
2008 Russell Westbrook 4 174.44%
2008 Kevin Love 5 170.17%
2003 Carmelo Anthony 3 164.88%
2006 Brandon Roy 6 159.71%
2004 Luol Deng 7 159.57%
2011 Kemba Walker 9 159.47%
2009 Stephen Curry 7 156.05%
2003 Kirk Hinrich 7 154.93%
2010 DeMarcus Cousins 5 153.99%
2005 Deron Williams 3 151.39%
2007 Joakim Noah 9 144.75%
2004 Dwight Howard 1 143.88%
2009 DeMar DeRozan 9 140.63%

This ranking shows the importance of not getting injured to be considered a valuable draft pick, with players like Blake Griffin, who would be over 150% of his expected value if he could be judged for only two seasons (but has to be judged for three since he sat a season after being drafted by the Clippers). And finally, the worst top 10 picks of all time:

2011 Jonas Valanciunas 5 0.00%
2006 Mouhamed Sene 10 3.21%
2006 Patrick O'Bryant 9 4.89%
2004 Luke Jackson 10 5.36%
2002 Nikoloz Tskitishvili 5 6.24%
2004 Rafael Araujo 8 7.51%
2008 Joe Alexander 8 8.39%
2002 Jay Williams 2 11.28%
2009 Hasheem Thabeet 2 13.65%
2006 Adam Morrison 3 13.85%
2002 Dajuan Wagner 6 14.65%
2007 Greg Oden 1 16.84%
2009 Ricky Rubio 5 24.10%
2003 Darko Milicic 2 31.04%
2005 Ike Diogu 9 33.04%
2007 Brandan Wright 8 36.07%
2004 Shaun Livingston 4 36.33%
2006 Shelden Williams 5 41.80%
2003 Mike Sweetney 9 42.35%
2009 Jordan Hill 8 48.57%
2011 Enes Kanter 3 48.74%
2007 Yi Jianlian 6 54.47%
2009 Jonny Flynn 6 55.80%
2010 Ekpe Udoh 6 56.76%
2005 Martell Webster 6 59.74%

Of course, #1 in this ranking, Jonas Valanciunas, will actually be playing next year, so it isn't very fair to call him the worst pick ever. But generally top 10 picks are drafted to play right away. He will still easily be able to exceed his Expected Value with a nice career though.

A huge thanks to the fine folks at basketball-reference.com for their wonderful stats database, I got all of the raw data from their site. Also, Grantland.com's Bill Barnwell probably inspired this series (and maybe this website) with his NFL draft analysis. And finally, 82games.com did something similar to this a few years ago but did it the easy way, which didn't do the job justice.

This was Part I of the StatDance.com NBA draft analysis.

Part II: Ranking the best draft classes

Part III: Team-by-team NBA draft performance - Coming Soon

Part IV: We evaluate every NBA GM since 2002 - Coming Soon

Part V: Who did they miss? Looking at the undrafted free agents in the NBA - Coming Soon

Wednesday, July 25, 2012

How Close is Tiger Woods to Actually Passing Jack Nicklaus - We Rank Golf's Greats

Tiger Woods, with his 14 majors has been chasing the iconic Jack Nicklaus for quite some time. Only four back from Jack's 18 major wins, the feat seems so close to achievement, if only Tiger can get back to his former level. This record is what Tiger strives for, the only thing left he has yet to conquer in competitive golf.

Did you realize that Jack Nicklaus took second place at Majors 19 times? So Jack won 18, placed second 19, and third an amazing 9 times. Tiger Woods has only 6 such second place finishes in Majors, and 4 third-place finishes.

I tried to develop a fair way of measuring a golfers career. This lead me to do a lot of research on golf, a sport I have not spent much time following. The major championships started 1860, with The Open Championship in Scotland. Over the course of professional golf's history, much has changed. Today, one could use career earnings - but I sincerely doubt that the purses have stayed constant with inflation. In older times, the majority of the "tour" was exhibition matches. Also, I found a very real lack of historical data. This might be due to my lack of familiarity with the sport, but I just don't think there is much interest in golf statistics. Don't worry, that didn't stop me from staying up until 3am several nights in a row, playing with golf statistics!

I decided that measuring success in golf's major championships would be the truest measurement of career success. The majors have been the only constant in golf's history (except of course, for the years the tournaments were cancelled in wartime). I declined to rank the great women golfers: it would be purely a judgement call and I couldn't find the function in Excel.

I decided on a 1/x function for the Major Points statistic. I toyed with more complicated formulas, but found them wholly dissatisfying. I agree that this might seem more heavily weighted towards rewarding the Golden Bear, with his record of top 3 finishes - but its more than fair. Consider the purses: the 2012 US Open rewards second place with 60% of the winners take, and 37% of the winners take for third place. My simple 1/x formula gives 50% and 33%, respectively. So I don't think this is too-heavily favoring Jack's career over Tiger's . (I could have, for example, awarded points based on the percentage of the total prize money available at major tournaments every year, although this system would have been too daunting a task for me to justify).

First, I painstakingly extracted the data from wikipedia, where they have tabulated major tournament results timelines for all of golf's greats. I decided the "cutoff" to be considered in this ranking was to have won four majors. For several reasons: firstly, I started this task trying to compare Tiger Woods to Jack Nicklaus; second, I wanted to deal with a manageable amount of data; and thirdly I only wanted to compare great with great and four majors is a pretty exclusive bunch. (Note: Billy Casper was ranked near the top 10 of male golfers by Golf.com which was my original group I was using, while only having won 3 majors.)

Then, I ranked the golfers from the best (Jack Nicklaus) to the "worst" - but make no mistake, all of these golfers were among the best in their eras. I highly encourage you all to read the wikipedia articles on these players. Some (most!) of their lives were fascinating. It seems the golfer's today keep their lives so private we don't get the fascinating stories that the older golfers left for us. (I didn't write this as a jab at Tiger, but I'm totally leaving it here as a jab at Tiger). The early eras were dominated by locals, since it was a new sport. But golf.com still ranked them in their top 20 (which included several women). There were fewer players playing and there were fewer majors - so less points to go around. I think this gives a natural "curving" to the rankings, allowing us to respect the history of the game (no one thinks Old Tom Morris would be competitive in today's game) while still giving today's amazing athletes their proper dues.

A summary of "Major Points" - earned for finishes at major championships, 1/finish, so...
1st: 1 point = 1.00
2nd: 1/2 points = .5
3rd: 1/3 points = .333
20th: 1/20 points = .05

The first interesting result besides the rankings of players (nerdy stuff below, see bottom of page) I found was by finding how many Major Points were won by year of competition.

The chart below is a composite performance by every golfer in our rankings - the 28 people who have won 4 majors or were ranked by golf.com to be a top 20 golfer. The horizontal axis is their year of playing in a major. The columns represent the sum of the "major points" that all of the players earned during that year of their career. Don't worry about matching each individual with their performance in this chart - a full collection of individual performances is below.

Then, I overlayed this "average career" over each of our 28 golf great's actual careers and saw how they all compared.

The scale on these plots were chosen by Excel, matching the peaks of the bar chart with the peak of the average career plot. Its amazing how closeley the individual careers can follow the average. I did not include Tiger Woods in the average career since he is so near his prime career still. Phil Mickelson and Ernie Els are both 4-5 years further along than Tiger, and were included.

So, does Tiger have a chance to catch Jack? No, simply put. He is already second, and there have been impressive careers before him. He will put significant distance between him and the pack that is close behind him. I do not doubt his ability to match Jack's significant mark of 18 majors, but while Tiger has 78% of the major titles Jack has, he has only 58% of the Major Points that Jack accrued.

Tiger is my favorite golfer - growing up casually following professional golf, Tiger was golf. I liked him - who didn't? I never idolized him though, so his fall was easy for me to get over. I'll be cheering for him, but I know I won't see the day where he is ever number one.

Here are some timelines of golf, broken down into readable sections by era. It's interesting to see how much competition Jack had, but who knows who will go on to win more majors in Tiger's era.

(1)This curve, as you can see, gives a very nice competing exponentials model (see: this, this, and this for examples) - the increase in talent with age and experience, and the subsequent decrease from getting older. I just used a fourth-order polynomial to approximate this, since I only need a good curve, not a scientifically rigorous result.

Monday, July 23, 2012

Why Punish Penn State?

The horrors of the Penn State scandal cannot be understated, but the punishment awarded to them a couple of hours ago by the NCAA needs to be put in perspective as well. The NCAA is an athletic association of universities; it's business is in the regulation of athletic competition between student-athletes.

Of course, this is not about punishing Penn State for the vile acts committed by one person affiliated with the program, this is all about the cover-up. It always is. The power given to the football program at PSU was abused in a vain attempt to put the interests of the football program over the best interest of humanity. I sincerely hope that the justice system finds a way to properly award those who are actually at fault for the acts that went on under their watch.

I'd like to appluad the NCAA on acting promptly - nothing the NCAA does is done with this promptness it has seemed. This will serve as a basis to judge all future NCAA actions in timeliness. Now, lets examine what the NCAA actually accomplished today.

From The Big Lead:

-Penn State has been fined $60 million
-4 year bowl ban
-Vacated wins from 1998-2011
-20 total/10 annual scholarship reduction for 4 yrs
-Any entering or returning players can transfer without penalty

The fine, even if it is not paid by the insurance company (as I read on twitter before the announcement, but have not heard since, so may be untrue), is not very steep. As Forbes is reporting, the football program (at its old pace) was printing cash, and any monetary penalty would have to be much larger to significantly impact the viability of the football program.

The only people that care about the vacated wins are the Penn State fans (maybe) and Paterno's die-hard supporters. We all know who won those games. If this had been a competitive violation, the argument could be made that the wins should be vacated. No one (that I have seen) has suggested that Joe Pa had any significant NCAA violations in that regard. Considering the depth of the Freeh report, maybe this is a real testament to his football integrity.

The competitive penalties are the real, significant, and devastating. No bowl games, a huge loss in scholarships, and an express lane for current players to leave. This mauling of a program, which is more important than any one person, has left me with a bad taste in my mouth. What is the point of it? Why do this?

As I see it, there are three reasons to levy a penalty (from a philosophical perspective):

-Punishment - to remove the advantage gained from having committed the wrongdoing.
-Safety - to ensure that the wrongdoers are not able to continue their acts
-Discouragement - to convince others not to act wrongly in future.

In what ways do the competitive penalties given by the NCAA accomplish this? The only advantage gained by Penn State was that their program's public image remained untainted. I think its safe to say this advantage has been eliminated, organically. The financial penalties are probably fair if not overly lenient - the money going to help victims of child sex abuse.

The individuals that were part of the cover up need to be brought to justice by the court system. The NCAA of course has no part in this, the "safety" aspect. Obviously, the most important individual is already behind bars. Others are going to court for their parts in the cover-up (and lying about it).

Did the NCAA need to destroy the football program to discourage others from doing this? I would hope not. Since we've eliminated the other two aspects, this must be the NCAA's intent. But who is punished by this decision? Mostly, the fans and players. Hundreds of thousands of Penn State fans no longer have a competitive team to cheer for. There is no need to pity them, its just sports - but this isn't how sports should work. The acts of a single individual, and the following enabling acts of a handful of individuals, has lead to an entire program's practical demise.

So, what would be an appropriate discouragement? There was no football advantage - so give the program no football penalties. Education, accountability, oversight, and money. Ensuring the University has to pay for its transgressions. That's how this should have been done.

Thursday, July 19, 2012

NBA Playoff Winning Percentages by Game

In the 1983-1984 season, the NBA switched to a playoff system much like the one currently in place. A total of 16 teams make it to the tournament, and every team plays every round. So, every season has 15 series total. From 1984 to 2003, the first round was a best-of-five series. Since then, all playoff series are best-of-seven contests.

While the number of teams and number of series have remained constant, the seeding has gone through several permutations, giving different seeding advantages to division winners. However, the home-court advantage always goes to the team with the best record.

For example, this year the Boston Celtics finished first in the Atlantic Division with a 39-27 record and the Atlanta Hawks finished second in the Southeast Division with a slightly better record, 40-26. As the division champion, the Celtics were guaranteed a top-four seed, despite having finished with the fifth best record in the East. This meant that as the four seed, home-court advantage was given to the five seed. (Note: this had no effect - however, had the Celtics won the division with the eighth best record, they would have still faced the Hawks instead of the #1 seed - the Bulls). In my analysis, I used the NBA's home-court advantage to determine the "higher seed."

Every game has its own flavor. From Game 1, with the anticipation of match-ups and rivalry to tense game 7s with seasons on the line. I went through the last 27 years of playoff series and found the winning percentages of the home team and the higher seeded team.

Now, on to the games!

Game 1

Game 1 is always a home game for the higher seed. The home team has won 76.05% of these games. This nearly matches the overall home winning percentage of the higher seed (75.95%).

Game 2

Like Game 1, Game 2 is always a home game for the higher seed. Two possible game 2's exist - the 0-1 game (with the lower seed having won the first game) and the 1-0 game (with the favorite winning the first game).

If the underdog had won the first game, the second game is won by the favorite 79.38% of the time. If the favorite won the first game, the favorite goes up 2-0 73.7% of the time.

This means that favorite more often wins game 2 if they first lost game 1. The combination of the complacency of the underdog, already having snatched home-court advantage back and the desperation of the favorite at the prospect of getting into a two game hole before going into enemy territory leads to a significant increase in home winning percentage.

Overall after game 2 56% of series have seen the favorite up 2-0, 39% tied 1-1, and 5% have the underdog cleaning up in the favorite's house, up two games to none over the favorite.

Game 3

For game 3, the home team is always the underdog. In 20 of 405 attempts, the underdog is already up two games to none (5%). In 12 of these 20 games, the underdog takes a 3 game lead on the favorites (winning 60%). This isn't far from the overall winning percentage of the home team - they win 56.3% of game 3's overall. However, with such a small sample size, this isn't very useful information.

When the underdog is down 2-0, they win 58.15% of the time in game 3. If the series is tied, the home team wins 53.16% of the games. One might think that the home team would win more often after having won once on the road, but the opposite is true. The condition of the series (the higher seed not wanting to fall behind in the series) is more indicative of the result of game three than the idea that the teams might be more closely matched.

This could be a general trend, but is more likely an overlap of two different scenarios. The first scenario being that the higher seed is significantly superior to the lower seed, and facing a deficit in the series, really turns it on and dominates game 3. The second scenario being that they are actually closely matched and the home team wins most of the games.

In 5-game series, the favorites swept in 50% of their chances - 43 of 86 attempts. This number is the lowest winning percentage for the home team with at least 50 games played. This is probably a testament to the extremely high numbers of teams that were allowed in the playoffs when the league first switched to a 16-team playoff. In 1984, there were only 23 teams in the league, so 70% of the league made the tournament.

After game 3, of which 405 have been played:

95 times (23.46%) the favorite winning 3-0 (43 times ending the series)
206 times (50.86%) the favorite is up 2-1
92 times (22.72%) the underdog is winning 1-2
12 times (2.96%) the underdog is up 0-3 (a 3-game sweep 5 times)

Game 4

In game 4, the home team is again always the underdog, just like in Game 3. Remarkably, the higher seed has won this game significantly more often than game 3. Boasting a nearly-even 49.58% winning percentage over the past 27 seasons, game 4 has the higher seed overcoming the home-court advantage of the lower seed.

For the 7 games played with the underdog threatening a sweep, only once has the favorite bounced back and took a game (Western Conference Finals, 2005 - the Suns stole game 4 but lost in 5 to the eventual NBA champs, the Spurs). The other 6 times, the underdog got the brooms out.

For the 83.47% of games that start with the series at a 2-1 tally (either the underdogs or the favorites with a one-game lead) the results are very similar, right around a 50% winning percentage. These are cases where neither team has its back against a wall. The previous performance in the series is indicative of the result of this game (although only to a small degree). If the lower seed is up two games to one, they go on to take a 1-3 lead 53.26% of the time. If the higher seed has the 2-1 lead, the lower seed only wins 50.97% of the time. A small, but interesting, difference.

The favorite has threatened to sweep (being up 3 games to none) in 52 of the 253 best-of-seven game series that have been played in the last 27 years. These series obviously represent the games where the favorite is significantly superior to the home team having won both of their home games and their only road game, and represents by far the highest winning percentage of any visiting team, winning 32 of the 52 tries (61.54%). In fact, the next highest away-team winning percentage in a seven game series is game 3 in a tied series, when the favorite wins to take a series lead 49.49% of the time.

Unfortunately, there is no way to compare the winning percentage of the best-of-five series sweeps to best-of-seven sweeps since the close-out game 3 is the first game played at the underdog’s home-court.

Game 5

While game 5 is usually a home game for the favorite, in the finals game 5 is the third home game in a row for the underdogs.

When the underdog has a 3 games to 1 lead going in to game 5, the higher seed wins 72.73% of the games to bring the series to a 2-3 tally. This is a high winning percentage, but still lower than the overall game 5 favorite winning percentage of 74.53%. This slightly lower winning percentage could be due to some game 5’s being away games for the favorite, or that the lower seed has to be a worthy opponent to have taken a three games to one series lead.

When the favorite is ready to clinch in game 5 with a 3-1 lead, they are almost always playing at home and have a remarkable success rate of 76.74%. This winning percentage is likely dominated by the higher seeds winning against an outmatched opponent that got a win at home in game 3 or 4.

With the series tied at two games apiece heading into game 5, the home team wins 74.32% of the games, to take a series lead. The majority of game 5’s that have been played over the last 27 seasons (55.43%) are of this type, with the series lead in the balance.

Game 6

Game 6 is usually played at the underdogs home-court (the exception being finals games, of which 6 underdogs have won on the road since 1984). Only two records are possible going in to game 6: 3-2 in favor of the top seed, or 2-3 in favor of the lower seed. Over the past 27 seasons, 66.43% of game 6’s have been 3-2 in favor of the top seed.

When the top seed has a chance to win the series game 6, they are on the road, with two chances to clinch the series, while the underdog has their season on the line at home. In 46.24% of the games, the underdog pulls it out and takes the series to a game 7. Given the gravity of the situation, and that the underdog has already won two games, one might think that this would be more in favor of the lower seed, but in fact it is below the average winning percentage of the lower seed in their home-court (54%).

When the lower seed has won three games going into a game 6, they are relatively dominant - winning 72.34% of their chances to win the series on their home-court. This is easily the highest winning percentage of any other game by the underdog (except for the 6 times the lower seed has swept in the 7 times they had a chance to in game 4 at home).

This large gap in winning percentages in game 6 - in series that has already gone to 6 games - is surprising to me. Only one game of six separates the two teams and there is a 26% difference in winning percentage.

Game 7

In the 27 seasons that I analyzed since the playoffs switched to a 16-team format, 56 playoff series have gone to a game 7. The top seed has been dominant, winning 82.14% (46 of 56). Considering that the lower seed has already won 3 games against this team, its a very significant edge by the higher seed.

Playoff basketball is all about attitude and talent. The higher seeds usually have the talent, and when their backs are against the wall, the talent perseveres.

The difference in winning percentages depending on the record of the teams in the playoffs is astounding. Whether the difference is a testament to player’s will to win when the pressure is on, or if its an embarrassment that they don’t try hard enough in early games, I’ll leave up to you to decide.

Google "Jason Kidd" Lately?

I'd love to hear the story about how this photo became the default google "Jason Kidd" photo. Classy.

(Note: as of 19 July, a more appropriate picture has replaced the one shown.)

The Numbers Behind the Jeremy Lin Contract

There's a story out there that no one is talking about in this NBA offseason: a little-known player named Jeremy Lin (a Taiwanese-American who graduated from Harvard) has been offered a contract by the Houston Rockets. Since Lin is a restricted free agent, his team last year (the New York Knicks) can match the offer and keep Lin.
In either an effort to make it impossible for the Knicks to sign Lin, or to force them in to going way over the luxury tax threshold, the Rockets offered Lin a somewhat-ridiculous 3 year deal for approximately 5/5/15 - a total of approximately 25 million but with a huge number in the third year. Due to the teired luxury tax system, it is much more expensive for the Knicks to sign him than a team with more flexibility like the Rockets.
Since, in reality, the news coverage of this story in this quiet lull between interesting sports (sorry, baseball) has been saturating, I'm sure we have all heard this before - and heard that it could cost the Knicks anywhere from $30 million to $75 million the third year. Obviously, its not Lin that costs that much, its the sum of the contracts.
So I decided to take a look at what every NBA team had in guaranteed salaries for that season.
Guaranteed NBA Salaries 2014-2015

The season in question, 2014-2015, the Knicks would have $75 million guaranteed for Carmelo Anthony, Amare Stoudemire, Tyson Chandler, and Lin. This is tops in the league, but only by 5-6 million over the Nyets and the Heat. Considering the salary-cap options available to them - the stretch clause, or trading someone (not just Lin), I find it hard to believe they can't afford Lin.
After all, Dolan is not hurting for cash and Lin could easily turn in to a great revenue stream and pay for himself anyway.

Tuesday, July 17, 2012

NBA Draft Analysis

I've wanted to look at the NBA draft for a while now - I had lots of questions. I tried to answer a few of them by looking at the last ten NBA drafts (2002-2011) and looking at how their careers turned out relative to their draft positions.

The first thing I had to figure out was how to compare careers. Simple box-score metrics obviously don't work - looking at points per game would be a very poor single indicator of career success in the NBA. I did some cursory investigating into advanced basketball statistics (APBRmetrics) and found a lot of ideas are out there.

I get the majority of my data from the wonderful Basketball-Reference site, and found that they list Player Efficiency Rating (PER) and Win Shares. Other sites, like the NBA Geek, list Wins Produced. Another metric is adjusted plus/minus, and while I'm sure it has its merits, I'm not really interested in assessing NBA drafts in a world that has Eric Bledsoe as the best per-minute player in the NBA last year. Its not just refusing to accept something that goes against my pre-concieved notions of what happened last year, its that public perception and box-score statistics are how players and drafts are evaluated.

I liked Win Shares and Wins Produced - but they both attempt to gauge a players defensive contributions by looking at how the team did. I don't want to give a player credit for playing with good defenders, it doesn't seem fair to me.

The most important metric is minutes played - if you're good enough to get on the floor and stay on the floor, you're a contributing member of the team. No other stat can replace that. I decided the best statistic to use for measuring a players quality of contribution is the Player Efficiency Rating. Despite its flaws, it gives a great picture of a player's ability to contribute. Most importantly, for this exercise, it is relatively consistent with perception. If a GM drafts someone with high PER for his draft position, chances are that pick will be viewed as a "success" when the GM is evaluated.

What I calculated is the PER*Minutes Played - PERMP - for each player drafted since 2002. You can view each photo in full resolution by clicking the gear in the top right corner, or just view the album in full here.

These results were very interesting to me and once I averaged each draft pick's performance per year, it gave relatively nice curve.

The formula for the percent of the first pick each pick is worth: 318307*e^(-0.06167*Pick)/299270. This gives:

1 100%
2 94.0%
3 88.4%
4 83.1%
5 78.1%
10 57.4%
20 31.0%
30 16.7%
45 6.63%
60 2.63%

Of course, the numbers at the top are silly - in the 2012 draft, the top pick was probably worth double any other pick - it is expected that Anthony Davis be a superstar, and everyone else would be a longshot for superstar status (see: 2004 Dwight Howard or maybe 2003 LeBron (one pick being significantly better, despite there being a lot of superstars in 2004)).

But, in the 2007 draft, the top pick was only marginally better than the second pick - you were still getting Oden or Durant (which, at the time, was a toss-up). But once you leave the top 5, the pick values are a lot more useful and consistent from draft to draft.

Again, much thanks to basketball-reference.com for all their data. If you have time, check out an analysis posted at 82games.com - he analyzed the drafts from 1980-2003 with a significantly different process and got very similar results.