Posts Tagged ‘Variance’

I’m a big fan of On/Off data, which compares a team’s point differential with a player on the court versus when he’s off the court. I’ve referenced it frequently in the past and think it’s one of the more telling reflections of a player’s value to his team in the advanced stat family.

The nice part about On/Off is that it represents what actually happened. The problem with On/Off is it ignores the reasons why it happened. And sometimes, it creates a fuzzy picture because of it.

For example, let’s suppose Kobe Bryant plays the first 40 minutes of the a game and injures his ankle with the score tied at 80. LA wins the game 98-90. The Lakers were dead even when he was in the game, and +8 with him out of the game – Bryant’s on/off would be -8.

In this case, sample size is an issue. But that becomes less of a problem over the course of an entire season. The real concern is the normal variance involved in everyone else’s game. Practically speaking, it takes little outside the norm for Kobe to have played 40 brilliant minutes while his teammates missed a few open shots, and for the opponent to miss a few open shots down the stretch while Kobe’s teammates start hitting them.

The tendency is to look at a result like that and conclude that Kobe hurt his teammates’ shooting and when he left the game it helped their shooting. He very well may have by not creating good looks for them.

Then again, players hit unguarded 3-pointers about 38% of the time. Which means if the average shooter attempts five open 3-pointers, he will miss all five about 10% of the time, simply based on the probabilistic nature of shooting. A fact that has little to do with Kobe or any of the other players on the court.

In our hypothetical situation, all it takes is an 0-5 stretch from the opponent and a 3-5 stretch from LA to produce Kobe’s ugly -8 differential. The great college basketball statistician Ken Pomeroy ran some illuminating experiments on the natural variance in such numbers. His treatise is worth the read, but the gist of it is that his average player — by definition — produced a -57 on/off after 28 games (-5.7 per game) due to standard variance in a basketball game outside of that player’s control. Think about that.

For fun, I just ran the same simulation and my average player posted a +5.6 rating of his college season:

Average Player Simulation

So in two simulations, the average player’s On/Off ranged from -5.7 to +5.6. One guy looks like an All-Star, the other like an NBDL player.

“The Team Fell Apart When Player X Was Injured”

This is a common argument for MVP candidates: Look at how the team fared when he missed a few games and conclude the difference is the actual value a player provides to his team. Only this line of thinking runs into the same problems we saw above with on/off data.

Let’s take Dirk Nowitzki and this year’s Dallas Mavericks. In 62 games with Dirk, Dallas has a +4.9 differential (7.8 standard deviation). In nine games without Dirk, a -5.9 differential (7.5 standard deviation).

Which means, with a basic calculation, we can say with 95% confidence that without Nowitzki, Dallas is somewhere between a -1.0 and -10.8 differential team. Not exactly definitive, but in all likelihood they are much worse without Dirk. OK…but we can’t definitively say how much worse they are.

In a small sample, we just can’t be extremely conclusive. In this case, nine games doesn’t tell us a whole lot. New Orleans started the season 8-0…they aren’t an 82-win team.

We can perform the same thought-experiment with Dirk’s nine games that we did with Kobe’s eight minutes to display how unstable these results are. Let’s say Dallas makes three more open 3’s against Cleveland and the Cavs miss three open 3’s. What would happen to the differential numbers?

  1. That alone would lower the point differential two points per game.
  2. Our 95% confidence interval now becomes -12.1 points to +4.4 points.

That’s from adjusting just six open shots in a nine game sample.

Jason Terry — a player who benefits from playing with Dirk Nowitzki historically — had games of 3-16, 3-15 and 3-14 shooting without Dirk. He shot 39% from the floor in the nine games. By all possible accounts, Terry is better than a 39% shooter without Nowitzki. He shot 26% from 3 in those games. Let’s use his Atlanta averages instead, from when he was younger and probably not as skilled as a shooter: How would that change the way Dallas looks sans-Dirk?

Well, suddenly Terry alone provides an extra 1.7 points per game with his (still) subpar shooting. The team differential is down to -2.2 with a 95% confidence interval of -10.4 to +6.1. Just by gingerly tweaking a variable or two, the picture grows hazier and hazier.

Making Sense of it All

So, what can we say using On/Off data? It’s likely Dallas is a good deal better with Dirk Nowitzki. But, hopefully, we knew that already.

To definitely point to a small sample and say, “well this is how Dallas actually played without Dirk, so that’s his value for this year” ignores normally fluctuating variables — like Jason Terry or an open Cleveland shooter — that have little to do with Dirk Nowitzki’s value. So while such data reinforces how valuable Dirk is, we can’t say that’s how valuable he is.

We can’t ignore randomness and basic variance as part of the story.

Read Full Post »

In the last post, I looked at nine of the most explosive wing scorers of the past 25 years. In a 40-point game, the ball has to go in the hole frequently, thus, TS% is quite good on average in such games. But what about removing scoring from the equation and simply looking at shooting volume?

High-Volume Shooting

Let’s use field goal attempts to examine what happens when these players shoot a lot, setting the cutoff at 30 or more FGA’s in a game. These are high-volume attempt games, in which efficiency counts more than lower volume games.

Returning to variance, here are the standard deviations for the same nine players in 30+ FGA games. “Stdev” is the standard deviation for the statistic to its left:

Again, LeBron James is a beacon of consistency, although he only shoots 30+ shots about once in every 20 games. LeBron also shoots the ball much, much, much better than anyone else when he shoots it this much. Note the ridiculous TS%.

So does that translate to team success? Actually, no. The ONLY player of these nine perimeter scoring-machines to see his team’s win% increase when he shoots the ball so much is…you guessed it, Allen Iverson. (Kudos if you actually guessed it.) Below are the results, along with frequency of 30-shot games and relative true shooting percentage (Rel TS%):*

This, despite Iverson having a break-even relative TS% (only Wilkins was worse relative to the league environment in such games). Which hits at the volume-efficiency tradeoff argument, because Iverson seems to be a player who can increase his volume — here, 95 of 561 games (16%) with over 30 attempts — and maintain similar efficiency to his normal standard. That’s not a ringing endorsement for Iverson as a team cog, but it certainly helps to justify his role and value in a system like Philadelphia’s.

On the opposite end of the spectrum is Kobe Bryant, whose teams suffer mightily when he shoots the ball a lot. And, unfortunately, he’s done this about every eight games in his career. Bryant’s relative TS% in such games is almost 3% off his normal average in the same time period, and his scoring varies greatly. (How many players have a 40-point difference between their two highest FGA games?)

This is further evidence that good players can shoot too much. All of these stars, except for Allen Iverson, see a drop in their team win% in high-volume attempt games. Some might cry chicken-and-egg; Are the star players suddenly shooting this much because the team is losing, or are they losing because of so much shooting? There is ample evidence that one player going rogue, or worse, forcing shots doesn’t help an offense in the first place. Being behind is no excuse to abandon ship and undertake a flawed strategy.

Coming full circle, as far as I know, there isn’t a single advanced metric that considers variance. Nor is there an advanced metric that takes into account team strength in matters like variance and volume. Means are beneficial, but wins are tallied after 48 minutes. It’s not like overall point differential — while a great predictor — determines playoff seedings. Perhaps we should look beyond averages and weigh consistency and team strength against those averages in individual player analysis.

*Relative TS% and win% difference are weighted by year. For eg, if half of one’s 40-point games were in a single season, that one season’s TS% and win% differential accounted for half the weight in both categories.

Read Full Post »

In the last post, I examined different measures of variance in this generation’s Mt. Rushmore of wing players, LeBron, Kobe, Wade and Michael Jordan, all the while keeping in mind that it’s possible for inconsistent play to result in a few more wins on weak teams and fewer wins on good teams.

Of those four superstars, Kobe Bryant had the most games with “inefficient shooting” (under 50% True Shooting) and the fewest games with “efficient” shooting (over 60% True Shooting). However, we ignored the amount of shots he attempted when he was shooting poorly or shooting well. Turns out, all four players shoot more when they’re shooting poorly. And of the group, Dwyane Wade has the biggest increase in FGA’s per 36 minutes in his inefficient shooting games. In order of change in FGA’s per 36 from good games to bad:

  1. Wade +1.2 (17.6  in good shooting games to 18.8 in bad ones)
  2. Kobe +0.9 (19.1 to 20.1)
  3. Jordan +0.4 (21.7 to 22.1)
  4. LeBron +0.2 (18.7 to 18.9)

Before we focus on attempts any further, let’s first look at what happens when elite wings score a lot.

High Volume Scoring

There have been just nine wing players with at least 25 40-point games since 1987 (the beginning of Basketball-Reference’s game logs):

  • Michael Jordan
  • Dominique Wilkins
  • Allen Iverson
  • Vince Carter
  • Kobe Bryant
  • Tracy McGrady
  • Gilbert Arenas
  • LeBron James
  • Dwyane Wade

We have our four usual suspects and five more players who collectively amassed 35 All-Star game appearances and 26 All-NBA nods. Not too shabby. Here is the volume and frequency of 40-point games from this group during their prime scoring years:

Not surprisingly, the greatest scorer in NBA history, Michael Jordan, dropped 40 in nearly one in every five games during his prime years. Yikes. Although Jordan isn’t the most efficient of the bunch in such games. That would be Gilbert Arenas, who boasts nearly 70% True Shooting in his 40-pointers:*

As expected, all these players increase their efficiency in 40-point games. Although Kobe’s shooting numbers are surprisingly low, residing next to someone labeled as an inefficient “chucker,” Allen Iverson. So Iverson and Kobe must not be helping their teams win those big games as much as their contemporaries. Right?


It turns out that Iverson’s teams actually improved the most when he scored 40 or more!*

In Iverson’s 72 40-point games, Philadelphia’s win% improved by nearly 20%. That’s a startling contrast – about 16 extra wins over the course of a season. But why would Iverson’s teams improve so much when he has the lowest relative TS% of the lot?

If we buy the argument that AI’s 76er teams lacked a scorer who could create his own offense — certainly a reasonable stance — then Iverson’s scoring explosions shored up that offensive deficiency and buoyed them to victory more often than his run-of-the-mill 25 or 30-point nights, regardless of the drop in efficiency relative to his peers. (This somewhat echoes Paine’s Monte Carlo run.) Besides, AI’s shooting efficiency in such games is still significantly better than both the league average and his own career average.

There’s also further evidence here supporting the idea that weaker teams are helped more by big performances: Jordan played on the best teams in this time period (win% with MJ in the lineup of .713) and saw the smallest change in team W-L when going for 40. From 1990-1998, once Chicago ascended to elite team status, the Bulls were 68-20 when Michael went for 40 or more, for a .772 win%. Slightly worse than his team’s .779 win% (387-110) when he didn’t go for 40.

Tracy McGrady played on the second worst teams of these nine players (Arenas the worst). When McGrady was in Orlando (01-04) the Magic went 19-11 (.633) in his 40-pointers. 121-144 (.457) in his other games. Then he went to a better Houston Rocket team, and went 7-4 (.636) in 40-point games and 119-66 (.643) in other games.

The same reasoning explains why LA has faired so well despite Kobe’s lower efficiency numbers; Many of Bryant’s games were in 03, and 05-07 when his team needed volume scoring. LA was 50-24 (.676 win%) in his 40-point games in those years, while going 112-119 (.485) in Kobe’s non-40 games. (In the other seasons, a .724 win% in his 29 40-point games and a .715 win% in all other games.)

So these players are helping bad teams with big scoring nights and not doing much for good teams with the same outbursts. Balance, it seems, is indeed better.

Yet we haven’t completely addressed the issue of what happens when players shoot a lot. That is the topic of Part III

*Relative TS% and win% difference are weighted by year. For eg, if half of one’s 40-point games were in a single season, that one season’s TS% and win% differential accounted for half the weight in both categories.

Read Full Post »

Last June, Neil Paine over at Basketball-Reference examined consistent vs. inconsistent performances by Kobe Bryant and LeBron James vs. the Boston Celtics. Using one catch-all metric (statistical plus-minus), James and Bryant had similar average performances over the course of their series. But their game-to-game performances varied greatly; James was high-variance — some great games and some awful ones — while Bryant was steadier throughout. If we buy Neil’s simple Monte Carlo simulation, his findings were:

  • Good teams are helped more by a consistent player
  • Average teams are helped more by a consistent player
  • Bad teams are helped more by a high-variance player

This makes sense to a certain degree; Big performances by stars can boost bad teams to wins they otherwise wouldn’t have had, and the bad performances still result in losses they probably would have incurred anyway. In theory, the inverse would hold true for good teams and really bad performances by stars.

Last year’s NBA Finals aside,  Bryant is actually more high-variance than James using measures like points, FG% and GameScore. (GameScore is a rough measure of productivity for a single game.) Below is a comparison of variance between the best wings of my lifetime, Kobe (2001-2010), LeBron (2006-2010), Dwyane Wade (2006-2010) and Michael Jordan (1987-1998):

“Stdev” is the standard deviation of the statistic to its left. If we use a summary statistic like GameScore, LeBron wins the consistency battle handily. Jordan would place second by virtue of his ridiculous 25.3 average GameScore, then Wade and Kobe by the same logic.

If we focus on consistency of shooting and scoring, LeBron wins again. (LeBron outpacing the field is becoming a theme on this blog.) Of course, one could argue LeBron played with a weaker team from 2006-2010, so higher variance would be better when compared to Kobe and Jordan. But unlike Neil’s Monte Carlo run, LeBron’s averages are significantly higher than Kobe’s and Wade’s to begin with.

Kobe, not surprisingly, is higher variance with his FG% — easily the lowest of the lot — and in particular with his scoring performances. But only looking at standard deviations overlooks the importance of the averages. A lower average means more poor shooting games.

EDIT: Bryant’s GameScore standard deviation is 9.5 (mean 22.0) from 2005-2007 on his “weak” teams.

Another way to view consistency is by frequency of games, delineated in a specific range. For instance, we can call games over 60% True Shooting (TS) “efficient” shooting games and games under 50% TS “inefficient” shooting games.

Player Efficient Games (> 60% TS) Inefficient Games (< 50% TS)
Michael Jordan 41.4% 20.5%
LeBron James 40.1% 21.6%
Dwyane Wade 37.8% 26.7%
Kobe Bryant 35.3% 29.3%

Now Bryant’s shooting inconsistency can be seen more clearly. While James and Jordan have an efficient game twice as often as an inefficient one, Kobe shoots well a little more than 1/3 of the time, and shoots poorly a little less than 1/3 of the time. And, if we come full circle to the original claim about consistency helping good teams, that doesn’t bode well for Kobe Bryant’s impact on wins relative to his averages.

For those visually curious, and for the sake of consistency, here is the distribution of TS% for all games played in the respective time frames:

The frequency of games based on TS% for elite wings. Frequency (y-axis) is the percentage of games a player shot a given TS% (x-axis) for the following years: Jordan (87-98) Kobe (01-10) LeBron (06-10) and Wade (06-10).

Of course, none of this accounts for volume — in theory, players should shoot more when they shoot well, and shoot less when they shoot poorly. And that is the topic of the next post: high-volume scoring games.

Read Full Post »