Quote:

Originally Posted by

**iatre**
Thanks for this. I have a question about the breakdown by champion. One would expect that there would be some natural spread in the punish rates between various champions on a completely random basis. That is, Irelia might be punished just as often as Leona is as the number of cases goes to infinity, but in your sample, you'd expect a random spread, possibly fit to a normal distribution, with most punish rates clustered near 60, but some random champs are outliers in either direction.

You'd need to perform a Bayesian analysis on the data to see if there is any real effect or just statistical noise.

I don't have my Ph.D yet, so Lyte would probably be able to provide a clearer explanation. :-p

I know I haven't been explaining it, but I have been checking my data for statistical significance as I go.

To answer your question, you are correct that I should randomly expect some variation between the top and bottom conviction rate (or any other ordered table by champion) just through chance alone.

There are two ways to test to check how likely this is to be what has actually happened. (Note: This is going to get technical.)

The first way is to build a generalized linear model of the situation, with the reported players as the dataset, case verdict as the response variable, and champion as the explanatory variable.

Because the response variable is a 2 level factor, this model is best built using logistic regression. Doing so produces not only a probability for each champion, but a p-value for each champion; the chance that a result like that (for the individual champion) could have occurred by chance.

Not all 100 champion p-values in this model indicate significance, depending on the relative popularity of the champion and how dramatically different the rate is from the mean; but at the classic p<=0.05 test, 35 do.

Not every champion listed in my top and bottom 5 show statistical significance this way, but most do (Trundle and Xerath do not, which I noted in the post, and Maoki and Skarner are borderline).

If this were a regular linear model, the next step would be to calculate an analysis of variance on the champion variable as a whole, to tell us if using champion to explain verdict is viable in concept. However, those calculations cannot be applied to logistic regressions.

The next best thing, then, is a monte carlo simulation. I can compare my real results to a block of 1000 fake results, where verdicts were assigned randomly (but in the same proportion) to the reported players. If I see similar extremes of value in the fake data as the real, I can presume that there is no real significance.

These results are in logits, not percentages. In my real data, the gap between the lowest conviction rate and the highest conviction rate is 1.23. In my simulations, the gap averages around 0.88, with a standard deviation of 0.19. This puts my real data just inside the classic 95% confidence interval.

If I were writing a biology paper, I would be sad and depressed and wouldn't be able to publish. But I'm not, I'm writing about video games in my spare time. So there you go. Some of my data are on the edge of sketchy.

(Note, I did the first check on pretty much everything I presented (build a model, make sure the terms are significant), but not the second check. Most other questions showed far more significance than this one, though, so I'm not worried.)