I greatly appreciate and enjoyed your thoughtful response! Thank you!

Quote:

**Goumindong**:

Yes and no. In that at any particular point things may not be perfect, but that they are always moving towards it. This isn't a problematic issue and exists in all rating systems which continually add people because adding people adds uncertainty, at the very least, to those people who recently enter. Less so to others if the population is large, which it is.

We can see how it adds less uncertainty by pretending that your ranking wasn't a number, but rather a partial order with each person lined up in order of ranking. If there are three people in line and one person enters and then find their proper spot, the total order has moved significantly relative to the middle point. The middle person, who was previously ranked in the middle, is no ranked either the second or third quartile. If three hundred people are in line the total order has moved very little relative to the middle point.

Since ratings are more or less enforced to average to 1200 adding new people doesn't do much to increase uncertainty in your ELO.

That is to say that new entrants are only problematic in that they consistently make it more difficult to predict the games which have the new entrants.

And if the number of "new" entrants are a sufficiently large relative to the active base, particularly in certain ranges, it becomes a big problem. Given slow convergence properties (estimated 200 games on average) that's a very big problem.

Quote:

**Goumindong**:

ELO, like true skill, is a heuristic to achieve Bayesian updating. ELO is just a common name for it. There are limits to how fast we can become more certain of a value with Bayesian updating which is a function of the amount of information contained in the newly added data points. Less information relative to the amount of information currently possessed means slower updating. Adding players to a game reduces the amount of information that we have available on the skill level of players. Such there is no system which "converges well" for multiplayer games where the players are not static. This is not a function of the ranking system you use, it is a function of the information contained in the data points.

Fair enough, but there are certainly ways to make the problem worse - implementation details matter.

The exact method used to allocate ELO gain or loss to individual players could matter a great deal.

Think about it this way - let's assume that we created a model to predict the impact each player of higher and lower skills scores would have on a game. (and we need such a model for accurate matchmaking) - - will the model simply be the average of the scores? I submit that it is far more likely to have meaningful features - - e.g. an over-performing low score or a under-performing high score have a much larger impact on the modeled probability - - and as a result allocating ELO gain or loss irrespective of this fact is introducing more noise into an already noisy system.

Quote:

**Goumindong**:

Many people think that because the original ELO system did not have a system to measure uncertainty that it is not a heuristic bayesian updater, which is wrong, it simply requires less computations and few more assumptions. It also is not likely as swift in updating as others.

Which is in fact, a major problem.

Quote:

**Goumindong**:

There is one interesting thing that TrueSkill does, but its not actually pertinent to how it matches [well, it may be, but probably not]. In that it lists your skill not as the mean, but as the mean minus three standard deviations. I.E. Your "True Skill" rating is the systems "I am 99% confident you're better than this" point.

Quote:

**Goumindong**:

Actually in League "new" players to ranked are pretty average. Its one of the advantages of having to play 200+ games and own 16 champions before starting ranked. There are no "newbies"

I will admit that I find this a bit difficult to believe - - although the qualification of "new to ranked" makes it far more plausible. As mentioned elsewhere, I'm not sure that average ELO after provisional is the right statistic.

Look at it this way - - either a) you might as well stop playing after 200 games, because you aren't going to get any better and the quality of games you experience will not change or b) experience DOES matter even after 200 total games and failing to consider this factor in matchmaking will lead to poor matchmaking.

Quote:

**Goumindong**:

If newer players experience an unfair downdraft then they're more skilled than their teammates and the probability of them winning matches increases. ELO hell does not exist. More trolls and AFK's will exist on the other team, creating a net advantage for the player who is legitimately more skilled than his ranking suggests

Think about flows. Use your own example of the rank ordering. If there is a constant flow of players into the middle of the distribution, and you are both below that middle, but you should be moving up... the constant flow into the middle above you will prevent you from moving above the midpoint.

This wouldn't matter if those flowing in truly were distributed the same as those already in the population... but that would be truly remarkable... a game with no experience curve.

Now, if the total population was relatively static, this might still work out... as the accumulation of players below our hypothetical person in ELO hell would eventually start to "push" them up... but in LoL it is possible and likely that below a certain threshold players simply drop out of the game - which makes the flow of players from the midpoint downward the dominant feature.

Quote:

**Goumindong**:

No. The probability of seeing a 14 game losing streak is not .0006%. The probability of seeing a 14 game losing stream in *any particular 14 games assuming that the probability of winning the game is stable at 50%* is about .0006%. But... and this is a big but. The probability of seeing at least one 14 game losing streak in a large number of ranked games, making the same stable win probability assumption is actually quite high. I don't have numbers since IIRC doing this requires simulation and I am lazy. But I can tell you that the probability of seeing 9 heads or 9 tails in a row if you flip a coin 1000 times is about 80%, 10 in a row is about 60%.

Since large streaks can "feed" on themselves due to poor play from psychological factors its not really a surprise for this to be seen in "large" numbers.

It would be interesting to know if large streaks were common or uncommon [i suspect large streaks of losses are more common than large streaks of wins] but I am not sure that knowing that we see large streaks of losses more often than large streaks of wins tells us anything particularly interesting about our match making system.

You are right, of course. I don't get to see the data, and streaks are not really the most important feature... but it's one of the few features I can see that hints at whether or not matchmaking is actually working.

In the end, it is encouraging that Riot is looking at these issues seriously. As I have said before, it's really the experience of it that is important to me. I don't have any illusion that some change in matchmaking is going to suddenly vault my ELO.

What I do hope for is something (anything!) that allows for more actually challenging and competitive games, where one player on EITHER team doesn't have such a disproportionate impact on the outcome of the game. (And unfortunately, despite what some from Riot have suggested, my experience is that the impact of Feed >>> Carry, not the reverse)

Ironically, I actually put a beginning of the game AFK well below a severely outclassed player in terms of their decisiveness to a team

ELO hell does exist, it is an

__experience __that your players are having of feeling that the quality of their own play doesn't matter in games, and in lower ELOs that particularly blatant and egregious behavior (e.g. afks) is the main factor.

Maybe it's just a psychological artifact of people not being able to perceive how their play increased the probability of wining by 3% (I estimate that even in a best case scenario, a player who for whatever reason is at the "wrong" ELO level can expect at best to experience an excess win rate of 55% or so - - in other words in 20 games they will lose win 11. )

Or, maybe, there is something about either the structure of the game itself, or the matchmaking and ELO system that causes individual players to have a greatly disproportionate impact on a game's outcome.

Either way, it is a miserable gaming experience... and causes players to be frustrated, leading in turn to raging and trolling and other toxic behaviors.

Sounds like hell to me.