### Statistical Analysis of Tribunal Cases

Comment below rating threshold, click here to show it.

Ulanopo

Senior Member

Quote:
Originally Posted by onihero
I didnt see this, might have missed it.

Whats the % of reports from a loss vs the % of reports from a win?
He doesn't cull out the exact data you're looking for, but in section 1, he says:

Quote:
Next up, how does total number of reports and total number of games figure into things?

First, how many reports does the typical game have? The median case has 5, the average case has 5.48. The mode (most common value) is 4. Two cases in the sample have no reports (both are pardons); I can only assume that is due to a Tribunal bug. There are very few cases above 15 or so reports, but one enterprising person managed to collect 33.

As reports go from 1 to 7, the chance of punishment increases quickly from around 20% to around 80%, then climbs very slowly thereafter.

We can do a similar analysis to the number of games in a case. The mean is 2.73, the median is 3, and the mode is 2. Pardon chances range from 62% in single game cases down to 24% in 5 game cases.

Lastly, we can consider reports per game. Again, as reports per game goes up, so does the chance of punishment.

Comment below rating threshold, click here to show it.

zliplus

Senior Member

Quote:
Originally Posted by gnfnrf
3.) By deciding how many reports and incidents to collect before generating a case, Riot controls the punish rate.

The difference in punish rates between cases with very few reports or incidents and many is stronger than nearly any other describing factor. But these factors aren’t directly within the control of the reported player; they are under Riot’s control. Rather than making a 1 report Tribunal case that has an 80% Pardon rate, Riot could wait for more reports before generating it, or decide not to submit it at all. Sometimes they do, and sometimes they don’t. I don’t know why.
I recall a red post saying that the Tribunal include some 'fake' or 'trap' cases that should be pardoned in order to catch 'bad' reviewers. This may account for some of the low punish cases and gives weight to the idea that reviewers also have some weight assigned.

Edit: I forgot to say this, but excellent work! Very valuable and interesting read, and hard data and conclusions provide for much discussion.

Comment below rating threshold, click here to show it.

gnfnrf

Senior Member

Quote:
Originally Posted by onihero
I didnt see this, might have missed it.

Whats the % of reports from a loss vs the % of reports from a win?
Huh. I ran this very early on, but never included it in the report.

The team with the reported player wins 25% of the time, loses 72.5% of the time, and the reported player left before the end of the game 2.5% of the time.

Comment below rating threshold, click here to show it.

onihero

Senior Member

Quote:
Originally Posted by gnfnrf
Huh. I ran this very early on, but never included it in the report.

The team with the reported player wins 25% of the time, loses 72.5% of the time, and the reported player left before the end of the game 2.5% of the time.
Can you give a breakdown of what the winning player was reported for and the losing player was and percentages?

And what program are you using to break this stuff down?

Comment below rating threshold, click here to show it.

gnfnrf

Senior Member

Quote:
Originally Posted by onihero
Can you give a breakdown of what the winning player was reported for and the losing player was and percentages?

And what program are you using to break this stuff down?

I only have a single "most common report reason" per case, and win\loss is determined per game. The data won't match up well.

As for what I'm using, I wrote a python program to download the data, and I've written a series of other programs to process it, modifying them when I think of something new to gather. Once it's processed, I write it to a .csv, which I load into R to do actual data analysis.

Comment below rating threshold, click here to show it.

onihero

Senior Member

Quote:
Originally Posted by gnfnrf

I only have a single "most common report reason" per case, and win\loss is determined per game. The data won't match up well.

As for what I'm using, I wrote a python program to download the data, and I've written a series of other programs to process it, modifying them when I think of something new to gather. Once it's processed, I write it to a .csv, which I load into R to do actual data analysis.

Jeez, thats a lot of work, thanks for doing it!

R itself is a lot of work to manipulate! Never messed with it much myself though. Minitab being the old stand-by and a bit of dabbling with Maple.

In regards to this section of your analysis:

Quote:
27.7% Intentionally feeding
25.8% Offensive Language
21.3% Verbal Abuse
14.1% Negative Attitude
7.8% Assisting Enemy Team
1.5% Inappropriate Name
1.2% Spamming

A small fraction of cases have no report type specified.

Remember, this is the proportion of most common report types, not individual report types. Refusal to communicate does not appear at all, meaning it is very uncommon.

Most of these types are punished at roughly the same rate as cases as a whole. Only one shows real statistical significance; Inappropriate Name cases, which are punished more often (at a 69% conviction rate).
I am concerned about the high % of intentional feeding reports and how you reflect that they are punished at the same rate as the other cases. Are you able to ascertain if those intentional feeding reports are being punished as another report type or if they are actual intentional feeders?

Oh and you stated what? 1400 permabans a week? What?!?! Thats a huge number!

Comment below rating threshold, click here to show it.

mackman

Senior Member

Man, this is fantastic stuff. Great work statistically demonstrating that "punish spam" does not, in fact, exist (or at least does not impact the final judgement). Kudos to you for doing all this work: Should you ever send a resume to Riot, you could include this as an example of you performing LIKE A BOSS without any compensation.

Comment below rating threshold, click here to show it.

Exploding Barrel

Senior Member

Quote:
Originally Posted by onihero
Oh and you stated what? 1400 permabans a week? What?!?! Thats a huge number!
That's 0.013% of the active player base each week (based on the 11 million active players number released some time ago).

Actually seems pretty small compared to how many people appear to deserve it.

Comment below rating threshold, click here to show it.

Mokkun

Senior Member

The number of incidents in a Tribunal case is a recency thing. Some number of games/reports is necessary to generate a case file, and some critical mass is required to go to Tribunal. The Tribunal then picks a sample of recent incidents to send to show the judges. This sample has a more recent threshold for what can show up, so not all potentially relevant incidents will show up because they are too old.

You can argue that that's terrible, because it causes people to be pardoned more often, despite having garnered that many reports.

Or you can argue that it shows improvement since the beginning of the case file, and should in fact act as a positive buffer against punishment for those who get sent to Tribunal.

I suppose it depends on your view. I'd personally prefer to always see at least 3, because I don't feel like 3 is a huge number, but 3 establishes a pattern.

Edit

@Oni I think we're seeing a high rate of people punished in cases with Intentional Feeding, because all he's seeing is "most common report type," as opposed to specific numbers of reports. Since the rules say that people who are guilty of any offense are still guilty, regardless of report type, they are being punished for other behaviors (I'd guess chat related), as opposed to the "intentional feeding" report. I know there have been a lot of cases where I've wanted to flag abusive feeding reports, but still hit punish for other offenses.

Comment below rating threshold, click here to show it.

onihero

Senior Member

I suspect the same Mokkun, but am curious if he can discern it one way or the other with his data. Because if it is not, that is certainly a concern.