Alrighty, for those of you who are interested, here's a more in depth look at how I came to the conclusions above. Beware, math ahead! Also, while I'm pretty good at math/statistics, there are probably those of you better at what I'm trying to do here, so please pitch in if you see a way to help!
First off, I collected data from, like I said before, around 175 games. The factors I collected where: Win/Loss, Surrender, Level, XP, Base IP, Time IP, Leaver IP, FWotD IP, Total IP, Kills, Deaths, Assists, Minions, Turrets, and Gold. These were the stats that I figured would have the highest correlation to XP.
Then at specific levels, I ran correlation on XP and a few of the factors. For example, I played 17 games at level 27, and here are the correlation results:
Kills: -24%
Deaths: -38%
Assists: 48%
Minions: -29%
Turrets: 61%
Gold: 5%
Base IP: 97.6%
Base IP + Time IP: 97.6%
Base IP + Time IP + Leaver IP: 99.99%
Looking at these numbers, it was pretty obvious that the real factors involved here were IP, omitting FWotD. The high correlation to Assists and Turrets is due to covariance with wins, which have a strong impact on IP, and therefore XP. Simply put: If you win, you get more IP and XP, AND if you won, you probably killed more towers. Therefore turret kills are important for winning, but not for calculating XP directly.
From here, I was hoping for the following result: I wanted the equation to be modeled by XP = IP * ( LVL * m + b), because this is very simple. Simple for me means simple for Riot, so I figured I had a decent shot. But, to be sure this is how they did it, I had to rule out some other possibilities.
For example, running a Linear regression on Level 27, with IP on the x-axis and XP on the y-axis, I got the first graph.
Wins and losses sort themselves, to the right and left respectively, and you can see how well they fit the trend line. However, the fact that the equation is "y = 1.6487x+2.8688" is a problem. To keep our final equation simple, we must have the form "y=mx."
So, what I had to determine was if the 2.8688 value was not zero due to randomness. To cut a long story short, I made graphs like this for each level, then ran a correlation between the the intercept value and the level. The results were uncorrelated, which means they are likely due to randomness. They didn't quite center around zero, but they were close enough to ignore.
The next thing I had to show was that the IP and XP ratio per level increased in a linear fashion. What I had to rule out was other trend lines (like square, cubic, exponential... etc) and more importantly, stair step patterns like what is used for XP per level. I created a scatter plot with level on the x-axis and all the XP/IP proportions on the y-axis, and got the second graph.
Now, it's unfortunate that I have no data at the center of this graph, but I think the trend line pretty obviously works and works well. There is no evidence of stair stepping. There does seem to be a slight curve to the graph, but the second order polynomial equation is "y = 0.0001x^2+ 0.0272x + 0.8559" which has a laughably small x^2 factor, which pretty much shows that, if it's there, it's ignorable. So, this is the end of the analysis, we can just use the trend line as our estimation equation.
I already went over how I used the equation to build the table of games needed per level, and it gave results close enough that I felt the approximation was good enough.
I don't really have much to add on the Elo system hypothesis, because it's mostly just a guess right now, and my analysis has pretty much been restricted to what I stated in my previous post. So, I'll leave that to another day, and probably another thread.
This probably isn't all as clear as I want it to be, but please ask questions and I'll answer them. I'll also try to make edits based on questions to make things clearer for the next interested person.
That's all I've got for now. When I have a chance, I'll post my next thread about how much IP you get based on game time.