Tryndamere 08-27-2010 03:45 PM

Service stability - our number one priority.
Greetings League of Legends Community,

We are extremely aware that over the last couple of weeks the performance of our servers hasn’t been as good as we would have liked. I wanted to assure you that service stability is currently Riot Games absolute number one priority. We have a team of over 20 senior engineers, system administrators, producers, and other personnel who are fully dedicated to improving our service uptime and availability. Answers to common service stability questions:

Is it really a huge priority? How come it’s taking so long to fix!
We definitely understand how frustrating the instability is, and I promise you – it’s even more frustrating to us. The core mission of Riot Games is to deliver the best gaming experience to the greatest number of people possible, and our hearts sink every time our service has issues. We are working around the clock, have 24-hour on-call schedules, and are spending massive amounts of time and resources to improve this area.

In addition to being a team of people who work on the game – we’re also a team of people who play the game, and really understand how frustrating the experience is for players. I know that sometimes it doesn’t seem like it, but stability has actually improved over the last 6 months, and will improve much more over the next six months – it’s just a really challenging set of problems to solve given how rapidly we have grown.

Why is it such a challenging problem to solve? It seems like plenty of people have done it before.
We’re often asked why we don’t just buy new hardware, or a second set of servers, and the reality is that these statements really oversimplify how our systems work. PVP.net runs on a very large and complicated infrastructure of hardware and software. We have hundreds of servers spread over multiple continents and many different sets of software and they all have to communicate with each other harmoniously. The problems unfortunately cannot be solved by simply adding or upgrading hardware, I can absolutely assure you that if the solution were that simple we would have definitely already done so. Additionally, even though the systems often appear similar each time we have system problems, each issue over the last several months specifically has been extremely unique, and has varying causes. Our platform Technical Director, Barry Livingston who leads our stability team has outlined the cause of the most recent set of service issues in this thread.

One of the most challenging aspects of our stability efforts is our explosive growth. League of Legends is growing rapidly, and the rate that we’re adding new users is increasing every day. Our audience has been more than doubling every 3 months and the growth has been accelerating and this has made it difficult to keep up. The good news is that we believe we are very close to having some major and long term improvements that will give us a lot of additional room for growth.

Why do restarts take so long? Can you just restart them at a fixed time more frequently so at least the downtime is predictable?
This really gets back to the complexity of our server infrastructure. Restarting the servers themselves really doesn’t take more than 20 minutes or so, but restarting the service is a lot more complicated than that. In addition to the simple server restarts, we also have to enable the appropriate software in the right order, pre-populate our storage caches, and run through an extensive battery of tests to ensure that everything works correctly when it comes back up.

In addition to improving our overall service stability, our stability team is working on other things that make downtime less painful. These efforts range from improving the way we handle downtime communication and ETAs, to actually reducing the time that individual steps in the restart process take to complete, to improving our automated testing ability. The player experience is and always will be our number one concern, and we appreciate everyone bearing with us as we go through this growth.

I really hope that this information has been helpful. We understand how negative service instability is to the player experience and are doing everything we can to resolve it as quickly as possible.

- Marc "Tryndamere" Merrill

skizbot 08-27-2010 03:45 PM

Snipzar 08-27-2010 03:45 PM


Yuma 08-27-2010 03:46 PM

IdaCr1msonTurtle 08-27-2010 03:47 PM

Holy Malevolence 08-27-2010 03:47 PM

Malzahearty Soup 08-27-2010 03:47 PM

kioik 08-27-2010 03:48 PM

Churchy 08-27-2010 03:50 PM

