The ideal ranking system.

Credit to Katastrophe of the Beyond Entertainment forums.

The following post is not mine.

> In this day and age, when it comes to the multiplayer experience of gaming, a ranking system is a must if not simply to ensure that all matchmaking games are fair, balanced, and thus fun for all involved. In non-ranked matchmaking, games are often lop-sided and more often than not, at least one individual spends the game getting destroyed - something that isn’t fun for anyone. Ranking systems, however, prose a problem to Frank O’Connor at 343 Industries because he does not want to see people cheat, boost/sell accounts, or harass people - things that he believes a ranking system promotes and encourages. To this, I say that players will always harass other players unless you remove all forms of stat collecting and display, whether it’s your KD ratio or your win percentage. For this reason, a ranking system that is designed to prevent all cheating and boosting, is only beneficial to the game as a whole. In the following post, I hope to outline a ranking system that I believe is better than any system so far seen in the Halo series because of how well it ranks and matches players, it’s depth and how far one can enjoy it and narrow down their actual skill level, and the fact that it simply cannot be cheated in anyway.
>
>
> Part I - The Service Record
>
>
> Below, you’ll find a (rough) example of how the Halo 5 service record should be presented, and what information this system would display on it. Note that I’m using Halo 4 character models and StarCraft 2 images simply as place holders. I am, by no means, a graphic artist. If someone would like to design better visuals, please contact me. That said, your service record should show a great deal of information:
>
> - Standards such as your Gamertag, Emblem, Spartan Model, Service Tag, etc.
> - Your Global Rating. This three or four digit number (which will be explained later) is your actual rating. Think of this as Halo 2 or 3’s 1-50 rank. Your ‘Global Rating’ (as well as the listed K/D and Win% beneath it) is actually the average of your top three playlists. Which brings us to…
> - Your top three playlists. Here you’ll see your rating for each playlist, as well as the playlist specific K/D and Win% ratios. Also shown is your playlist’s league, but we’ll talk more about that later.
> - Extra information that could be added (either globally or playlist specific) would be ‘Most Played Win’, your KA/D, etc.
>
> Visual mockup.
>
>
> Part II - Rating Explained
>
>
> So what exactly is this three or four digit number I’m talking about? This system is based on an existing system that was originally designed for 1v1 (specifically Chess) games but has since been adopted and used in many games and in many ways, most notable being League of Legends. I am, of course, talking about the ELO system. Unlike the typical Halo 2/3 1-50 system which is progressive, having players start at the lowest rank and climb their way up, the ELO system actually starts players off at a middle rating (1200) and allows them to move up and down based on the population’s general skill level. In a progressive 1-50 system, you expect there to be the same number of 1s as there are 50s with the majority of people being Rank 25, forming a bell curve, and that’s just not the case. This ELO system, however, does that. It’s important to note that there is no ‘minimum’ or ‘maximum’ rank. The single best player will have the single highest ELO and not be lumped into just ’50’. While this number is the most precise measurement of your skill, it’s not by any means the only measurement. It’s simply the rubric from which all your other ranks and placements are derived from.
>
>
> Part III - How ELO Works
>
>
> The benefits of the ELO system over a standard progressive 1-50 system don’t seem that great at first. It’s really when you look into the math of the system that you see it’s benefits. Now, I won’t bore you with all those details (although if you’re interested, the wikipedia page has quite a bit) so let’s just look at some of the key problems with a 1-50 system.
>
> First, it’s restricted to only 50 ranks. In Halo 3, I don’t think anyone who had a 50 would argue that there was a different just in the rank of 50. For example, I had several 50s - does that mean that I’m just as good as Roy or the Ogres? The beauty of that ELO system is that with a midrange point of 1200, the average ratings fall between 600 and 2400 - that’s 1800 different ranks and an accuracy of 0.05% between you and the person one rating point above and below you. But it doesn’t end there, because as I previously said, there is no minimum or maximum rank. If that doesn’t appeal to you, I don’t know what will.
>
> Second, the way people are grouped into leagues changes from season to season, based on the current population’s actual ratings. The 1-50 system is designed to become a bell curve - mostly people ranked 25, and an equal number of 1s and 50s. However, that was obviously not the case because no one was actually a ‘1’, but there were people who should have been when you compare their skill level to the pros. In fact, from my experience, most people seemed to float around 40-45 in Halo 3 which, if you’re going for a bell curve, means that the max rank in Halo 3 should have been closer to 80 or 90. So this system is designed to look at each previous season’s minimum and maximum and rescale the boundaries for each league based on those numbers to ensure that ‘Gold League’ is always the middle ground. If the average player is reaching a rank of 1400 instead of floating around 1200, the system will scale up to accommodate this.
>
> Third, the system only looks at two things when a game is whoever: who won, and how does their rank match against the other team’s. Unlike Halo 3, the system does not look at your teammate’s past win/lose ratio to determine how much or how little you should rank up - it cares only about who won the game, and the difference between your rating, but I’ll discuss this in painful amounts of detail later. For now, know that if you were given a bad teammate and they dragged you down, your opponent’s will be of equally low rating and, therefore, your loss penalty will be minimized. However, if you pull an underdog, you’ll rank up faster.
>
> And last, the myth of rank locking will be dispelled. In Halo 3, often times the change to your rank was so small that even repeated changes went unnoticed in the scale of 1-50. With this rating system, using a four digit number, you WILL notice a change after EVERY game, no questions asked. You will see immediate change in your rating on a game-to-game basis.

[continued below]

> Part IV - League Play
>
>
> Because a league system was so inappropriately added into Halo via Halo: Reach’s ‘Arena’ system, I understand that some people might be hesitant to include or consider it again - don’t be. League systems are rapidly becoming a popular and easy-to-understand ranking system. The league system is not meant to define your rank, but simply to cluster you together and make it more general. For instance, think to Halo 3. If someone said they were a Brigadier (ranks 45 to 49), you had a rough idea of their skill level. The same as if they said they were a Captain (ranks 20 −30). That is how the league system should be viewed - an encompassing system that makes the ELO system easy to understand and look at. Each playlist should be ranked and each playlist should be tracked separately with each using the same measures to define placement:
>
> - Bronze League: 85th Percentile
> - Silver League: 65th Percentile
> - Gold League: 35th Percentile
> - Onyx League: 15th Percentile
> - Masters League: 1st Percentile
> - SPARTAN League: Top X
> - To clarify, the SPARTAN League is a separate league that removes the top players from the general population and puts them into their own ‘pro’ category. Each league would consist of the playlist’s max party size times eight. So far FFA, it would be the Top 8. For 2v2, Top 16. For 4v4, Top 32. You get the picture. In addition, in order to maintain your visible spot on this list, you must complete at least one game per week. Failure to do so will have you removed from the list and marked as ‘INACTIVE’. This is done to prevent boosters and account selling.
>
> Like most league systems, this will be subject to seasonal resets every three months. This was really where the Arena system lost a lot of people, so I want to emphasize that your rank is never FULLY reset. The reset simply weeds out inactive or boosted accounts, and it also resets the boundaries for each league’s placement, but I’ll talk about that later. Just know that you’ll be asked to play ONE game after a reset, and your league placement will be returned to you. But how does the system determine what league you belong in and when you should move up or down?
>
> First, you’ll need to understand what happens at the start of every season, starting with the game’s launch (season one). At the game’s launch, it set’s it’s midpoint at 1200, and creates a false min and max value of 600 (half the starting) and 2400 (twice the starting). Based on these numbers and the range for each league, they’d get broken down like this:
>
> - Bronze League: ≤600 − 870 ELO
> - Silver League: 871 − 1230 ELO
> - Gold Division: 1231 − 1770 ELO
> - Onyx Division: 1770 − 2130 ELO
> - Master Division: 2130 - ≥2400 ELO
>
> With this distribution, this means that the general population should fall within Silver or Gold League. Now, during your placement matches, your ELO will move up and down rapidly in an attempt to more accurately place you from the start. After those initial five games, the rate at which you rank up and down will dramatically decrease. The higher your ELO, the slower you’ll rank up or down in order to require a more consistent performance to make any big gaps. So what happens when Season Two starts? As I said, you’ll be required to play one placement match in order to regain your league placement, but this match will not be another ‘rapid’ rank up or down placement match. As I said previously, these resets only change the league placement boundaries and prevent cheating. When a new season begins, the highest and lowest ELOs are recorded and used to redistribute the league placement requirements. For example, if no one has an ELO lower than 1000 and no higher than 2400, the ELO requirements for all leagues will increase to reflect this change, with Bronze League being 1000 − 1210 ELO. This will also reset the ‘default’ ELO for any brand new players to a new value - specifically, the minimum ELO + one third of the total difference. In the aforementioned example, this means new players would start with an ELO of 1467. So is getting into a new league as easy as hitting a requirement? Unfortunately, no.
>
> In order to move into a new league during a season, your ELO will need to surpass 15% of that league’s population. For example, in season one, if you want to move from Silver to Gold, an ELO of 1231 isn’t enough. You’ll actually need to get 1312. Likewise, if you were to drop from Gold to Silver, your ELO wouldn’t have to fall to 1230 or lower, but actually 1176. The system doesn’t want to rank you up or down unless you actually deserve it. So that means if your ELO did actually fall to 1176, you’d need to gain another 136 ELO to rank back up into Gold Division. This is why most people who are borderline (in between a 1231 and 1312 ELO, for example) will only see division changes between seasons, when the borders are changed. For example, if you had ended Season One with an ELO of 1250, you would have been in Silver. However, if the parameters didn’t change, you’d play your placement game and so long as your ELO stayed above 1231 after that game, you’d be in Gold.
>
>
> Part V - Skill Based Matchmaking
>
>
> So we’ve talked about what the ELO system does, how League play works… so how does the actual matchmaking system work? This section is ripe with math and actual numbers so if that doesn’t interest you, here’s the cliff notes - you get matched against the people also searching that are closest to your rating every time. Makes sense, right?
>
> Much like Halo 3’s ranking system that started searching within a certain range and expanded to a maximum range of +/-10, this system does the same thing. If you were to search with an ELO of 1200, the initial search parameters would attempt to pull everyone with an ELO between 1150 and 1250 into that game with you. For a time, this expands to a maximum range that would depend on the playlist’s population. For example, a high populated playlist like Team Slayer would have a smaller search range where a lower populated playlist like Grifball would have a higher search range.
>
> Once the lobby consists of two or more players, the system averages their ELO rating and searches from that parameter. So say Jake, with his ELO of 1200 is searching within 1150 and 1250 and picks up Steve, who has an ELO of 1230. The matchmaking system averages those two out (1215) and searches from there (1165 − 1265). Then, the system pulls in Mike, with an ELO of 1180, and the system begins searching between 1153 and 1253. This ensures that everyone being pulls into that lobby is around the same average skill level. The only thing that trumps this system is party size. Much like in Halo 3, full parties (or close to full parties) will match only similarly sized parties. A team of 4 in Team Slayer will only play other teams of 4.

[continued below]

> So how are teams formed? Well, it’s something I called ‘First and Worst Matchmaking’, but I’m sure it has a better and more official name. The highest rated player (and his team) goes Red team along with the lowest rated player and his team. The next highest rated player and the next lowest rated player (and their respective teams) go Blue. At this point, each team averages their player’s ELO together and begins balancing. Here’s a complicated example of some players, their ELO and who they’re matched with. Note that the number of ’s before their name marks who they’re teamed with. This is for Squad Battle or a 6v6 playlist that counts anything as a team of 3 or less to not be worth matching with a similar sized team first. The first thing the system will do is, once again, list the players in order of ELO from highest to lowest, so I’ve already done that.
>
> Jake
** - 1475
> Aaron* - 1450
> Mark*** - 1450
> Mike* - 1400
> Sally** - 1400
> Sarah** - 1350
> Brandon - 1350
> Zack - 1300
> Paul - 1275
> Edward - 1250
> Jim - 1250
> Steve* - 1200
>
> As stated, the first thing to happen is Jake (and his teammate Mark) will go Red Team, along with Steve (and his teammates Aaron and Mike). Already, Red Team has an average ELO of 1395 and 5/6 spots filled. It’s important to note that had Steve’s team (in whatever manner) exceeded the max team size, it would have skipped him and pulled Jim instead. Just an FYI.
>
> The next step is that the next highest remaining player, being Sally and her teammate Sarah (since Aaron got put on Red Team thanks to Steve) will go Blue Team along with Jim. This puts them at 3/6 spots filled with an average ELO of 1333. That’s a pretty big difference, so how does this get balanced? Well, the system looks at the lowest team (currently Blue Team) and looks at the remaining players. If a remaining player has an ELO that is HIGHER than the lowest team’s average, he will go on that team. So, in this instance, Brandon will be put on Blue Team which only raises their average ELO to 1341.
>
> Now, once again, the system will look for a remaining player with a higher ELO than the lowest team. Since the next highest player is Zack and his ELO is lower than Blue Team’s average, it would actually lower their average if he went on their team. So now the system takes the lowest remaining player and throws him or her onto the highest team, which means Edward is going on Red Team. This lower’s Red Team’s average from 1395 to 1322.
>
> So now Red Team is the lowest averaged team, but their roster is full so the remaining players have to go Blue Team anyways. In the end, this makes the average team’s ELO being Red at 1322 and Blue at 1304. This means that there’s only an 18 point difference between the two teams, but Red will still go into the game with the advantage, but I’ll talk about what this means in the next section.
>
>
> Part VI - Ranking Up!
>
>
> So how do you increase your ELO rating and move up in rank you ask… does it factor in your Kill/Death spread? Does it matter if you got carried, or carried your team? Does it count your flag caps? Does it perform some asinine mathematical formula to determine your ‘score’ for the game based on your kills, assists and deaths? I’m happy to say none of those. The game simply wants to know who won. Did you win? You rank up. Did you lose? You rank down. I’ll discuss how this works in both FFA and Team situations.
>
> First, let’s take some of our example players from previously and arrange them in an FFA, say… the top 6? and randomly shuffle them around to show what the scoreboard looked like at the end of their game.
>
> 1: Jake - 1475
> 2: Mark - 1450
> 3: Sarah - 1350
> 4: Sally - 1400
> 5: Mike - 1400
> 6: Aaron - 1450
>
> Alright so who ranks up and who ranks down? In an FFA game, the top half (in this case, the top three) will rank up while the bottom half will rank down. What’s the incentive for getting first instead of third, you ask? While, only first place gets the full reward for winning. Come in 2nd, and you get 66% of the reward while 3rd only gets 33%. Now if this was an 8 person FFA, 2nd would get 75%, 3rd would get 50%, and 4th would get 25%. I’m assuming you can figure out how and why that works. Note that losing uses the same scaling, but in reverse so that the player in 6th takes 100% of the loss while the player in 4th will only take 33% of the loss.
>
> So how does the system work if ELO was only designed for 1v1s? Well, it’s going to average up the ELO’s of everyone that wasn’t you and treat that as the opponent. So for Jake, the opponent’s ELO would be the average of everyone who placed 2nd through 6th, or 1410. For the record, I use this calculator with the weighting set to 25. When we plug it in, we see that Jake would get an ELO reward of +10 and, since he got 1st, he gets the full thing. For Mark, the opponent’s ELO was 1415. His full bonus would be +11, but because of his 2nd place finish, he’ll only get +7. For Sarah, she’ll receive a bonus of +15, modified to +10. So why does Sarah receive the same bonus as Jake? Simply look at their ELO’s. Sarah was the lowest ranked player and got 3rd so while it was barely a win, the reward was still great. At the end, here’s the results:
>
> 1: Jake - 1475 (+10)
> 2: Mark - 1450 (+7)
> 3: Sarah - 1350 (+10)
> 4: Sally - 1400 (-4)
> 5: Mike - 1400 (-8)
> 6: Aaron - 1450 (-14)
>
>
> In team games, the system is similar but slightly different. So let’s pull in our two team’s from the previous example and see how they did post-game, saying that Blue Team won with the upset.
>
> Red Team - 1322
> Jake - 1475
> Mark - 1450
> Aaron - 1450
> Mike - 1400
> Edward - 1250
> Steve - 1200
>
> Blue Team - 1304
> Sally - 1400
> Sarah - 1350
> Brandon - 1350
> Zack - 1300
> Paul - 1275
> Jim - 1250
>
> What happens now is that each member of each team compares their ELO against the opposing team’s average. This means that Jake will receive a larger penalty than Steve for losing because his rating is so much higher. So again, as an example, we’ll plug in Jake’s ELO of 1475 and Blue Team’s average ELO of 1304 into the equation (or ELO calculator with a weighting of 25) and see that he actually drops to 1457 for that loss, while Steve at 1200 ELO, only drops to 1191. In games that feature multiple teams, like Multi-Team, you do the same concept as this, where each players compares their ELO to the ELO of the losing team, except the other team’s average ELOs are averaged together. Also, the FFA rules of scaled adjustments based on placement applies.

[continued below]

> Part VII - The Credit System
>
>
> Halo 4 features a serious flaw, but a negligible one thanks to how matchmaking works in the fact that not everyone is available to you from the start. The Credit System is taken form the success of League of Legend’s micro transactions. At the start of Halo 5, all players have access to a single armor set, a handful of colors, and the default weapon skins. Upon launch, multiple DLC packs should be available for purchase that transfer varying amounts of Credits to your Halo 5 account. These Credits can then be used to unlock new armor sets, colors, emblems, weapon skins, etc. If you don’t want to spend money, you don’t have to. While every playlist in Halo 5 will feature this ranking system, only ‘Ranked’ playlists will show it. However, if you win a game in any playlist, you’ll receive Credits equal to the ELO you gained during that game. If you lose, you won’t lose Credits.
>
> Like League of Legends, certain items should go on sale every week and new items should be introduced regularly over the game’s life as free DLC, but they must be unlocked like all other purchases. It’s important to note that these changes should only be cosmetic. This offers players an easy way to unlock the armor and skins they want without having to go through a variety of meaningless commendations (although, likeReach, these could provide Credit bonuses!) and could potentially increase the value of Halo 5, as well as solve the existing ‘licensing’ dilemma because even if someone buys the game used, there’s still a chance they’re spending money on these micro transactions.
>
>
> Part VIII - Punishments
>
>
> Obviously, punishing people for quitting and abusing the system has always been an issue so how do we counteract this? First, if you quit a game, it should automatically count as a loss against you but you should also receive a 2x penalty. However, this penalty scales based on the number of people remaining in the game.
>
> After_Quit_ELO = Before_Quit_ELO + ( ELO_Adjustment * ( 2 * ( Remaining_Teammates / Max_Team Size ) )
>
> Also, in the post game lobby, when viewing an opponent’s service record, you’ll have the option to ‘Report’ the player. Once you select this option, you’ll be able to select a reason such as “Deranking”, “System Tampering”, “AFK”, etc. If a player receives a number of reports, like ten or more, within a single 24 hour period then the account is flagged for someone at 343 Industries to look at and determine if any punishment is due. A fair warning that if the account is deemed fine then those players who reported the account are given a report for “False Reporting”, in order to prevent people from just reporting people they lose against.
>
>
> Closing Statements
>
>
> In closing, I hope we can all agree that a system like this - one that utilizes 100% skill based matchmaking, a system that cannot be cheated nor abused, and one that accurately tracks a player’s skill over time - is what Halo 5 matchmaking will need in order to truly succeed. A ranking system is not just a measure to compare yourself to other players, and it should never be used as a tool to belittle someone in an online forum. The purpose of a ranking system such as this one should instead be to ensure that every game played is fair, balanced and fun for all those involved. In addition, the Credit system I mentioned will not only allow for more variety and ease of access in the customization aspect of Halo, but could potentially earn 343 Industries a much larger sum of money then they would have otherwise received.

Yes, it’s a huge wall of text for some of the people that lurk here, but I’m curious as to what Waypoint thinks about this ranking system, and what alterations they’d make. But with the OP being so massive, I don’t expect any replies from this forum.

I’m speechless! It’s everything I’ve ever wanted, and more.

I like the idea of customization through microtransactions, but I don’t think that should be the only way to acquire additional armors, emblems, etc.; unlocks via commendations and achievements should be included as well.

Other than that, I absolutely adore this idea.

I appreciate that the writer at least understands that a robust rating system requires no more than wins and losses and doesn’t try to cram K/D into the equation. But at the same time, I can’t help but feel that they wrote that thread off of a very, very rudimentary understanding about rating systems.

Granted, Elo is a widely popular rating system used in all sorts of places. It’s a robust system for ranking players based on skill over time. It’s an acceptable solution for rating players in any game. And, to be honest, I probably shouldn’t be too harsh towards anyone suggesting it because I always complain about the people who try to come up with their own solutions that they think are clever.

But it isn’t really a very revolutionary idea to suggest Elo for a game. And while it’s good that they seem to have some form of understanding on how Elo works, it would’ve been worth the effort to take a step further. In all its greatness, Elo isn’t perfect. There have been systems trying to fix the issues of Elo and streamlining player ratings. One of most succesful being Microsoft’s TrueSkill algorithm, used by Xbox Live.

It’s an infrastructure that is already in place, works much like Elo but presumably fixes some of its issues and makes the process of finding the player’s skill level faster. Either way, it’s at least as good as Elo. In that sense, using Elo to rate players when TrueSkill is already doing it on the background would be pointless. All that needs to be done is to expose the TrueSkill.

Finally, none of us in the Halo community are qualified to judge rating systems, at least the ones to be taken seriously. They rely on mathematics that most people in the community most likely don’t understand. Constructing a sensible argument for or against a ranking system not only requires a strong level of mathematical knowledge, but also tons of real world, statistical data to back any claims up.

That’s why always when it comes to ranking systems, my final words are: “leave it to the professionals”. Because when it comes to rating players based on skill, I doubt most people in the Halo community (including the writer of that thread) have neither the understanding, nor the data to be qualified to discuss the ranking system. For that, you need a group of mathematicians and access to actual data on how the system works in action.

So, while I commend the writer for explaining how Elo works, he doesn’t get the same level of trust as Herbrich et al. or this guy.

@tsassi The Elo system really only has two problems: players leaving a game because of constant rank fluctuation, and players picking and choosing their matches so they don’t rank down.

The first problem is easily remedied with the introduction of leagues* (as suggested earlier). So, while a player’s rank may change on a game-to-game basis, their league (Silver, Gold, etc) will remain constant…so long as they aren’t stuck in a rut. Even then, the Elo system will work in their favor, demoting them as many times as is necessary to find an even match.

*:You could even give them a separate rank based on the amount of time they’ve spent in-game, like what Bungie did with Reach, and 343i with Halo 4. That seems to have worked out okay.

The second problem with Elo - that players will quit or refuse to play matches they don’t think they can win so they won’t rank down - is unavoidable. I don’t think this problem is exclusive to Elo, but rather a problem with ranking systems in general; players are protective of the ranks they’ve attained (understandably so) and they’ll do whatever they can to maintain them. The most you can do here is dissuade players from quitting (as has already been suggested).

There: I just discussed the problems inherent to a ranking system and how to solve them without using complex mathematical algorithms (most of which would probably make my brain explode).