TrueSkill2 Questions

This is mainly aimed at ZaedynFel, and was fueled by a lot of coffee.

> 2533274839818445;170:
> > 2533274841661584;169:
> > > 2533274839818445;168:
> > > At the end of the day, Slayer is more a KPM race than anything else. Players that carefully maintain their KDA, but don’t pump out kills consistently lose matches against better opponents, whereas players with worse KDAs but higher KPMs also consistently win against better opponents. They may go negative sometimes, but on average it pays off.
> > >
> > > This proves itself repeatedly in the data: KPM > DPM >> KDA over and over again, regardless of whether it’s intuitive or not, it’s just the way it is across millions and millions and millions of matches. It’s not even subtle in the data.

Is this conceptually valid?

Obviously, it’s counter-intuitive that a proxy outcome (kills) is a better predictor of winning than the true metric (kills/deaths), when the definition of winning a Slayer game is to have a team (which we can consider the sum of individuals) have kills>deaths. I’ve read the paper and am not a total layman; Trueskill2 is obviously driven by wide swaths of data which is fantastic, but also brings the caveats you have when researching with any ‘big data’, which is the possibility of capturing trends and associations that are not meaningful to what we’re trying to measure, or are not being interpreted appropriately. Side note, the paper does not appear to discuss the discarding of death rate in the model. Is there additional documentation regarding that decision lying around?

Anyway, if the data demonstrates that KPM is superior to KDA for predicting who wins games (and therefor is used to assess player skill at the individual level), to what extent did the model explore the potential that KPM drags in team performance more than KDA? For example, a higher KPM is likely correlated with teammates that assist more (or control power weapons and powerups better, or control the map more, both lurking variables that I imagine are hard to capture), and therefor could possibly serve as a proxy for a team that works well together. On the converse, there are far fewer ‘protector’ medals than assists, meaning a good team is marginally more like to jack up kill numbers for one another than to prevent deaths, meaning good teams are correlated with KPM more strongly than KDA. As a result, KPM as a partially team-driven (essentially, confounded) variable could be a stronger win-predictor than KDA as an individually-driven variable (in fact, anyone who’s played Slayer knows that a team of four that plays with teamwork wins more often against a team of solo queues, even if the solo team has higher individual CSR/MMR [would be very interested to see if that could be confirmed in data]).

You note that “Players that carefully maintain their KDA, but don’t pump out kills consistently lose matches against better opponents, whereas players with worse KDAs but higher KPMs also consistently win against better opponents. They may go negative sometimes, but on average it pays off.” This strikes me as a potentially problematic? Regardless of how things converge over a sample size of millions of games, I would contend that players should be ‘rewarded’ with increased MMR based on their actual performance, not a proxy that may approximate skill and win percentage over the long run. The logic is “This person had an objectively bad game (high kills, higher deaths; by definition this is a bad game because the objective of slayer is to have kills>deaths), but we’re betting over the course of their career they’ll end up doing better.” It’s like if you surveyed thousands of basketball games and determined that frequent shooters, regardless of actually scoring on those shots, typically do well over their career; then the NBA determines who won a game (or the skill of a player) based on number of shots, regardless of who/which team actually got the points in a given match. Proxies might predict well in huge sample sizes, but are they really appropriate to grade individual instances? Especially if doing so is unintuitive to users (players)?

This is all subject to discussion. The architects of Trueskill2 are a lot smarter than me. Just curious and trying to think though reasons that the system confuses the hell out of everyone who plays the game.

All I know is I can hang with ranks near mine maybe up D1, but get destroyed by anything above that. On the other end, high gold is right there with me, but anything lower gets generally beaten.

In slayer, I hover between P3 & P5 as a solo queue.

To me, that means overall the system is assigning ranks fairly accurately.

> 2533274922320564;1:
> I would contend that players should be ‘rewarded’ with increased MMR based on their actual performance,

That’s already an aspect of the MMR system. Personal performance (measured by KPM) influences the MMR delta after a win or loss. If your rated at a Gold 6 but perform like a Diamond, then your MMR increase upon a victory will be greater than if you’d played like a Platinum. In fact, in the past, Menke has said that for extreme cases, personal performance can even supersede victory/loss in terms of changes to MMR. That means if, for instance, you win a match, but performed super terribly, your MMR could still go down despite having won; on the other hand, if you lose, but outperformed everyone else on your team and exceeded the system’s expectations for your performance, then your MMR could still go up, even though you lost. Of course, those are rare scenarios, but it’s an example of personal performance influencing the gains/losses on their skill rating.

The potential for MMR to change in such drastic ways is part of the reason why it’s a hidden value and not used directly to rank players. One uncharacteristically bad or good game may not represent you, even if your MMR changes alot for a bit due to said game, so it would be bad if players were judged by others based on such temporary changes. This is why CSR is used as a basis for ranks instead, because it is a more stable metric (i.e. does not change as drastically), and changes to CSR follow a more expected pattern (i.e. CSR will always go up on a win, and always go down on a loss).

> 2533274922320564;1:
> The logic is “This person had an objectively bad game (high kills, higher deaths; by definition this is a bad game because the objective of slayer is to have kills>deaths), but we’re betting over the course of their career they’ll end up doing better.” It’s like if you surveyed thousands of basketball games and determined that frequent shooters, regardless of actually scoring on those shots, typically do well over their career; then the NBA determines who won a game (or the skill of a player) based on number of shots, regardless of who/which team actually got the points in a given match.

I don’t think your NBA analogy is accurate, because you’re equating kill rate to shots taken, when kill rate is actually more similar to shots scored. The team who scores points faster is certainly more likely to win. The MMR system isn’t rating players based on shots fired. It’s rating them on their ability to secure the kill, and players who secure kills faster are more likely to achieve victory. Also, your example of an objectively bad game (more kills than deaths) doesn’t mean the system is betting that they’ll do better in the future. Instead, the future looks across the entirety of the player’s game history, and if that objectively bad game is an outlier, then maybe it doesn’t change MMR as much. But if that sort of performance becomes regular, then the MMR adjusts accordingly. So it’s not so much “we bet they’ll do better”; it’s more “this performance is not characteristic of them” (until, of course, it becomes characteristic).

> 2533274817408735;3:
> > 2533274922320564;1:
> >
>
>
>
> > 2533274922320564;1:
> >
>
> I don’t think your NBA analogy is accurate, because you’re equating kill rate to shots taken, when kill rate is actually more similar to shots scored. The team who scores points faster is certainly more likely to win. The MMR system isn’t rating players based on shots fired. It’s rating them on their ability to secure the kill, and players who secure kills faster are more likely to achieve victory. Also, your example of an objectively bad game (more kills than deaths) doesn’t mean the system is betting that they’ll do better in the future. Instead, the future looks across the entirety of the player’s game history, and if that objectively bad game is an outlier, then maybe it doesn’t change MMR as much. But if that sort of performance becomes regular, then the MMR adjusts accordingly. So it’s not so much “we bet they’ll do better”; it’s more “this performance is not characteristic of them” (until, of course, it becomes characteristic).

Fair, a better analogy would be that it values how fast a team scores points over whether a team scores more points than the other team, even though the latter defines winning and losing. Would that be a better metaphor? By definition, a kill is ‘scoring’ on the other team, and a death represents the other team ‘scoring’ on you.

The objectively bad game piece I think you’ve misinterpreted. Per the above, it’s possible to have a high KPM and have an objectively bad game, that’s my point. I think we can agree with the stipulation that having less kills than deaths is objectively bad given the goal of Slayer. When I talk about the system ‘betting’ that you’ll do better, iI’m saying that in MOST examples, people with high KPM are actually having objectively good games, which is why TS2 uses it as a metric. But in our INDIVIDUAL example game, this was not the case. But TS2 is playing the numbers, and it’s concerned with being right MOST of the time, even if it gets individual instances wrong, in which case players don’t get the MMR increase that they think they deserve. Does that make sense?

Really, what makes this academic is that the data probably shows that high-KPM low-KDA games are extremely rare. Those two variables are also probably extremely closely correlated, making this discussion pretty moot. But at least it’s helping wrap my brain around it.

When someone lags out of the game. Your on a team of 3 against 4 and you pull off the win. Why does the only top 2 performers dramatically rise and the third person gets award 1 csr?