Jump to content

LocalRatings mod - evaluate players' skills based on previous games


Mentula
 Share

Recommended Posts

On 17/10/2022 at 5:15 PM, Philip the Swaggerless said:

Just a question:  Has a weighting "meta" emerged?

This question deserves attention.

Due to the way LocalRatings is structured, this is slightly similar to asking: what summary scores describe players' skills best? I am not aware of any (recent) related topic on the forum; if anyone knows, please link it.

I personally find the default weights quite reliable, albeit not perfect. Any of the experienced players has a better clue?

  • Like 1
Link to comment
Share on other sites

this is a little fork from
<3 LocalRatings from https://gitlab.com/mentula0ad 
thanks very much for the help from mentula0ad . i have asked many questions to mentula.

use mouse or only keyboard to select a player.

if you use only keyboard its always the first player.
if you use mouse its the player that you have clicked from the list.

it's remembers last visited player, last chart that you have used (saved automatically when a new chart had been drawn).

 

Edited by seeh
Link to comment
Share on other sites

  • 2 months later...

There is no shared database/cloud/space for replays, as far as I know. The idea of sharing replays has been brought up by other users, and I personally like it, as it has other side benefits in terms of sharing knowledge about the game.

For what it is worth, my dream is that one day Wildfire Games will have the economic and developing resources to afford own lobby servers. That will allow collecting replays, along with solving many of the major problems affecting the game today.

  • Like 1
Link to comment
Share on other sites

  • 4 months later...

What's for A27? Here are some of the new features that LocalRatings will include in the upcoming version 0.27.1.

> Rating distribution charts - If you like charts, statistics, Gauss, or you simply like colors... it's time for new charts! See screenshots below.

> Open LocalRatings from the lobby, the game setup or in-game - The LocalRatings page can be toggled as a dialog while playing a game, while in the game lobby or during the creation of a new game. Simply with a hotkey.

> A new mod filter - Filter out replays with mods you want to exclude from the rating calculation.

> Unrestricted weights - Weights can now be set to any value... even negative! That means that certain parameters can be considered as a malus for the rating calculation. Who knows what data will reveal.

> AI players included - Games with Petra can be included for the rating calculation. Because even Publius Cornelius Rufinus deserves his own rating.

> And other little perks - An optional vertical marker to navigate charts more easily, persistent table sorting preferences, improved performance...

 

The new version 0.27.1 is currently in the final stage of development and will be released soon, after the necessary tests. If you are interested in trying it, you can download the development version (branch name: Alpha27) at this page.

Cheers

2.png.8cf1b7ffb86fd1572536fe949d943c3c.png

1.png.a6a9fd467cd90350be5e7109d998c9ee.png

  • Like 3
Link to comment
Share on other sites

36 minutes ago, Mentula said:

Sorry but I can't see any benefit in limiting the freedom of users in choosing their own parameters. :unknw:

i talking about the default value. you have default values in all fields. i suggest use also here a default. you now understand?

Link to comment
Share on other sites

2 hours ago, seeh said:

i talking about the default value. you have default values in all fields. i suggest use also here a default. you now understand?

Still, that would be a change inconsequential in everyone's practice.

Edited by alre
Link to comment
Share on other sites

On 23/05/2022 at 2:38 PM, real_tabasco_sauce said:

he mod accounts for scores averaged across gametime, so rushes have more of an impact than they do just by looking at end of game scores, but its not great.

Acero and I discussed how to account for rushes. thoughts on this?

 

effectiveness = military score*(game pop cap/ avg game pop)/ resources spent

[ex. rush at ~3 mins, total game pop is 160/1600 = 20 percent -> military score of involved players (rusher and rush defender) receive 80% boost]

@real_tabasco_sauce

 

I am a little confused. Did you mean "pop is 160/1600 = 10%"? And is the 80% boost an arbitrary number or is it based on the game pop/world pop cap?
 
In a 4v4, a rusher may have better stats than the 1-2 people they rushed for much of the game, but the other 5-6 players likely have better stats for the majority of the gametime.  It is a likely scenario that the longer the game goes, the better the score for the non-rushers (at least on the winning team).  Can the population percentage boost account for this in a way that we think correctly reflects effectiveness of the rusher?
 
Also, as the losing team's population diminishes, the winning team's kills will become increasingly more valuable in the score relative to when most players have full pop.  Such a boost would be beyond what is intended.
 
I apologize if this is all obvious, but I submit that:
 
One of the reasons early kills are effective is because the person that loses a unit loses that unit's economic and military productivity for the rest of the game's duration.  So the value of that kill continues to increase the longer the game goes on. (Or to be more precise, until that player reaches their pop cap, or whichever comes first).
I don't think that ratings that don't consider time of kill compared with game duration (or time until the affected player's pop cap is reached) can capture the value of rushes.
 
Since this mod is based on averaging scores of different frames, it may not be well suited for this.  Is that right @Mentula?
 
Actually, if we consider the value of the kill in terms of the productivity it denies the other player, there may even be a rough way to consider military and economic score as different aspects of a single dimension.
 
Admittedly, this method would not account for the value of rushes which do not kill units but which cause a lot of idle time for the target of the rush.
  • Like 1
Link to comment
Share on other sites

1 hour ago, Philip the Swaggerless said:

"pop is 160/1600 = 10%"

Haha, the math must not have been mathing that day.

But yeah, the idea was to add a multiplier for kills in early games based on global population.

if the current game pop is 100 and the game population cap is 800, then that military score receives a 8x boost. This way it affects everybody equally.

Now, that seemed overkill, so there had to be additional parameters to get a more reasonable multiplier. Then you can make further changes to more heavily affect the low pop scenarios, with higher pop scenarios approaching a multiplier of 1.

Basically, the statistic becomes more complicated than the value it is supposed to represent, so plenty of people agreed that its better to just keep simple statistics.

  • Like 2
Link to comment
Share on other sites

10 hours ago, Philip the Swaggerless said:

Since this mod is based on averaging scores of different frames, it may not be well suited for this.  Is that right @Mentula?

This is what I believe too, and I agree on all the thoughts you shared @Philip the Swaggerless.

The solution that could mitigate this problem is to change the mechanics of the rating system to adapt to this particular case, but I am not in favor of subduing data to expectations. You know what they say: “If you torture data long enough, it will confess to anything”.

Following the reasoning above, I really think rushing is a particular aspect of the game: in a 1v1, for example, LocalRatings will be very accurate in rewarding effective rushes. The less players, the more rushes are valued. Also, the fact that we regularly play with 200 pop (and not, for example, 50 or 100) and with 300 res (and not 500 or 1000) are reasons why rushes are not very well taken into consideration.

In other words, different conditions will be evaluated differently and the effectiveness of a rush depends very much on the game design (which might change in future versions), so I'd rather keep the rating system as abstract as possible, removing all possible sources of arbitrariness.

 

As a side note, some combinations of weights are suitable for rushes more than others: for example number of units killed + exploration will be more significant for rushers [reason: rushers have a high relative value on these parameters compared to other players during the largest part of the game]. However, I acknowledge this is far from being satisfactory, as all the other aspects of the game (economy in primis) will be ignored.

Link to comment
Share on other sites

  • 4 weeks later...

New LocalRatings version!

I have been working hard on the new release, and I'm happy to announce it is ready for download! It comes in two versions: v0.26.2 and v0.27.1, one for Alpha26 and the other for Alpha27, respectively. The two versions are the same in terms of features.

Download for Alpha 26: LocalRatings 0.26.2 (zip) | LocalRatings 0.26.2 (pyromod)

Download for Alpha 27: LocalRatings 0.27.1 (zip) | LocalRatings 0.27.1 (pyromod)

 

What's new (with pictures)

1. Rating distribution charts

Spoiler

A chart with 100 bins (vertical bars). The yellow bin is the one of the player selected from the player list:

ch1.png.720aa8c9d635143327f60a8ba1eacf32.png

A chart with 10 bins and different colors:

ch2.png.8f9a4b0ffe50c42e20585c7d8d6fad4f.png

When hovering on a bin with the mouse, more information on players in that tier shows up:

tierinfo.png.31c9e43044649094c1ad89c81fcf7978.png

Distribution chart options:

chartoptions.png.be09d8438c033d947c57fb4a8bbe5e6b.png

2. Aliases (thanks Acero for suggesting this very useful feature)

See also Treat players with multiple accounts as one and What is the "primary identity" of a player with aliases?

Spoiler

The Aliases menu, where groups of multiple identities of the same player can be created, deleted or edited:

alias.png.0892db03a9ffb3862832c63055fc1cab.png

3. Toggle LocalRatings as a dialog from the lobby, the game setup or in-game

Spoiler

The LocalRatings page can be toggled via a hotkey, unassigned by default:

hotkey.png.65f6d9ed06bc9019b9e5d0d5e3ae0da0.png

The LocalRatings page can also be opened from the in-game menu button:

in-game.png.716111ef034c12f47a1267356e23e0b7.png

The menu button can be enabled/disabled from the Options menu:

buttonoption.png.70896df0264fd4fc78193cb19c408ff1.png

4. Mod filter

Spoiler

The Mod Filters options menu. Th list of mods is taken from your -the mod user's- replays. In presence of many  mods, the list is distributed across multiple pages:

modfilter.png.ac53fd5b7b37668ce6ccc241500ade80.png

5. Restyled player profile area

Spoiler

The new player profile area that shows up when selecting a player from the player list:

playerprofile.png.24406c821590d9d18ddf8b85dd234ee3.png

6. Personalized (rating/matches) format

Spoiler

A custom format can be set from the Options menu. For example, the following format:

formatoption.png.a0802e94bbe5c08fe178060f425bbbcc.png

will result in the Lobby page as:

formatlobby.png.74c1f85e319d50a2b272815e285ac7ee.png

and in the Game Setup page as:

formatgamesetup.png.e6400cc267b20b82f3caeb70e0a6727d.png

7. Games with AI players included

Spoiler

An example of an AI player (Ashoka the Great) showing up in the player list:

ashoka.png.c242a34f99ac9f2e099576bccae9cd0c.png

8. Optional vertical marker for the evolution chart

Spoiler

The Evolution chart can be navigated with the help of a vertical marker. Its color can be changed from the Options menu:

verticalmarker.png.e7c6e03af52aa23c7fea69fc2567e9e2.png

9. Negative/unlimited weights

Spoiler

An example of a negative weight and a big weight:

negativeweights.png.0df159e8017543bc6c63a726b1e75e38.png

  • Like 3
  • Thanks 2
Link to comment
Share on other sites

  • 2 months later...

@Mentula This is awesome. 

Just a question (without looking through your code which would probably provide me with an answer to this question and if so sorry for time-wasting) how do you obtain the stat data?  Do you do a headless replay and collect the stats?  Do you just do a replay and use the stats collected?

Curious and it would help me if you'd share this information, if possible.

Link to comment
Share on other sites

55 minutes ago, Dizaka said:

@Mentula This is awesome. 

Just a question (without looking through your code which would probably provide me with an answer to this question and if so sorry for time-wasting) how do you obtain the stat data?  Do you do a headless replay and collect the stats?  Do you just do a replay and use the stats collected?

Curious and it would help me if you'd share this information, if possible.

do you know you can see stats from a replay without starting the replay? I mean from the game.

  • Thanks 1
Link to comment
Share on other sites

Hi @Dizaka,

each replay has an associated folder where data is stored (see this page to locate replays on your system: GameDataPaths). A replay folder contains two files: commands.txt and metadata.json.

The first file, commands.txt is not used by LocalRatings: it contains the sequence of actions performed by each player in the game and its main purpose is to sequentially execute those actions when a replay is played.

The metadata.json file is the interesting one, from the LocalRatings perspective. It contains the stats of the game, taken at certain intervals of time. You may want to look into such file, to extract data from a replay.

 

Nicely enough, the engine exposes two methods that facilitate retrieving replay data:

Engine.GetReplays()
Engine.GetReplayMetadata(replaydir)

The first of the two commands above yields all replays along with minimal metadata. To get the full metadata (including all stats) from a given replay, the second call will do the job.

 

If you are curious to dig into the LocalRatings code to see how the mod handles data, I can suggest looking into Replay~LocalRatings.js and Sequences~Localratings.js, although some additional file is probably needed to grasp the full picture. These two files contain classes responsible for handling metadata and stats (respectively) of a given replay and storing relevant information.

Cheers

  • Thanks 1
Link to comment
Share on other sites

  • 1 year later...

I think this mod is highly ambitious. The desire to have what the mod attempts is big, but currently it's not accurate.

Adding to this from another thread:

12 hours ago, ffm2 said:

If you have a working algorithm, maybe you can test it by modding local ratings mod. I think this mod currently does not rate well and favors certain styles and not just because there's too few games. ValihrAnt is rated 8.18 in replay-pallas.wildfiregames.ovh Players with higher rating than him: RangerK, FriedrichDerGrosse, Meister_Augustus, breakfastburrito_007, Atrik_III. All over 10 games.

I checked how well all past games are predicted by the current local rating of the players in my database and overall it predicts the team with a higher sum of lr is the winner in 59.3% of the cases (so it is better than random guessing). The 20 games I have with ValihrAnt got predicted 70% false by the sum of the lr of the team (so worse than random guessing). Btw. my lr is -20.64 so maybe I'm biased. If I clean up the statistics I can release more, but I think scrolling through replay-pallas.wildfiregames.ovh with players in mind you know should suffice to get a feeling of the accuracy of local ratings.

Here is the script with which I evaluated the results. Maybe it helps to develop a better working algorithm.

The script is to be run in .local/share/0ad/replays

localratings_statistics.xlsx rating_localratings.py

Edited by ffm2
  • Thanks 1
Link to comment
Share on other sites

Of course, at the moment I can only express my disagreement with this rating and don't have much else to contribute.

I also played with the score weights a bit and found that using resources used instead of resources gathered brings better results for my data base. But one has to split a test and a training data set first for adjustments like that (which I didn't in a quick attempt).

For improvements one might analyze ValihrAnts (or other players that are obviously wrong valued) games and see where his strength can be seen in the data. And maybe even change the statistics that are logged in 0 a.d..

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...