Jump to content

CPU Lag


Acero
 Share

Recommended Posts

One of the most important sources of frustration for team game players (TG) is the amount of lag they experience in every game, especially in late game, when populations are bigger and battles involve larger armies.

This post proposes two changes to the game:

  1. Adding an in-game player cpu lag indicator (similar to the in-game player network lag warning we already have) in order for players and hosts to be able to diagnose lag in their games when network problems are not the bottleneck (which is the case in most TGs nowadays). Sadly, right now, players are kept in the dark about the cause of lag in their TG games.
  2. Reverting 0AD simulation to use 2 turns per second (as it was the case in alpha 23) instead of 5 turns per second (as now) in order to increase the performance of the game up to 2.5 times compared to what we have now.

As I am not a developer, please tell me if these ideas make sense to you and give me your feedback on the lag issue of 0AD.

Lag complaints

The complaints about 0AD lag fall into three categories:

1. Slow game

The game lasts way longer than it should. It is not uncommon for a 4v4 TG of 25 minutes gametime to last 50 or 60 minutes of real time (just because of in-game lag and not accounting for pauses) and this overall sluggishness causes a lot of frustration. Players wonder: why should we be forced to play at 0.5x or 0.4x of the speed when the game setting was set at 1x?

2. Lag spikes

During key moments of the game the lag spikes to ridiculous levels, especially in late game and in big battles, when the game can freeze for up to a few minutes until enough units have died. During these lag spikes all sense of movement is lost, and the battleground becomes a slideshow.

3. Frozen user interface (for slow cpu players)

Players with slow CPUs experience an additional problem. Not only the units in the game run slower and come to a standstill in big battles, but the whole 0AD user interface freezes during lag spikes. While players with fast PCs can still experience a fluid user interface, for players with slower CPUs the simple act of writing a team chat message, moving the camera or selecting any entity on the game becomes almost impossible during strong lag spikes, and very difficult during moderate lag spikes.

CPU lag vs Network lag

Whenever a player has a ping above 400ms the game shows his name and his ping milliseconds on the top-right part of the screen, informing everyone that his connection is lagging. This type of lag is easy for everybody to understand and identify.

What keeps most players confused is why can it be so much lag in a game when no lag warning is shown to them.

This lag is caused by someones computer being too slow keep pace with the speed of the game and, as we know, the game will run at the speed of the slowest PC, which becomes the bottleneck in terms of game speed.

In this post I will refer to "cpu lag" as the lag caused by computer performance bottlenecks as opposed to network bottlenecks. While the CPU lag will be affected by both the performance of someones computer CPU and GPU, the CPU aspect is almost always more problematic than the GPU aspect, as it will become clear below.

A personal experience

In the days of alpha 23 my PC was perfectly capable of playing regular 4v4 TGs. Sure, we got some late game lag and battle lag spikes back in the day, but my PC felt perfectly adequate for 0AD back then, and the experience was decent. Path finding big armies on forests or choke points like gates was a big problem but thankfully it was not that common to encounter (maybe it was really nasty in 1 out of 10 games).

I stopped playing for a while, but when I came back to alpha 25 the lag spikes and overall feeling of the game had become horrendous in almost every late game 4v4 TG. Clearly my PC was now obsolete to play the new 0AD.

I replaced the graphics card of my PC with a modern one, but the frame rate and responsiveness did not improve. A computer perfectly capable of running 'Red Death Redemption 2' at max settings and 60 fps was barely getting 1 fps in big battles in 0AD.

Then I talked to several technically oriented players and the concensus about those who knew was that the CPU performance was the bottleneck in 0AD TGs. No matter how good your GPU is, if your CPU is old or slow (especially in single core performance) your experience will be horrible.

Modern games like 'Red Death Redemption 2' need beefy GPUs but can work well on older CPUs by having more optimized code than 0AD and by making use of multiple cores of your CPU. 0AD is an old game that was designed before multicore CPUs were common.

The simulation and the problem of turn rate

While a slow GPU can cause low framerate, you can always lower the graphics settings to increase performance. But in the case of CPU lag there is nothing you can tweak, as 0AD requires a non negotiable amount of single core computing power every turn in order to update what is known as "the simulation".

The simulation involves all the rules of the game and the inputs of the players in order to calculate what the next state of the game should be. Due to the way 0AD is designed, this is a linear process that can only be done by a single core of your computer.

As I started talking to players that are also devs I came to the concept of "turn", a step which involves the calculations that take place in order to move the simulation forward. Back in the days of alpha 23 there would be 2 turns per second, which means that, for every second, the simulation would be updated twice. This involves taking into account the user inputs and applying the rules of the game to each and every unit and entity in the game in order to update them to the new state.

Then, in alpha 24, the original rate of 2 turns per second was increased to 5 turns per second.

Correct me if I am wrong, but this meant that alpha 24 became 2.5 times more demanding on that critical single core of your CPU. And since CPU lag is the most important cause of lag in large TGs then its pretty easy to deduce that this change might have been the single worst design change decision made in 0AD history in terms of making large TG games less enjoyable.

What is the optimal turn rate

How many turns should be calculated every second? What is the optimal amount?

The first thing to notice is that you can input as many commands per second as you want in your game, irrespective of the number of turns per second. For example, suppose you want to snipe and you can click 5 times very quickly by pressing the shift key at the same time. Those 5 attack orders would all be queued and sent to the next turn. If you wanna cheat with an autoclick mouse you could send 30 clicks in one turn. 0AD will not complain. The number of actions you can input on a given amount of time is not limited by the turn rate.

Another thing to keep in mind is that around 1 second will pass since you issue any command and your units start executing it. Apparently it works this way in order that every computer executes the commands issued by every player at the same time. So keep in mind that an unavoidable latency of 1 second for every command will be always present, again, irrespective of the number of turns per second.

So, how many turns do we need per second?

On the one hand we know that the more turns or calculations per second, the slower the game will run in late game TGs, which nobody wants.

On the other hand, what is the advantage of increasing the turn rate in terms of gameplay, say from 2 to 5?

As I see it, there is very little. Back in the days of a23 we could observe pretty good micro from top players, zigzagging their way out of javelins thrown at them, and let's not forget the nasty hero dances dodging all arrows that were so common back then. And all that was done apparently with 2 turns per second.

Do we need to update units 5 times per second now? Let's remember that one second will pass anyway between any command issued and the unit acting on it. Moreover, lately the units cannot change direction instantly as before, because acceleration was added to their physics, so they became more sluggish in response. And alpha 23 with 500ms turns felt more responsive due to the lack of acceleration physics.

In my opinion, the added lag caused by a higher turn rate defeats the purpose of the increased "snappiness" the developer probably intended, especially in large TGs. And the 1 second latency for every command limits the amount of perceived "snappiness" we will ever have in 0AD no matter how fast the turns are being tried to be calculated.

Turn rate proposal

The simplest option would be to revert to the tried and true 2 turns per second we had in alpha 23. This would probably decrease CPU lag by more than twofold.

A better solution would be to allow the host of the game to select the turn rate he wants.

Another solution would be to automatically adjust turnrates depending on the number of players, for example:
     1v1 - 5 turns per second (like current alpha)
     2v2 - 4 turns per second
     3v3 - 3 turns per second
     4v4 - 2 turns per second (like alpha 23)
   
Or set the turn rate according to overall population cap.

Or dynamically adjust the turn according to the slowest player cpu performance during the game. Early game it could have a fast turn rate and slowly turn it down as the game progresses.

In any case, a hardcoded return to a turn rate of 2 would be a huge improvement in performance and probably a trivial change to implement.

CPU lag warning

Besides this, I want to propose that a cpu lag indicator or warning be added to 0AD. This warning should be similar to the network lag warning we already have.

For example:

zniper14  0.4x

This would indicate that in the last minute the CPU of the player "zniper14" is only capable of running the game at 40% of the intended speed. In big lag spikes we will probably see numbers approaching 0.01x for some players.

This number should be calculated by the amount of seconds (of real time) needed to compute all turns in the last 10 seconds (of game time), and divide that amount by 10.

This is obviously an example and other ideas could be thought of.

 

game_lag.png

Edited by Acero
  • Like 4
Link to comment
Share on other sites

I won't pretend understanding all the code but from the constants names it seems that multiplayer games already run on 500ms turns lengths.

static const int DEFAULT_TURN_LENGTH_MP = 500; //Multiplayer turn length is ms.
static const int DEFAULT_TURN_LENGTH_SP = 200; //Singleplayer

https://github.com/0ad/0ad/commit/cdc07b66ea53b769265a86a4fb918b2816e4fc59

Would be nice to have displayed the player throttling the game in the gui.

Link to comment
Share on other sites

Posted (edited)

If you open any replay file (command.txt) you will find that the duration of each turn is stated on each line a new turn starts, like "turn 0 200".

The 200 stands for 200 ms. In a23 you would see "turn 0 500".

The example below comes from a multiplayer 4v4 tg on this alpha:

turn 0 200
end
hash b6f05acff08e123cde8c1c43ec19627c

turn 1 200
end
hash-quick 2c00bb0c9b7ad23bd7f74ea4a261fa82

 

Edited by Acero
  • Like 1
Link to comment
Share on other sites

When that change was introduced, it was argued that it wouldn't decrease the performance, and that it would possibly even increase it (maybe even with benchmarks to confirm).

Keep in mind that not everything is recalculated every turn. I think calculations use the elapsed time to decide if they trigger, or something like that. So I believe the more expensive calculations do not trigger more often even with a lower turn time.

Personally I recall the lag issue was still very important even in a23 and below. But if you feel there is a change, there are so many other things that can also explain it:

- Player level get stronger -> fully boomed faster -> huge fights happen sooner and more frequently -> you feel the lag

- Economy upgrades are more effective -> all same as above AND you can afford to leave fewer units at home when fully boomed (so more in the expensive fight)

- Meta: rush less viable -> people don't do it -> instead of having maybe 6 players simultaneously fully boomed, you have all eight which reach the same situation at almost the same timestamp, and every fight will happen simultaneously.

Edited by Feldfeld
  • Like 1
Link to comment
Share on other sites

1 hour ago, Player of 0AD said:

Alpha27 is supposed to cause less lag because of the new Vulkan stuff. It's a major change in the code of the game.

Yes. OpenGL seems to be more heavy on the CPU than Vulkan, replacing it was a major improvement. I hope we will see A27 this year.

Link to comment
Share on other sites

Posted (edited)

Thanks real_tabasco_sauce for pointing to the reasons for the change back to 200ms.

On the one hand it would be nice to see real data about the effect on performance by the change in turn rate. I haven't seen the benchmarks Feldfeld mentions. If someone can point to them that would be nice. My own experience says it was bad for large TGs in general.

On the other hand what could be a better benchmark than to allow the community itself to change and test this in real TGs.

To make it possible, the host should be able to tweak the turn rate (maybe a settings option). Also the game should report the real speed the game is running at (as a moving average for example). Lastly the player responsible for slowing down the game should also be identified.

This way, several games that include this "slow cpu player" could be ran and conclusions drawn after a few days or weeks of testing.

Doing testing over time on real TGs is way more solid that a few "laboratory tests" we could come up with. Also way more brains could be involved in observing the advantages and disadvantages of higher or lower turn rate values. Remember that performance issues can happen for different reasons, as unit placement and terrain will be different on each game. It's not a one-dimensional problem.

For example, currently we know that if cronelius is playing a 4v4 TG, and the game makes it to late game, the actual lobby time will be between 2 and 2.5 times higher than the actual game time. In other words, the game will run between 0.4 and 0.5 the normal speed. That's more or less a rule.  If we could already experiment with the turn rate, it would be possible to find out almost overnight which setting makes more sense in these type of games. On the other hand, In 1v1s it might make more sense to use a different setting.

Collision_size.jpg

Edited by Acero
Link to comment
Share on other sites

Posted (edited)

Regarding the new Vulcan improvement on performance, we are all eagerly awaiting for it, hope the new alpha sees the light soon.

But I think this proposal does not get invalidated by this. I am sure we all can agree that 0AD will never run fast enough in the foreseeable future. We are already orders of magnitude away from optimal performance in TGs, especially in big battles where a single turn can freeze up to 10 seconds, when it should take 200 miliseconds. That's 50 times slower.

And if, by some miracle, surplus performance was available, we can always find new ways to make use of it, for example:

  • We could revert the collision sizes to previous normal values to make units behave in more realistic ways instead of lumping together like they do now.
  • We could play larger TGs, like 5v5, 6v6,
  • etc.

So a simple  tweak that could potentially increase performance by 2x or more should not be ignored in my opinion.

 

Edited by Acero
Link to comment
Share on other sites

Keeping in mind I am by no means experienced here, I don't think you can expect a 1:1 effect on performance by changing the turn rate. So a 2.5x slower turn rate does not necessarily mean a 2.5x improvement in performance.

Perhaps some tweaking of the turn rate would be fine (has 250 ever been done?), but I don't recommend going all the way to 2 turns per second.

Also, the situation with collision size and pathing is interesting: in https://code.wildfiregames.com/D4970, we went back and forth with different values for pushing and found that the pushing situation is directly in conflict with pathing: preventing overlap directly makes pathing more awkward and making pathing less awkward requires overlap.

I would totally support something like https://code.wildfiregames.com/D5037 which would allow for less overlap, but it would introduce some extra performance cost.

Link to comment
Share on other sites

Posted (edited)

Just to summarize, after reading the previous discussion, I would be in favor of:

  1. Allow the host to change the turn rate (maybe a settings option).
  2. Make the game report the real speed the game is running at (as a moving average for example).
  3. Make the game indentify the player responsible for slowing down the game to its current speed.

Just with those tools we could study the problem of setting an optimal turn rate in real games played in the lobby instead of speculating. I suspect the optimal turn rate will not be a single number set in stone for every type of game.

There are plenty of smart players and hosts that can give feedback about how turn rate impacts games after a while of testing. I would not disregard that help from the community.

 

Edited by Acero
Link to comment
Share on other sites

For the first part: Yes it would be good to implement that. There is a ticket: #5323

For the second part: There is also a ticket #3752 :D

Most computation isn't executed every turn (or in a constant turn interval), like a builder deciding what to do when a field is placed.
Still some computation does that, like the AI deciding what to do. IIRC An AI only executs each 8th turn.
With more frequent turns the constant-turn-interval tasks could be spaced out more.
So the average computation is roughtly indipendant of the turn length.

Engine things (rendering, network...) Isn't done per turn but per frame.
In a cpu intensive game there is a turn every frame. So engine things could be seen as constant-turn-interval tasks. (This entirely refutes my previous argument :sweatdrop:)

 

There isn't one magical thing which makes the game run fast (in a single thread (multithreading really is magical (that's a joke magic doesn't exist :LOL:)))

If you only care about game speed (your complain 1) you want do as few constant-turn-interval tasks. Thus longer turns.
If you only care about a responsive UI (your complain 3) you want to do as few sim-things per frame as possible. Thus shorter turns.

About your 2. complain: I think in such a situation the fast clients repeatetly have to wait for the slowest client. When a fast client ends it's turn it has to wait's for the slow client to finish the turn. In this time the fast client shows an interpolation (IIRC 5 frames) to the user. When the slowest client is really slow thous 5 frames aren't enough. A solution (which just came to my mind, it isn't elaborate) would be to space thous "interpolation-steps" out, like one interpolation-step each second frame. The interpolation-step / frame ratio should be dynamic to the expected speed of the slowent client. Maby also the interpolation-step count could be dynamic. (I'm just thinking loudly at this point)

 

We head in the direction of decreasing turn length even more:

 

Hosts can already specify the turn length by typing `Engine.SetTurnLength(100);` in to the js console. (yes the feature is hidden)
I'd like to see the results of your experiments :).

  • Like 1
Link to comment
Share on other sites

Just a quick note another big performance improvement in A27 will be with the removal of spectre and meltdown mitigations in spidermonkey which slow considerably the js execution.

wraitii did a few other improvements iirc but not sure which got it and which got reverted. The one about messages could potentially be good if it worked ^^"

  • Like 2
Link to comment
Share on other sites

Regarding these charts:

Can someone explain what the x-axis stands for in both left and right charts? The screenshot seems to be cropped just where it is written. I can't understand the meaning of these charts without it. If someone can explain it would be great.

 

Quote

Hosts can already specify the turn length by typing `Engine.SetTurnLength(100);` in to the js console. (yes the feature is hidden)
I'd like to see the results of your experiments :).

If the turn rate can be changed in JS, then maybe @Atrik idea of doing a quick mod to test it could be worth. The only question would be if there is a way that the host can communicate the selected turn rate every time a new game starts. Else the mod would have a hard coded turn rate, which would not allow us to test different values on different games.

Link to comment
Share on other sites

16 minutes ago, Acero said:
Quote

Hosts can already specify the turn length by typing `Engine.SetTurnLength(100);` in to the js console. (yes the feature is hidden)
I'd like to see the results of your experiments :).

If the turn rate can be changed in JS, then maybe @Atrik idea of doing a quick mod to test it could be worth. The only question would be if there is a way that the host can communicate the selected turn rate every time a new game starts. Else the mod would have a hard coded turn rate, which would not allow us to test different values on different games.

well looks like you wouldn't even need a mod to do this. You could do the following (and I'd be curious on the results): Set up a match with combat demo huge (in scenarios, demo maps) and choose different values with Engine.SetTurnLength(x);

@phosit are the backticks included when typing this in the js console?

Link to comment
Share on other sites

Quote

Hosts can already specify the turn length by typing `Engine.SetTurnLength(100);` in to the js console. (yes the feature is hidden)
I'd like to see the results of your experiments :).

Can this command be entered during a 4v4 replay? It would be an interesting test to make for sure.

But, while testing turn length on a single computer might be interesting, I want to test it on actual multiplayer games from the lobby. Testing it just on my own CPU does not give us the whole picture probably.

Link to comment
Share on other sites

Quote

well looks like you wouldn't even need a mod to do this. You could do the following (and I'd be curious on the results): Set up a match with combat demo huge (in scenarios, demo maps) and choose different values with Engine.SetTurnLength(x);

real_tabasco_sause, I tried this command on the console. It fails on single player mode, so I had to open 2 instances of 0AD, one as host and one as client. As soon as I entered the command on the host machine I got an OOS. And when I tried this command on the client it failed telling me only hosts can enter this command.

So apparently both host and client should be patched via mod before the game starts.

Link to comment
Share on other sites

19 hours ago, Acero said:

Can someone explain what the x-axis stands for in both left and right charts?

The x-axis of the right chart is time. The frames are ordered chronologically.
The right chart are the frames ordered by time.

19 hours ago, real_tabasco_sauce said:

@phosit are the backticks included when typing this in the js console?

No

I only tested it with multiplayer with a single client that worked.
I also get OOS with multiple clients when entering the command during a game.
It worked when entering the command during the loadscreen.

In a replay the turnlength is the same as in the game it came from.

  • Like 1
Link to comment
Share on other sites

  • 4 weeks later...

I think we should make 0AD visually appealing in parallel too, 
when you are done optimizing game performance and 0AD looks like an old game
you have to go back to visual details 
may be we attract 3d artist and make them work for 0AD
and 0AD give certification for their work.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...