One of the most important sources of frustration for team game players (TG) is the amount of lag they experience in every game, especially in late game, when populations are bigger and battles involve larger armies.
This post proposes two changes to the game:
Adding an in-game player cpu lag indicator (similar to the in-game player network lag warning we already have) in order for players and hosts to be able to diagnose lag in their games when network problems are not the bottleneck (which is the case in most TGs nowadays). Sadly, right now, players are kept in the dark about the cause of lag in their TG games.
Reverting 0AD simulation to use 2 turns per second (as it was the case in alpha 23) instead of 5 turns per second (as now) in order to increase the performance of the game up to 2.5 times compared to what we have now.
As I am not a developer, please tell me if these ideas make sense to you and give me your feedback on the lag issue of 0AD.
Lag complaints
The complaints about 0AD lag fall into three categories:
1. Slow game
The game lasts way longer than it should. It is not uncommon for a 4v4 TG of 25 minutes gametime to last 50 or 60 minutes of real time (just because of in-game lag and not accounting for pauses) and this overall sluggishness causes a lot of frustration. Players wonder: why should we be forced to play at 0.5x or 0.4x of the speed when the game setting was set at 1x?
2. Lag spikes
During key moments of the game the lag spikes to ridiculous levels, especially in late game and in big battles, when the game can freeze for up to a few minutes until enough units have died. During these lag spikes all sense of movement is lost, and the battleground becomes a slideshow.
3. Frozen user interface (for slow cpu players)
Players with slow CPUs experience an additional problem. Not only the units in the game run slower and come to a standstill in big battles, but the whole 0AD user interface freezes during lag spikes. While players with fast PCs can still experience a fluid user interface, for players with slower CPUs the simple act of writing a team chat message, moving the camera or selecting any entity on the game becomes almost impossible during strong lag spikes, and very difficult during moderate lag spikes.
CPU lag vs Network lag
Whenever a player has a ping above 400ms the game shows his name and his ping milliseconds on the top-right part of the screen, informing everyone that his connection is lagging. This type of lag is easy for everybody to understand and identify.
What keeps most players confused is why can it be so much lag in a game when no lag warning is shown to them.
This lag is caused by someones computer being too slow keep pace with the speed of the game and, as we know, the game will run at the speed of the slowest PC, which becomes the bottleneck in terms of game speed.
In this post I will refer to "cpu lag" as the lag caused by computer performance bottlenecks as opposed to network bottlenecks. While the CPU lag will be affected by both the performance of someones computer CPU and GPU, the CPU aspect is almost always more problematic than the GPU aspect, as it will become clear below.
A personal experience
In the days of alpha 23 my PC was perfectly capable of playing regular 4v4 TGs. Sure, we got some late game lag and battle lag spikes back in the day, but my PC felt perfectly adequate for 0AD back then, and the experience was decent. Path finding big armies on forests or choke points like gates was a big problem but thankfully it was not that common to encounter (maybe it was really nasty in 1 out of 10 games).
I stopped playing for a while, but when I came back to alpha 25 the lag spikes and overall feeling of the game had become horrendous in almost every late game 4v4 TG. Clearly my PC was now obsolete to play the new 0AD.
I replaced the graphics card of my PC with a modern one, but the frame rate and responsiveness did not improve. A computer perfectly capable of running 'Red Death Redemption 2' at max settings and 60 fps was barely getting 1 fps in big battles in 0AD.
Then I talked to several technically oriented players and the concensus about those who knew was that the CPU performance was the bottleneck in 0AD TGs. No matter how good your GPU is, if your CPU is old or slow (especially in single core performance) your experience will be horrible.
Modern games like 'Red Death Redemption 2' need beefy GPUs but can work well on older CPUs by having more optimized code than 0AD and by making use of multiple cores of your CPU. 0AD is an old game that was designed before multicore CPUs were common.
The simulation and the problem of turn rate
While a slow GPU can cause low framerate, you can always lower the graphics settings to increase performance. But in the case of CPU lag there is nothing you can tweak, as 0AD requires a non negotiable amount of single core computing power every turn in order to update what is known as "the simulation".
The simulation involves all the rules of the game and the inputs of the players in order to calculate what the next state of the game should be. Due to the way 0AD is designed, this is a linear process that can only be done by a single core of your computer.
As I started talking to players that are also devs I came to the concept of "turn", a step which involves the calculations that take place in order to move the simulation forward. Back in the days of alpha 23 there would be 2 turns per second, which means that, for every second, the simulation would be updated twice. This involves taking into account the user inputs and applying the rules of the game to each and every unit and entity in the game in order to update them to the new state.
Then, in alpha 24, the original rate of 2 turns per second was increased to 5 turns per second.
Correct me if I am wrong, but this meant that alpha 24 became 2.5 times more demanding on that critical single core of your CPU. And since CPU lag is the most important cause of lag in large TGs then its pretty easy to deduce that this change might have been the single worst design change decision made in 0AD history in terms of making large TG games less enjoyable.
What is the optimal turn rate
How many turns should be calculated every second? What is the optimal amount?
The first thing to notice is that you can input as many commands per second as you want in your game, irrespective of the number of turns per second. For example, suppose you want to snipe and you can click 5 times very quickly by pressing the shift key at the same time. Those 5 attack orders would all be queued and sent to the next turn. If you wanna cheat with an autoclick mouse you could send 30 clicks in one turn. 0AD will not complain. The number of actions you can input on a given amount of time is not limited by the turn rate.
Another thing to keep in mind is that around 1 second will pass since you issue any command and your units start executing it. Apparently it works this way in order that every computer executes the commands issued by every player at the same time. So keep in mind that an unavoidable latency of 1 second for every command will be always present, again, irrespective of the number of turns per second.
So, how many turns do we need per second?
On the one hand we know that the more turns or calculations per second, the slower the game will run in late game TGs, which nobody wants.
On the other hand, what is the advantage of increasing the turn rate in terms of gameplay, say from 2 to 5?
As I see it, there is very little. Back in the days of a23 we could observe pretty good micro from top players, zigzagging their way out of javelins thrown at them, and let's not forget the nasty hero dances dodging all arrows that were so common back then. And all that was done apparently with 2 turns per second.
Do we need to update units 5 times per second now? Let's remember that one second will pass anyway between any command issued and the unit acting on it. Moreover, lately the units cannot change direction instantly as before, because acceleration was added to their physics, so they became more sluggish in response. And alpha 23 with 500ms turns felt more responsive due to the lack of acceleration physics.
In my opinion, the added lag caused by a higher turn rate defeats the purpose of the increased "snappiness" the developer probably intended, especially in large TGs. And the 1 second latency for every command limits the amount of perceived "snappiness" we will ever have in 0AD no matter how fast the turns are being tried to be calculated.
Turn rate proposal
The simplest option would be to revert to the tried and true 2 turns per second we had in alpha 23. This would probably decrease CPU lag by more than twofold.
A better solution would be to allow the host of the game to select the turn rate he wants.
Another solution would be to automatically adjust turnrates depending on the number of players, for example:
1v1 - 5 turns per second (like current alpha)
2v2 - 4 turns per second
3v3 - 3 turns per second
4v4 - 2 turns per second (like alpha 23)
Or set the turn rate according to overall population cap.
Or dynamically adjust the turn according to the slowest player cpu performance during the game. Early game it could have a fast turn rate and slowly turn it down as the game progresses.
In any case, a hardcoded return to a turn rate of 2 would be a huge improvement in performance and probably a trivial change to implement.
CPU lag warning
Besides this, I want to propose that a cpu lag indicator or warning be added to 0AD. This warning should be similar to the network lag warning we already have.
For example:
zniper14 0.4x
This would indicate that in the last minute the CPU of the player "zniper14" is only capable of running the game at 40% of the intended speed. In big lag spikes we will probably see numbers approaching 0.01x for some players.
This number should be calculated by the amount of seconds (of real time) needed to compute all turns in the last 10 seconds (of game time), and divide that amount by 10.
This is obviously an example and other ideas could be thought of.