lag investigation thread

badosu · January 13, 2022

Ok, so I got some answers (thanks ivand), BAR indeed does this in C with Lua/Scripted hooks that can be overloaded for customization. It's also single threaded.

If my assumption that ranges query is JS then I'm pretty sure it's where the bottleneck is in 0ad. BAR architecture seems better.

If 0ad is not scripted in the fundamental ranges queries then I assume its badly optimized.

Edited January 13, 2022 by badosu

**Stan`** · January 13, 2022

1 minute ago, badosu said:

If my assumption that ranges query is JS then I'm pretty sure it's where the bottleneck is in 0ad. BAR architecture seems better.

If 0ad is not scripted in the fundamental ranges queries then I assume its badly optimized.

Range queries are done in C++ through the CCmpRangeManager class. That class component is attached to what we call the SYSTEM_ENTITY.

Those calls can be made from JavaScript and C++ by creating a range query.

The issue is when 1 000 units start a range query at the same time. As far as I know the processing is sequential.

From my recent testing of wraitii's patch to unthread the ai though not having to copy the whole simulation over to the AI makes a huge improvement on performance.

https://code.wildfiregames.com/D3769

badosu · January 13, 2022

Weird, range queries afaik are not a bottleneck in BAR and still single threaded. As ivand said the code is convoluted but you can check this entrypoint for some insight: https://github.com/beyond-all-reason/spring/blob/BAR105/rts/Game/GameHelper.cpp#L635

badosu · January 13, 2022

Moreover, BAR also does physics simulations for range queries and projectiles so it's definitely something that can be very optimized in 0ad even single threaded, the problem is if the architecture on 0ad is a blocker.

badosu · January 13, 2022

Some insight from TarnishedKnight on BAR:

Quote

One factor I suspect is that 0ad must use Fixed Point math whereas we can use floats and SSE. Another is that 0ad must use OpenGL 2.1, whereas we have batch loading of assets in GL4 - the graphics improvements have made a HUGE impact on performance. Prior to those changes, my stress test had an average rendering pass time of 80ms, afterwards I think it dropped to below 20ms. That has a significant impact on Sim.

**vladislavbelov** · January 13, 2022

38 minutes ago, badosu said:

the graphics improvements have made a HUGE impact on performance

No promises, but I'm working on that

badosu · January 13, 2022

Another bit from TarnishedKnight:

Quote

I haven't seen 0ad's code, but one trick used a lot in the Spring engine to keep a decent performance is what we call SlowUpdate. So anything that doesn't absolutely have to be done every sim tick can be put in a SlowUpdate. SlowUpdate runs updates over half a second (in our case 15 frames) so each frame 1/15th of the units get their SlowUpdate run, then the next frame the next 1/15th of units get their SlowUpdate run as so on.

**Freagarach** · January 14, 2022

7 hours ago, badosu said:

Another bit from TarnishedKnight:

That is how we update the AI.

And for units that sounds somewhat like our timer system. We don't need to check for enemies every turn when attack-moving, so we do it every second.

**wraitii** · January 14, 2022

13 hours ago, badosu said:

Moreover, BAR also does physics simulations for range queries and projectiles so it's definitely something that can be very optimized in 0ad even single threaded, the problem is if the architecture on 0ad is a blocker.

In fact, doing physics simulation most likely makes them much more efficient than us.

In 0 A.D., arrows are actually purely graphical, the "hit" itself is a timer (this, by the way, has a number of odd consequences, e.g. #5965 or #5964). The consequence is that on timer hit, we must query units around us and check collisions in JS manually (sorta). if there are 100 arrows, this will do 100 range queries.

In a physics-system approach, the arrows would move each "physics update" through the world, a very local phenomenon that can be highly optimised. Detecting a hit is a fast operation by itself, and there is no need to do range queries. Thus arrows are not _specifically_ slow, just part of the whole physics engine.

0 A.D. does not have a physics engine at all, and it probably wouldn't work that well for us because of our 200ms turns. I suspect BAR uses a much more fine-grained "physics turn" of e.g. 10 or 20ms (edit: based on 500ms being 15 turns, 33ms) , so their physics-related lag is less 'spikey'.

--

We could update 0 A.D. to have more turns and do fewer things on each turn, which would end up making it more possible to use a physics engine (though there are floating point / determinism concern), but that's a lot of work given where we come from.

---

That being said, this doesn't prevent us from changing how projectiles work -> it's probably a semi-good idea to consider moving stuff to C++ and making them behave more realistically, given that the current code ends up being slow-ish anyways.

Seleucids · February 21

A source of frame rate drop found: attack animations.

When we fight large armies, the GUI must render all of the attack animations, which is a huge task. I removed the attack animations of some units and did a fps comparison to with the attack animations and noted the results.

Experimental method:

1. Mod the fps counter so that it prints out the fps value into my terminal. I then save the terminal outputs to a .txt file.

2. For each test, I ran the Combat Demo Huge map on automatic. On game start, I zoom out to my zmin (8.00 in user.cfg) and watch the units fight.

3. Kill 0ad as soon as one side lose all units.

4. Use a Python script to compute the average frame rate recorded, discarding anomalous values and the gamesetup screen etc.

I did a FPS measurement and save it to a new file every time I remove an art feature (sorry Stan ).

Results

fpslog1.txt  average fps:  76.17021276595744
fpslog2.txt  average fps:  92.876254180602
fpslog3.txt  average fps:  91.12413793103448
fpslog4.txt  average fps:  81.94713656387665
fpslog5.txt  average fps:  119.80232558139535
fpslog6.txt  average fps:  188.7904761904762

file 1: Vanilla 0ad units with just my katemod.

file 2: removed attack animation of just Roman skirmishers

file 3: repeat of experiment 2

file 4: removed Gaul skirmisher attack animation, but kept Roman skirmishers

file 5: removed some Roman and Athenian spearman animations; removed both Gaul and Roman skirmisher animations

file 6: kept all above changes, replaced Roman spearman with a Mauryan spearman, then decapitated the Mauryans.

Subjective observations:

Experiments 1, 2, 3, 4 all had a stutter immediately after game starts, then fps rises over time.
Experiment 5 was not perfectly smooth animation but playable from a first person perspective
Experiment 6 had almost no lag or stutter. Units were selectable in real time without delays.

Conclusions:

This is consistent with my experience of Romans being one of the most lagging civ and Mauryas lagging significantly less.
One big source of lag (at a GUI level) when fighting is the sudden need to render many animations.
Removing spearman animation had the biggest effect on fps.
A mitigation strategies: simplify skirmisher attack animation (we can't remove it completely because you wouldn't know whether they are attacking anymore); remove spearman / melee attack animation (you will not watch your spearman in battles, you just assume they are fighting ).

Seleucids · February 21

Experiment 6 only removed the attack animations of half of the melee actors present, but the performance improvement was already this huge!

To reduce render, one can seriously consider replacing all spearman templates with Maurya spearman. It will look weird but it will really help your fps.

Files used attached below:

fpslog6.txt fpslog5.txt fpslog4.txt fpslog3.txt fpslog2.txt fpslog1.txt comparator.py fpslog6.txt

Seleucids · February 21

The result, graphed:

This might explain why experiment 4 and 3 yielded lower values than they should have: I could have paused the experiment prematurely, while the fight was still happening. This decreases the amount of high fps values recorded at the end and hence reduces the total values. However, it is still clear that experiments 5 and 6 (disabling spearman animations) consistently outperformed the vanilla 0AD.

Seleucids · February 21

fpslog1.txt average fps: 76.17021276595744
fpslog1.txt `10%` low: 30
fpslog1.txt `1%` low: 20
fpslog1.txt lowest: 18
fpslog2.txt average fps: 92.876254180602
fpslog2.txt `10%` low: 30
fpslog2.txt `1%` low: 21
fpslog2.txt lowest: 17
fpslog3.txt average fps: 91.12413793103448
fpslog3.txt `10%` low: 31
fpslog3.txt `1%` low: 21
fpslog3.txt lowest: 17
fpslog4.txt average fps: 81.94713656387665
fpslog4.txt `10%` low: 30
fpslog4.txt `1%` low: 20
fpslog4.txt lowest: 16
fpslog5.txt average fps: 119.80232558139535
fpslog5.txt `10%` low: 42
fpslog5.txt `1%` low: 27
fpslog5.txt lowest: 20
fpslog6.txt average fps: 188.7904761904762
fpslog6.txt `10%` low: 53
fpslog6.txt `1%` low: 30
fpslog6.txt lowest: 29

The 10% low and 1% low has been calculated. Human eyes will feel smooth for animations above 30 fps. Having decapitated frozen Maurya spearman ensures the fps is smooth even in 300 vs 300 situations (extreme 2v2 battle). This can at least remove the GUI lag from A27, without the pain of editing the engine.

I will be working on a mod which can do this.

Edited February 21 by Seleucids

**Stan`** · February 21

You could try to replace them with meshes from a23

**Stan`** · February 21

If done properly you could even switch between the two at runtime.

0 calories · February 22

is there any mod which removes these anymation for testing purposes of others?

**Stan`** · February 22

You might just delete the animation folder (it will trigger error spam but it *should* not break.

0 calories · February 22

37 minutes ago, Stan` said:

You might just delete the animation folder (it will trigger error spam but it *should* not break.

This one?

[~/source/0ad-0.27.0/]: find . -name animation
./data/mods/_test.dae/art/animation

Edited February 22 by 0 calories

Atrik · February 22

Do we really need to chope everything off the game to be able to run it? :huh:

0 calories · February 22

11 minutes ago, Atrik said:

Do we really need to chope everything off the game to be able to run it?

just for testing purpose.. without animation game looks wierd

Edited February 22 by 0 calories

**Stan`** · February 22

10 minutes ago, Atrik said:

Do we really need to chope everything off the game to be able to run it?

Well obviously rendering nothing or cubes is gonna be faster.

13 minutes ago, 0 calories said:

This one?

[~/source/0ad-0.27.0/]: find . -name animation
./data/mods/_test.dae/art/animation

No it should be in the public mod.

0 calories · February 22

2 minutes ago, Stan` said:

Well obviously rendering nothing or cubes is gonna be faster.

No it should be in the public mod.

Thanks I resolved it.. Well if game is faster it is ok I still have strong hardware but I;m getting drop to 4FPS and that the bottleneck I would like to really resolve. In other cases GUI gets over-slow. Let see if this really helps

Edited February 22 by 0 calories

Seleucids · February 22

no-animation.zip

Seleucids · February 24

In a comparison between A26 and A27, I did 300 cavs per side for battle. You have seen the A27 version already.

The A26 version was not worth recording, but there are a few observations:

1. Average frame rate was on par or even lower than A27.
2. Much less freezes and stutters in A26. It was a simple matter of low framerate due to too much models being rendered on a weak GPU.

3. My laptop was stretched much harder by A26 than A27. While playing A26, I had a single core at 100% and the temperature rose to 95 degrees Celsius. At this point the frame rate began to drop to 24fps.

On the other hand, in A27, even if you are at 3 fps, the usage of CPU is merely 50% - 60% with a temperature of around 70 degrees. I have never seen A27's CPU usage exceed 60%, not even for a single virtual core. The laptop was barely doing any work. There is something wrong with the way A27 consumes CPU resources that is causing the stutters. Barely any hardware is used by A27. Maybe there is too much waiting for a particular process?

A27's optimisation improved in the sense that its maximum frame rate could reach higher values than A26 and the average over time was better. However, the range of A27 fps values is much bigger with the lowest values below 1 (totally unplayable). There is no frame rate stability in A27. Therefore, the solutions is to make A27 consume as much as A26, then the stutters should be much less.

@real_tabasco_sauce @wraitii

Edited February 24 by Seleucids

**Stan`** · February 24

For completeness you might compare a single player match (which has no hashes) between the two versions. This will tell us if there is something else at play.

3 hours ago, Seleucids said:

On the other hand, in A27, even if you are at 3 fps, the usage of CPU is merely 50% - 60% with a temperature of around 70 degrees.

Maybe there is too much waiting for a particular process?

When the stuttering happens is one of your cores at 100% ?

You could upload the A27 replays too.

Have you tried profiler 2? It will show you where it's taking time.

Maybe we should merge the two threads.

lag investigation thread

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

maroder

vladislavbelov

wraitii

Posted Images

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation