Jump to content

lag investigation thread


Recommended Posts

Ok, so I got some answers (thanks ivand), BAR indeed does this in C with Lua/Scripted hooks that can be overloaded for customization. It's also single threaded.

If my assumption that ranges query is JS then I'm pretty sure it's where the bottleneck is in 0ad. BAR architecture seems better.

If 0ad is not scripted in the fundamental ranges queries then I assume its badly optimized.

Edited by badosu
Link to comment
Share on other sites

1 minute ago, badosu said:


If my assumption that ranges query is JS then I'm pretty sure it's where the bottleneck is in 0ad. BAR architecture seems better.

If 0ad is not scripted in the fundamental ranges queries then I assume its badly optimized.

Range queries are done in C++ through the CCmpRangeManager class. That class component is attached to what we call the SYSTEM_ENTITY.

Those calls can be made from JavaScript and C++ by creating a range query.

The issue is when 1 000 units start a range query at the same time. As far as I know the processing is sequential.

From my recent testing of wraitii's patch to unthread the ai though not having to copy the whole simulation over to the AI makes a huge improvement on performance.

https://code.wildfiregames.com/D3769

  • Like 1
  • Thanks 2
Link to comment
Share on other sites

Some insight from TarnishedKnight on BAR:
 

Quote

One factor I suspect is that 0ad must use Fixed Point math whereas we can use floats and SSE. Another is that 0ad must use OpenGL 2.1, whereas we have batch loading of assets in GL4 - the graphics improvements have made a HUGE impact on performance. Prior to those changes, my stress test had an average rendering pass time of 80ms, afterwards I think it dropped to below 20ms. That has a significant impact on Sim.

 

  • Like 2
Link to comment
Share on other sites

Another bit from TarnishedKnight:

Quote

I haven't seen 0ad's code, but one trick used a lot in the Spring engine to keep a decent performance is what we call SlowUpdate. So anything that doesn't absolutely have to be done every sim tick can be put in a SlowUpdate. SlowUpdate runs updates over half a second (in our case 15 frames) so each frame 1/15th of the units get their SlowUpdate run, then the next frame the next 1/15th of units get their SlowUpdate run as so on.

 

  • Like 2
Link to comment
Share on other sites

13 hours ago, badosu said:

Moreover, BAR also does physics simulations for range queries and projectiles so it's definitely something that can be very optimized in 0ad even single threaded, the problem is if the architecture on 0ad is a blocker.

In fact, doing physics simulation most likely makes them much more efficient than us.

In 0 A.D., arrows are actually purely graphical, the "hit" itself is a timer (this, by the way, has a number of odd consequences, e.g. #5965 or #5964). The consequence is that on timer hit, we must query units around us and check collisions in JS manually (sorta). if there are 100 arrows, this will do 100 range queries.

In a physics-system approach, the arrows would move each "physics update" through the world, a very local phenomenon that can be highly optimised. Detecting a hit is a fast operation by itself, and there is no need to do range queries. Thus arrows are not _specifically_ slow, just part of the whole physics engine.

0 A.D. does not have a physics engine at all, and it probably wouldn't work that well for us because of our 200ms turns. I suspect BAR uses a much more fine-grained "physics turn" of e.g. 10 or 20ms (edit: based on 500ms being 15 turns, 33ms) , so their physics-related lag is less 'spikey'.

--

We could update 0 A.D. to have more turns and do fewer things on each turn, which would end up making it more possible to use a physics engine (though there are floating point / determinism concern), but that's a lot of work given where we come from.

---

That being said, this doesn't prevent us from changing how projectiles work -> it's probably a semi-good idea to consider moving stuff to C++ and making them behave more realistically, given that the current code ends up being slow-ish anyways.

  • Like 2
  • Thanks 2
Link to comment
Share on other sites

  • 3 years later...

A source of frame rate drop found: attack animations. 

When we fight large armies, the GUI must render all of the attack animations, which is a huge task. I removed the attack animations of some units and did a fps comparison to with the attack animations and noted the results. 

 

Experimental method:

1. Mod the fps counter so that it prints out the fps value into my terminal. I then save the terminal outputs to a .txt file. 

2. For each test, I ran the Combat Demo Huge map on automatic. On game start, I zoom out to my zmin (8.00 in user.cfg) and watch the units fight. 

3. Kill 0ad as soon as one side lose all units. 

4. Use a Python script to compute the average frame rate recorded, discarding anomalous values and the gamesetup screen etc. 

I did a FPS measurement and save it to a new file every time I remove an art feature (sorry Stan :( ). 

 

Results

fpslog1.txt  average fps:  76.17021276595744
fpslog2.txt  average fps:  92.876254180602
fpslog3.txt  average fps:  91.12413793103448
fpslog4.txt  average fps:  81.94713656387665
fpslog5.txt  average fps:  119.80232558139535
fpslog6.txt  average fps:  188.7904761904762

file 1: Vanilla 0ad units with just my katemod. 

file 2: removed attack animation of just Roman skirmishers

file 3: repeat of experiment 2

file 4: removed Gaul skirmisher attack animation, but kept Roman skirmishers

file 5: removed some Roman and Athenian spearman animations; removed both Gaul and Roman skirmisher animations

file 6: kept all above changes, replaced Roman spearman with a Mauryan spearman, then decapitated the Mauryans. 
 

Subjective observations:

  • Experiments 1, 2, 3, 4 all had a stutter immediately after game starts, then fps rises over time. 
  • Experiment 5 was not perfectly smooth animation but playable from a first person perspective
  • Experiment 6 had almost no lag or stutter. Units were selectable in real time without delays. 

 

Conclusions:

  • This is consistent with my experience of Romans being one of the most lagging civ and Mauryas lagging significantly less. 
  • One big source of lag (at a GUI level) when fighting is the sudden need to render many animations. 
  • Removing spearman animation had the biggest effect on fps.
  • A mitigation strategies: simplify skirmisher attack animation (we can't remove it completely because you wouldn't know whether they are attacking anymore); remove spearman / melee attack animation (you will not watch your spearman in battles, you just assume they are fighting ). 
  • Like 1
Link to comment
Share on other sites

Experiment 6 only removed the attack animations of half of the melee actors present, but the performance improvement was already this huge! 

To reduce render, one can seriously consider replacing all spearman templates with Maurya spearman. It will look weird but it will really help your fps. 

Files used attached below:

fpslog6.txtfpslog5.txtfpslog4.txtfpslog3.txtfpslog2.txtfpslog1.txtcomparator.pyfpslog6.txt

Link to comment
Share on other sites

The result, graphed:

image.thumb.png.3015c707c52d3334231959ea7d260460.png

 

This might explain why experiment 4 and 3 yielded lower values than they should have: I could have paused the experiment prematurely, while the fight was still happening. This decreases the amount of high fps values recorded at the end and hence reduces the total values. However, it is still clear that experiments 5 and 6 (disabling spearman animations) consistently outperformed the vanilla 0AD. 

Link to comment
Share on other sites


fpslog1.txt  average fps:  76.17021276595744
fpslog1.txt  `10%` low:  30
fpslog1.txt  `1%` low:  20
fpslog1.txt  lowest:  18
fpslog2.txt  average fps:  92.876254180602
fpslog2.txt  `10%` low:  30
fpslog2.txt  `1%` low:  21
fpslog2.txt  lowest:  17
fpslog3.txt  average fps:  91.12413793103448
fpslog3.txt  `10%` low:  31
fpslog3.txt  `1%` low:  21
fpslog3.txt  lowest:  17
fpslog4.txt  average fps:  81.94713656387665
fpslog4.txt  `10%` low:  30
fpslog4.txt  `1%` low:  20
fpslog4.txt  lowest:  16
fpslog5.txt  average fps:  119.80232558139535
fpslog5.txt  `10%` low:  42
fpslog5.txt  `1%` low:  27
fpslog5.txt  lowest:  20
fpslog6.txt  average fps:  188.7904761904762
fpslog6.txt  `10%` low:  53
fpslog6.txt  `1%` low:  30
fpslog6.txt  lowest:  29

 

The 10% low and 1% low has been calculated. Human eyes will feel smooth for animations above 30 fps. Having decapitated frozen Maurya spearman ensures the fps is smooth even in 300 vs 300 situations (extreme 2v2 battle). This can at least remove the GUI lag from A27, without the pain of editing the engine. 

I will be working on a mod which can do this. 

Edited by Seleucids
  • Like 1
Link to comment
Share on other sites

37 minutes ago, Stan` said:

You might just delete the animation folder (it will trigger error spam but it *should* not break.

This one?

[~/source/0ad-0.27.0/]: find . -name animation
./data/mods/_test.dae/art/animation
 

Edited by 0 calories
Link to comment
Share on other sites

10 minutes ago, Atrik said:

Do we really need to chope everything off the game to be able to run it? :huh:

Well obviously rendering nothing or cubes is gonna be faster.

13 minutes ago, 0 calories said:

This one?

[~/source/0ad-0.27.0/]: find . -name animation
./data/mods/_test.dae/art/animation
 

No it should be in the public mod.

Link to comment
Share on other sites

2 minutes ago, Stan` said:

Well obviously rendering nothing or cubes is gonna be faster.

No it should be in the public mod.

Thanks I resolved it.. Well if game is faster it is ok I still have strong hardware but I;m getting drop to 4FPS and that the bottleneck I would like to really resolve. In other cases GUI gets over-slow. Let see if this really helps

Edited by 0 calories
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...