Jump to content
Crynux

AI, Pathfinding and Performance

Recommended Posts

Hello! I recently started playing 0AD again, and I must say, it's looking good!

One thing I noticed is that during game-play, in the late game, everything slows down (2 humans vs 2ai; 300 pop cap). I haven't looked at any performance metrics yet (I may here in my next game), but I'm interested in looking into and possibly working on some stuff to help speed this up.

From what I've gathered, there are 2 issues (please correct me if I'm wrong):

1. The Renderer is somewhat bound by simulation.

2. In late game specifically, the AI is the main slowdown; partly due to pathfinding, and use of a single thread.

With that in mind, I've looked thru some of the code, specifically the code here : https://trac.wildfiregames.com/browser/ps/trunk/source/simulation2/components/CCmpAIManager.cpp

It looks like it has been partly setup to use threads, but doesn't yet? I've also read that there's some concept of moving some of the AI related JS code to C++.

Anyways (I apologize if this seems a little scattered), I was wondering what the current plan is for this. I noticed there are a few performance related tickets open, and some have been worked on.

Really, I'm just looking for maybe some clarification of where to go from here (specifically, if there are known problem-areas, I'd like to be aware of them before-hand, knowledge sharing on these things is important). I have some time that I could dedicate to this ... and I'm very interested in performance improvements.

On a side note, with conversion of some of the JS ai code to C++, has any design or some other outline of what 'needs' to be in JS vs C++ been outlined? I think it'd be valuable to set some rules or guidelines as to how much should be available in JS. To be honest, I don't have very much experience with AI programming, but at a glance, I believe any of the heavier computations should be strictly implemented in C++. An AI should really only contain decision logic... (I haven't looked thru the Petra code yet so if this is already the case, that's awesome!).

I think that's it for now. I really appreciate any pointers on this, or clarification. If anything above is dumb or obvious, please point it out ... I really only started looking into this in the past few hours ... but figured it may be worth-while to make a post, in case there's some info out there that I'm missing.

Thanks in advance!

Crynux

 

  • Like 3

Share this post


Link to post
Share on other sites

Hello! Thank you for the response! I had seen your post there, I hadn't looked at it closely as I was mainly focused on the AI at the time, however I'll take a look there as well.

Share this post


Link to post
Share on other sites

Hello @Crynux, welcome to the forums.

Indeed there are a bunch of things that can be done to make the game performance better. There is a short, somewhat outdated list here https://trac.wildfiregames.com/wiki/GamePerformance

You can also help by reviewing patches submitted by wraitii on Phabricator https://code.wildfiregames.com/people/revisions/3/

You can also help by identifying why https://code.wildfiregames.com/D14 is causing memory leaks.

There is also a huge drop of performance when selecting units because too much data is passed https://trac.wildfiregames.com/ticket/5422 I think @echotangoecho had a few patches for that.

For the AI there is that https://trac.wildfiregames.com/ticket/2618

2 hours ago, Crynux said:

The Renderer is somewhat bound by simulation.

Kind of though mostly it is old and doesn't benefit for OpenGL 3+  Optimizations

2 hours ago, Crynux said:

2. In late game specifically, the AI is the main slowdown; partly due to pathfinding, and use of a single thread.

The bottleneck would be the AI making a lot of pathfinder calls, hence slowing down the game. The AI is only executed 1/8th of turns. However, I believe it still eats up a lot of memory.

2 hours ago, Crynux said:

Anyways (I apologize if this seems a little scattered), I was wondering what the current plan is for this. I noticed there are a few performance related tickets open, and some have been worked on.

The plan is to improve things, there are no deadlines though we hope to get @wraitii's patches in for A24

2 hours ago, Crynux said:

On a side note, with conversion of some of the JS ai code to C++, has any design or some other outline of what 'needs' to be in JS vs C++ been outlined? I think it'd be valuable to set some rules or guidelines as to how much should be available in JS. To be honest, I don't have very much experience with AI programming, but at a glance, I believe any of the heavier computations should be strictly implemented in C++. An AI should really only contain decision logic... (I haven't looked thru the Petra code yet so if this is already the case, that's awesome!).

I don't think there is such a plan. The AI was made JavaScript for moddability :) 

 

Thanks for your interest, let me know if you have any more questions, and looking forward to seeing your contributions.

  • Like 1
  • Thanks 2

Share this post


Link to post
Share on other sites

@feneur Can you lift his post limit, please, I can't figure out to do it, I'm not sure I even have the rights...

  • Thanks 2

Share this post


Link to post
Share on other sites

5422 is actually unrelated to the GetEntityData thing. It is caused by the GUI renderer issuing lots of OpenGL calls for every GUI element (I think it goes from 3500 calls at the start of the game with nothing selected to 12000 with a single citizen soldier selected,  just because of the buttons) I can't check the exact numbers right now. 

The JS code that keeps the buttons updated and runs once per turn was also a tad too slow, and was causing frame pacing problems.

  • Thanks 1

Share this post


Link to post
Share on other sites

Thanks for the replies!

I just read into 5422, interesting problem there. I haven't looked into the GUI code yet, besides just looking at the session.xml... but, am I correct in the assumption that there isn't any kind of 'caching', or 'GUI State'?

My initial thoughts would be that a GUI would be built using layers. So for example, all the panel backgrounds would be one layer - then other objects (buttons, text, etc) could be rendered on top of that. A jump from 3500 calls to 12000 is quite a few.

On a side note - maybe the GUI is best done on a separate thread? I read elsewhere that the simulation is supposed to be deterministic ... but the GUI can act in response to the simulation. So instead of having it lag everything when there's a selection, it could just have a 'Update' message (for lack of a better term) posted to it, making it update independently of everything else.

Tooltips (in my very non-familiar-limited view), could be on a separate layer, since they seem to be the most 'dynamic' of the GUI renderings. So for example in a unit selection, you draw all the UI bits once, save the result to a buffer somewhere, where on each tick that buffer is drawn to screen, and the tooltop layer could be rendered on top of that, without having to redo the whole thing.

So the result would be:

  1. Background layer; containing any panel backgrounds
  2. Display layer; containing buttons, text, etc
  3. Tooltop layer; containing only the tooltop

Again, I apologize if this is all obvious (or already done), I'm just kind-of thinking out-loud here.

One last thought -- is there a reason for using OpenGL to render the 2d GUI? My thought would be that something geared toward 2d rendering may be a better solution; like using Cairo or something for offscreen GUI rendering, and then OpenGL could just take the result and plop it in-view.

Thanks!

  • Like 1

Share this post


Link to post
Share on other sites
36 minutes ago, Crynux said:

One last thought -- is there a reason for using OpenGL to render the 2d GUI? My thought would be that something geared toward 2d rendering may be a better solution; like using Cairo or something for offscreen GUI rendering, and then OpenGL could just take the result and plop it in-view.

I think we use the SDL to render the 2D UI, but I might be wrong. If we start to use cairo, we could use it to render fonts atlas depending on resolution as well see: 

https://github.com/StanleySweet/sdl2-text-renderer

  • Thanks 1

Share this post


Link to post
Share on other sites
12 hours ago, Stan` said:

 

14 hours ago, Crynux said:

On a side note, with conversion of some of the JS ai code to C++, has any design or some other outline of what 'needs' to be in JS vs C++ been outlined? I think it'd be valuable to set some rules or guidelines as to how much should be available in JS. To be honest, I don't have very much experience with AI programming, but at a glance, I believe any of the heavier computations should be strictly implemented in C++. An AI should really only contain decision logic... (I haven't looked thru the Petra code yet so if this is already the case, that's awesome!).

I don't think there is such a plan. The AI was made JavaScript for moddability :) 

Afaik everything that is too performance heavy in js should be done in c++ if it doesn't hinder moddability too much.

  • Thanks 1

Share this post


Link to post
Share on other sites

Hmmm, thanks to everyone for the posts/pointers!

I started poking the code, seeing what I can observe that 'may' contribute to slowdown. At first I enabled some code to draw paths - these are commented out by default, so I just commented them and enabled the overlay.

From there, I started testing moving units around, and started playing with some 'usual' cases.

What I found is that if you have a good few units tasked to an area (say 20 or so workers), and a building is parallel (with respect to it's bounding polygon thing [the blue lines on the overlay]), in some cases you'll get a "unit bounce", where they'll continually bounce around, looking for paths, since there's never a space within the 4 points available (the right two corners of the stone grid polygon, and the left two corners of the storehouse grid polygon). This can be seen here:

Spoiler

By contrast if you rotate the building 90 degrees (excuse the rubble, had to try a second time since the first angle wasn't great), the points now being outside of the vertical (relatively) min/max of the right points of the stone, helps the units resolve their paths more quickly, since they can now access the bottom and top sides of the stone. As seen here:

Spoiler

This may not cover every case of unit pathing ... but in cases where you get a good few units tasked to an area, and they just happen to line up,  I could see how this could cause some slowdown. (sorry if the video too blurry... I made them small clips, and I'm not sure how to make them use their original size on here).

With that in mind, I'm thinking a solution to this would be to add some "satellite points" of which the pathfinder could use to resolve the paths more easily around corners; maybe something like this (orange areas are where units would be expected to 'bounce' if no other point of path finding was available):

Spoiler

example.thumb.png.04a15e6368099fe525cc7e431a16a3a3.png

Not sure how 'possible' that would be ... well, I guess anything is possible. But I'm not sure how well that would fit into the existing setup. Just an idea anyways.

Thoughts on this?

Thanks,

Crynux

Edited by Crynux
  • Like 1

Share this post


Link to post
Share on other sites

Ah, ok cool!

I started playing around with VertexPathfinder.cpp ... and came up with this:

Spoiler

It's a little modified from the concept above, just extending the corners out, and adding those as extra vertices to be considered in pathfinding. The result, seems to be that units can get to their target a little faster, and actually seem to have access to more sides (that last lady there got a little confused, given she was on the other side of the building to begin with, that's ok I guess... but overall the rest were better I think).

 

Spoiler

example_2png.thumb.png.7b251ab18338137bf6d6747ca6f2c17a.png

I might play around with these changes a little more ... just to see how it affects a long-term game. But from what I can see it looks to be some sort of improvement!

What do you guys/gals think?

EDIT: I changed how the new points are created, basing it off of the existing buffer based on unit clearance, and a constant. Also created a Diff for it (hopefully it's created decently well) https://code.wildfiregames.com/D1911.

Thanks again,

Crynux

Edited by Crynux
Added D1911
  • Like 3

Share this post


Link to post
Share on other sites

@Stan` I think some of your info is a bit outdated.

On 5/20/2019 at 6:04 AM, Crynux said:

From what I've gathered, there are 2 issues (please correct me if I'm wrong):

1. The Renderer is somewhat bound by simulation.

2. In late game specifically, the AI is the main slowdown; partly due to pathfinding, and use of a single thread.

This isn't that accurate to be honest (sadly, how good would that world be)

At a high level, the renderer is draw--call bound (heavily so) because we aren't using any instancing - because we support OGL that doesn't support it. They "why" on this will lead to a flamewar. The GUI rendering is particularly unoptimised, since we don't even render similar items together.

At a high level, the simulation is pathing + JS bound. The AI itself is a slowdown, but it doesn't increase over time that much. I have ideas on how to improve it, but surprisingly perhaps my idea is actually to unthread the AI. The reason for this is that the AI is slow mostly because it needs a copy of the simulation state every turn, which is slow to create, which it wouldn't need if it lived in the same JS container as the simulation itself.

The pathfinding is slow. There are 3 pathfinders. The hierarchical one is high level and fairly fast. The long-range pathfinder is JPS and fairly fast on its own, though it searches very large grids and can still take some times if enough calls are made on a given turn. Then there is the vertex pathfinder, which isn't fast. I think it's also kinda broken (see D1908).
These 3 pathfinders are used from CCmpUnitMotion.cpp, which I am currently in the process of rewriting.

The pathfinding calls should be threaded (D14 - which as far as I know does not leak memory, it might have done in the past but the current revision should work), which will help. We should also implement unit Pushing (D1490) to reduce the usage of the vertex pathfinder. After that, it's just optimising stuff and we'll get into 'difficult' territory.

The rest of the simulation isn't extremely fast either. JS code is slow, and from profiling I would say about as slow as the pathing itself. There's just not much that we can do here except move some of the slow stuff to C++ and hope for the best. Spidermonkey, our engine, needs to be upgraded and hopefully things will get a bit better in the future.

Finally, our Entity-Component-System is also rather unoptimal, using maps in a lot of places and not really being optimised for cache misses.

----

Your extended corner thing is an interesting idea, though one needs to consider that:
- adding more vertices slows the vertex pathfinder further
- there is a bug that we always try to go for the nearest point on goal with the vertex pathfinder, even if that is unreachable, which can lead to stuck-ness. Maybe it's better/faster to fix that instead.

  • Like 1
  • Thanks 2

Share this post


Link to post
Share on other sites

Hello! Thank you for the detailed overview! After digging a bit thru the code today, I did notice the hierarchical pathfinder, and wondered a bit about its use; your post clears that up a bit!

I haven't looked much at the AI stuff yet ... but since I'm already into the pathing, maybe I'll stick with that for a bit.

I did notice some issue where units would continually try to go somewhere when they can't (for example, having a large group of workers go to an area without formation). They'll just continually pace around trying different angles to get there. I have some ideas on how to fix this ... just haven't gotten to it yet, since I've been working on / getting familiar with path-finding in relation to the extended corner idea.

With that, I do recognize those concerns there, more vertices would slow it down, but it maybe a case where having those few extras are better - since if a 'successful' path can be found faster, then overall there 'may' be less computation.

That bug you mention seems like it's the same thing I noticed (where they can't get somewhere, but keep trying). Maybe I'll give that a shot next.

Aside from that, the OpenGL instancing thing mentioned ... is that just with regard to supporting older versions rather than updating?

  • Like 3

Share this post


Link to post
Share on other sites

Unit Motion is rather subtle to be honest because it ties the actual pathfinders, the unitMotion component and UnitAI together to work.

Mostly if we want to keep supporting old versions and new versions we first need to clean up the code to support different renderers, and that's not too easy, and then actually code the new one, which isn't easy either. We don't have a lot of graphics programmer, and it's getting increasingly hard to do graphics correctly.

I've been somewhat convinced, for a few years now, that we need to change our strategy of doing most things in-house and just use a rendering engine, but I haven't really found a very satisfactory one so far.

  • Like 1

Share this post


Link to post
Share on other sites
3 minutes ago, wraitii said:

I've been somewhat convinced, for a few years now, that we need to change our strategy of doing most things in-house and just use a rendering engine, but I haven't really found a very satisfactory one so far.

Well we could use Godot :) Make the game a gdnative plugin and rewrite things as we go.

28 minutes ago, wraitii said:

I think some of your info is a bit outdated.

Yeah might very likely be. I'm using what I can gather from the wiki and a few recent threads :) Thanks for clarifying things. a bit.

  • Like 1

Share this post


Link to post
Share on other sites
21 hours ago, theotherhiveking_ said:

It is caused by the GUI renderer issuing lots of OpenGL calls for every GUI element (I think it goes from 3500 calls at the start of the game with nothing selected to 12000 with a single citizen soldier selected,  just because of the buttons)

Any source on those numbers? (aka profiling logs)

Share this post


Link to post
Share on other sites
6 hours ago, (-_-) said:

Any source on those numbers? (aka profiling logs)

I used a tool called ApiTrace (https://github.com/apitrace/apitrace). If you're using Linux, your distro might have a package for it.

 

What takes about ~3500 calls is the GUI when the game hasn't started (the lobby, main menu, settings screen, etc), so I misremembered.

Acropolis Bay (2) map needs about ~14500 calls (before any user input of any kind, so default camera, units and GUI elements)

After selecting a Citizen Soldier (because they have the most complex GUI)  calls go up to ~21500

If I disable the GUI, Only about 11500 calls are needed.

0ad_apritrace.thumb.png.5cbdf78ac6ad5d6c26399f62128abf7b.png

Frames with a citizen soldier selected in red.

Frames with a disabled GUI (Alt + G) in green.

The .trace file weights 240MB, I can upload it somewhere if you want to have a look at this specific file.

 

There is also this:

https://trac.wildfiregames.com/attachment/ticket/5422/out.svg

It is a interactive flamegraph, but trac's preview isn't interactive at all. Click 'Download in other formats: Original Format' and then open the local file with an internet browser.

After that is done, click elements to zoom in, hover to obtain aditional info.

  • Like 3
  • Thanks 1

Share this post


Link to post
Share on other sites
4 minutes ago, theotherhiveking_ said:

I used a tool called ApiTrace (https://github.com/apitrace/apitrace). If you're using Linux, your distro might have a package for it.

It's a good tool. I'm using it for a while. But it still misses some good features from RenderDoc, which doesn't support our GL unfortunately.

5 minutes ago, theotherhiveking_ said:

What takes about ~3500 calls is the GUI when the game hasn't started (the lobby, main menu, settings screen, etc), so I misremembered.

We mostly use 1 draw call per each GUI element. It'd be good to refactor it and to use only 1-2 draw call per all GUI.

  • Like 2

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...