Jump to content
Kuba386

Parallel GPU-based A* algorithm in pathfinding

Recommended Posts

Just now, aeonios said:

which really shouldn't be part of the pathfinder at all

Don't you have to know whether you will hit something before choosing a path ?

Just now, aeonios said:

it got stalled by the halted development.

It mostly got stalled because of the lack of reviews. The patches are old now :) Some still needs change on @wraitii's end ;)

Share this post


Link to post
Share on other sites
12 minutes ago, stanislas69 said:

Don't you have to know whether you will hit something before choosing a path ?

No, physics should handle that. The pathfinder becomes extremely stupid when trying to calculate paths around other units. It gets stuck very easily and ends up causing huge traffic jams on some maps. Buildings/terrain/trees/etc are fine for pathfinding since they don't change much, but with units you don't really want it to worry about something that might have moved out of the way by the time it gets there. That just tends to give you really bad paths.

13 minutes ago, stanislas69 said:

It mostly got stalled because of the lack of reviews. The patches are old now :) Some still needs change on @wraitii's end ;)

Well there was that too. I know that it was very WIP last I checked on it, but I dunno how many people have had much motivation to work on anything when nobody even knows when stuff will start getting reviewed or merged again.

Share this post


Link to post
Share on other sites
Just now, aeonios said:

No, physics should handle that. The pathfinder becomes extremely stupid when trying to calculate paths around other units. It gets stuck very easily and ends up causing huge traffic jams on some maps. Buildings/terrain/trees/etc are fine for pathfinding since they don't change much, but with units you don't really want it to worry about something that might have moved out of the way by the time it gets there. That just tends to give you really bad paths.

I thought you meant even for static obstructions which didn't make sense to me. Thanks for the explanation.

Just now, aeonios said:

Well there was that too. I know that it was very WIP last I checked on it, but I dunno how many people have had much motivation to work on anything when nobody even knows when stuff will start getting reviewed or merged again.

A few actually. We are just not good at advertising it :)

Share this post


Link to post
Share on other sites

"Ridiculously complex" is a bit of an overstatement. None of the code is super easy but there's nothing too unexpected - except for unitMotion maybe.
The pathfinder is actually 5 components:

  • The obstruction manager handles collisions (for unitMotion - not the pathfinder proper) and rasterising static obstacles (like buildings and terrain) and keeping track of units.
  • UnitMotion handles the actual movement logic.  It's terribly written at the moment (buggy, unclear and inefficient at actually moving sanely…) and I have a patch (I think it's D13) fixing it. On top of that, we need to change the architecture to a Data-oriented design - I may or may not have a patch for this, can't recall, but I've definitely done it somewhere.
  • The hierarchical pathfinder handles accessibility (and could be used to handle high-level pathfinding, but it's not really much faster). It creates big "regions" and connects them. That's where flood-fill comes in. I have a patch (D53 I think) that improves its performance.
  • The "long-range pathfinder", AKA the A* JPS pathfinder, is pretty well optimised (aka making it much faster would take too much work). It can create a path from navcell A to navcell B using the rasterised map of the obstruction manager - so no unit avoidance here. 
  • The "short-range pathfinder", aka the vertex pathfinder, aka a graph pathfinder, is used to walk around units. It's triggered by unitMotion when getting stuck in some special circumstances. It creates on-demand an exact map of all units and A* on that.

Nothing is threaded right now but two things are 'easily' threaded: the long and short paths computations. These are currently done between turns, and could be parallelised. Doing it _super_ safely would mean rewriting some stuff, doing it kinda dangerously is very easy and I have a patch for it on Phabricator somewhere.

I also have (this one indeed very WIP) patch for unit-pushing, which removes most of the need for the short-range pathfinder (it's still useful to keep around for stuff such as finding your way around a resource).

The short-range pathfinder might be optimised for the common-is case of several units close together since we could reuse the graphs across computations if those are in the same region - but it's not trivial to do correctly.

The other patches are pretty much finished and ready to merge, they just need some review and some more testing.

  • Like 3
  • Thanks 3

Share this post


Link to post
Share on other sites
17 minutes ago, wraitii said:

The other patches are pretty much finished and ready to merge, they just need some review and some more testing.

@temple Would you mind reviewing those ?

@aeonios, @vladislavbelov@Kuba386, @kanetaka if you have some time as well...

 

https://code.wildfiregames.com/people/revisions/3/

Share this post


Link to post
Share on other sites

@wraitii "doing it kinda dangerously is very easy and I have a patch for it on Phabricator somewhere".

Well, I've updated* your patch so it compiles on latest version.

And I must admit I was quite surprised with the result. It really significantly decreased amount of lags for single player game. However when I tried to open replay took with original version, I got out of sync error. (actually not a suprise)

However there were no problems when I've played a multiplayer game over LAN. Even when using different amount of worker threads for each game instance.

Not only that there were no problems, while there were still some lags in single player game, in multiplayer game lags actually disappeared.

3 players, 1500 units fighting in the center of map, 46 FPS and almost no lags (almost means that there still were some lags, but they were almost unnoticeable).

2 instances of the game running at computer with intel i5-7300 and NVIDIA GTX 1050  8GB RAM, and 1 instance running at computer with intel i5-5th gen and NVIDIA GTS 450  8GB RAM.

When I recreated this situation with original version of the game, lags were so huge that game was almost unplayable.

I've added option in settings, so players can change amount of worker threads (I actually used it just for debugging purposes, probably not needed for end users).

I haven't noticed any problems with determinism in this one game.

I've uploaded it back to D14.

Oh, and I only tested it on linux, but I think it should work on other platforms too.

 

I'm now thinking about more safe version, since it would be very nice to save such speedup.

 

* 'updated'... hmm... actually changed a few lines....

Edited by Kuba386
  • Like 2

Share this post


Link to post
Share on other sites

If you need someone to test it on Windows let me know. Also try to fix that tabs to spaces whitespace changes. Makes it harder to read diffs.

Thanks for working on it.

  • Like 1

Share this post


Link to post
Share on other sites
2 minutes ago, stanislas69 said:

If you need someone to test it on Windows let me know.

No, thanks. I can test it on Windows myself. Only reason I haven't already done so is because I haven't compiled it for Windows yet.

2 minutes ago, stanislas69 said:

Also try to fix that tabs to spaces whitespace changes. Makes it harder to read diffs.

No problem :) Sorry for that

3 minutes ago, stanislas69 said:

Thanks for working on it. 

:)  It has been very pleasant for me so far.

  • Like 1

Share this post


Link to post
Share on other sites
3 hours ago, stanislas69 said:

Would you mind reviewing those ?

Yes. Such big changes require a really precise and careful review, especially of threading + C++. I already see few suspicious places.

  • Like 1

Share this post


Link to post
Share on other sites

I've just combined D14 with D53 and result is even better.

I've attached the patch and a quite heavy replay so you can test it yourself. Try running it with multithreading disabled (set number of threads to 0), and enabled (experiment with number of threads,  typically optimal value is 2*(number_of_cores - 1)).

You will get out of sync error if you run replay with original version of the game so you have to run it with this patch.

Notify me if you have any problems with compiling or running the patch.

Also would be nice if anyone would like to play a multiplayer game with me using this patch.

pathfinderupdate.patch

2018-12-19_0007.zip

Edited by Kuba386
  • Like 1

Share this post


Link to post
Share on other sites
34 minutes ago, Kuba386 said:

Also would be nice if anyone would like to play a multiplayer game with me using this patch.

If you find other players i could be interested in trying a big game (6 players or more) with the patch, to see what settings it would free for those

  • Thanks 1

Share this post


Link to post
Share on other sites

A bit off topic but i think optimal is exactly number of cores - 1 so every thread gets its own core

Edited by Angen

Share this post


Link to post
Share on other sites
1 hour ago, Angen said:

A bit off topic but i think optimal is exactly number of cores - 1 so every thread gets its own core

True, but then it depends on whether they are actual cores :)

Share this post


Link to post
Share on other sites

do you mean logical vs physical? 

I have 4 cores but 8 logical, runnig 7 threads + main gives me 100% usage. with 4 total its only about half. 

Share this post


Link to post
Share on other sites
4 hours ago, Angen said:

A bit off topic but i think optimal is exactly number of cores - 1 so every thread gets its own core

This in not off-topic (well, actually second page of this topic isn't about GA* at all... but I mean it isn't off topic in comparison with other posts on second page).

Actually default number of threads that wraitii implemented in his code was 2*(number_of_cores-1).

And experiments shown that it's true. Actually I've achieved best results when using 6 and 7 threads (I'm on 4-core CPU).

2 hours ago, Angen said:

I have 4 cores but 8 logical, runnig 7 threads + main gives me 100% usage.

Well, there are also other threads, not only main and pathfinding.

  • Like 1

Share this post


Link to post
Share on other sites

I've just confirmed that MD5 hashes of final state of replays match when using different number of threads and different platforms (tested windows and linux).

Tested with 10 different replays.

This is already safe enough to play a multiplayer game.

On 12/19/2018 at 9:02 PM, Feldfeld said:

If you find other players i could be interested in trying a big game (6 players or more) with the patch, to see what settings it would free for those

Thank you Feldfeld :)

 

Anyone else would like to play?

EDIT:  I've created a topic for this

Edited by Kuba386
  • Like 1

Share this post


Link to post
Share on other sites
On 12/20/2018 at 11:56 AM, Kuba386 said:

Well, there are also other threads, not only main and pathfinding.

I think the most important thread to have is for networking, at least on multiplayer. Not having networking on a separate thread causes major problems including some notable disconnect issues.

  • Like 1

Share this post


Link to post
Share on other sites

The pathfinder threading is safe because computing paths is a pure operation (by design) and all paths are computed in between turns in both the threaded and the non-threaded version of the game. If you compute paths in the background while also running other Simulation operations, then things might get weird.

The problem is that the Pathfinder Grid isn't actually copied to make it 100%, structurally thread-safe. IMO that's not a huge issue but we need to be careful.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...