Jump to content

[DISCUSS] Performance Optimisations


Recommended Posts

This thread is public but only reply to it if you have the skills needed to contribute something useful. Any posts that don't add anything to the thread will be deleted!

The game is getting to a point where the speed is really killing us. We need to find ways of speeding up the game, and doing so within the next release or two. Lag causes players to give up and leave. We need to start attracting fans to play regularly.

So the point of this thread is to discuss, collect, and action on ways we can actually speed up the game. Found a really slow part of the game? Post where it is and some ideas for improving it. Working on something to speed it up? Post about it an get feedback early.

This is now top priority for all developers. Other work can continue, but you should be working together to get the game faster. Communicate, collaborate, contribute....

Here are some ideas/tickets to get things started.

  • Finish developing the new path finder - http://trac.wildfire...com/ticket/1756
    The new pathfinder is designed to be faster and give closer results between long range and short range path finding.
  • CCmpRangeManager optimisation - http://trac.wildfire...com/ticket/1707
  • Implement an Octree - http://en.wikipedia.org/wiki/Octree
  • Rewrite the AI base system in C++ - AI are very slow. From what I understand, several systems are implemented in JS when they would be much faster in C++, e.g. building placement needs to copy data from C++ to JS and then loop over the data. Move as much of the AI system to C++ for speed.
  • If we keep the javascript AI system then a fairly simple improvement is to make the entity collection updates use the heirarchy of entity collections to save significant work.
  • boost::unordered_map is potentially faster than std::map in some situations. This is now serializable so should be very easy to change (it won't even break multiplayer sync because it serializes as a std::map). I don't know if there are any situations currently where this would help though. (Except for CCmpRangeManager, but the proposal in the ticket is better for that).

Here are things which could be done but hopefully won't need to be. This is just a list we can come back to near when we release if we still need to improve:

  • Disable the ability to have arrows hit units other than the one they targetted, this would save looking up entities near to the arrows hit location.

  • Like 3
Link to comment
Share on other sites

I had a look around for open source octree implementations and I thought that this looked promising. http://www.hxa.name/articles/content/octree-general-cpp_hxa7241_2005.html . Though I am not sure if we already use something for graphical clipping, does anyone know?

The places where the octree should help is in the graphical code. Currently the position interpolation, rendering transforms and some other stuff is done globally for all units. This should be restricted to visible entities and some stuff can be cached to save time for stationary entities (like trees).

Also the code which highlights things under the cursor loops through all entities at the moment. This should also use the octree so that large maps with lots of trees don't cause lag here.

The good thing is that there are potentially huge gains to be made at the moment, though with some areas like the pathfinder this is quite tricky.

Link to comment
Share on other sites

  • Rewrite the AI base system in C++ - AI are very slow. From what I understand, several systems are implemented in JS when they would be much faster in C++, e.g. building placement needs to copy data from C++ to JS and then loop over the data. Move as much of the AI system to C++ for speed.
  • If we keep the javascript AI system then a fairly simple improvement is to make the entity collection updates use the heirarchy of entity collections to save significant work.

I think we can get a lot of performance out of JS. I've seen that we have already enabled JIT (Just In Time compiling) for release builds.

Why do we even need to do it "Just in Time" when we (opposed to Firefox) know all scripts before we run them?

In addition to that, there were a lot of improvements in Firefox's Javascript engine (Spidermonkey) which we are using.

The latest versions of Firefox has come with many significant performance improvements and introduced a new JIT compiler called Ion Monkey. This compiler should further improve the optimization for compiled JS code.

Here's an interesting blog-post about that topic (also other topics are interesting on that blog):

https://blog.mozilla...-in-firefox-18/

One difficulty about using a newer version of Spidermonkey is that Mozilla doesn't seem to be interested in doing the work required to release a new standalone (not part of Firefox) version. This means also that documentation is lacking.

https://bugzilla.moz...g.cgi?id=735599

Another difficulty is that the API of 1.8.5 isn't compatible with the newer versions. I don't know yet how much work it would be to integrate a new version of Spidermonkey into the game.

Last but not least I assume that garbage collection and memory allocation could be a major bottleneck for certain parts of JS code that update a lot of data very often. I don't have any proof for that yet. Maybe we can optimize something or if that fails we could still fall back to C++ code.

I'm very interested in doing more research about that matter, but I'd like to complete my current work first.

Link to comment
Share on other sites

Here are a couple of interesting tickets:

Link to comment
Share on other sites

I would like to contribute to improving the performance of 0ad, I've been following the game for about two years now but I've never had time. However, I may have some time opening up soon. I'm interested in working on the pathfinder however I expect there is quite a lot to become familiar with before anything meaningful can be contributed. What has been done and what still needs to be done. I see there is a patch on trac, is all the work in the patch documented in Ykrosh's forum posts?

Edited by Thanduel
Link to comment
Share on other sites

Further information for AI-side optimizations:

My API v3 (still not in the game, afaik) will use a shared component and a player-specific component. The architecture would thus be this:

C++ Side:

AI Manager -> AiWorker (for threading). The AI Worker deals with getting the simulation state (from Ai Manager), running the shared component and sending it to each AI Player script, and runs the AIPlayer (C++ side) which calls the javascript AI.

Javascript side:

shared component / Ai script.

Basically this architecture means that the shared component could probably, for the most part, be ported to C++. I do not believe it should necessarily be too easily moddable since it's only there to do basic, but necessary stuffs: entity filtering (possibly entity collections too, but that might be kept JS side using a clever hierarchical (ie à la octree)). It does map filtering and things like that, which could be ported to C++ efficiently (I believe).

AI scripts should however probably remain mostly JS only, for easier moddability. There are areas of improvements there too: detecting what is ran too often and running it less often, optimizing probably in a lot of places, porting some logic to the shared component if it happens to be worthwhile. Furthermore, the JS itself could be faster.

About the GC: AIs tend to waste a lot of memories in allocations and deallocation. With a small runtime size (the default 16Mb or the current 24Mb), this means running the GC fairly often. It should be tested wether it's more efficient to run the GC once in a while with a larger runtime (I think we could afford 32Mb), or fairly often with a smaller runtime.

Link to comment
Share on other sites

I would like to contribute to improving the performance of 0ad, I've been following the game for about two years now but I've never had time. However, I may have some time opening up soon. I'm interested in working on the pathfinder however I expect there is quite a lot to become familiar with before anything meaningful can be contributed. What has been done and what still needs to be done. I see there is a patch on trac, is all the work in the patch documented in Ykrosh's forum posts?

There is a pdf here http://trac.wildfiregames.com/browser/ps/trunk/docs/pathfinder.pdf otherwise I think the forums are the best source. I was also thinking about looking at the pathfinder if I have time. I think it is reasonable to have both of us looking at it at once, but obviously we would need to communicate progress.

Link to comment
Share on other sites

Thanks, the PDF looks pretty good, with a few holes like a full explanation of the JPS* algorithm. But great that gives me a much better picture of what needs to be done. Indeed I will keep you posted, I can't promise I'll have time but it does look like time will be opening up.

Link to comment
Share on other sites

Thanks, the PDF looks pretty good, with a few holes like a full explanation of the JPS* algorithm. But great that gives me a much better picture of what needs to be done. Indeed I will keep you posted, I can't promise I'll have time but it does look like time will be opening up.

If you want to learn more about JPS then this is a good source http://harablog.wordpress.com/2011/09/07/jump-point-search/ . There is a link to a paper on that site as well.

Link to comment
Share on other sites

  • 1 month later...

As a reply to the blog post, I am willing to help with any optimizations, but of course I do not want to do any work that has already been done. Since the last post in this thread is more than a month old, I was wondering what tasks are still available for a starting developer. As an alternative, maybe it'd be a good idea for me to find performance critical functions using profiling/callgrind and find something that is relatively easy to fix just to get started.

  • Like 2
Link to comment
Share on other sites

Another relatively easy but significant item is to make the sim interpolation only do work when an entity changes position. This saves doing calculations every frame for a load of trees and buildings. This will have a more significant impact now until the oct tree filtering for the sim interpolation gets implemented (since then the calculation can be avoided for everything which is not in sight), I still think it is worth having anyway though.

  • Like 1
Link to comment
Share on other sites

quantumstate: Wouldn't it be enough to implement a quadtree for the map/entities? Or would you need the 3rd dimension somewhere?

And could you elaborate the idea in your last post a bit? Isn't the sim interpolation also for calculating positions etc. (which would still be needed for units out of sight)? Sorry, but maybe others are also interested.

Edited by scroogie
Link to comment
Share on other sites

Our terrain is 3d so unit positions do vary in z. The camera angle is changeable so that is properly 3D as well. Also using an octtree makes things more flexible if someone adds flying units. It would be possible to use a quadtree but I think it is simpler and more robust to use an octtree and performance should be good enough.

The game has the simulation which is network synchronised, this does all of the game play. The simulation is currently run 2/5 times per second (multi/single player). So this handles moving units from a gameplay perspective. Obviously 2 turns a second would look terrible so there is an interpolation layer so that the graphics system gets nice smooth movement. So since it is purely graphical it only needs to be done for stuff on the screen.

Link to comment
Share on other sites

I'm not quite sure about the quad tree/octree thing. There might be a slight speed up which would be interesting to have.

As Philip said yesterday on IRC, we must be wary of the cost of updating entities in the octree. This could get high. Perhaps we could rather use an octree/quadtree with some added pointers to their direct neighbors, instead of having to go the long way in some cases. Would be longer at initializing but probably faster later on.

Basically we need to take into account that we're having an RTS with fairly small maps, not an FPS (for which octrees seem more fitting)

  • Like 1
Link to comment
Share on other sites

  • 1 month later...

I did some test and yes, the sim interpolation improved quite a bit performance, at least when FPS is already good, with many units the bottleneck is somewhere (pathfinder) and the sim interpolation improvement become insignificant.

If you read this whole thread you'll find other open tickets, there is also this one not here: http://trac.wildfire...com/ticket/1860 .

There should be a performance page on wiki to have a page with updated information (I may do it if it's OK).

Link to comment
Share on other sites

I'll ask K776 to update the first thread. This information ought to get a wiki page, true. Ask someone on IRC?

Anyone can create wiki pages, so if fabio wants to spend some time on it for example that would be excellent :) Documentation is one of the hardest things, not necessarily to do, but to get people to do, so it's definitely encouraged :)
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...