scroogie

March 21, 2015

Wow, considering how large that view is and how small the model will be in the game, thats really impressive, great work.

January 12, 2015

Netbeans nowadays also has support for C++ and JavaScript, including code completion and debugging support. I'm not using it for C++, but its not bad at all.

November 7, 2014

Hi Yves, once again great work on these investigations. Judging from the results of profiling, how much time is actually spent in Javascript, I'd say that this would be extremely worthy to achieve. To be honest I'd even say that 75MB more memory consumption is neglectable, I'm curious how discussions about that will proceed.

September 16, 2014

There are two kinds of concerns.
The first is that some of our code gets used rarely (like random map code for a specific map or a specific trigger script) and it could cause unnecessary bloat to keep the compiled versions in memory. The other concern is that this setting is meant just for testing and it could have unwanted side-effects like keeping code that isn't accessible anymore. Running one non-visual replay worked quite well though. The memory usage was somewhere between the two graphs of v24 and v31+GGC.
I should probably investigate a bit further and ask the SpiderMonkey developers. My concern is that they design the heuristics for keeping JIT code just for the use case of Firefox and not for a case like ours where most of the code should be kept.

If the memory consumption was between the old and new version, that would be of no concern, would it? Even the v24 was well below 100Megs most of the time, if I don't misread your graph. Or are there many complaints about 0ad being memory hungry?

September 13, 2014

Wow, 13.5% is a really massive change. What do you mean by kind of a memory leak? Wouldn't it make sense to have all Javascript code compiled, as it doesn't change?

July 26, 2014

Well, thinking about it, as it creates collision edges for all obstructions, I'm quiet certain its related to the amount of all obstructions on the map. Maybe Ykkrosh can explain how Trees affect it specifically.

July 25, 2014

Which maps would you like to compare? Do you have specific ideads? To have "realistic" human behaviour in there, I played the game myself and used the sim_log for the profiling. You can also attach commands.txt and I run it through the profiler if you want.

July 24, 2014

I did a new profile on r15552 using a replay of a 1 human, 3 petra bot match, 20713 turns.

Time spent in modules:

55.9% pyrogenesis

32.3% libmozjs (javascript)

6.3% libc

4.1% libstdc++

0.7% libnspr4

The previously found (minor) hotspots are gone, the PathFinder appears more prominently now, but there were also a lot more units in the game. Looking at the subtree analysis, CCmpPathfinder::ProcessShortRequests() takes 46.4% and CCmpPathfinder::ProcessLongRequests() takes 15.2% (of course relative to the 55.9% spent in pyrogenesis). The only other subtree that appears prominently is BroadcastMessages

Looking closer:

- in BroadcastMessages the cost is mainly caused by the RangeManager (GetNear and UpdateVisibilityData) and CCmpPosition::MoveAndTurnTo (itself due to posting messages)
- in ComputeShortPath, the most costly subtrees are the EdgeSort and the CheckVisibility functions (there apparently is a huge amount of edges to iterate over)

- in the long range pathfinder, CalculateHeuristic and DistanceToGoal call isqrt64 awfully lot

So actually, there is nothing new, except that the situation has improved imho. Apparently there are no low hanging fruit anymore. :/

I'll keep the results, in case someone is interested.

July 22, 2014

> certainly connected to pathFinder
When was the last time the pathfinder has been profiled?

Starting here, there are a few profiles: http://www.wildfiregames.com/forum/index.php?showtopic=17274&page=6#entry270911

July 11, 2014

The thing with performance improvements is that they often only change performance by a small margin, 1% say, but if you do that in a hundred commits, it adds up.

And i've often seen it working like that. You benchmark / profile, address one issue giving you 1% gain, then you benchmark/profile again, and address another hotspot, etc.

But anyway, this is actually quite offtopic here.

Just out of curiosity, what *are* the actual disadvantages of doing the move? I think even on LTS distributions like RHEL its no problem to install e.g. gcc 4.7 in case its required.

May 27, 2014

You can read more about issues and possible solutions in countless papers. I'd suggest these as a starter:

HPA* Near-optimal Hierarchical Pathfinding: http://www.cs.ualberta.ca/~mmueller/ps/hpastar.pdf

It explains the idea and implementation of long-range / short-range pathfinding.

Jump-Point Search: https://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/view/3761/4007

The algorithm that was chosen for the (incomplete) rewrite.

Apparently improvements will soon be published: http://www.nicta.com.au/pub?id=7771

As mentioned in the paper, it could potentially be combined with swamps as well: http://leibniz.cs.huji.ac.il/tr/1188.pdf

Another approach that seems to be getting a lot of attention is SUB: http://idm-lab.org/bib/abstracts/papers/icaps13.pdf

April 5, 2014

Hi Yves,

let me say thanks for your tremendous effort on this, its always a pleasure to read about your progress!

KeepJitCode was the one where Jit code is not garbage collected, right? Do the Mozilla Devs plan to make this safe in ESR31? I honestly don't know how this could work with a moving GC?

@FeXoR: To the contrary, all ESR31 versions are actually faster. And quite substantially as well (biggest improvement seems to be around 18% by visual judgement).

January 3, 2014

I've also posted some profiles/traces on another thread if more info is required. Regarding the isqrt, if I remember correctly the new pathfinder patch drastically reduced the number of calls there. Afaik the current one still uses vector lengths which obviously requires a lot of sqrts.

December 21, 2013

If I may say so as an outsider, I think the buildings look fantastic, you Art guys are really doing a splendid job.

Whats described as "cartoonish" here could perhaps really be the colors / contrast / saturation. Maybe some of the textures are too perfect and could use some muckiness, and others could get toned down a bit. But I really think it looks great. You certainly know better than me, just my 2cts.

November 24, 2013

Wow, these are really great. The run animation also seems to be very realistic, comparing to what is found on youtube, e.g.

Although in that fast run for hunting, everything is a bit more emphasised, which makes it look more powerful, e.g. the hind legs seem to jump further to the front, and there is a larger muscle movement. Nevertheless, I think the level of detail is already great. Btw, its amazing to see how this comes together. The art section is really a pleasure to watch. You are all doing a great work.

November 13, 2013

Just out of curiosity. Is 512 the max amount of tiles? How large (in meters) is a unit in comparison?

October 2, 2013

Great to see that you are actually investigating this. Perhaps, as other people have repeatedly tried to look into this, you could preserve somewhere how you investigated / tested solutions. Are you replacing both the short-range and long-range pathfinder, or one of them?

September 26, 2013

Entities are 2D bounding boxes (since they can be buildings, units, flora, fauna...).

CCmpRangeManager deals with points rather than bounding boxes, so it only has to put entities into a single cell (except for the rare cases where an entity is precisely on the edge between cells, which we could avoid by being more careful about the edge conditions).

So I'm a bit confused now. I guess entities have an extent, but the RangeManager ignores it (which is what I meant to ask)? Does the range vary very much or is it quite uniform?

The PFor implementation isn't entirely horrible - though it has quite a few concurrency issues. We don't use OpenMP because we already implement pthreads and there is literally no useful cross-platform library with a PFor. The patch already implements thread pools if you didn't notice, so I wouldn't worry about the implementation efficiency.

Ah, yes, I didn't see the static initialization. I'm still not sure its nicer than using OpenMP, though. The latter would also allow tasks (if 3.0 would be an option), which might even be better suited sometimes. But anyhow, I would still tend to the component-wise solution.

If we push the code into a worker thread - something that works in background to the whole engine, we will definitely reduce the time taken, even on single core CPU's. However, we have to account for situations where calculating ranges takes more than 1 turn and how that will affect the simulation state. We will probably have to wait out the results from previous turn regardless. All in all, it might be worth the shot and it would have less concurrency issues than PFor.

Yes. Regarding the issue of waiting for the results, its virtually a decision between blocking the whole game to wait for a result or live with an old result for some time? Also, you gain some time in a way, which is currently unused because of the sequential processing. If we take the AI again as an example, it could already crunch away while the engine is busy with other stuff. Its in no way trivial, but I guess maintaining a bunch of parallel sections isn't trivial in the long run either?

Anyway, I digress. I hope you don't mind if "externals" chime in, its an interesting topic.

September 25, 2013

Hi RedFox,

sorry I don't remember, does the RangeManager treat entities as points or as featuring some radius?

*edit*

I just had a look at the parallel for patch. Don't you think that is a quite painful way of doing multithreading in games? First, I'd assume it would be easier to use some existing implementation with threadpools etc., like even OpenMP. Secondly, I'd think the easiest (and most common afaik) way of doing it is completely putting one component in a seperate thread, like the AI. That way you don't need to make sure that every called method is threadsafe etc., which will be horrible to maintain over time, at least in my experience.

September 25, 2013

Hi RedFox,

this is what I meant with the cost of getting items in exactly the state you need for SP, but so be it. I'm quite sure your work was still valuable! For example your testbed implementations. Could that code be used by others that want to experiment with the RangeManager? E.g. did you isolate parts so that debugging or performance measurement is easier? I guess that would already help very much.

Cheers

September 24, 2013

Just to add to the good explanations of sanderd, if you look at the short range pathfinder, it is basically a points-of-visibility algorithm with high precision obstruction edges connecting corners. I think there was also an article in a GPG issue that explained PoV pathfinding. Maybe it helps to understand the idea behind it. Apart from the mismatches of the two systems, the problem seems to be that it is simply too expensive.

I don't know how good a NavMesh approach will fare, especially because of the modifications of the obstructions, online replanning, and unit clearance (i.e. point 3), but its definitely worth a try. I assume you'd need some additional subdivision to floodfill (or however you build the navmesh) only parts of the map.

September 17, 2013

Yes, I know, but what does this change? A sweep is basically the most trivial algorithm one can come up with. Its just traversing all items in the way you keep them. The more interesting question is how to get them to this state, and thats where other algorithms get into the equation. Do you always sort all items, or do you keep them in groups, etc. Im afraid thats what all algorithmic geometry is about.

September 16, 2013

With that argument, you don't need any algorithm at all. It all depends on the situation. Sometimes its about the need to sort or not.

September 16, 2013

I don't think that using a simple Sweep is a real answer to everything. It can be faster, but so can KD-Tree, RB-trees, QuadTrees, etc. pp. I liked the idea of kuranes, to first fix an abstract API of spatial partitioning, and then implement different techniques to see how they fare against each other in different situations.

September 16, 2013

jcantero: I think this was discussed several times. I would have chosen Lua as well, but apparently there were reasons against it. Just saying that people were actually aware of Lua.

Sign In

scroogie

Posts

Joined

Last visited

Days Won

Content Type

Profiles

Forums

Posts posted by scroogie

===[COMMITTED]=== Mastiff

Any instruction of compile using Eclipse on Linux?

SpiderMonkey ESR31 upgrade

SpiderMonkey ESR31 upgrade

SpiderMonkey ESR31 upgrade

[DISCUSS] Performance Improvements

[DISCUSS] Performance Improvements

[DISCUSS] Performance Improvements

new Ai bot: Petra

C++11 Migration: Discussion

Pathfinding yet again: NavMeshes

SpiderMonkey ESR31 upgrade

Alpha 16 Wishlist/General discussion

===[TASK]=== Ptolemaic Structures

===[COMMITTED]=== Tiger

Atlas - enhance the heigtmap importer and texturing workflow

Pathfinding yet again: NavMeshes

Paid Development: September 2013 Activity Status

Paid Development: September 2013 Activity Status

Paid Development: September 2013 Activity Status

Pathfinding yet again: NavMeshes

Paid Development: September 2013 Activity Status

Paid Development: September 2013 Activity Status

Paid Development: September 2013 Activity Status

Further AI development

Forums