tuan kuranes

[Discuss] Memory aka game slowing over time.

tuan kuranes replied to tuan kuranes's topic in Game Development & Technical Discussion

@Thanduel: Using memory analyzer from http://blog.makingartstudios.com/?page_id=72, run 0ad with it in debug mode , switch to "fragmentation tab" and you'll have nice memory mapping block drawings showing just that. (other tabs still interesting, should be able to spot all the recurring "allocators" code spot, and all the other memory problems)

[DISCUSS] Performance Improvements

tuan kuranes replied to RedFox's topic in Game Development & Technical Discussion

@scroogie: nothing interesting/visible yet, just separated the pathfind code in a lib, behind a facade interface. Now working on making the visualization (just grid and colored square in opengl, not very different than minimap,just with enough flexibility to handle multiples grid level like tile/vertex) this weekend, all that in a wxwidget gui (using a wxformbuilder thing). Then probably next we, I'll have to make all the changes on the pathfind lib interface in order to handle the "classic" proposed tool set for a pathfinder app. (load/save/edit, algo change, step by step/play/pause/stop/start/end, benchs, etc.). Then tests will begins.

[Discussion] Spidermonkey upgrade

tuan kuranes replied to Yves's topic in Game Development & Technical Discussion

That adds other good point for upgrading: - latest sm version will support sourcemaps (allows eaiser use of js "flavors" or any other language that can be transpiled to js) - asm.js support in future sm release (which means real faster typed vars) Both point, plus the current "c++ like" orientation of current js code could leads to either easier js code production, using transpilers like lljs (would recommend that, because it can transpile to asm.js, and handle memory C like in a very efficient way and is very strict), or just directly asm.js (but that have to wait for sm upgrade to asm.js supported version). On a side note, current js code could benefits for addition/usage of js code quality tools. (jsvalidate, jshint, qunit, plato, benchmarkjs, etc.). Some nice things can be done/automated using gruntjs (like setting a "grunt watch" task which run in background and lint against any js tools any changes while you edit js files) (from a few js code reading, there's lot of room for js optimisation)

AoM-like pathfinding/movement

tuan kuranes replied to Ykkrosh's topic in Game Development & Technical Discussion

Looking at actual long pathfinding rewrite patch, that would help even just there, easing the transition and work collaboration. (the patch is a lot of new files mixed with other components, lots of defines in order to keep old methods, a "b" folder, etc.). That's why I'm advocating a test app with run time selection of algo: experimenting made easy, discussing based on facts. @wraitii the crowd techniques I'm referencing, loa/rvo, does that kind of thing. and secondly, french indeed..

AoM-like pathfinding/movement

tuan kuranes replied to Ykkrosh's topic in Game Development & Technical Discussion

@historic_bruno: s/should rather target/could consider... pardon my french. Short path range is the perf killer as soon as lots of entity are in the game (end of game, or a 6player game), and A*/jps is perf limited, too much computations, too much memory moves, etc. What I pointed in the crowd techniques is just the "local obstacle avoidance" (loa) parts. What's interesting is that it doesn't "comput" path, but rather "react","adjust" direction when moving to the next waypoint (those computed once from longpath) on a very low range, and only if there's a obstacle (does nothing if not), handling all other moving units (and handling "reciprocal velocity obstacle" (rvo) which is very helpful for formation handling as seen in the nice video in my post which uses loa/rvo techniques), and letting each unit have it's own avoidance technique (units size, speed and event tactical style). It's very low computation, that's why crowd techniques use that, and therefore very scalable (~15.000 units moving real time on current cpu). ( here's a "sort of" loa/rvo demo with code and source http://software.inte.../samples/colony 65000 units at 20ms/frame using threads or 200ms without) - Another (related) point is that Its currently very hard and slow to code/test/bench current pathfinding code. Currently it's a all or nothing thing to make changes on the pathfinder... I would propose to consider moving all pathfinding code as a static external lib, with a clear and nice interface, but with under the hood, much more flexibility, as in the ability to select/tweak pathfinder algos at runtime. (see the refrence pathfinding library "abstraction" https://code.google.com/p/hog2/ and how researche use it to test code https://code.google....n%2Ftrunk%2Fsrc) Then we could add a 2D testing/dev wxwidget app that uses it. Would be a top-down 2d view grid, with real-time pathfinding technique selection and tests. scenario loading, map loading, benchmarks, etc. (here's some example of what I'm thinking about for the gui and capabilities http://www.cgf-ai.co...acastarexplorer ) Once we have pathfinding code as a more flexible lib, and a testing app, it's a matter of adding algo to the lib, experimenting and discussing on "testable" algos by everyone.(could also experiment with threading/scheduling pathfinding there.). Once everyone agrees, just making the algo the "default" in 0Ad, minimizing breaking changes, but still allowing them. So adding/testing any other techniques (or formation), or tactical reasonning, would just be adding that to the pathfind lib, then testing/benching it with all scenarios we consider mandatory to be solved by the pathfinder (have to build a nice repository of maps/scenario/case demonstrating the needs.), adding it to svn, and then launch a discuss thread.

AoM-like pathfinding/movement

tuan kuranes replied to Ykkrosh's topic in Game Development & Technical Discussion

For short "pathfinding", where perofrmance is the most lacking, specially with lots of moving units, we should rather target "crowd" tehcniques. Those technique evolve around "collision/obstacle avoidance" techniques: Here's the most relevant papers http://gamma.cs.unc....esearch/crowds/ http://grail.cs.wash...ts/crowd-flows/ Here's an nice implementation of local obstacle avoidance and lots of post and links about it: http://digestingduck...-search-results On formation, a nice idea/feature is "formation sketching" which would give a nice "commander/strategic" capabilities http://graphics.cs.u...on-preprint.pdf

[Discussion] Spidermonkey upgrade

tuan kuranes replied to Yves's topic in Game Development & Technical Discussion

Ok. got the point. now documented in forum why mozilla-js rather than v8. @alpha123 agree. Just want to make sure alternatives are known/considered. Docs, support & community is a big part of reliable 3rd party library, better change soon than late and having more work to do when having to upgrade... Didn't know that js code could be SpiderMonkey-specific. Perhaps "Polyfills" (like es5/es6 polyfills) could perhaps help there? Not sure about same perf, as to my understanding that mozilla-js optimisation where mostly "tracing" (spotting and optimising hot-path) and v8 were more "compilation" (type inference) oriented @Yves: I know I'm late. Just wanted to be sure that alternative are considered (especially when hitting a wall) Yeah, GDB thing is just stack trace naming, still useful though, notably in Cpu profilers that get lost by js stack... My secret point is that mozilla-js is really a pain to get running, especially under win64... still didn't succeed here (would like to be able to release 64bits exe, much faster, notably on the fixed sqrt calls...)

Important: Please read if you have an older computer or graphics card

tuan kuranes replied to historic_bruno's topic in Game Development & Technical Discussion

"parsing using regex" is a bad way top describe it, sorry. It's more a "preprocessor": you find #define, #pragma, #include. inside glsl and replace it with values, other glsl content file, etc. Here's an example of glsl include support using boost of what I meant. I do agree that real parsing and compiling is not really useful in runtime, only for offline tools like aras_p's glsl optimizer

[Discussion] Spidermonkey upgrade

tuan kuranes replied to Yves's topic in Game Development & Technical Discussion

I guess that's been covered a lot(couldn't find in forum search, though), and I'm very late there, but why not V8 js engine ( https://code.google.com/p/v8/ )? There's a bigger community around than mozilla js afaik, lots of apps & docs & tutorial around (http://athile.net/library/wiki/index.php?title=Library/V8/Tutorial), nicer docs ( https://developers.google.com/v8/embed ) and also has buitltin profiler/debugging capabilities. (even gdb support https://code.google.com/p/v8/wiki/GDBJITInterface or with some work webkit/chrome dev tools like nodejs did here https://github.com/dannycoates/node-inspector ). Some nice working/example c++ interfacing already exists: https://code.google.com/p/nasiu-scripting/ (that one got me with the "std::vector covered"), and the persistent case is indeed covered ( https://developers.google.com/v8/embed#dynamic ).

Important: Please read if you have an older computer or graphics card

tuan kuranes replied to historic_bruno's topic in Game Development & Technical Discussion

If everyone does agree, I would propose to create small simple steps/ticket so that people know where to stand there and can start contributing without colliding ? Here's a modest proposal of tickets that could be created, in order and with dependencies: Wipe all non-shader and arb code Those could be done somewhat in parallel with each other ("somewhat" because using svn instead of git is a pain... for any merging): Get rid of all deprecated opengl immediate calls (deprecated and mandatory for openglES support), turning them in vbo calls (yes, even drawing a texture using a quad. should lead to faster rendering) Remove current SDL "os window" handling and handling it directly. (Makes 0ad able to select different opengl profiles) Get rid of fixed function glmatrix calls (deprecated and mandatory for openglES and opengl 4.0 support) and we already compute most matrix anyway (in selection/pathfind/etc). It's just a matter of using uniform for those matrix (worldmatrix, worldviewmatrix, modelmatrix, etc., note that discussing/defining some uniform name scheme so that all shader share the same would ease things there, see next point) Add GLSL high level handling code: parsing glsl using regex to get 0ad #defines, #pragma, etc. (handle debug lines, #import/include pragmas to concatenate files and making glsl code much more DRY, change #precision and #version pragma at runtime, adding #defines at runtime, etc.), add reload/validate shader ability (faster shader debugging/coding). Idea is to be able to have shared reusable glsl code library. (easier to maintain, smaller codebase) A very good tool for those steps is gremedy opengl profiler/debugger as its analyzer gives nice and precise list of deprecated calls per frame or per run. (and lots of other nice opengl helpers) Once 1 and 2 done, a much easier next move then would be: Total simulation2/graphics separation using command buffers. In 0ad case, could do it higher than opengl commands: that would be something like taking advantage of "renderSubmit", and the list of submitted render entity, which would end being the "command buffer" given to graphics/render worker thread. (faster rendering as soon as 2 core available, which is pretty standard nowadays) Add new renderers: different opengl profile, openglES, debug/test mock renderer, deferred, forward, etc. ( the hidden cost here is defining a scheme to handle per renderer *materials/shader* in a nice way. (deferred and forward doesn't use the same shaders)

[Discuss] Memory aka game slowing over time.

tuan kuranes replied to tuan kuranes's topic in Game Development & Technical Discussion

Just stopping by listing a nice article on stack allocation: http://geidav.wordpress.com/2013/03/21/anatomy-of-dynamic-stack-allocations/

[DISCUSS] Performance Improvements

tuan kuranes replied to RedFox's topic in Game Development & Technical Discussion

I think that if you carefully all project frustum corner points (including near plane corner points) on the 2D "terrain" plane, you do then get the biggest rectangle containing all frustum projected points, thus preventing any possible popping in front. In fact, it's more on the conservative case, drawing a bit more than needed. Worst case would be make non-visible terrain tiles and small object submitted to render when camera angle near FPS view, but it's a "rare" use case anyway.

[Discussion] Spidermonkey upgrade

tuan kuranes replied to Yves's topic in Game Development & Technical Discussion

Thanks for the graph link. Definitely needed on wiki. Can I just at least copy/paste it there, even if not exhaustive, that help when searching for it ? In CCmpRangeManager: ExecuteQuery, ResetActiveQuery, GetEntitiesByPlayer, etc. All those methods do uses std::vector from and to js, and are called very frequently. It does show here with "very sleepy" profiler. Js related string, malloc, free are in the top calls. Not that it beats the huge "EconomyManager.prototype.buildDropsites" perf gap in aegis bot, but memory fragmentation is the reason of the overall slowdown over time of 0ad.

[Discuss] Memory aka game slowing over time.

tuan kuranes replied to tuan kuranes's topic in Game Development & Technical Discussion

Definitely need a discuss thread ? Here's another nice c++11 Great Three part list. Rewrite/copy webpages and scott meyers books might make it tedious to read and finally not be read, that's why I went for just listing strict "do that or do not commit" like guidelines. perhaps we can do listing + other wiki pages explaining each item. I would go guidelines + example + link to deep wiki page ?

[DISCUSS] Performance Improvements

tuan kuranes replied to RedFox's topic in Game Development & Technical Discussion

A first step would be to use current spatial for culling, just projecting 3D camera frustum on 2D terrain, and calling getrange on that. (would give much faster than doing 3D culling against all frustum planes, and letting reuse same 2D current code) Imho, perf wise, current spatial algo problems are: 1: duplicates: better have only one and only one entity in one tile, thus removing the very costly sort/uniq in getRange 2: contiguous memory: better have a single vector for the whole structure, rather than vectors of vectors. 3: getRange allocating on the heap a std::vector each time All those can be solved in current spatial code, with some simplifications, but those must be addressed some way in a new partitioning scheme. 1: inserting based on entity center or point. 2: algo to rearrange subpart of huge single vector when adding/removing, keeping tile in contiguous ranges (sparse vector) 3: range vector as parameter, and static/member of class that makes the call) The clear advantage of a tree would nicely solve the rect to "point simplification" that could make huge structure being not taken in account when in range query border (depending on aabox size, it stays on higher quadtree nodes, instead of ending in leaves.) Octree is certainly overkill, lots of memory and lots branching per node for near nothing interest in a 2.5D game, and that rules out using the same code for range & pathfinder (using CFVector2D) and culling (octree would need to use 3D vectors) Quadtree would perhaps give some perf improvements. Note that those algo are easy to implement and test, if you agree on keeping same interface as current spatial code, and once spatial code is also shared with frustum, it's just a matter of subclassing. I would even add a kd-tree, geohash, hilbert curve to the tests. And even tests them separately (kd-tree would be very fast for static obstacles (costly add/remove but very), and loose quadtree would very fast for moving units.) Btw, on another topic performance improvement : following current code, do anyone know what's the requirement that make current code re-compute local AaBox for animated+prop object ? Couldn't we use object static aabox, not taking animation or/and props ? It's not as if we need that much precision ? (It's very cpu intensive to make all the vertex transformation, especially if you're not needing it at all CPU side, when gpu skinning is enabled ) If it's really needed on some case (a mechanical crane ?), could that be made optional for those case ?

Sign In

Posts

Joined

Last visited

Days Won

Previous Fields

Profile Information

tuan kuranes's Achievements

Discens (2/14)

Reputation

[Discuss] Memory aka game slowing over time.

[DISCUSS] Performance Improvements

[Discussion] Spidermonkey upgrade

AoM-like pathfinding/movement

AoM-like pathfinding/movement

AoM-like pathfinding/movement

[Discussion] Spidermonkey upgrade

Important: Please read if you have an older computer or graphics card

[Discussion] Spidermonkey upgrade

Important: Please read if you have an older computer or graphics card

[Discuss] Memory aka game slowing over time.

[DISCUSS] Performance Improvements

[Discussion] Spidermonkey upgrade

[Discuss] Memory aka game slowing over time.

[DISCUSS] Performance Improvements

Forums