Jump to content

RedFox

WFG Retired
  • Posts

    245
  • Joined

  • Last visited

  • Days Won

    7

Everything posted by RedFox

  1. Just thought it would be nice to clear this up. Due to some disagreements with the Dev team, I decided to withdraw from the position. It's quite unfortunate, but since teamwork didn't seem to be flowing in the desired direction and we couldn't come to common terms, I feel this is the best option. The long delay from WFG led to me finding a new job and I would much rather stick to that. I have no idea what WFG will do next regarding the fundraiser money, but I'm sure they'll find some way to put it into good use.
  2. If we manage to get NAT hole punching to work, then it doesn't matter what port number the client uses. As long as the listener server has its listening port open, we can create numerous connections from client to server and use the automatically opened ports on the client.
  3. I agree Ykkrosh, it's tough working on large scale changes on such a large project - it often looks like operating cancer, really. I'd say my focus would be on reviewing and bugfixing megapatch so that all the problems get fixed. And in the future, we should definitely avoid megapatch situations, since they become increasingly difficult to maintain the longer they remain off the main svn trunk.
  4. There is no October update thread. The last work I did was the first week of October where I fixed memory leaks in the new lobby, added VS2012 support for premake4 and some bug-fixes on megapatch (no new features). And the end of September I fell quite ill and was out of action for 2 weeks; actually I've yet to fully recover - I should probably give myself more rest. Since there has yet to be any discussion on resuming paid development, I'm remaining in stand-by.
  5. Is there any further progress on this test? What kind of issues are you facing currently? Updating the navmesh? Performance issues? For grid distances you could just use the standard "manhattan distance" method. It's a simple and fast method and also gives decent results.
  6. Looking good. To avoid the worst case of A*, you'll have to do some simple heuristics like described before. An iterative flood-fill method should be pretty fast and do the trick. Running some tests with my own A* setup on reasonably complex map layouts, a 48x48 (2304 tiles) can take around ~1500 iterations to cover the map, while a 512x512 (262144) can have over 88700 iterations per A* search query. If you can reduce the number of cells tenfold from a 256x256 map (~65k -> ~6k), you'd already cut down the number iterations by a huge amount, to somewhere around ~5000 for the whole map. Looks like a huge algorithmic victory to be gained regardless
  7. We are actually considering moving the AI to C++ because of how badly it performs in JS. It would save us a lot of time in the long run if we started working on a new AI in pure C++ instead of expanding the JS AI. It would also solve any limitations of JS and the duplication of game state (it's a major slowdown having full gamestate duplication from C++ to JS).
  8. I think the problematic cases are when drivers have bugs and/or incorrectly implement features. Though cross-checking features would be a good way to eliminate hopeless ancient drivers.
  9. When it comes to optimization, I would rather focus on a well thought out algorithmic approach at first. Micro-optimizations as some call them shouldn't be the focus, though they shouldn't be disregarded either - a good cache locality optimization can make many algorithms several times faster. I guess the main benefit of a grid-based system is its relative sequential nature, which is better for CPU cache than other pointer based methods. Notoriously, linked lists are ridiculously slow on modern hardware, even though they are algorithmically faster for insert/remove, any simple array/vector based system will beat it hands down because sequential memory is naturally attuned for CPU cache. When you are working on your new solution, I hope you will keep this in mind. Although, if we the algorithm significantly reduces overall complexity, it will definitely be worth it. I'll be looking forward to what you can conjure in the upcoming few weeks.
  10. Exactly. If we plan to support 10-year old hardware, then any hardware since 2005 will have OpenGL 2.0 support. The only issue can be with some non-standard compliant drivers that fail to optimize or properly compile the GLSL shaders - so far we had an issue with pre-processor not working on OSX GLSL compiler. It might be simply because people are using outdated graphics drivers. Perhaps we should compile a list of suggested drivers for every platform? It would make our life easier if we just ensured proper driver suite is installed with the game, right? As for ARB shaders, the main problem is that the ARB shader code is already outdated compared to our GLSL code. It's becoming more and more of a maintenance issue than anything else. On performance I'd rather look at our current GLSL implementation and how well the code is optimized - the fact is that we haven't actually optimized the shader code at all kind of puts it in an unfair position.
  11. Regarding fixed-function pipeline, we have decided already months ago that it will be dropped in future releases. So all in all, it shouldn't really be a surprise. You can get a cheap modern laptop with just 200$. I don't see any reason to support fixed function pipeline as it is, since it's a relic Look at this from another perspective - should we had incorporated MS-DOS support in 2003, just because some people somewhere still used it? Even though it was a decade-old relic already?
  12. We won't need any special treatment of enemies in the pathfinder. The hypothetical new "CollisionManager" would detect collisions and decides whether to start fighting (collide with enemy), pushing (collide with friendly), or try to find a path around the friend/foe (disengage from fights).
  13. Thank you for explaining your point of view in detail.
  14. Entities are 2D bounding boxes (since they can be buildings, units, flora, fauna...). The PFor implementation isn't entirely horrible - though it has quite a few concurrency issues. We don't use OpenMP because we already implement pthreads and there is literally no useful cross-platform library with a PFor. The patch already implements thread pools if you didn't notice, so I wouldn't worry about the implementation efficiency. Since we use C++03, the code isn't that nice. In C++11 we could get away with cool little lambdas: ps::parallel_for(0, size, [&](int atIndex) => { printf("%d\n", atIndex); }); As I measured speedup from multi-threading, my results were: 1 thread: 24ms/turn 2 threads: 14ms/turn 4 threads: 8ms/turn If we push the code into a worker thread - something that works in background to the whole engine, we will definitely reduce the time taken, even on single core CPU's. However, we have to account for situations where calculating ranges takes more than 1 turn and how that will affect the simulation state. We will probably have to wait out the results from previous turn regardless. All in all, it might be worth the shot and it would have less concurrency issues than PFor. It would definitely make it easier to do range checks between formations. However, once a formation is in range, we will still need to target individual soldiers which is again one of the slow cases. I wonder how Total War handles this problem?
  15. The testbed used was my own private OpenGL engine and doesn't have anything special other than an interactive demo for debugging / testing. I'm quite sure that my SAP implementation isn't completely optimized in the algorithmic sense. The most expensive parts are moving elements in the axis, which forces us to update the random-access indices. I thought I could make it work, but SAP isn't meant for random access. Even the GPU + SAP demo sweeps the entire set - which is very efficient, but totally different from what we need. Now, I'm not saying our implementation is perfect - far from it actually. Because of existing code, we are forced into this random-access range detection. Furthermore, SAP is good for collision detection, but doesn't look like very good for range detection. I mean I somehow made it work for ranges - but it was a freak attempt IMHO. Perhaps we could use regular grid subdivision and implement SAP in every subdivision grid? That would eliminate a lot of X-axis alignments for sure. If we use subdivision, we have to ensure that entities only exist in one grid at a time, yet ensure that they will be correctly detected from neighboring grids. This simplified drawing illustrates the issue: Entity A, inside grid tile 1, queries for entities in range. Obviously entity B is in range, but it exists in grid tile 2. So in order to get accurate results, we must also include all entities in tile 2. The current implementation in 0AD SVN would place entity B into both grid tile 1 and grid tile 2. This means we have to sort the results and call std::unique to remove duplicate entities. I think we can improve the current implementation of Spatial Subdivision by improving on these issues. We should also use floating point numbers internally for faster Squared Distance calculation - meaning we could return accurate results and avoid any further calculations.
  16. I only have my personal prototype build (since I ditched the pyrogenesis port after it failed to give any desirable results). The code doesn't really explain much of the problems, but trying to implement random-access search with SAP was pretty much the reason it failed (since SAP is supposed to be sweeping, not random accessing). However, there is no way around this with the current design of the simulation2 engine in 0AD - we need random access. SAPDivision.h: http://pastebin.com/5Ld0QUA6 SAPDivision.cpp: http://pastebin.com/mL211s2k
  17. So the previous week was actually really tough - I spent pretty much all the time working on implementing Sweep and Prune approach and testing all sorts of configurations. There were some obvious upsides to the algorithm at first, but unfortunately it has a worst case scenario that almost always plays out in our game engine. So my final through opinion is that we can't use it for the sort of queries we're doing. So would I vote this week of 53 hours as wasted? Well, I could have used all this time for something else that didn't end up with no results. Week #7 16.09 - 22.09 16.09 08-24 (16) - SAP subdivision implementation draft. 17.09 08-13 (05) - SAP subdivision, small Megapatch fixes. 18.09 11-23 (12) - SAP subdivision. 19.09 22-05 (07) - SAP testing and debugging. 20.09 16-21 (05) - SAP performance testing and results. 22.09 21-07 (10) - SAP implementation in Pyrogenesis and results. Parallel For. 1. Sweep and Prune implementation. So, I spent pretty much the whole week working on creating a testbed for Sweep and Prune. The first results came in the middle of the week when I could measure its performance. It looked pretty good at first - slightly outperforming the current range implementation. I think the best part of this implementation was its bounding-box precision that always gave near-perfect results. I think it was also nice how we could rely on the results being in sorted order and also got the squared distance in the result. However, since our current simulation2 design requires random-access requests (get all entities in range of Entity123, etc..), it made moving along the axis a really expensive task. The worst case scenario for SAP is when a large bunch of entities is stacked in X axis - and believe me, this happens a lot in 0AD. This causes a lot of iteration on the axis and discarding mismatched entries. Furthermore, it's the reason moving entities in the axis is really expensive. In a sparse setup, when entities were not obviously aligned in X axis, the results were perfect 2000 entities in 0.97ms. However, when I arranged the entities closely in a large square formation, it became 19ms - which was unacceptable. 2. Sweep and Prune in Pyrogenesis The last thing I needed to do was to implement it in our engine, which was a pretty trivial task. Initially, even with large formations SAP performed much better than our old subdivision - 3ms vs 12ms. However, something I could not have anticipated was moving entities becoming a really expensive task. I kindly neglected the fact that when entities are stacked in X axis, the desired small movement of only 1 position in the axis, becomes an expensive case of shifting 100 entity handles instead. So when entities started moving around - performance hit rock bottom. Clearly a failure. We can't implement an efficient random-access RangeManager with Sweep and Prune. I will now kindly bin Sweep and Prune into the bucket of algorithms that are useful in many cases, yet useless in our case. 3. Parallel For What is parallel for?: For many of you non-programmers, it's a very robust multi-threading approach of using several threads to iterate over some range of items. Usually you would do a fair bit of work in every loop iteration. In our case, Parallel For is something we've been wanting for quite some time. There is barely any multithreading going on in 0AD, so modern 8-thread laptops (like mine), suffer from horrible single-core performance. After implementing a very robust solution with posix pthreads, the speedup with 4 threads was ironically ~3-4x, which is the same amount of gain SAP would have given - however in this case it's consistent. You can take a look at the patch in the RangeManager optimization ticket, I think Parallel For is a really great step for the codebase to really squeeze out multi-threading performance of modern systems. http://trac.wildfire...com/ticket/1707 ------- This is it for now. I am now convinced that we do need a really, really complicated solution for RangeManager that only gives us perfect results. However, I won't be jumping into RangeManager spatial issues any longer, since A15 is nearing up and we need to finish up megapatch optimizations.
  18. Hi MattFox, nice to see another coder throwing himself into the fray on this one. I think the best place you can start from is creating a list of features that the 0AD pathfinder needs. For example, because the game design foresees variable-cost terrain movement (slower movement in mud and bog than on grass), the amount of possible options is drastically reduced. Another big issue is unit sizes - we need a pathfinder that works from small wardogs to large elephants. When we fail to find a path, we need to provide a new path that moves as close to the target as possible. I think the main issue however, is not the pathfinder itself - it's the entire system of ObstructionManager. Basically, every individual unit has an obstruction square (the kind of obstruction you pathfind around). I think this is the biggest flaw of the system, because if you have 200 or more units in close order, the pathfinding becomes incredibly complex. A better solution would be to introduce a new CollisionManager module that manages individual unit-to-unit collisions only. You can see games like StarCraft 2 that do this: And it works great - units that collide simply push each-other apart and out of the way. There is actually a grid-based pathfinding method that uses a hierarchical passability mesh/grid. Incidentally, it also solves unit-sizes due to its clearance-oriented structure. This method is a combination of Clearance-based pathfinding grid with HPA* (Hierarchical Annotated A*): http://aigamedev.com...ed-pathfinding/ It's the only solution that really solves all the pathfinding problems we're facing (besides unit collison that is). Traditional A* grid approach would use floodfilled planes to detect unreachable areas (something like this - notice how unreachable planes are colored different). This can be used with any grid-based method, you just check if (source.planeID == destination.planeID). After that check fails (there is no path available), we just pick the closest tile that has the source.planeID and pathfind to that. Alright, enough ranting from me. I'm really glad someone decided to just jump into the fray - but there are a lot of issues that must be confronted. One thing is clear though - simply implementing a new pathfinder won't fix anything if we have 200 tightly packed obstructions that ruin the algorithm.
  19. I'm currently working on optimizing the RangeManager, which is used for detecting objects within a certain range. The error: "SpatialSubdivision Query too large. Results truncated." is simply a safeguard I put into place to detect if we're doing ridiculously big range queries. This is one of the reasons we're having such bad performance. Right now there's no way around the issue with the old code. I'm working on a completely new solution that lowers the amount of results noticeably.
  20. Hello John. I understand that projects written in C++ seem a bit over the top for hobbyist programmers or CS undergraduates at first. However, C++ is a perfectly productive language if you know it through and through. It's true that it's harder to master than any other language out there, but the performance it offers is something you can't really compare against with scripted or GC based languages. I've worked on several game projects in the past few years and in every case we wished we would have used C++, because GC based languages such as Java or C# would cause GC cleanup spikes that really killed the performance. Using script-based languages such as AS3 was even worse, because it's anywhere around 100 times slower than C++, meaning we put insane amounts of optimizations into the 3D engine. In the end we spent months of optimizations on something you can get straight out of the box with C++ and OpenGL. If you want to make all the same mistakes senior developers have already done, go right ahead and try to write a huge game like this in Python. There's a reason no AAA game is programmed in Java, C# or god forbid Python. P.S. - Torque is Windows only and written in C++.
  21. I committed a temporary "fix" that increases the capacity to 2048. Hopefully with the new Search and Prune approach for the Range Manager, we won't stumble into this issue. You'll have to wait for the next autobuild before the issue corrects itself.
  22. Hi lars, it seems you stumbled upon a new hardcoded limit. It's something I've put into our old SpatialSubdivision class for mostly testing and ensuring that our code isn't doing anything wrong. In your case, it seems like a range query returned more than 1024 entities - a really bad case. debug_warn("SpatialSubdivision Query too large. Results truncated."); Can you give more information on what map you were playing in, how late in the game were you and what /seemed/ to trigger the error? Were there a lot of units on the screen?
  23. A little preview for non-believers, I reckon. The idea is to drop the use of simple entity_id_t handles that we used in the past in the SpatialSubdivision class, where we had to do some searching for the entities own grid every time we did a lookup, which was kind of silly. Also, a huge downside of using Spatial Subdivision grid, was with buildings - a building can take several tiles, which complicated the range query algorithm. We had to sort all the results and then call std::unique on the results to remove the duplicate building id's. Worst of all - the Spatial Subdivision grid class provided no actual distance data for us, which is bad. With Sweep And Prune subdivision, the entities are sorted by their position in the world - perfect right? Sorting itself is very fast because entities never move too much and usually you just have to swap entities in the list to keep the axises sorted. Furthermore, the biggest benefit is that SAP will only give us "perfect" results, so we won't have to do any additional filtering like we do now. That filtering is really expensive with our current Entity-Component System. Since we already have the position data and items are returned sorted closest-to-furthest, we won't have to do any additional work. For standard collision schemes, SAP would perhaps not be the best choice, but for this concrete problem with Object size + Object range, SAP is pretty much perfect. Here's the current prototype I'm working on: /** * A simulated Entity object for this demo alone. The pyrogenesis Entity is completely different, * but its semantics remain the same. Entities have position, size and range. */ struct Entity { Vector2 pos; Vector2 size; float range; inline Entity(const Vector2& pos) : pos(pos), size(10.0f), range(10.0f) { } }; /** * SAPDivision entity handle * Contains approximate X/Z axis index for fast random access and * the object's bounds data */ struct SAPEntity { int approxX; // approximate index in XAxis array (autocorrected) int approxZ; // approximate index in ZAxis array (autocorrected) Entity* entity; // pointer to the entity object with actual data // @note Objects are sorted by "xMin" and "zMin" in the Axis arrays float xMin, xMax; // ------| object |------> axis : the object BOUNDS float zMin, zMax; }; /** * A range query structure for fast queries */ struct SAPRangeQuery { enum { MAX_COUNT = 1024 }; int count; SAPEntity* items[MAX_COUNT]; inline SAPRangeQuery() : count(0) {} inline const SAPEntity* operator[](int index) const { return items[index]; } inline int size() const { return count; } inline void clear() { count = 0; } }; /** * Sweep and Prune spatial subdivision - Great for rangefinding. */ class SAPDivision { std::vector<SAPEntity*> m_XAxis; std::vector<SAPEntity*> m_YAxis; std::vector<SAPEntity*> m_FreeHandles; // reuse handles if possible public: SAPDivision(); /** * @brief Inserts a new entity into the subdivision map * @param entity Entity to insert */ SAPEntity* Insert(Entity* entity); /** * @brief Removes an entity from the subdivision map. * @param handle SAPEntity handle to erase. */ void Remove(SAPEntity* handle); /** * @brief Moves an entity by the given amount. Very cheap - simple shift in the Axis. * @note For small moves the entity is shifted in the SAP axises. Large moves perform Remove+Insert. * @param handle SAPEntity handle to move. * @param pos The new position where to move the entity. */ void Move(SAPEntity* handle, Vector2 pos); /** * @brief Get entities in range of a BOUNDING BOX. * @param inrange [out] The resulting array of entities in range. Sorted from nearest to furthest. * @param min Bounding box minimum. * @param max Bounding box maximum. */ void GetInRangeBB(SAPRangeQuery& inrange, Vector2 min, Vector2 max); /** * @brief Get entities in range of a RADII. * @param inrange [out] The resulting array of entities in range. Sorted from nearest to furthest. * @param pos Center of the range query. * @param range RADII of the range query. */ void GetInRange(SAPRangeQuery& inrange, Vector2 pos, float range); };
  24. Hi again jcantero. All of your statements regarding what /should/ be done is correct, however the current implementation pretty much forces these silly constraints. Basically they're bugs and issues with the approach/algorithm itself. And to fix the issues, the original programmer just threw CPU cycles at the issue to solve it. Hence we have std::sort + std::unique. Why do we need that? Because large buildings can exist in several tiles at the same time.... I know, sounds very silly, right? Well that's the only reason we need std::sort and std::unique. I explore this issue in detail in my latest development report here: http://www.wildfiregames.com/forum/index.php?showtopic=17580entry274324 I would like to point out that, I really like the Sweep & Prune approach. I have a technique in mind that should solve the current issue and would automatically return distance sorted entities from the Sweep & Prune structure. Thanks for pointing out that method.
×
×
  • Create New...