Jump to content

RedFox

WFG Retired
  • Posts

    245
  • Joined

  • Last visited

  • Days Won

    7

Everything posted by RedFox

  1. At least cg is 'portable' (between DirectX and OpenGL that is). Furthermore, a lot of cool shader programs are out there and already written for Ogre3D. Want to implement parallax mapping? Sure, grab parallax mapping shader from the Ogre wiki and attach it to your material ... Well, in that regard, it's true that the biggest drawback of the engine right now is the ECS and there's no 'easy' way of changing it. Right now it would be prudent to discuss where I could focus my time and coding resource. Making this change to the ECS will be a huuuge change and it will most certainly break script support for a while.
  2. With a proper implementation it can be both maintainable and extendable. The trick is to encapsulate logic into action/decision sequences, much like the Entity-Component system, but actually very different. If a scripting module is implemented, it can just expose current 'decisions' or 'actions' of an entity to the script. The script can implement whatever logic it wants.
  3. That would help me out so much, actually. I have a lot of modified code and nowhere to commit.. The fact of the matter is that, implementing a lot of these graphics engine changes (to make it run decently) would require a huge change. Same effort would go to using Ogre3D instead, which already implements well, everything graphics related. Myconid's work was mostly with shader programs, the shaders are 'graphics card programs' that are compiled during runtime and uploaded to the graphics card. The shader is then usually run in a massively parellel asynchronous way, resulting in a super fast rendition of the data. You throw [Vertices, Textures] into a [shader] and it spits out an [image] on the screen. So yeah, myconid's shaders will work on Ogre3D.
  4. Actually, I think just using Ogre3D with MyGUI would be faster to implement, since I'm already familiar with Ogre3D and not so much with pyrogenesis (graphics module). So in that regard, it's an unfair comparison...
  5. Exactly! Luckily serializing / deserializing in binary is like 3 lines of code when only POD types (simple types like float, int... struct{float,int}) are concerned. However for more complex types like std::string, a custom implementation would be required... And if pointers enter the picture, things get messy. Furthermore, the items being serialized are usually entity templates, which means very complex binary format that soon begins to look like text data. In the end it might be easier to just use a new serialization method on top of more condensed data files (like units_athens.txt). Lets leave it at that for now - it can be reimplemented later Right now just getting good performance out of the parsing would be the main goal. For example, a current component in XML: <Identity> <Civ>gaia</Civ> <GenericName>Gaia</GenericName> </Identity> Could be instead represented by: identity gaia The textual representation for gaia would be looked up from translation as: ; english.txt gaia_descr Gaia Introducing translation tables would make things so much easier. Furthermore, the component can be easily parsed with: file >> cmpIdStr; IComponent* c = CreateComponent(cmpIdStr); // this will throw an error on invalid component c->Deserialize(file); // will parse till end of line for specific data and fail gracefully
  6. Hmm, that is another thing that most (if not all) MMO-s do. Of course their 'loose' files arrive over broadband... Since the user provided 'mod' files are in 'plaintext', they will have to be Deserialized before caching as raw binary... However, if the 'plaintext' files are kept in the development folder, then the cache looses its point. To the point: we have to provide a tool (built into pyrogenesis.exe maybe?) that converts 'plaintext' loose files into straight binary. This is something that could be done during release deployment. Otherwise the game would be caching stuff and wasting time.
  7. That is a really interesting idea and it can be easily done, too. But for rapid development, the system would have to be 'doubled'. This is what I mean for such an implementation (also used in Rome - Total War, mind you): 1) Search 'packs' path for *.pack 2) Search 'data' path for 'loose' (unpacked) input files *.txt, *.dds, *.etc If a 'loose' file is found, it always has higher priority, so 'data/test.txt' would be used instead of 'packs/test.pack::/test.txt'. The packs can be compressed, though that wouldn't affect much, since the data is already very dense.
  8. Thanks! Migrating from XML would be a simple task and since all the data is still kept in human readable form, it can be easily modded and new units can be added in new unit files. Well, in our case, writing and reading raw binary would be quite trivial, even in C: struct Vector2 { float x, y, z; }; Vector2 randExample[10] = { { 0.0f, 0.0f, 0.0f } }; // .... lets write the array to a file: FILE* f = fopen("myfile.bin", "wb"); fwrite(randExample, sizeof(randExample), 1, f); fclose(f); // And the whole array of Vectors was written to the file in binary... It would streamline the system by allowing explicit knowledge on the type of Entities and their statically defined components. This is not to say that an entity can't contain a 'ScriptComponent'. For now I'm thinking of implementing all of the core components straight in C++ and leave the 'moddable' implementation to the ScriptComponent. The script itself can choose the number of components it defines. As far as the API is concerned, there is only a 'ScriptComponent' as the moddable implementation. Yes, the actual position/rotation/scale should be in a Transformation component (not in separate components), implemented as a 4x4 Matrix. This way, the matrix can be sent directly to the shader without any hot-potatoing going around (a situation where data is seemingly passed from module to module and copied numerous times before it reaches its destination). The AI can be turned into a component-like system, where one of the components is ScriptAIComponent (for example..). This would allow us to implement the AI in C++, yet making it possible to add additional AI functionality. The core AI code shouldn't be in JS, because it leaves out a plethora of optimization opportunities that are definitely needed for an efficient and maintainable system. Current UnitAI.js is unfortunately neither of those two. The CFixed<> was replaced by float and the serialization method remains the same, thus the hashes are deterministic. It will work if client and server use this new version, which is probably granted .
  9. I have written around 4-5 GUI's from scratch by now and I've seen a plethora of different design choices. I even have a WIP Gui system for DirectX hanging in a repo somewhere. Though I should note that there is MyGUI for Ogre3D, which is much more functional than the current pyrogenesis Gui. That is a huge list of changes to the graphics engine. It basically means an almost complete rewrite. Exactly my point! It would remove the need to debug all that graphics code and would leave a lot more time for other more pressing matters. For now there are so many things in the engine that require the attention of a full-time programmer. I wouldn't think any of these changes would be possible if someone tweaked a bit of code every few days or so. I just worked 2 days straight on pyrogenesis, totaling around 20 hours, to change 250 files, remove a mad scientists crazy experiment from the engine and win a 20% performance improvement to the game. It wouldn't be possible if I worked on and off. Same applies to most required changes to the engine.
  10. Hey guys, haven't been active for a few years, but I've been tweaking around with the pyrogenesis source lately and made some performance improvements: Removed CFixed<> and replaced it with float. All fixed::IsZero checks replaced with ::IsEpsilon(float). This accounts for float imprecision in some cases and removes thousands of conversions from CFixed -> float and float -> CFixed. Improves overall performance by ~20%. Replaced boost::unordered_map<key, value> with std::unordered_map<key, value>. Implemented std::hash<> for the required keys. This gave only a small performance increase of ~5%, but it was worth the shot, since unordered_map is a part of the C++ standard. Most notably, the bad performance was due to unimplemented hash functors (!) . Right now performance can be further improved if all std::map<> instances have a proper hash method implemented. As it turns out however, most of these maps can be removed overall. Result: before 37fps, after 48fps on my i7-720QM and Radeon HD5650 laptop. It took some 250 files to modify and some minor redesign of the math library, but the benefit was worth it: faster game, more maintainable code. Right now the engine has a lot of inefficiencies that are mostly caused by naive implementations: 1. Inefficient Renderer: Problem: Currently the rendering system constructs a sorted map of models and their shaders Every Frame. Not only is this bad that such a large map and vector is created every frame, it could be redesigned to achieve the same effect with no maps or vectors recreated. Solution: the models can be sorted beforehand, during context initialization (!) and grouped under specified shaders and materials. Result: This should give a huge (I'm not kidding) performance boost, so this will be the next thing I'll implement. 2. Entity/Actor/Prop XMLs: Problem: Not only is XML inefficient to parse, it's also much harder to actually read and edit. To make it worse, the filesystem overhead for these hundreds of files is huge (!). Loading times for weaker filesystems can be very long. Solution: use a simplified text parser for entity/actor/prop parsing to increase speed. Group common actors/entities into single text files (e.g. - units_athenian.txt) while still retaining the 'modability' of the game. All current data can be easily converted from XML to the desired (custom) format. Result: Loading times and memory usage will decrease dramatically (which is a pretty awesome thing I might add). 3. String Translation: Problem: It would seem as if this had no effect for performance or memory, but you should think again. Creating a translation (multilingual) system offers us a cheap ability to group all the strings together into a huge memory block. Solution: Load game/unit strings into a large memory block and simply share out const pointers to the strings. Something like this: "unit_javelinist\0Generic Peltast\0unit_javelinist_descr\0Peltasts are light infantry...\0". Result: Every heap allocation costs some memory for the system heap bookkeeping and on Win32 platform it's around ~24 bytes. If you have 1000 strings of size 24, that will add up to 24KB and totaling at 48KB used. With the translation system this would remain ~24KB. Of course, there are much more strings in the game, so the result would be noticeable. 4. Entity-Component System is inefficient: Problem: While some would say its having a map<int, map<int, *>> shouldn't have any real effect on performance (since the profiler(?) says so), I would recommend to reconsider and rethink the algorithm of the entire system. Maps are very memory intensive and too many vectors are created/destroyed constantly while the look-up of Components remains slow. Solution: Give the good old ECS pattern a slight Object-Oriented overhaul. Create an IEntity interface, give concrete implementations like UnitEntity : IEntity a strong reference to a Component in its class definition. This will remove the need for component look-up. The message system can also be redesigned to be more direct. And finally, Managers can be divided into frame slots across frames, since a lot of the data can be asynchronous. Result: This will give a huge overhaul and streamline the simulation engine; making it more maintainable, easier to program and much much faster and memory efficient. 5. Naive Pathfinder: Problem: The current implementation runs a very time consuming algorithm over the entire span of the region, making it very inefficient. Solution: Redesign the pathfinder to include a long-distance inaccurate path and a short-distance accurate path. The accurate path is only updated incrementally and for very short distances, making this method suitable for an RTS. Result: A huge performance improvement for moving units. 6. Naive Collision/Range/Obstruction detection: Problem: Currently a very naive subdivision scheme is used for collision, range and obstruction. Even worse is that all of these modules are decoupled, duplicating a lot of data and wasting time on multiple updates of the same kind. Solution: A proper Quadtree embedded into the Entity Component System would help keep track of all the entities in a single place (!) and the Quadtree structure itself would make ANY(!) range testing trivial and very fast. Result: Performance would increase by at least 10 times for the Collision/Range/Obstruction detection components. 7. AI is a script: Problem: I can't begin to describe my bafflement when I saw UnitAI.js. AI is something that should be streamlined into the engine as a decoupled yet maintainable module - a component of ECS, yet in a little world of its own. Solution: Translate UnitAI to C++ and redesign it. In a FSM case (such as UnitAI.js), the controller pattern would be used to decouple actions from the FSM logic and make it maintainable. Result: AI performance would increase drastically and an overhauled system wouldbe far more maintainable in the future, leaving room for improvements. -------------------------------------------------------------------------------------------------------------------- This is not exactly a duplicate of the old Performance Optimisations thread - I've made these observations and conclusions based on the current source and my past experience as a C++ developer. Furthermore, the reason why I'm bringing out all these problems, is because I intend to solve all of them and I also have the required time and skills to do it. Michael as the Project Leader has invited me to take up as a full-time developer for 0AD after my mandatory service in the Defense Forces finishes in May, so right now I'll be focusing on small changes and getting fully acquainted with the source. As for me, my name is Jorma Rebane, I'm a 22 year old Software Developer of real-time systems. My weapon of choice is C++, though often C and assembler is required. I've worked with several proprietary and open-source 3D engines in past projects as a graphics and gui programmer, so I'm very comfortable around DirectX and OpenGL. ----- Hopefully these points will bring out a constructive discussion regarding 0AD performance and improvements. Cheers!
  11. Yes, I really agree about the massive amount of time needed to migrate to Ogre3D. Fortunately, pyrogenesis is written very well in that regard - graphics and game engine are very separated. I also agree that this is something more for part2 if anything else. Though I disagree about unnecessary bloat. Ogre3D is a very extendable library and in that sense, it does require some tweaking to get your desired fix out of it. It's open source, so the build can be customized to include/exclude features and modules that are considered bloat. Think of it as a customized Ogre distro . What I was trying to bring to discussion though was: Which one of the following is easier?: 1) Redesign the current pyrogenesis engine to bring out the needed performance. 2) Migrate to Ogre3D - a painful task, but it might take less time than redesigning/fixing pyrogenesis. The bonus of Ogre3D is that it works on Windows, Linux, OSX, Android... And best of all? It has great performance. On a side note: Michael has invited me to take on 0AD development full-time. I'm a C/C++ real-time systems programmer and I've worked on 4 separate graphics engines in my past (1 software rasterizer, a project on Ogre3D, a robotics simulator on OpenGL and a high-end game graphics engine on DirectX). So I would have the required time and skillset to make these changes. The reason I'm available is that my mandatory service in the Estonian Defense Forces is coming to an end and Michael made me a quite interesting offer regarding 0AD development .
  12. Hi! Haven't been on these forum for some time, but I think I might provide some additional insight regarding this. Right now the engine constructs material buckets (for batch rendering) every frame. This is something that should stick to the context and simply update it when the context changes (i.e. - a new map is loaded) and when new materials are submitted. As far as sorting techniques by distance is concerned - this is something that has been implemented a bit incorrectly. A better solution would be to have an IShaderTechniqueMgr, that keeps an associated list of materials and dependant models. This would allow for far easier rendering: for(IShaderTechniqueMgr* tech : context->GetTechniques()) { for(IMaterialPass* pass : tech->GetMaterialPasses()) { renderer->SetTechnique(tech->Technique()); renderer->SetMaterial(pass->Material()); for(IModel* model : pass->GetModelsCulled(frustrum)) { renderer->Render(model); } } } Why is this good exactly you ask? Well. You wouldn't be wasting time creating vectors and sorting them all the time. To top it all off, the technique manager or the material pass for the alpha tested objects can update its list of models as needed. I can't begin to describe how much performance boost this would give to the engine. Right now it is true that pyrogenesis isn't really the best graphics engine out there and would require quite a few changes before it could be used properly. Another solution would be to just use an available tried-and-tested open-source graphics engine and stop trying to implement everything by ourselves (which is obviously a bit too much to handle due to lack of programmers). Ogre3D has been mentioned many times before and is a very very good engine. It is supported on majority of platforms and its implementations are well debugged and maintainable. Furthermore, its API is extremely well documented and the engine itself has all the tools needed to build any game engine... (notably including shader managers, material passes and friends). To top it all off, the Windows implementation can use DirectxRenderer (which is a virtual device mind you), making it even faster for Windows nuts. This is just food for thought
×
×
  • Create New...