phosit
WFG Programming Team-
Posts
177 -
Joined
-
Days Won
3
Everything posted by phosit
-
Don't take the title seriously. What i am writing is my view and not that from the WFG-team. I try to speed up the engine and make it less stuttery. First I was locking at multi-threading. Since "if one core is to slow multiple will be faster" right... In the most recently uploaded CppCon-presentations about Hardware Efficiency and General Optimizations this statements were made: "Computationaly-bound algorithms can be speeded up by using multiple threads" and "On modern hardware applications are mostly memory-bound". It seems impossible or hard to speed up an application using multiple threads. But step by step: Computationaly bound is when the speed of an application is bound by the speed of computation. Faster computation does not always mean faster application. A webbrowser is likely network bound: A faster CPU does not improve the application speed by much but improfing the network speed does (It can also split up in latency-bound and throuput-bound). An application can be bound by many other things: GPU-bound, storage-bound, memory-bound, user-bound An application is fast if no bound does outweight the others. In the last decades CPUs got smaller (smaller is important since that reducec latency) and faster. -> Computationaly bound application became faster. Memory also got faster but is spatially still "far" away from the CPU. -> Memory bound application became not so much faster. There is some logic that can help us: We have to bring the data (normaly inside the memory) spatially near to the CPU or inside it. That's a cache. If the CPU requests data from the memory the data will also be stored inside the cache. The next time it is requested we do not request it from the memory. Exploit: Use data from the cache as much as possible. More practicaly: Visit the data only once per <something>. Lets say you develop a game every entity has to update their position and health every turn. You should visit every entity once: "for each entity {update position and update health}" When you update health the data of entity is already in cache. If you do "for each entity {update position} for each entity {update health}" at the start of the second for-loop the first entity will not be in cache anymore (if the cache is not big enough to store all entitys) and another (slow) load/fetch from memory is required. If the CPU fetches data from the memory most of the times it needs an "object" which is bigger than a byte. So the memory sends not only a byte to the cache but also tha data around it. That's called a cacheline. A cacheline is typicaly 64 bytes. Exploit: Place data which is accessed together in the same cache line. If multiple entitys are packed inside the same cacheline you have to do only one fetch to get all entitys in a their. Optimaly the entitys are stored consecutive. If that is not possible an allocator can be used. The yob of an allocator is to place data(entitys) in some smart spot (the same cache line). Allocator can yield gread speed improvement: D4718. Adding "consecutivenes" to an already allocator aware container does also yield a small speed improvement: D4843 Now and then data has to be fetched from the memory. But it would be ideal if the fetch can be started asyncronosly the CPU would invoke the "fetcher"; run some other stuff and the access the prefetched data. Exploit: The compiler should know which data it has to fetch "long" before it is accessed. In an "std::list" one element does store the pointer to the next element. The first element can be prefetched but to get the second element prefetch can not be used. Which data to load can not be determined bevore the frist fetch. Creating a fetch-chain were each fetch depend on a previous fetch. Thus iterating a "std::list" is slower than an "std::vector". Exploit2: Don't use virtual functions they involve a fetch-chain of three fetches. A normal function call does only involve one fetch which can be prefetched and might even be inlined. Back to threading In a single threaded application you might get away with the memory-boundnes. Since big parts of the application fit in to the cache. Using multiple threads each accecing a different part they still have to share the cache and the fetcher-logic(I didn't verify that with measure). The pathfinder is not memory bound (that i did measur) so it yields an improvement. Sometimes i read pyrogenesis is not fast because it is mostly single thraded. I don't think that is true. Yes, threading more part of the engine will yieald a speed up but in the long run reducing memory bound should have bigger priority than using multiple threads. There also might be some synergy between them.
-
Message System / Simulation2
phosit replied to phosit's topic in Game Development & Technical Discussion
That's not what I ment. The messages are still sent via C++ to some JS-entity class that class then sends the message to its component. -
Message System / Simulation2
phosit replied to phosit's topic in Game Development & Technical Discussion
How about sending a message to a entity instead of a component and the JS-entity dispatches to its component? then there would only be one c++ -> JS comunication per entity. -
Message System / Simulation2
phosit replied to phosit's topic in Game Development & Technical Discussion
I saw that page already and ignored it thaught it is outdated . I'll see what i can update. I don't know the units pushing code. It would be great if we can generalize it to optimize range queries. -
Message System / Simulation2
phosit replied to phosit's topic in Game Development & Technical Discussion
Regarding the function call cost I can only guess too. The messages(arguments) are typically smal / easy to convert. That should not be a problem Thouse use the Reactor-/ PubSub-/ Command-pattern (which is it?)for async stuff. We nead it for consomization. More indication that we use the wrong pattern. -
Message System / Simulation2
phosit replied to phosit's topic in Game Development & Technical Discussion
Oh, yes I have an idea. Responding to my own call ... and laughing about my own joke Currently the components are (among others) accessed throu std::map<ComponentTypeId, std::map<entity_id_t, IComponent*> > m_ComponentsByTypeId; I would make a container for each ComponentType std::map<entity_id_t, CCmpAIInterfaceScripted> AIScripted; std::map<entity_id_t, CCmpAIManager> AIManager; // ... all components known to C++ // just as bevore std::map<ComponentTypeId, std::map<entity_id_t, CCmpScript>> ScriptedComponents; There are no interface-classes; no virtual functions; the components could be stored inside the container. In short much less redirections and propably more inlining. -
The simulation requires much time in the late game. While this is expected we still should ompimize it. For reference: https://trac.wildfiregames.com/wiki/SimulationArchitecture There are many ideas: @Freagarach tried to send messages only to some entities. https://code.wildfiregames.com/D2704 [[Abandoned]] @Mercury and @wraitii think about multithreading. i.a: https://wildfiregames.com/forum/topic/90006-mutex/ Mercury also made an allocator. https://code.wildfiregames.com/D4718 There is a new datastructure from wraitii. https://code.wildfiregames.com/D1739 Meta: Isn't that (many improvement ideas) a sign of a deeper problem. Should we change the message system or move away from it altogether? Do you have other idears? (Did we change the system already, why is there a 2 after Simulation?)
-
In the lower left there is "Alpha XXIV" but there should be "Alpha XXVI: Zhuangzi" For me community/0ad a26-1 and a26-2 work.
-
Changing C++ Coding Conventions
phosit replied to phosit's topic in Game Development & Technical Discussion
The changes are reverted: Not all were discussed in this thread and I made some bad examples. The diff is attached To explain all of them: Were to Include third party libraries was not defined in the CC. I formulated wat Vladislav said on IRC. The example with boost/optional is bad since we try to have few dependencyes on boost. I replaced "Use STL when appropriate." with a link to C++ Core Guidelines rule. At "Use nullptr" I added "This convention should be implicit but stated hear because there are still many places were `NULL` is used." I removed "Don't do `if(p) delete p;`". I havent seen this pattern in code and somebody knowing C++ would not write that -> will be pointed out at code-review I replaced the paragraph about auto with a link to C++ Core Guidelines and the sentence "Be aware that overuse of `auto` can also cause problems in the long-run." I moved the Code And Memory Performance guidelines to the bottom and added a link to C++ Core Guidelines. Algorithms I didnt wan't to write a harsh rule("No raw loop") or a wage one("Use algorithms when appropriate"). So wat i did was write a reason why not to use raw loops. here again the excample was bad since I wrote std::ranges::iota instead of std::views::iota and we don't require support for either. CC.diff -
Scenario: two component of different type lock their mutex. But both could access a shared/global resources. Oh... now i get the question of this whole thread: you asking about thous shared resources. I don't know but i suspect there are many. And much code to check, also JS. We have information about the cpu's used: https://feedback.wildfiregames.com/cpu/
-
Changing C++ Coding Conventions
phosit replied to phosit's topic in Game Development & Technical Discussion
I made changes to CC: http://trac.wildfiregames.com/wiki/Coding_Conventions?version=56 The part about C++ Core Guidelines and that about algorithms are not as present as i wish them to be. But i didn't know what to write -
Every component could do anything. Because of that it is hard to paralelize the message-system; and it is hard for the compiler to make optimizations. Did i understand you correctly: you want one mutex per ComponentType? That would be useless because a recource might be mutated by multiple ComponentTypes. The function taking long is PostMessage. This function sends a message to all components which have subscribed to the messageType. Some subscribed components do nothing: RangeManager subscribes to PositionChanged but has to check if it "trackes" the entity (CCmpRangeManager.cpp L570). This could be changed to first ask all subscriped component if they realy are interested in the message (This could be done in paralell). And then sending the message to the interested components (in sequence). An other idea: At the moment most Components have a switch in the HandleMessage. If Messages were static types this switch would not be necesary. This aproach might interfere with JS.
-
Changing C++ Coding Conventions
phosit replied to phosit's topic in Game Development & Technical Discussion
A link to CppCoreGuidelines would indeed be usefull. Yes but why? Would you go for str::FromDouble? The misc part will not go in to CC. "use std::string were possible" would be to harsh. -
Ref: https://trac.wildfiregames.com/wiki/Coding_Conventions#C auto This paragraph should be removed/shorted, because sometimes it does make the code more verbose by adding very little value, sometimes it is slower because of implicit conversion, and sometimes it is impossible lambda; structured binding and trailing return type. The CC should state instead: Use const auto& or auto&& in range based for loops and algo-lambda-parameter or write a comment why not. const int i = std::reduce(vec.begin(), vec.end(), [](const auto& l, const auto& r){return l + r;}); l and r is what i mean by algo-lambda-paramerer. algorithm There is already: But STL is an ambiguous term and this paragraph does not appreciate std::algorithms enought This general rule is not good. Because sometimes i have iterators or the body is an existingt function. In both situations it is bether to use std::for_each. I would also write a paragraph about avoid old stile for-loops because iterations could be bypassed from within the body: // avoid for (size_t i = 0; i != 9; ++i) { // The iterator can be mutated. This loop never ends. ++i; } // saver for (const size_t& i : std::ranges::iota(0, 9)) { ++i; // compile-error } Misc could we try to integrate the code from source/lib? Licence; brace style; string type. propably rewrite it in source/ps. Own string type is a bad solution to add more functionality: free functions do it as well.
-
I remember a ticket were they talked about distuinguish phase by backgroundcollor.
-
I don't know what this "man throwing a discus" is. I'm a noob i know I start a new game: If it is ordered by function i know it produces units and in is enabled on phase 2 or 3 (since it is disabled on phase 1) If it is ordered by phase i know only that it is enabled on phase 2 or 3 (since it is between a building of phase 2 and one of phase 3)
