Jump to content

Leaderboard

Popular Content

Showing content with the highest reputation on 2022-12-18 in all areas

  1. Don't take the title seriously. What i am writing is my view and not that from the WFG-team. I try to speed up the engine and make it less stuttery. First I was locking at multi-threading. Since "if one core is to slow multiple will be faster" right... In the most recently uploaded CppCon-presentations about Hardware Efficiency and General Optimizations this statements were made: "Computationaly-bound algorithms can be speeded up by using multiple threads" and "On modern hardware applications are mostly memory-bound". It seems impossible or hard to speed up an application using multiple threads. But step by step: Computationaly bound is when the speed of an application is bound by the speed of computation. Faster computation does not always mean faster application. A webbrowser is likely network bound: A faster CPU does not improve the application speed by much but improfing the network speed does (It can also split up in latency-bound and throuput-bound). An application can be bound by many other things: GPU-bound, storage-bound, memory-bound, user-bound An application is fast if no bound does outweight the others. In the last decades CPUs got smaller (smaller is important since that reducec latency) and faster. -> Computationaly bound application became faster. Memory also got faster but is spatially still "far" away from the CPU. -> Memory bound application became not so much faster. There is some logic that can help us: We have to bring the data (normaly inside the memory) spatially near to the CPU or inside it. That's a cache. If the CPU requests data from the memory the data will also be stored inside the cache. The next time it is requested we do not request it from the memory. Exploit: Use data from the cache as much as possible. More practicaly: Visit the data only once per <something>. Lets say you develop a game every entity has to update their position and health every turn. You should visit every entity once: "for each entity {update position and update health}" When you update health the data of entity is already in cache. If you do "for each entity {update position} for each entity {update health}" at the start of the second for-loop the first entity will not be in cache anymore (if the cache is not big enough to store all entitys) and another (slow) load/fetch from memory is required. If the CPU fetches data from the memory most of the times it needs an "object" which is bigger than a byte. So the memory sends not only a byte to the cache but also tha data around it. That's called a cacheline. A cacheline is typicaly 64 bytes. Exploit: Place data which is accessed together in the same cache line. If multiple entitys are packed inside the same cacheline you have to do only one fetch to get all entitys in a their. Optimaly the entitys are stored consecutive. If that is not possible an allocator can be used. The yob of an allocator is to place data(entitys) in some smart spot (the same cache line). Allocator can yield gread speed improvement: D4718. Adding "consecutivenes" to an already allocator aware container does also yield a small speed improvement: D4843 Now and then data has to be fetched from the memory. But it would be ideal if the fetch can be started asyncronosly the CPU would invoke the "fetcher"; run some other stuff and the access the prefetched data. Exploit: The compiler should know which data it has to fetch "long" before it is accessed. In an "std::list" one element does store the pointer to the next element. The first element can be prefetched but to get the second element prefetch can not be used. Which data to load can not be determined bevore the frist fetch. Creating a fetch-chain were each fetch depend on a previous fetch. Thus iterating a "std::list" is slower than an "std::vector". Exploit2: Don't use virtual functions they involve a fetch-chain of three fetches. A normal function call does only involve one fetch which can be prefetched and might even be inlined. Back to threading In a single threaded application you might get away with the memory-boundnes. Since big parts of the application fit in to the cache. Using multiple threads each accecing a different part they still have to share the cache and the fetcher-logic(I didn't verify that with measure). The pathfinder is not memory bound (that i did measur) so it yields an improvement. Sometimes i read pyrogenesis is not fast because it is mostly single thraded. I don't think that is true. Yes, threading more part of the engine will yieald a speed up but in the long run reducing memory bound should have bigger priority than using multiple threads. There also might be some synergy between them.
    3 points
  2. This was one of his existing auras, not a new one. Having done some tests, it is slightly more effective against CCs than the Roman hero (Scipio) that gives plus 2 capture attack to all units. I agree it is obscure in that the regeneration rate stat is not even visible. This never bothered me much though, I just knew it meant he was good at capturing things. Honestly, I like the uniqueness of it. The idea, I assume, is that since Alexander captured so much territory in such a short amount of time he should be really good at capturing CC's. I agree with this. The advantage of the Roman hero is that he also captures non cc buildings very quickly, whereas Alexander doesn't have any special effect on them. In adding the new Alexander auras I was trying not to go too crazy and make him too strong. Depending on how strong he is in practice, I could be open to supporting one of the below adjustments: Remove current CC Aura, and add capture attack bonus equal to our greater than the Roman Hero's (+2). It could be added to his new Rapid Conquest Aura. Or, Change current capture regeneration rate debuff Aura to effect all buildings, not just CC.
    2 points
  3. I think this does not work for ships as they dont have ralley points. And even if it would work, it would be nicer if it would work automaticly. The ship should delay the spawn only if the ship is full AND its not at a coast or close to another ship.
    2 points
  4. Well you still have to compute the state. I suppose you could do it non visually then offer the checkpoints you are talking about.
    1 point
  5. Athenians and persians can train units in ships. This would be even better, if they could also do this if they are away from the coast. The units could simply be garrisoned inside the ship (currently the training doesnt happen unless the ship is at a coast)
    1 point
  6. So ironically, the upcoming mod being worked on by students at the Activ'Design school has Mechas.
    1 point
  7. no, this is a community mod. The community has spoken on 5 of the MR's and these are the ones that should be added. I will revise the upgrades and see if they are of interest later.
    1 point
  8. The game has very few structure/building/construction/defensive techs and bonuses (but mainly techs). Consider those.
    1 point
  9. 1 point
×
×
  • Create New...