Leaderboard

in all areas
Custom Date
- Custom Date
  Between and

Ludo38

WFG Retired
- Points
  
  2
- Posts
  
  721
- Find Content
RedFox

WFG Retired
- Points
  
  2
- Posts
  
  245
- Find Content
wrod

Community Members
- Points
  
  1
- Posts
  
  86
- Find Content
Mythos_Ruler

WFG Retired
- Points
  
  1
- Posts
  
  14.941
- Find Content

Popular Content

Showing content with the highest reputation on 2013-05-01 in all areas

Reputation Points

today i just decided to see my 0ad profile and noticed there was this reputation point thingy and i got 1 WOO!!! and im just wondering.. what are reputation points/wat r they 4
- May 2, 2013
1 point
Mythos_Ruler's Playlist

In love with this song! Makes me smile every time I listen to it! It's as if Michael Jackson was still alive and touring with Daft Punk.
- May 1, 2013
1 point
[DISCUSS] Performance Improvements

Great post, Yves! RedFox: I know the GUI example was just for demonstration but like Philip I get about 800-900 fps on the main menu with a 2 year old GPU, so I can't believe it's worth doing anything about the UI for performance reasons (there may be other good reasons, but then they need to be discussed in a different topic/context). Even a significantly slower system should have acceptable performance there. On an ancient c. 2004 single-core laptop with Intel GM915 graphics, I'm still getting 70+ fps in the UI, even considering that GPU is unusable with the game, but I don't think we should fret too much about that right now (especially since there was discussion recently and more or less everyone agreed dropping the fixed function pipeline would be no loss). My advice for anyone wanting to optimize the game would be to actually play it. Play it in single player with AIs, but know they are very stupid and inefficient currently, so also play it in multiplayer with no AIs. By playing I mean finish a game Huge combat demo is not really playing the game, loading one of the few maps with a excessive trees isn't really playing the game. You'll notice real world issues playing real world maps in a real world way. The last staff match I played was 8 players: 4 humans and 4 AIs, we all commented how smooth the game was, though the map was chosen to be lag-free as possible, I think it illustrates the point. The game runs remarkably smoothly for me in multiplayer games, though I'm not quite on a 2007 laptop, but many of the issues affecting my experience should be the ones affecting your experience and others in an even bigger way. It would be nice to focus on those first, rather than considering rewriting the renderer, or major simulation architecture rewrites. It might take more time before you understand what's going on and what needs to be done, but there are others around (like Philip) who have looked at these problems before, so you can use that knowledge and not be completely in the dark. No one is working on the short range pathfinder as far as I know, that's a major concern in battles, of which the huge combat demo is an extreme example. I would be interested if someone tried that map and then claimed GUI engine, renderer, or even fixed point math is the most serious concern or even a logical place to begin. Formations need redeisgning but they bring out the worst sorts of pathfinding issues, especially with large formations moving around static obstacles (the current long range pathfinder scans the entire map to find a tile is unreachable, and it might do this dozens of times per turn - a problem mostly solved with Philip's WIP patch). I ask what the benefit of a 20% gain really is in code that may take several seconds per turn because it doesn't scale? AIs aren't careful about memory usage so GC becomes a problem (significant intermittent delays), and the AI API doesn't expose all the functionality they need, so they have to do things like terrain analysis in JS, doing MBs of allocations - instead we should have a better C++ interface for AIs and move performance critical logic there. AIs should be multithreaded, pathfinding should be multithreaded - in both cases I think there are blocking issues before we get to that point, like upgrading JS or completing the pathfinder design. We should be more careful about how we schedule GC. Having a thorough benchmark mode would be great, so we could more reliably measure and report performance data. Having a better way to collect and analyze performance data from our users would be a boon - currently it does something like report data at the start of a match, if they enable user reports, it goes into a massive database that is never actually used There are a lot of tasks for an able C++ programmer that don't involve squeezing out small gains here or there because a profiler might indicate it's an issue, need to keep the big picture in mind as well.
- May 1, 2013
1 point
[DISCUSS] Performance Improvements

I agree that it's extremely inefficient. It won't take much effort to modify the current code. I'll take a look into it and submit a patch sometime if it's okay. Before I do that, I'm going to build a small test with FreeType 2 beautiful anti-aliased fonts...
- April 30, 2013
1 point
[DISCUSS] Performance Improvements

I think you're right, it doesn't serve any purpose at the moment. Perhaps you're correct on that matter, since a lot of these problems are disputatious, its better to handle smaller tasks right now. Since Philip's pathfinder is still a WIP, I haven't given it much attention. Perhaps Philip could comment on this? If he can finish the pathfinder, or if we can start over? Right now simulation 'turns' are taken a few times per second. Given that we can calculate per average how many frames are called between turns (let it be N), we can divide the time into 'frame slots'. Specifically, you can assign slot[0] = Task0, slot[1] = Task1, slot[N] = TaskN. Where of course any task that has a slot greater than N just overflows. In this sense, if the tasks are asynchronous they can be assigned to any slot. If they require a previous task to run first, they have to be assigned to a higher slot. It's actually pretty simple once you put it to code. It just helps you divide all the tasks over a series of frames instead of calling all the tasks every 250ms. It should help with the framerate spikes and it won't introduce data-race conditions. I thought Philip said that JavaScript has no noticeable effect on performance? You are correct, that is always a huge issue - getting the same results on different platforms. So a lot of functions would need to have profile macros? Or do you mean a kind of profiler you can run inside JS scripts? So basically, it's rather useless since the amount of data is overwhelming. Now Profiler2 sounds like something much better. I think all of this profiling and implementing a better profiler is really really good stuff actually... BUT - There's always a but somewhere. Visual Studio 2012 has a really really good profiler too... The main benefit of the VC++ integrated profiler is that it accurately measures time spent inside a function and the timers are compiled into the assembly output if you run your program with the profiler on. Of course its downside is that it has a similar issue to to Profiler1: it generates huge (~512mb per minute for pyrogenesis) datasets that take quite some time (a few minutes) to analyze. Luckily the analyzed report is quite small So for example, taking a performance sample of just running the Main Menu and opening the few tooltips present, of course only filtering and showing the last minute of crazy menu action: First thing we can see is that 0 AD Main Menu doesn't use a lot of CPU power: out of 8 cores it only needs one active thread while others are mostly in sleeping state. Even though 0 AD didn't consume that much CPU, it still didn't run fast. What gives? The only noticeable peaks in CPU activity are probably from menus and dialogs opening up. Checking the hot path for the generated report, we see something interesting: Around 25% of time is spent suspending the thread (which is very good - we don't need to use more power than necessary), and another 50% is used for WaitForSingleObject, which is the Windows API's mutex lock function. Are we having concurrency issues here? Is it a worker thread waiting for mutex wakeup? Interesting to note is that std::char_traits<char>::compare is being called a lot and is thus pointed out by the profiler. There were 5 threads running in total, so it makes sense that they were being suspended most of the running time, otherwise CPU usage would have skyrocketed. Taking a closer look at the function statistics that we really care about: the active application hotpath. Now we can start really profiling the main menu. What is going on here exactly? Aside from lots of threads sleeping, we can see that GuiManager is the one doing the real work. Around 20% of the remaining CPU time was spent drawing the GUI, if we do some quick math (20/25) that makes 80% of the total run time. So what, big news? Rendering is takes time. We all knew that. Besides, everyone uses recursion to create easy-to-use GUI API's. What we're missing here however, why is recursion taking so much time? Image drawing doesn't take any time at all, however CTextRenderer::Render does: So most of the time is spent rendering text? Lets have a look: Hah! I get it. It's constructing the glyph runs every frame again and again and again. It should do this when the Text object is initialized with text. Direct2D API on windows has special glyph run objects for handling this and creating static text vs dynamic text. Clearly we've found why the GUI was struggling slightly on the profiler run. Summary: Even though GUI isn't the priority, it was the easiest thing to profile (just to showcase the VS2012 profiler and how useful it can be) in such a short amount of time. And it also helped discover that the 10 years old GUI code actually has a pretty big weakness: text rendering wastes a huge amount of memory and resources every frame, since they are temporary objects. I've been looking at the new FreeType 2 library, which is an open-source library used for TrueType and OpenType text rendering. It has a very slim and streamlined API. Perhaps this is something I could start with?
- April 30, 2013
1 point
[DISCUSS] Performance Improvements

Maybe we should just skip the fixed point vs float discussion for a moment. 20% sounds like a lot, but I think you measured that at the beginning of the game and the bigger problem is later when the simulation starts using up most of the resources. Also it could make such a significant difference simply because other parts of the engine aren't optimized yet and have to do too many calculations. Philip indicated something like that related to the short-range pathfinder. Last but not least there seem to be different opinions that are well argued but still no real consensus. If you will eventually become a fulltime programmer you should start with a less contentious issue . Don't get me worng, but no matter if your right or wrong in that matter it would be good to add something everybody is happy with. There are enough tasks for that, it shouldn't be the problem. In my opinion the main problems currently are: 1. Pathfinder: Meaning the pathfinder itself that is used for unit movement but also the pathing functionality that the AI need and currently implements separately because there's no interface to the engine's pathfinder implementation that fulfills the requirements. 2. Multithreading support: Separating AI and Simulation from the rendering would about double the average framerate and would get rid of lag-spikes happening when for example the AI calculates the map for dropsite placement (multiple seconds on large maps!). I've tried doing that for the AI but my conclusion was that the spidermonkey upgrade needs to be done first (3). 3. JavaScript Engine improvements and Spidermonkey upgrade: I'm currently working on that in ticket #1886. It's difficult to predict how much that will improve performance. For Firefox the Spidermonkey performance improved a lot but our situation is a bit different. Anyway, the upgrade is need for various reasons - one being the multithreading support. An issue we have with all kinds of performance improvements is how to measure them. You are talking about 20% improvement which is inherently quite a random number you measure in one specific situation on a very specific plattform (Hardware, driver, OS, compiler options) and with unspecified settings. It would be very helpful both for making good judgements and for motivation to have better ways of measuring performance. What's missing is some kind of benchmark mode that tests different aspects of performance in scripted ingame scenes. It could be some camera movements over dense forests, over water, a lot of units fighting, an army moving through a forest and pathing through a narrow gap in formation, a maximum of 8 AI players doing their normal AI-calculations etc. It should store all the profiler information in the background to draw nice graphs later like how the ms/frame performance changed over time and what parts of the engine used most of the calculation time in these different situations. We already have most of what we need for that. I think that would be a very good task for you to start with, if you like. The pathfinder would be the most important task to continue but that's probably a bit though to start with. We already have two integrated profilers in the engine (simply called Profiler1 and Profiler2). Profiler1: "source/tools/replayprofile/*" contains a Perl script to extract profiling data from the text files generated by Profiler1. There's also a HTML file that can display nice graphs like that: Issues with profiler1: There's a bug that messes up the profiling data after some time. I havent analized it any further but you should get what I mean if you let it run with a game for a few minutes. It has some issues with multithreading In some situations it created huge files of profile data. I once got a 32 GB file and it has such an impact on performance that the data looses its significance Profiler2: Philip create this one which is very helpful for measuring accurately what happens in a single frame. Check the forum thread for more information. It even supports displaying multiple threads and calculations on the GPU. Profiler2 keeps the data in fixed size memory buffer and overwrites the oldest data when the buffer is full. The format is a binary format with much less overhead than profiler1's output. It makes the data accessible via web-server. Maybe it would be better to extend this profiler to save the data in the memory buffer to the disk. What do you think?
- April 30, 2013
1 point
AoEII HD

I don't have any originals that still work... (are RTS CDs bad quality? or do I play them too much? or do I treat them so badly ...) For AoE:RoR I'm using the copy of a copy of my friends game ... (funny fact: my friend realized he was addicted to AoE and threw the CD behind a heavy desk so he could no longer play the game ... he asked me to come over to move the desk not so long after xD he later destroyed the CD ... and then bought a new one) For AoK I'm using a copy of my original CD For AoM:TGE I'm using a crack because my original cd got an error in it before I made a copy of it. I never bought AoE III because I've got 0 A.D. ... So why would I want it? (I played with my friends game for a while, but it just isn't as good as AoK) (btw: the natives expansion CD of my friend doesn't install anymore) I never installed AoEO because it looks like crap My conclusion: CDs are crap, and 0 A.D. won't have any problem with them xD I see the thing about Steam, no CD trouble anymore, but I don't like it.
- March 9, 2013
1 point

Sign In

Leaderboard

Ludo38

Points

Posts

RedFox

Points

Posts

wrod

Points

Posts

Mythos_Ruler

Points

Posts

Popular Content

Reputation Points

Mythos_Ruler's Playlist

[DISCUSS] Performance Improvements

[DISCUSS] Performance Improvements

[DISCUSS] Performance Improvements

[DISCUSS] Performance Improvements

AoEII HD

Forums