Jump to content

Yves

WFG Retired
  • Posts

    1.135
  • Joined

  • Last visited

  • Days Won

    25

Everything posted by Yves

  1. Awesome portraits and welcome back!
  2. Execute items spends nearly all the time in this loop in simulation/ai/qbot-wc/map-module.js:39. There are 40 baseline bailouts recorded in this loop, all of the bailout kind "Bailout_Normal", except one which is Bailout_ArgumentCheck. bailout_log_map-module.txt.zip
  3. The screenshots look nice indeed! These narrow paths and the trees could be a bit difficult with the current pathfinder and formations.
  4. What difference did you measure? Your fix for MCallGeneric? EDIT: If you have any patches, could you post them please? I'd like to avoid finding problems you already fixed.
  5. H4writer and I have discussed the problems in IRC and I think we haven't really found a conclusion yet. He brought up some interesting questions and observations. A strange observation is that currently the average performance (not counting the pikes) is better with all the optimizations (Ion Monkey, Type Inference, Baseline) disabled. Does the performance really get lost during JS code execution or is it some other overhead that got added? Is something wrong about how we pass data to IonMonkey which could keep it from making some important optimizations? I've spend the whole thursday experimenting with wrappers and changing the interface to avoid some structured clones. I assumed that copying the entire gamestate with a structured clone is bad for IonMonkey because it doesn't recognise that objects and values are the same as in the last turn and it can't do optimizations because the data's memory address changes each turn (or so... ). At the end of the day there was no improvement but I was still not sure if these concerns are valid or not. I will probably try to do some more experiments in a sandbox environment. I've decided to look a bit closer if I can spot some specific JS code that runs slower. I noticed two things: There is JS code that runs slower but also code that is faster with V24 About half of the time is in currently "unlogged" sections (i.e. sections not wrapped into profile macros). I'll add some more profile macros and will try figuring out where that time is spent. In the meantime I have some graphs comparing different profile sections of our JS code. Watch the scales, they are different! Parts where V24 is faster: Parts where v1.8.5 is faster: Execute items is part of "AI script":
  6. H4writer offered a "0 A.D. day", so I'm uploading my current heavily WIP patch here and provide some instructions how to use it. Brave people are encouraged to test it (but you won't be able to really play a game with it yet). Also bug-reports and feedback is only needed if you find some important performance issues or fundamental design issues because there are too many known issues at the moment. Build instructions (also check http://trac.wildfire...ildInstructions): Check out revision 13408 Apply the patch (I always had some rejected files... it's a bit cumbersome) Fix some absolute paths I used (sorry!) "ln -s" the aurora-source directory to 0ad/libraries/source/spidermonkey/mozjs24 cd 0ad/libraries/source/spidermonkey/mozjs24/js/src && autoconf2.13 run "./update-workspaces.sh --with-system-enet --without-audio --with-system-nvtt --disable-atlas -j5" from 0ad/build/workspaces/ cd gcc && make -j5 config=debug pyrogenesis Testing in non-visual replay mode (simulation) I've attached the commands.txt file I used for the performance measurements in this thread. run ./pyrogenesis -replay=/path/to/commands.txt or run ./pyrogenesis_dbg -replay=/path/to/commands.txt How to create the performance graphs. What works/doesn't work There's still a lot that doesn't work. Basically the only really tested things are: Building on my Ubuntu system Generating the random map alpine_lakes Running the non-visual replay with the commands.txt provided Starting a normal game on Acropolis1. Playing isn't possible yet because of a missing change in the config-db code and probably there are a lot other issues too. I also haven't checked the GUI-code for missing requests, so it could simply crash after a while. EDIT: There were some files missing in the patch. I've attached a new version. commands.txt patch_24_v0.2.diff
  7. Thanks, that's indeed the fastest solution and it also avoids the baseline bailouts! Have you seen my "Edit2" in the last post? It doesn't matter anymore for this case but it could be a bug. Performance (execution time for 2000 iterations in microseconds): original: 1715559 modified_yves: 1976366 modified_h4writer: 1548373 1551303
  8. A test for the setrval issue. [Scripts] Analyzing script tests/setrval.js:6 (0x7fb6436511f0) (usecount=1004) [Abort] Unsupported opcode: setrval (line 19) [Abort] aborted @ tests/setrval.js:19 [Abort] Builder failed to build. [Abort] Disabling Ion mode 0 compilation of script tests/setrval.js:6 [Scripts] Analyzing script tests/setrval.js:1 (0x7fb643651128) (usecount=1100) It can't be simply that return values are not supported. Testing a small function with return values seems to work. EDIT: I've found a way to fix the "Abort" message. Unfortunately it makes the script about 10% slower, so it probably causes some other issues. The strange thing is that it also prints the abort message if I pass an object as reference and use that as return value (as in the modified version) but use a return statement without value instead of the hack with the quit variable. EDIT2: Looks like a similar issue as the last one... ./js --ion-parallel-compile=on tests/setrval_modified.js 2>&1 | grep "Took bailout" | wc -l 1655 1655 Baseline Bailouts in 2000 executions of the function... setrval.js.txt setrval_modified.js.txt
  9. The 1.8.5 benchmark is still the original one. I don't recreate it to safe time. I could do a test with all the improvements later but I think it isn't worth to test every single improvement in both v24 and v1.8.5. I set it to false and checked for that. I have to add a lot of these checks because there are a lot of loops. It improves performance quite a bit as we can seen (assuming I didn't add a bug that causes significant simulation behaviour changes).
  10. I've tested replacing delete and it seems to make a big difference. Unfortunately I'm not sure if I could have changed the simulation behaviour and replacing delete requires some ugly hacks which isn't good either. Apparently "for(var i in entities_)" also steps into the loop for entities that are set to undefined. Edit: We aren't using "arguments" anywhere else. This is one code where it points to: Resources.prototype.add = function(that) { for ( var tKey in this.types) { var t = this.types[tKey]; this[t] += that[t]; } this.population += that.population; };
  11. I've made my working copy ready for the threadsafe build. I also found that there were some structured clones left from my tests that could be replaced by wrappers. Removing these structured clones improved performance quite a bit but the threadsafe build has nearly no impact at all in this scenario. It should be able to do Ion compiling and garbage collection in another thread now, so it has nothing to do with moving the AI or the Simulation code to another thread. Since I'm now only testing the Simulation and the AI context is constantly in use I expect some more improvements for the threadsafe build in normal games. From the documentation: This means to me that if we move the AI to a separate thread and into a separate runtime it can do all the garbage collection in another thread between simulation turns when no active requests are needed. That should make quite a difference because a lot of time is spent doing garbage collection. There's an issue with the threadsafe build when running the HWDetect script at startup. I've filed a bug for that here. Here are some other abort messages from Ion, if anyone can provide some helpful information (I haven't analyzed them much yet). Especially these are annoying because they don't point to a line of code and because there are so many of them: [Abort] OSR script has argsobj
  12. I've created a standalone test script. It calculates a specified number of random paths (random start-point and random end-point) using hardcoded JSON map data (I copied that from our acropolis scenario map). It's probably a bit buggy and definitely ugly, but it can reliably reproduce the BoundCheck messages which is most important. Btw. the JSON map data is quite large. If you want to edit the script you need a text editor that doesn't have problems with 800'000+ characters per line. export IONFLAGS=aborts,bailouts,bl-bails ./js --ion-parallel-compile=on tests/pathfind.js pathfind.js.zip
  13. I'm sorry to hear that. I also hope someone can take it over.
  14. A nice video and very positive.
  15. I tried running a threadsafe build today. First it segfaulted instantly when loading the library in debug mode. I figured out that it's our override of the free function that causes it. Uncommenting the whole #IF block here works around the issue. Now it triggers a Spidermonkey assertion in Debug mode when starting a match (the main-menu loads fine): Apparently I'm not the only one with this problem. It seems that having multiple runtimes in one thread isn't supported anymore. Is that true?
  16. Nice work, I'm looking forward to seeing how much it improves 0 A.D.'s performance.
  17. I also wondered what this setting in the configure script does: --enable-more-deterministic Enable changes that make the shell more deterministic"
  18. Thanks for your input and your offer to help with troubleshooting! I'll try your suggestion about the threadsafe build and will also try to provide a standalone test-script for the bound-issues. That could be a bit tricky but I'll see what I can do. I will have some time to work on it this Thursday and then on the weekend again.
  19. Very interesting video, thanks for sharing! I think we should go for it. C++11 offers some new features to make code easier to understand and more maintainable and maybe even brings some performance improvements. I don't see any disadvantages at the moment.
  20. I enabled some more logging. There are hundreds of these messages (same file, same line). What bailouts are is described here. EDIT: The access to this.widthMap[index] here is an out of bounds access. It's even printed as warning but only once so I didn't fix it before. This out of bounds access happens many times per turn. I'll try to fix it tomorrow. EDIT2: h4writer from the #jsapi channel suggested another solution for the random map performance issue. According to him it's only the access to arguments[x] that causes problems and not the arguments.length. function randInt(arg0, arg1) { if (arguments.length == 1) { var maxVal = arg0; return Math.floor(Math.random() * maxVal); } else if (arguments.length == 2) { var minVal = arg0; var maxVal = arg1; return minVal + randInt(maxVal - minVal + 1); } else { error("randInt: invalid number of arguments: "+arguments.length); return undefined; } }
  21. Thanks for the more detailed explanation. What I would like to see is a performance comparison of 0 A.D., but I guess that's only possible after the changes are implemented. I finally wrote a few words about creating graphs of the simulation performance. It does not measure the improvements in the rendering performance or GUI performance but it should be useful for the simulation performance.
  22. Actually I've already tested replacing the randInt function in Spidermonkey 17, but not randFloat because that was called less often. It didn't make a big difference. There's also a bug for this in Bugzilla.
  23. Yay, finally something is faster instead of slower! TIMER| Load RMS: 13.2699 s ... which is quite an improvement compared to 253 seconds and it's also much faster than the unpatched v1.8.5 or v17 with 35-37 seconds! This was the important hint: [Abort] NYI inlined get argument element [Abort] aborted @ maps/random/rmgen/random.js:39 [Abort] Builder failed to build. I had to replace dynamic argument number checking in the functions randInt and randFloat. rm_random_perf_fix_v1.0.diff
  24. I also tested alpine lakes random map generation. ./pyrogenesis -autostart=alpine_lakes -autostart-random=123 -autostart-size=256 TIMER| Load RMS: 253.25 s It was between 35 and 37 seconds for 1.8.5 and 17. Someone on the #jsapi channel suggested setting the IONFLAGS environment variable and running a debug build. I think that was quite a good suggestion: Now I just have to solve these problems somehow...
  25. The good news is that my working copy is now ready for Spidermonkey 24! The bad news is that it performs even worse than Firefox 17 and much worse than 1.8.5. On the other hand the good news is that it seems very unlikely that Spidermonkey 24 with all the new improvements like Ion-Monkey and the new Baseline compiler can really be slower than the old 1.8.5 from Firefox 4. I still hope there's a "--make-everything-fast" switch or something . I've already tested the jemalloc switch because of all the memory performance discussions. I don't know what it actually does or if it works but there's no visible difference. ... some hard work for this weekend.
×
×
  • Create New...