Stan` Posted May 7, 2019 Report Share Posted May 7, 2019 @feneur Could you post it on twitter and facebook ? Something along the lines of: "We are glad to announce 0 A.D. was recently chosen as an experimentation ground for Intel® VTune™ Amplifier with the goal of finding potential bottlenecks in the game's code, you can find the detailed analysis here." Maybe on twitter @intel with a few hashtags on twitter #freesoftware #opensource #SoftwareTesting 4 Quote Link to comment Share on other sites More sharing options...
feneur Posted May 7, 2019 Report Share Posted May 7, 2019 @Itms Could you please post on Twitter? I don't have the login info easily accessible. Quote Link to comment Share on other sites More sharing options...
Itms Posted May 7, 2019 Report Share Posted May 7, 2019 Done! Thanks @Alex from Intel for the work and the write-up! We'll let you know about improvements to these parts of the code 2 1 Quote Link to comment Share on other sites More sharing options...
Stan` Posted August 10, 2019 Report Share Posted August 10, 2019 @Itms @Alex from Intel Now that Spidermonkey 45.2 was committed maybe it would be nice to use the tool again ? 2 Quote Link to comment Share on other sites More sharing options...
ffffffff Posted November 16, 2019 Report Share Posted November 16, 2019 (edited) so i walked now through this thread and made the finding that i also guess is the most relevant to our 0ad performance: On 3/4/2019 at 8:14 PM, Alex from Intel said: So what I believe is going on here, is that these gaps between frames are coming from the game having to wait for the JavaScript. A couple possibilities I can think of are that you might be interfacing the two languages too often, or you might be doing computations in the JavaScript that really belong in the C++. is someone currently looking on that performance bottleneck on javascript computations that belong to C++? I would dive into that if someone is on it and would like to split work. So let me know Edited November 16, 2019 by ffffffff 1 Quote Link to comment Share on other sites More sharing options...
Stan` Posted November 16, 2019 Report Share Posted November 16, 2019 41 minutes ago, ffffffff said: is someone currently looking on that performance bottleneck on javascript computations that belong to C++? I would dive into that if someone is on it and would like to split work. So let me know I am sort of, I used getmicroseconds to profile some functions to see easy optimizations. However it's not easy to reduce the coupling, and one cannot cache QueryInterface Pointers as they keep changing. What could be nice is implementing workers so that each component could run in threads ? Also running the same profiling with SM45 would be interesting. One might also try to use the tracelogger to find potential bottlenecks. 1 1 Quote Link to comment Share on other sites More sharing options...
ffffffff Posted November 16, 2019 Report Share Posted November 16, 2019 (edited) 11 minutes ago, Stan` said: each component could run in threads thought the same. i looked in some worker threads from current code to know how to implement and how they work. i wanted to start trying to thread the current archive builder code to thread the translations happening there to see if that can be threaded as first try that would result in a gain for that when building archives for mods and stuff. then i would try to apply to the transitions of the functions being called most often when building a frame in the game. current idea from me. so need to isolate by perf measuring code the functions that do the most work. then try to thread and make the data lock that is being worked on so we get a parallel work here. Edited November 16, 2019 by ffffffff 1 Quote Link to comment Share on other sites More sharing options...
Stan` Posted November 16, 2019 Report Share Posted November 16, 2019 Honestly threading the archive builder won't help much because it's a punctual. I guess the sprintf thing could be speed up but I haven't profiled it yet with https://code.wildfiregames.com/P187 IMHO there are some optimizations to be done with Auras, maybe the FSM (I wonder if we could move it to cpp) On 11/8/2019 at 4:25 PM, Stan` said: {"Component":"Health","FunctionName":"ExecuteRegeneration","TotalTime":1.8259999998263083,"Count":29,"Average":0.06297,"TurnAverageCount":0.00332378223495702} {"Component":"Barter","FunctionName":"ProgressTimeout","TotalTime":3.0849999999336433,"Count":67,"Average":0.04604,"TurnAverageCount":0.007679083094555874} {"Component":"Trigger","FunctionName":"DoAction","TotalTime":14.219999999999345,"Count":1,"Average":14.22,"TurnAverageCount":0.00011461318051575932} {"Component":"AttackDetection","FunctionName":"HandleTimeout","TotalTime":22.20600000010745,"Count":6216,"Average":0.00357,"TurnAverageCount":0.71243553008596} {"Component":"GarrisonHolder","FunctionName":"HealTimeout","TotalTime":34.932000000058906,"Count":503,"Average":0.06945,"TurnAverageCount":0.05765042979942694} {"Component":"Capturable","FunctionName":"TimerTick","TotalTime":37.416999999777545,"Count":292,"Average":0.12814,"TurnAverageCount":0.03346704871060172} {"Component":"StatisticsTracker","FunctionName":"UpdateSequences","TotalTime":49.37099999987913,"Count":118,"Average":0.4184,"TurnAverageCount":0.013524355300859598} {"Component":"ResourceTrickle","FunctionName":"Trickle","TotalTime":52.41299999988587,"Count":3490,"Average":0.01502,"TurnAverageCount":0.4} {"Component":"BattleDetection","FunctionName":"TimerHandler","TotalTime":64.0520000008255,"Count":3773,"Average":0.01698,"TurnAverageCount":0.43243553008595986} {"Component":"BuildingAI","FunctionName":"FireArrows","TotalTime":96.48299999882875,"Count":929,"Average":0.10386,"TurnAverageCount":0.10647564469914039} {"Component":"Pack","FunctionName":"PackProgress","TotalTime":811.1559999997262,"Count":1095,"Average":0.74078,"TurnAverageCount":0.12550143266475644} {"Component":"DelayedDamage","FunctionName":"MissileHit","TotalTime":4515.770000000877,"Count":6767,"Average":0.66732,"TurnAverageCount":0.7755873925501432} {"Component":"ProductionQueue","FunctionName":"ProgressTimeout","TotalTime":7443.771000001156,"Count":6729,"Average":1.10622,"TurnAverageCount":0.7712320916905444} {"Component":"UnitAI","FunctionName":"TimerHandler","TotalTime":34888.44400000893,"Count":155744,"Average":0.22401,"TurnAverageCount":17.85031518624642} With GetMicroseconds, without any optimizations same replayed match (since they were ais it's different but I believe the data is still good. Quote Link to comment Share on other sites More sharing options...
ffffffff Posted November 16, 2019 Report Share Posted November 16, 2019 3 minutes ago, Stan` said: Honestly threading the archive builder at least per file translation threading must be possible no? read cores count and fill according number of worker threads with translations no? Quote Link to comment Share on other sites More sharing options...
Stan` Posted November 16, 2019 Report Share Posted November 16, 2019 As I said, first profile then optimize (Saves you a lot of time ) Quote Link to comment Share on other sites More sharing options...
ffffffff Posted November 16, 2019 Report Share Posted November 16, 2019 Just now, Stan` said: As I said, first profile then optimize (Saves you a lot of time ) right just about to get the hands into it. will let u know 1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.