Jump to content
Sign in to follow this  
Alex from Intel

Help needed: Optimizing 0 A.D. with Intel VTune Amplifier

Recommended Posts

@feneur Could you post it on twitter and facebook ? :) Something along the lines of:

"We are glad to announce 0 A.D. was recently chosen as an experimentation ground for Intel® VTune™ Amplifier with the goal of finding potential bottlenecks in the game's code, you can find the detailed analysis here."

Maybe on twitter @intel  with a few hashtags on twitter #freesoftware #opensource #SoftwareTesting

  • Like 4

Share this post


Link to post
Share on other sites

@Itms Could you please post on Twitter? I don't have the login info easily accessible.

Share this post


Link to post
Share on other sites

so i walked now through this thread and made the finding that i also guess is the most relevant to our 0ad performance:

On 3/4/2019 at 8:14 PM, Alex from Intel said:

So what I believe is going on here, is that these gaps between frames are coming from the game having to wait for the JavaScript. A couple possibilities I can think of are that you might be interfacing the two languages too often, or you might be doing computations in the JavaScript that really belong in the C++.

is someone currently looking on that performance bottleneck on javascript computations that belong to C++? I would dive into that if someone is on it and would like to split work. So let me know

Edited by ffffffff
  • Like 1

Share this post


Link to post
Share on other sites
41 minutes ago, ffffffff said:

is someone currently looking on that performance bottleneck on javascript computations that belong to C++? I would dive into that if someone is on it and would like to split work. So let me know

I am sort of, I used getmicroseconds to profile some functions to see easy optimizations. However it's not easy to reduce the coupling, and one cannot cache QueryInterface Pointers as they keep changing. What could be nice is implementing workers so that each component could run in threads ? Also running the same profiling with SM45 would be interesting. One might also try to use the tracelogger to find potential bottlenecks.

  • Like 1
  • Thanks 1

Share this post


Link to post
Share on other sites
11 minutes ago, Stan` said:

each component could run in threads

thought the same. i looked in some worker threads from current code to know how to implement and how they work. i wanted to start trying to thread the current archive builder code to thread the translations happening there to see if that can be threaded as first try that would result in a gain for that when building archives for mods and stuff. then i would try to apply to the transitions of the functions being called most often when building a frame in the game. current idea from me. so need to isolate by perf measuring code the functions that do the most work. then try to thread and make the data lock that is being worked on so we get a parallel work here.

Edited by ffffffff
  • Thanks 1

Share this post


Link to post
Share on other sites

Honestly threading the archive builder won't help much because it's a punctual. I guess the sprintf thing could be speed up but I haven't profiled it yet with  https://code.wildfiregames.com/P187

IMHO there are some optimizations to be done with Auras, maybe the FSM (I wonder if we could move it to cpp)

On 11/8/2019 at 4:25 PM, Stan` said:

{"Component":"Health","FunctionName":"ExecuteRegeneration","TotalTime":1.8259999998263083,"Count":29,"Average":0.06297,"TurnAverageCount":0.00332378223495702}
{"Component":"Barter","FunctionName":"ProgressTimeout","TotalTime":3.0849999999336433,"Count":67,"Average":0.04604,"TurnAverageCount":0.007679083094555874}
{"Component":"Trigger","FunctionName":"DoAction","TotalTime":14.219999999999345,"Count":1,"Average":14.22,"TurnAverageCount":0.00011461318051575932}
{"Component":"AttackDetection","FunctionName":"HandleTimeout","TotalTime":22.20600000010745,"Count":6216,"Average":0.00357,"TurnAverageCount":0.71243553008596}
{"Component":"GarrisonHolder","FunctionName":"HealTimeout","TotalTime":34.932000000058906,"Count":503,"Average":0.06945,"TurnAverageCount":0.05765042979942694}
{"Component":"Capturable","FunctionName":"TimerTick","TotalTime":37.416999999777545,"Count":292,"Average":0.12814,"TurnAverageCount":0.03346704871060172}
{"Component":"StatisticsTracker","FunctionName":"UpdateSequences","TotalTime":49.37099999987913,"Count":118,"Average":0.4184,"TurnAverageCount":0.013524355300859598}
{"Component":"ResourceTrickle","FunctionName":"Trickle","TotalTime":52.41299999988587,"Count":3490,"Average":0.01502,"TurnAverageCount":0.4}
{"Component":"BattleDetection","FunctionName":"TimerHandler","TotalTime":64.0520000008255,"Count":3773,"Average":0.01698,"TurnAverageCount":0.43243553008595986}
{"Component":"BuildingAI","FunctionName":"FireArrows","TotalTime":96.48299999882875,"Count":929,"Average":0.10386,"TurnAverageCount":0.10647564469914039}
{"Component":"Pack","FunctionName":"PackProgress","TotalTime":811.1559999997262,"Count":1095,"Average":0.74078,"TurnAverageCount":0.12550143266475644}
{"Component":"DelayedDamage","FunctionName":"MissileHit","TotalTime":4515.770000000877,"Count":6767,"Average":0.66732,"TurnAverageCount":0.7755873925501432}
{"Component":"ProductionQueue","FunctionName":"ProgressTimeout","TotalTime":7443.771000001156,"Count":6729,"Average":1.10622,"TurnAverageCount":0.7712320916905444}
{"Component":"UnitAI","FunctionName":"TimerHandler","TotalTime":34888.44400000893,"Count":155744,"Average":0.22401,"TurnAverageCount":17.85031518624642}

With GetMicroseconds, without any optimizations same replayed match (since they were ais it's different but I believe the data is still good.

Share this post


Link to post
Share on other sites
3 minutes ago, Stan` said:

Honestly threading the archive builder

at least per file translation threading must be possible no? read cores count and fill according number of worker threads with translations no?

Share this post


Link to post
Share on other sites
Just now, Stan` said:

As I said, first profile then optimize (Saves you a lot of time :) )

right just about to get the hands into it. will let u know

  • Thanks 1

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×
×
  • Create New...