Jump to content

Leaderboard

Popular Content

Showing content with the highest reputation on 2013-07-10 in all areas

  1. Carthaginian Blacksmith concept: What do you guys think, like it?
    1 point
  2. I have been looking into the issues surrounding the JS engine and how to improve performance by having better code (i.e taking more use of IonMonkey). So we have bailouts (temporary go to baseline again) and invalidation (delete the IM code) in this code. From high to low this is: 1) MCallGeneric 2) GetPopertyPolymorphicV 3) BoundsChecks 4) ValueToInt32 1) MCallGeneric is the generic code used in IonMonkey to call to a function. Normally in the browser the JSObject to where the call takes place is always a JSFunction. Here that isn't the case, since I assume you added your own JSObject's and override ->call function. In that case we were bailing, so stuck in baseline until we can enter IM again. So I created a small fix for this. 2) GetPopertyPolymorphicV looks at the baseline caches to know which objects go there and create better code to lookup a property. If we encounter a new type we bailout and see a new type, resulting in IM rebuilding. Now it seems we are bailing here the whole time. I assume again that an own JSObject is given and for some reason we aren't recording this type! I temporary disabled looking at baseline caches. In that case we use GetPropertyCache that is slightly slower, but can't bail. So now I'm back at looking to BoundsChecks issue, also raised above. But I found it time to measure how much difference this would make. I created such a graph you created and it was a bit disappointing, but it points to another issue. The difference is only visible on the spikes, there it is clearly visible fixing the issues indeed improves performance. But the base line didn't go down. Also the base line didn't go up when only allowing the interpreter (our slowest engine). => I think the difference between v24/v1.8.5 is not because of the engine, but because of the overhead difference between the two. I ran oprofile on the whole execution: h4writer@h4writer-ThinkPad-W530:~/Build/0ad/binaries/system$ opreport ./pyrogenesis -d | grep "^[0-9a-z]" | head -n30 Using /home/h4writer/Build/0ad/binaries/system/oprofile_data/samples/ for samples directory. warning: /no-vmlinux could not be found. warning: [vdso] (tgid:25236 range:0xb778e000-0xb778efff) could not be found. vma samples % image name symbol name 08138bd0 152029 4.1760 pyrogenesis std::_Rb_tree<unsigned int, std::pair<unsigned int const, EntityData>, std::_Select1st<std::pair<unsigned int const, EntityData> >, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, EntityData> > >::find(unsigned int const&) 08136f00 144718 3.9752 pyrogenesis CCmpRangeManager::GetPercentMapExplored(int) 002026e0 123392 3.3894 libmozjs24-ps-release.so JSCompartment::wrap(JSContext*, JS::MutableHandle<JS::Value>, JS::Handle<JSObject*>) 001922b0 93085 2.5569 libmozjs24-ps-release.so js::ShapeTable::search(int, bool) 00000000 92117 2.5303 libstdc++.so.6.0.17 /usr/lib/i386-linux-gnu/libstdc++.so.6.0.17 00000000 86632 2.3797 no-vmlinux /no-vmlinux 000770a0 74491 2.0462 libc-2.15.so _int_malloc 00000000 66707 1.8324 anon (tgid:25236 range:0xb2f09000-0xb2f48fff) anon (tgid:25236 range:0xb2f09000-0xb2f48fff) 00000000 65781 1.8069 libnspr4.so /usr/lib/i386-linux-gnu/libnspr4.so ... So 8% is spent in pyrogenesis itself. I think the 3.3% JSCompartment::wrap is also overhead. But I'm mostly guessing. I think it would be good to have a oprofile log from turn 80(*20) to turn 90(*20). That would show where the time is going too...
    1 point
×
×
  • Create New...