Jump to content


Community Members
  • Content Count

  • Joined

  • Last visited

  • Days Won


Posts posted by h4writer

  1. That's one of the reasons I try not to use for ... in. The order isn't specified.

    If it is really important to get the same order, I would keep a property that contains the properties by name in the right order.

    You can keep that "order" variable on the object and adjust whenever a property get's deleted/added ...

    var obj = {
    "prop2": 1
    "prop3": 2
    "prop4": 3

    obj.order = ["prop2", "prop3", "prop4"];

    for (var i=0; i < obj.order.length; i++) {
    var key = obj.order[i];
    callback(key, obj[key]);

  2. Implement JSOP_SETRVAL and JSOP_RETRVAL (should apply to v24)


    Fix bailing of LCallGeneric:


    Quick and dirty fix for GetPopertyPolymorphicV. This doesn't fix the real issue and could actually decrease performance. I'm not sure yet what the right fix is,

    but this removes the constant bailing from that bytecode:

    diff --git a/js/src/ion/BaselineInspector.cpp b/js/src/ion/BaselineInspector.cpp
    --- a/js/src/ion/BaselineInspector.cpp
    +++ b/js/src/ion/BaselineInspector.cpp
    @@ -100,18 +100,18 @@ BaselineInspector::maybeShapesForPropert
    if (stub->toGetProp_Fallback()->hadUnoptimizableAccess())
    } else {
    if (stub->toSetProp_Fallback()->hadUnoptimizableAccess())

    // Don't inline if there are more than 5 shapes.
    - if (shapes.length() > 5)
    - shapes.clear();
    + //if (shapes.length() > 5)
    + shapes.clear();

    return true;

    ICStub *
    BaselineInspector::monomorphicStub(jsbytecode *pc)
    if (!hasBaselineScript())

    Your new graphs are actually very nice. It's nice to see that some parts indeed get the improvement. It should be cool if you find out which where the worst offenders. Would narrow the search.

    I hope I can create some free time to look further myself. I'm going to be in the US next week, so I'll be online in different hours and I have no idea how much free time I will have. But my next objective is to fix the boundchecks.

  3. I have been looking into the issues surrounding the JS engine and how to improve performance by having better code (i.e taking more use of IonMonkey).

    So we have bailouts (temporary go to baseline again) and invalidation (delete the IM code) in this code. From high to low this is:

    1) MCallGeneric

    2) GetPopertyPolymorphicV

    3) BoundsChecks

    4) ValueToInt32

    1) MCallGeneric is the generic code used in IonMonkey to call to a function. Normally in the browser the JSObject to where the call takes place is always a JSFunction. Here that isn't the case, since I assume you added your own JSObject's and override ->call function. In that case we were bailing, so stuck in baseline until we can enter IM again. So I created a small fix for this.

    2) GetPopertyPolymorphicV looks at the baseline caches to know which objects go there and create better code to lookup a property. If we encounter a new type we bailout and see a new type, resulting in IM rebuilding. Now it seems we are bailing here the whole time. I assume again that an own JSObject is given and for some reason we aren't recording this type! I temporary disabled looking at baseline caches. In that case we use GetPropertyCache that is slightly slower, but can't bail.

    So now I'm back at looking to BoundsChecks issue, also raised above. But I found it time to measure how much difference this would make.

    I created such a graph you created and it was a bit disappointing, but it points to another issue.

    The difference is only visible on the spikes, there it is clearly visible fixing the issues indeed improves performance.

    But the base line didn't go down. Also the base line didn't go up when only allowing the interpreter (our slowest engine).

    => I think the difference between v24/v1.8.5 is not because of the engine, but because of the overhead difference between the two.

    I ran oprofile on the whole execution:

    h4writer@h4writer-ThinkPad-W530:~/Build/0ad/binaries/system$ opreport ./pyrogenesis -d | grep "^[0-9a-z]" | head -n30
    Using /home/h4writer/Build/0ad/binaries/system/oprofile_data/samples/ for samples directory.
    warning: /no-vmlinux could not be found.
    warning: [vdso] (tgid:25236 range:0xb778e000-0xb778efff) could not be found.
    vma samples % image name symbol name
    08138bd0 152029 4.1760 pyrogenesis std::_Rb_tree<unsigned int, std::pair<unsigned int const, EntityData>, std::_Select1st<std::pair<unsigned int const, EntityData> >, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, EntityData> > >::find(unsigned int const&)
    08136f00 144718 3.9752 pyrogenesis CCmpRangeManager::GetPercentMapExplored(int)
    002026e0 123392 3.3894 libmozjs24-ps-release.so JSCompartment::wrap(JSContext*, JS::MutableHandle<JS::Value>, JS::Handle<JSObject*>)
    001922b0 93085 2.5569 libmozjs24-ps-release.so js::ShapeTable::search(int, bool)
    00000000 92117 2.5303 libstdc++.so.6.0.17 /usr/lib/i386-linux-gnu/libstdc++.so.6.0.17
    00000000 86632 2.3797 no-vmlinux /no-vmlinux
    000770a0 74491 2.0462 libc-2.15.so _int_malloc
    00000000 66707 1.8324 anon (tgid:25236 range:0xb2f09000-0xb2f48fff) anon (tgid:25236 range:0xb2f09000-0xb2f48fff)
    00000000 65781 1.8069 libnspr4.so /usr/lib/i386-linux-gnu/libnspr4.so

    So 8% is spent in pyrogenesis itself. I think the 3.3% JSCompartment::wrap is also overhead. But I'm mostly guessing.

    I think it would be good to have a oprofile log from turn 80(*20) to turn 90(*20). That would show where the time is going too...

    • Like 1
  4. So in most cases we will use JSOP_RETURN to indicate a return. But for some returns we need to do more e.g. close open try blocks. In that case JSOP_SETRVAL is used to set the value and the extra code is executed followed by JSOP_RETRVAL to return the value. So that is happening here. Apparently we don't compile those bytecodes yet. (Happens most often with try blocks what isn't supported by IM). So the problem here is the "return" in the for-in loop.

    Opened https://bugzilla.mozilla.org/show_bug.cgi?id=890722 to fix this.

    1) I think the patch to support it will be fairly small. So if you want I can backport the patch to FF24 and post the resulting patch here... Let me know if you are interested in that

    2) To fix it in JS code, is to hoist the return out of the for each loop.

    var refpath = node.split(".");
    var refd = spec;
    for each (var p in refpath)
    refd = refd[p];
    if (!refd)
    if (!refd)
    error("FSM node "+path.join(".")+" referred to non-defined node "+node);
    return {};
    node = refd;

    I didn't test it though...

    • Like 1
  5. Apparently "for(var i in entities_)" also steps into the loop for entities that are set to undefined.

    I know that's unfortunate, but it will create much better type information and be much faster.

    There are different solutions for that problem though,

    1) if entities_ is small, you can recreate that object when wanting to delete a property.

    2) you can have an array with all "available keys" and recreate the array when you remove a poperty.

    3) If the value in entities[x] is never undefined, I would just check it in the loop and just "continue;"

    This is one code where it points to:

     Resources.prototype.add = function(that) {
    for ( var tKey in this.types) {
    var t = this.types[tKey];
    this[t] += that[t];
    this.population += that.population;

    That seems very strange! We don't need an arguments object there...

  6. Sorry about the delay. Next week I'll be more responsive. I've got all the important stuff done now.

    I looked to the BoundsCheck issue and it definitely looks like an issue of IonMonkey. It should remember an element was set out of bound and when recompiling using a alternate version that doesn't bail in this case. I still have to look why and create a patch. I hope I can create a patch this weekend. Else I will provide one in the beginning of next week. Thanks for reporting!

    About "OSR script has argsobj". You can see which script it is, by providing IONFLAGS=scripts,aborts. The line before the abort is the script where it is failing on. I would definitely suggest to not use "arguments" as much as possible. It is known to decrease performance on all engines a lot...

    Another issue that causes JS engines to deoptimize a lot is using "delete". Never use that! It pollutes the type information. It is much better to set it to undefined or null.

    http://jsperf.com/delete-vs-undefined-vs-null/16 (I'm not a fan of jsperf and microbenchmarks, but here it is quite accurate)

    Also it seems like we haven't added JSOP_DELPROP to ionmonkey yet. So that means the whole script will run in baseline compiler and will be much slower. So I would suggest to adjust that!

  7. I do not know the details (yet) about that part. So I could ask it on irc myself. But I suggest you do it yourself? Since that way you will get the most information about what is possible and not and how to work around it. I think Luke (:luke) would be the best person to ask it. He is stationed in Mountain View, but is online from 18h CEST. If getting online on those times is hard, you can always mail him. (luke AT mozilla . com). I introduced him yesterday to this project, since he didn't know it yet ;).


  8. Thanks for responding and it's good to hear that's still important :) Actually even very small differences across platforms matter to us, because we do a hash of our binary simulation state to detect if a client goes out of sync. Even one bit difference is an OOS (as I've discovered recently where NaN can have different internal representations possibly due to JIT behavior/bugs? - which is fine according to the spec, but a headache for us).

    This is why we are implementing our own Math functions for trig, pow, exp, etc. Assuming the last comment from this bug report is still valid, and like Philip I can't blame you all if it is.

    About the rounding errors. Yes I think the last comment is still valid and I don't think that will change soon.

    I'm sure that we use one NaN representation in the JIT. Since that makes our live easier. I actually assumed that is also the case in the interpreter? If NaN is the same across platform is another question.

    I can ask around if you want to know.

    I also wondered what this setting in the configure script does:

    Enable changes that make the shell more deterministic"

    That is a flag mostly for fuzzers. It contains little hacks to make life easier for them. To make scripts more deterministic in output/behaviour.

    E.g. last bug I can remember (not implemented yet) is don't show the compilation time of asm.js in the warning. (Since that is not deterministic and a nightmare for fuzzers)

    But it will mostly make sure that the error messages are always the same. We try to always give as much as possible information in the error message. So we try to give the value that is fault. But in the JIT we often cannot give that. So we get different error messages, depending on which engine is running. Very annoying for the fuzzer.

    This is definitely not for production ;)

  9. Something worth investigating if we upgrade Spidermonkey, we've been having a discussion about Spidermonkey floating point math (spurred by #1633, #1990 and #433), does the new Spidermonkey still try for consistent floating point rounding on different platforms? In the past, some people cared enough to find and fix any discrepancies that popped up, but that bug report is 4 years old and surely much has changed.

    Running code in JIT or the interpreter shouldn't make a difference. So we definitely care for these issues! That was an issue reported by Gary and is about an problem in JM/TM vs the interpreter. Both are removed now in favor of IM. I'm not gonna pretend we don't have these issues in IM, but we try to fix them. Gkw, decoder and Jesse are running special bots to find these issues. Now and than they pop up during differential testing. So reporting bugs about that is highly welcome!

    Just reading the bugs, I don't think we can do much about floating point issues on different platforms, though that wouldn't be a total performance loss. Also I don't know how widespread the issues are and what the error range is. I assume they are very very small and in that case I don't think it will make such a difference?

  10. About the constant hitting bound check. I'm not sure this is a JS error, it could potentially be an IonMonkey issue. I'm willing to look into it and fix it if you can provide me a shell version of it.

    About the messages: [Abort] Script too large (2149 bytes)

    This is telling me you are running a non-threadsafe build. Therefore you don't have parallel compilation and we cap the max script size to 2000 bytes.

    You should really try a threadsafe build. (That's the default in the browser). The maximum script is 20000bytes and compilation happens on background thread.

    Should give a nice performance boost.

    To compile a threadsafe shell:

    $ cd mozilla-inbound

    // this is needed for threadsafe build

    $ cd nsprpub

    $ mkdir build-64

    $ cd build-64

    $ ../configure --disable-debug --enable-optimize

    $ make

    // building js shell

    $ cd mozilla-inbound/js/src

    $ autoconf2.13

    $ mkdir build-64

    $ cd build-64

    $ ../configure --disable-debug --enable-optimize --enable-threadsafe --with-nspr-cflags="-I/PATH_TO/mozilla-inbound/nsprpub/build-64/dist/include/nspr" --with-nspr-libs="/PATH_TO/mozilla-inbound/nsprpub/build-64/dist/lib/libnspr4.a /PATH_TO/mozilla-inbound/nsprpub/build-64/dist/lib/libplc4.a /PATH_TO/mozilla-inbound/nsprpub/build-64/dist/lib/libplds4.a"

    $ make

    Execute test.js

    $ ./js --ion-parallel-compile=on test.js

    • Like 2
  • Create New...