Jump to content

[Discussion] Spidermonkey upgrade


Recommended Posts

What does this mean? I thought all JavaScript objects could be considered maps?

Well, objects in JS are like maps, but they get handled pretty differently in the background. When you create an object, behind the door, there is made a class for it. So when you change the properties of an object, the class gets invalid, and the object gets transformed to a lot slower "dictionary mode" (accessing a value is some 40% slower after a single delete). While Maps will be native objects where deleting and adding keys is a lot more optimized.

Link to comment
Share on other sites

Well, objects in JS are like maps, but they get handled pretty differently in the background. When you create an object, behind the door, there is made a class for it. So when you change the properties of an object, the class gets invalid, and the object gets transformed to a lot slower "dictionary mode" (accessing a value is some 40% slower after a single delete). While Maps will be native objects where deleting and adding keys is a lot more optimized.

Is there any data that suggests this is a bottleneck in our scripts? I think a strong case needs to be made before we use more experimental non-standard JS features. I would much rather use C++ data structures where we can change the implementation and have finer control over e.g. serialization and memory usage, as well as being able to implement good algorithms natively, rather than using these maps. And there should really only be a limited number of cases where that is necessary, I can't see us needing to go through every script and change every "associative array" object to a map.
Link to comment
Share on other sites

Is there any data that suggests this is a bottleneck in our scripts? I think a strong case needs to be made before we use more experimental non-standard JS features. I would much rather use C++ data structures where we can change the implementation and have finer control over e.g. serialization and memory usage, as well as being able to implement good algorithms natively, rather than using these maps. And there should really only be a limited number of cases where that is necessary, I can't see us needing to go through every script and change every "associative array" object to a map.

Some of these questions are answered in the link I've posted to #2370.

In my opinion we have to try some different approaches and compare the results. Maps are one thing that could help, but we should only use them if they prove to be the best approach or at least a sufficiently good approach (which the current implementation of entity collections probably isn't).

Link to comment
Share on other sites

Some of these questions are answered in the link I've posted to #2370.

In my opinion we have to try some different approaches and compare the results. Maps are one thing that could help, but we should only use them if they prove to be the best approach or at least a sufficiently good approach (which the current implementation of entity collections probably isn't).

The jsapi IRC log? It's pretty interesting. I think one of the stronger points in favor of the Maps is ordered iteration, then again that's something we could get with our own implementation, even if it was more work. It would certainly be interesting to profile the entity collections more closely and compare their performance with objects vs. Maps. To me it feels like one of those things that should be part of the C++ API for AIs, that brings up JS<->C++ overhead as the ticket points out, it would be good to know exactly what we're dealing with there too.
Link to comment
Share on other sites

I've had a look at incremental GC. Incremental GC allows splitting up the garbage collection into slices and JS code can run between these slices.

It doesn't make garbage collection faster in general, but it should avoid big lag spikes when too much garbage collection happens at once.

SpiderMonkey uses a Mark-and-Sweep garbage collector. First, it marks garbage in slices (that's the incremental part), then it sweeps everything in another thread.

It was quite a challenge and I wouldn't have a working implementation yet if terrence didn't help me on the JSAPI channel.

He also opened a bug to add some more documentation. It looks like it's not yet figured out how API users are meant to configure incremental GC because Firefox accesses it quite directly.

Anyway, I implemented a quite simple logic.

  • If no GC cycle (the whole GC with multiple slices) is running and if the JS heap size has grown more than 20 MB since the last GC, we start a new incremental GC.
  • We run the incremental GC in slices of 10ms each turn until it's done.

The numbers can be tweaked of course.

The part of measuring the heap size since the last GC isn't quite accurate though.

What it actually does is setting the last heap size to the current heap size if the current heap size is smaller.

That's because I didn't find a way to detect when the sweeping in the background thread has finished. So after the marking has completed, the heap is shrinking because the sweeping is running in the background.

Here is a graph showing GC performance. I had to change the profiler to print the exact values each turn instead of printing an average over several turns.

I've also changed v1.8.5 to run GC every 50 turns because I had to use only one runtime for measuring and the current settings wouldn't have worked.

What we are measuring here are the peaks. It's quite a good result IMO, but it's hardly visible on the graph with total performance and with average values (because GC only runs from time to time).

post-7202-0-35842000-1390768266.png

Btw. It's not a fault, this time v24 is actually better ;)

The next graphs show memory usage. V24 seems to use a bit more , but most of that is probably related to the GC behaviour.

The peaks in the end could be because of the way I'm calculation the "last heap size" and because it takes several turns to mark the garbage before sweeping kicks in and actually frees it.

Memory overview:

post-7202-0-20354800-1390768199_thumb.pn

Memory zoomed:

post-7202-0-29788800-1390768212_thumb.pn

post-7202-0-35842000-1390768266_thumb.pn

  • Like 2
Link to comment
Share on other sites

Can someone show me an history of spidermonkey versions ? I'm wondering because your going from v1.8.5 to v24 which is 23versions more and i find it a bit weird to have so many different versions. I know that C language has 99.

Regards Stan

It's a bit related to FF versioning. Until FF4, they had lots of sub versions, just as the JS engine had. From FF4 onwards, they released a new FF version every few weeks, this included a new spidermonkey version too. Every now and then, there's a long-term release. v24 is such a long term release, and the next one will be v31. We will probably try to keep using the latest long-term release after v24 is in.

Link to comment
Share on other sites

Very nice. How does it scale with multiple players? If more players cause more gc time in a linear way, we could give 5ms per player per turn maybe?

I just noticed that I haven't answered to this yet.

Spending too much time fine-tuning the GC doesn't make sense at the moment in my opinion.

Maybe there's some more API functionality in the next SpiderMonkey versions or we change our code to produce less or (hopefully not) more garbage.

Btw. I've attached the current WIP patch with incremental GC to ticket #1886.

  • Like 1
Link to comment
Share on other sites

I find it a bit weird to have so many different versions. I know that C language has 99.

It doesn't - C99 is the 1999 revision of the language standard. C11 was published a couple of years back (in 2011), although some compilers (looking at MSVC) don't even support C99 fully yet.

[/offtopic]

Link to comment
Share on other sites

  • 3 weeks later...

The upgrade is quite close to completion now! :)

I've fixed serialization issues, compiler warnings, tests, replaced the SpiderMonkey usage in Atlas (#2434), cleaned up the patch and fixes other smaller issues.

Now I need someone with a Mac to set up the build script for Mac OS X and test the build process there (#2442).

In the meantime I'll test the build on Windows, have another close look through the whole patch and then call it v1.0 and consider it ready for the final review.

I'd like to be ready for review in about a week and done with the review as soon as possible after that to have enough testing before the next alpha.

  • Like 4
Link to comment
Share on other sites

I have a very strange problem on Windows. Maybe someone has an idea what it could be.

I have a static lib project with the two ScriptInterface files and a main .exe project with main.cpp.

Now it segfaults on the line "int x= val" when ScriptInterface is in another translation unit (the lib) AND the function uses jsval as return type AND the function is static AND the function uses a template (also check the comments in main.cpp to see which variations work and which don't).

Any ideas?

Main.cpp

#include <iostream>#include <fstream>#include <streambuf>#include <sstream>#include <stdio.h>#include "../staticlib1/ScriptInterface.h"#define __STDC_LIMIT_MACROS#include <stdint.h>using namespace std;/* The class of the global object. */static JSClass global_class = { "global", JSCLASS_GLOBAL_FLAGS, JS_PropertyStub, JS_DeletePropertyStub, JS_PropertyStub, JS_StrictPropertyStub, JS_EnumerateStub, JS_ResolveStub, JS_ConvertStub, NULL, JSCLASS_NO_OPTIONAL_MEMBERS };/* The error reporter callback. */void reportError(JSContext *cx, const char *message, JSErrorReport *report) {     fprintf(stderr, "%s:%u:%s\n",             report->filename ? report->filename : "<no filename="">",             (unsigned int) report->lineno,             message);}void initContext(JSContext* cx){	uint32_t options = 0;	options |= JSOPTION_EXTRA_WARNINGS; // "warn on dubious practice"	options |= JSOPTION_VAROBJFIX; // "recommended" (fixes variable scoping)	options |= JSOPTION_BASELINE;	options |= JSOPTION_ION;	options |= JSOPTION_TYPE_INFERENCE;	options |= JSOPTION_COMPILE_N_GO;	options |= JSOPTION_STRICT_MODE;	JS_SetOptions(cx, options);	JS_SetErrorReporter(cx, reportError);}int main(int argc, const char *argv[]) {	JSRuntime* rt = JS_NewRuntime(16 * 1024 * 1024, JS_NO_HELPER_THREADS);	JS_SetNativeStackQuota(rt, 128 * sizeof(size_t) * 1024);		JSContext* cx = JS_NewContext(rt, 8192);	JSAutoRequest rq(cx);	JSObject* global = JS_NewGlobalObject(cx, &global_class, NULL);	JSCompartment* comp = JS_EnterCompartment(cx, global);	JS_SetGlobalObject(cx, global);	JS_InitStandardClasses(cx, global);	initContext(cx);	ScriptInterface scriptInterface;	jsval value = scriptInterface.myToJSVal(188); // works	value = ScriptInterface::myToJSValStatic(188); // works	ScriptInterface::myVoidStaticTemplate(188); // works	value = ScriptInterface::myToJSValStaticTemplate(188); // segfault}

ScriptInterface.h:

#ifdef _WIN32# define XP_WIN# ifndef WIN32#  define WIN32 // SpiderMonkey expects this# endif#endif#include "jsapi.h"class ScriptInterface{public:	ScriptInterface() {};	template<typename T> jsval myToJSVal(const T& val);	template<typename T> static jsval myToJSValStaticTemplate(const T& val);	static jsval myToJSValStatic(const int& val);	template<typename T> static void myVoidStaticTemplate(const T& val);};

ScriptInterface.cpp:

#include "ScriptInterface.h"template<> jsval ScriptInterface::myToJSVal<int>(const int& val){	int x = val;	printf("value: %d", x);	return JS::Int32Value(x);}jsval ScriptInterface::myToJSValStatic(const int& val){	int x = val;	printf("value: %d", x);	return JS::Int32Value(x);}template<> jsval ScriptInterface::myToJSValStaticTemplate<int>(const int& val){	int x = val;	printf("value: %d", x);	return JS::Int32Value(x);}template<> void ScriptInterface::myVoidStaticTemplate<int>(const int& val){	int x = val;	printf("value: %d", x);}

Link to comment
Share on other sites

Using static and template induces implicite instantiation. There the compiler doesn't use a specific order and it's completely random order of instantiation and even differs from compiler to compiler (so Visual Studio compiler or GCC or whatever you use ...).

A solution could be to explicitely declare all participants in any source-file similar to:

template<>jsval ScriptInterface::myToJSValStaticTemplate<int>;
  • Like 1
Link to comment
Share on other sites

Using static and template induces implicite instantiation. There the compiler doesn't use a specific order and it's completely random order of instantiation and even differs from compiler to compiler (so Visual Studio compiler or GCC or whatever you use ...).

A solution could be to explicitely declare all participants in any source-file similar to:

template<>jsval ScriptInterface::myToJSValStaticTemplate<int>;

Indeed, it finally works!

Adding this line to main.cpp (in the example) solves the problem:

template jsval ScriptInterface::myToJSValStaticTemplate<int>(const int& val);

Thanks very much, you just made my day! I was staring at Visual Studio's memory view and comparing values and addresses on the stack for way too long now!

I still don't quite understand why this causes a segfault and not a compiler error. I also don't understand why it worked with SpiderMonkey 1.8.5 but not with v24.

I guess I have to read a bit more about how templates work.

Link to comment
Share on other sites

You are the master not me. It was just a guess ... lucky it turned out useful. Can still be random. Sometimes working, sometimes not. Could have also been some magic null pointer assignment.

All not directly your fault for sure. Sometimes it's simply crazy!

The problem of static and template mixed is related to the compiler not implicitely "seeing" if it's used or not according to C++ specification. So it's instantiated at runtime at a random order and bang ... if it [the variable/function...] is used somewhere prior to being initialized, then it breaks with segmentation error.

Thank your for all your time going into our spidermonkey issues! (interesting words those mozilla folks think of ..)

A colleague from stackoverflow, Johannes wrote something about it: Static Member Init Template Fun Inside.

Edited by Hephaestion
Link to comment
Share on other sites

I'm still trying to understand this problem.

I don't see any static data members, so the explanation probably doesn't apply completely.

It must be releated to static function templates though because placing that code in main.cpp solved it.

Also JS::Value must be related in some way because I couldn't find any other types that cause the same problem when used as return value.

Link to comment
Share on other sites

Definitely there must be more implicitely to be initialized static templates. One alone will make it easy for the compiler to get it in the correct order (as there is only one). Even though there are more than one still the compiler could be lucky and initialize all references using the correct order.

Agreed. To exactly understand the problem in the context of JS::Value would be helpful. Unfortunately for bigger programs to find the real issue could turn out more complicated than simply explicitely declaring used static template functions and attributes (in the correct order). If you do further research, please let us know how your ventures turn out. And take your time - SpiderMonkey by itself is dangerous enough as it's highly optimized probably at all and every corner. (so I would not use it for spaceflight, as there could be crazy odds one never thinks of until the spacecraft crashes)

Link to comment
Share on other sites

I'm analyzing the generated assembler code in the .exe now.

I'm new to assembler but what I see so far is that it calls the same function differently which must be the reason why it fails.

I hope my interpretation (comments) is correct.

This is how it calls the function in the working version:

  Working   004117E9: 8D 45 A0           lea         eax,[ebp-60h]       ; storing a reference to the memory location of the integer argument in eax  004117EC: 50                 push        eax                 ; pushing the content of eax on the stack  004117ED: 8D 8D D0 FE FF FF  lea         ecx,[ebp-130h]      ; storing a reference to the memory location of the JS::Value in ecx  004117F3: 51                 push        ecx                 ; pushing the content of ecx on the stack  004117F4: E8 7A F8 FF FF     call        @ILT+110(??$myToJSValStaticTemplate@H@ScriptInterface@@SA?AVValue@JS@@ABH@Z) ; call the function  004117F9: 83 C4 08           add         esp,8               ; clean up the space required for the arguments (the int ptr and the JS::Value pointer)  004117FC: 8B 10              mov         edx,dword ptr [eax]        ; expect the pointer to the JS::Value data in eax. Copy the first four Bytes to edx  004117FE: 8B 40 04           mov         eax,dword ptr [eax+4]      ; expect the pointer to the JS::Value data in each. Copy the second four Bytes to eax  00411801: 89 55 AC           mov         dword ptr [ebp-54h],edx    ; copy the first part of the value from the register to the memory location of the JS::Value  00411804: 89 45 B0           mov         dword ptr [ebp-50h],eax    ; copy the second part of the value from the register to the memory location of the JS::Value  00411807: C7 45 FC FF FF FF  mov         dword ptr [ebp-4],0FFFFFFFFh

And here's the segfault version:

Segfault  004117E9: 8D 45 A0           lea         eax,[ebp-60h]            ; storing a reference to the memory location of the integer argument in eax  004117EC: 50                 push        eax                      ; pushing the content of eax on the stack  004117ED: E8 81 F8 FF FF     call        @ILT+110(??$myToJSValStaticTemplate@H@ScriptInterface@@SA?AVValue@JS@@ABH@Z)  ; Call the function  004117F2: 83 C4 04           add         esp,4                      ; clean up the space required for the argument (the int ptr in eax)  004117F5: 89 85 D0 FE FF FF  mov         dword ptr [ebp-130h],eax   ; JS::Value structure is 8 Bytes. Expect to find the first four bytes of the JS::Value data in eax and copy it to memory (12Ch is 4 bytes before/after 130h)  004117FB: 89 95 D4 FE FF FF  mov         dword ptr [ebp-12Ch],edx   ; expect the second four bytes of JS::Value data in edx and copy it to memory (12Ch is 4 bytes before/after 130h)  00411801: 8B 8D D0 FE FF FF  mov         ecx,dword ptr [ebp-130h]  00411807: 89 4D AC           mov         dword ptr [ebp-54h],ecx  0041180A: 8B 95 D4 FE FF FF  mov         edx,dword ptr [ebp-12Ch]  00411810: 89 55 B0           mov         dword ptr [ebp-50h],edx  00411813: C7 45 FC FF FF FF  mov         dword ptr [ebp-4],0FFFFFFFFh

Link to comment
Share on other sites

I'm analyzing the generated assembler code in the .exe now.

I'm new to assembler but what I see so far is that it calls the same function differently which must be the reason why it fails.

I hope my interpretation (comments) is correct.

This is how it calls the function in the working version:

  Working   004117E9: 8D 45 A0           lea         eax,[ebp-60h]       ; storing a reference to the memory location of the integer argument in eax  004117EC: 50                 push        eax                 ; pushing the content of eax on the stack  004117ED: 8D 8D D0 FE FF FF  lea         ecx,[ebp-130h]      ; storing a reference to the memory location of the JS::Value in ecx  004117F3: 51                 push        ecx                 ; pushing the content of ecx on the stack  004117F4: E8 7A F8 FF FF     call        @ILT+110(??$myToJSValStaticTemplate@H@ScriptInterface@@SA?AVValue@JS@@ABH@Z) ; call the function  004117F9: 83 C4 08           add         esp,8               ; clean up the space required for the arguments (the int ptr and the JS::Value pointer)  004117FC: 8B 10              mov         edx,dword ptr [eax]        ; expect the pointer to the JS::Value data in eax. Copy the first four Bytes to edx  004117FE: 8B 40 04           mov         eax,dword ptr [eax+4]      ; expect the pointer to the JS::Value data in each. Copy the second four Bytes to eax  00411801: 89 55 AC           mov         dword ptr [ebp-54h],edx    ; copy the first part of the value from the register to the memory location of the JS::Value  00411804: 89 45 B0           mov         dword ptr [ebp-50h],eax    ; copy the second part of the value from the register to the memory location of the JS::Value  00411807: C7 45 FC FF FF FF  mov         dword ptr [ebp-4],0FFFFFFFFh

And here's the segfault version:

Segfault  004117E9: 8D 45 A0           lea         eax,[ebp-60h]            ; storing a reference to the memory location of the integer argument in eax  004117EC: 50                 push        eax                      ; pushing the content of eax on the stack  004117ED: E8 81 F8 FF FF     call        @ILT+110(??$myToJSValStaticTemplate@H@ScriptInterface@@SA?AVValue@JS@@ABH@Z)  ; Call the function  004117F2: 83 C4 04           add         esp,4                      ; clean up the space required for the argument (the int ptr in eax)  004117F5: 89 85 D0 FE FF FF  mov         dword ptr [ebp-130h],eax   ; JS::Value structure is 8 Bytes. Expect to find the first four bytes of the JS::Value data in eax and copy it to memory (12Ch is 4 bytes before/after 130h)  004117FB: 89 95 D4 FE FF FF  mov         dword ptr [ebp-12Ch],edx   ; expect the second four bytes of JS::Value data in edx and copy it to memory (12Ch is 4 bytes before/after 130h)  00411801: 8B 8D D0 FE FF FF  mov         ecx,dword ptr [ebp-130h]  00411807: 89 4D AC           mov         dword ptr [ebp-54h],ecx  0041180A: 8B 95 D4 FE FF FF  mov         edx,dword ptr [ebp-12Ch]  00411810: 89 55 B0           mov         dword ptr [ebp-50h],edx  00411813: C7 45 FC FF FF FF  mov         dword ptr [ebp-4],0FFFFFFFFh

Have you tried ida pro ? I uses a very convenient algorigram view

(here with a retro engineering project I'm doing)

post-12287-0-41989800-1393705115_thumb.p

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...