Jump to content


Community Members
  • Posts

  • Joined

  • Last visited

Yggdrasil's Achievements


Discens (2/14)



  1. You don't have to create your own tool. There's a W3C standard named ITS[1] and a nice program itstool[2] which generates POT files for us. So for entity xml files you could create ITS rules like this one: https://github.com/t.../entity.its.xml itstool generates then this one (only from one xml): https://github.com/t.../po/hoplite.pot The proposed patch[3] marks translatable strings explicitly with special tags. Not sure what's prefered. Other formats can be handle with existing tools too, see here: http://www.wildfireg...=60#entry258851 (i think the proposed patch handles javascript in a different way but uses format strings too) @fragmentation: I don't think you deallocate unit descriptions that often, so fragmentation should be minimal. You only allocate it on load and free it when the game shuts down (this is a guess, not familiar with the code). [1] http://www.w3.org/TR...C-its-20070403/ [2] http://itstool.org/ [3] http://trac.wildfire...s.com/ticket/67
  2. Or use public ones like transifex and launchpad. https://www.transifex.com/ https://translations.launchpad.net/ Here's the documentation for git integration of weblate: http://docs.weblate....tomatic-worflow (other systems have similar ways) Or if all this is too fancy for you, you can just use a translation mailing list where all interested translators subscribe. A developer then posts an updated POT file every now and then (mostly when a new release is near) and makes a "call for translation", so that the subscribed translators merge the string changes in the POT file into their PO file, update the translation and send back the new PO file. The developer then commits the new PO files. This was more or less the standard way some years ago. This is nice and simple but has the disadvantage that their's no ready way of collaboration between multiple translators for the same language. They only get a mailing list. It's up to them to organize themself. That's why more and more projects use online translation systems which make participation and collaboration very easy. Which is important to attract translators. @RedFox: If it's set in stone that you want to use your "dictionary" approach you can still use all the gettext utilities. You just have to create a POT file by collecting all translatable strings (however you want), let the translators create PO files and also commit these PO files to svn. Then just use your txt format as a backend like gnu gettext uses MO files by converting the PO files during compile (the same way msgfmt does it which has other backends for java, c#, tcl and qt too). This way you only need a PO->txt tool and don't need to be feature compatible with PO files as you can drop stuff. I still doubt you get any benefit from your format as i also think string handling isn't a bottleneck.
  3. You forget about your id strings/keywords which you have to load from the xml files. So basically the amount of allocations is only 1/3 because you merged generic, specific and history into one keyword. The only real difference with your approach is you do this string pooling for the english messages too. The lookup and the message catalog are highly optimized in gnu gettext. It hashes the string and looks it up in the message catalog which is by default a MO file[1]. It also caches results[2]. I wouldn't be surprised if the whole catalog is memory-mapped, but i'm not sure about it (looking at strace it sure looks like it). Btw, tinygettext which is used in the proposed patch[3], uses the PO files directly, no MO files. Not sure how the lookup is done there (probably building a hashmap on load). [1] http://www.gnu.org/s...t.html#MO-Files [2] http://www.gnu.org/s...timized-gettext [3] http://trac.wildfire...s.com/ticket/67
  4. But then again they have to know the intented meaning of the keyword and where it is shown ingame. Isn't your data laid out badly if the information is too fragmented? Better merge some data files than collecting it later. I don't really get what you want with your dictionary (string pool?). It doesn't make sense for english if the english text is in the source or data files. You already parsed the xml and hold the information in some object. Why do another lookup in a dictionary? For other languages where you have to translate, gettext of course uses a hashmap. So there you have your dictionary.
  5. You'd face multiple problems by implementing your own system. You are using a keyword based system which is bad for the translators. Just look at the examples posted by historic_bruno: You have to translate some arbitrary keyword which means you have to lookup first the english "translation" to get its meaning and then translate. It's also bad for the developers as they now have to look at two places: The source code where the keyword is used and the english translation. Why not just use the english phrases in the source code and extract it from there (as is the case now anyway)? The developers just have to maintain the source code and the translators get an english phrase or real word to work with. That's what gettext does. How do you do updates? How do you inform other translators that the english translation of a keyword changed? What do you do with old obsolet/unmaintained translations? How do you inform translators about new keywords? You'd have to write tools for all these situations and probably more (like context and plural). Gettext and PO-editors have all these features already. And i just have to second zoot. PO files are text files. If you don't want to use a PO-editor with nice comfort functions or an online translations system like transifex then just use a plain text editor.
  6. Nice post, but i have some concerns: Your unicode support is very limited when it comes to complex scripts [1][2]. I don't think texture-mapping of glyphs will work here. There are just too many combinations. That's why Ykkrosh proposed to use pango which uses harfbuzz internally to support complex rendering [3]. It also uses fontconfig to search for glyphs in multiple fonts. So you don't need to know in which font a specific glyph is stored. With pango you render (through cairo) your whole text snippet into a texture which you could cache for later usage. Of course this is slower than texture-mapping but i doubt it'll be a bottle neck. [1] https://en.wikipedia.org/wiki/Complex_text_layout [2] http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=CmplxRndExamples&_sc=1 [3] http://www.wildfiregames.com/forum/index.php?showtopic=13790&st=40#entry236362
  7. You might want to check out streflop which is used by spring and megaglest. http://nicolas.brodu...ation/streflop/ Better just use gettext. http://www.gnu.org/software/gettext/ A keyword based translation system is also not good to maintain. related thread: http://www.wildfireg...showtopic=13790
  8. Guess why most commercial games use big package files to store data (besides obscuring it)? It's faster to seek in one big file than open/close many tiny files. Uses less syscalls and locality will be better too (only if you mostly read sequential), if the file isn't fragmented. Then you can also memory-map this file easier to get even more performance. https://en.wikipedia.org/wiki/Memory-mapped_file But i don't think you need all that stuff. Loading time is still acceptable.
  9. Most likely the disk cache of the kernel, caching recently used files in RAM. Besides, 0ad caches stuff too (DDS textures,...). I've made a very simple texture atlas working with compressed DDS files (without decompression). It's just a proof-of-concept. I could clean it up a bit if there's interest. It's probably only really useful if you start to batch things in the renderer. Something to keep in mind for the new renderer.
  10. Here was some minor technical "discussion": http://www.wildfiregames.com/forum/index.php?showtopic=13790&st=60 I tried to summarize it on the last page. As there was no interest and no comments from the developers, i stopped looking into it.
  11. As you're planning for 1.0 may i bump this thread. Translation is an important feature you shouldn't forget about. Let's summarize what was "discussed" in this thread. Ykkrosh suggested to use pango[1] to get bidi and font shaping support (uses harfbuzz internally). Some other nice features of pango: It searches for the right font based on the needed glyphs and some attributes you specified (uses fontconfig). Pango markup can be used to get some basic text effects. It also provides some basic layout features like word wrapping. You basically just throw text at it and you're done. You don't have to worry about the details. The negative things: It pulls in many dependencies and as it replaces all the text rendering and font handling, it probably takes quite some time to integrate. I suggested to use gettext[2] as i18n framework and pointed out some ways to extract strings from source and data files (see some posts above). Some people mentioned different online translation systems which make it very easy for translators to get started and to collaberate. To make use of these you have to provide your strings in a compatible format like gettext does. If you want to run it on your own server, pootle[3] is a good project. Other public translation systems are transifex[4] and rosetta[5] (part of launchpad). [1] http://www.pango.org/ [2] http://www.gnu.org/software/gettext/ [3] http://docs.translat...otle/en/latest/ [4] https://www.transifex.com/ [5] https://translations.launchpad.net/
  12. Let's keep pushing the ball. I had another look at your data files and noticed some other formats you use: JSON and javascript embedded in xml. I already noticed your json files some days ago and json2po, converter from translate-toolkit[1], looks like it suits the job. Use the filter option to pick the leaves you want to extract, e.g.: json2po --filter=History,Description -o hele.pot hele.json This only works with current git master[2] of translate-toolkit as i had to patch it and it got already merged upstream. You can only filter for key-names and not their position in the tree, but i guess that's not a problem for you. Javascript seems to be a bit trickier. The easiest solution i found: Extract the javascript parts with XSLT and let xgettext do the rest. xgettext doesn't support javascript (yet, there's an unmerged patch for it[3]), but you can get away with Python or Perl set as language. You have to mark strings with some kind of function like _(), otherwise it's not extractable (just how you would do in C++). Best you link the gettext function to it as well (don't know how you call C++ from js). xml sel -T -t -v '//action' -n mainmenu.xml |\ xgettext --language=perl -k_ -o js.pot - This will extract all _("strings") from javascript in an action-tag. You need xmlstarlet[4] for the XSLT part. This has some limitations, e.g. you can't use string concatenations like this one: _("Open "+url+"\n in default web browser.") The string needs to be constant. Format strings seem to be the alternative which can be easily added[5]. The translator can then move the special characters of the format string where they need to be in the translated string (that's why it's a bad idea to break it up in multiple strings). As you can see it gets complex quite easily. Still, it shouldn't be too hard to add it to your build system (i'm not familiar with premake, i might be wrong). [1] http://docs.translat...lkit/en/latest/ [2] https://github.com/translate/translate [3] http://lists.gnu.org...2/msg00012.html [4] http://xmlstar.sourceforge.net/ [5] http://stackoverflow...f-string-format
  13. I don't think that's a big problem. Any translator should be well aware that it's a moving target and there are string changes every now and then. gettext can help here a bit. The english strings get extracted from the source files, be it data or code, and get merged with existing translations. It even does fuzzy matching: If the string change is small enough, it still uses the old translation but marks it as fuzzy, so the translator should have a look at it. Old and obsolete translations get removed/commented out. That should be all done by the build system and commited by the one making the string change. There's probably more stuff depending on the translation editor used. Your english texts probably evolve in the same way like everything else: Step by step. So, why not let translators do the same continuously in parallel? Of course some translations will get obsolete but that's just normal. Many open source projects integrate their translation system with their version control system. So, translators get every change directly after it was made. Other projects update their language files only when they are near a release and make a "Call for translation" to notify the translators. If you're that much concerned about "wasted" effort by translators choose the latter approach. I think 0ad is stable enough to get at least started.
  14. Let's see if we can get this rolling. I was investigating how to improve translation for GlestAE when stumbling across this thread. There's one big similarity in both projects: Most data is stored in XML files. This includes strings which one has to extract for translation. So, i thought i might share my findings with you here. First of all, as i'm working on linux, using gnu gettext as i18n framework is more or less a given. There's good tool support and many translators are also familiar with it. You don't have to use the gnu implementation, e.g. boost::locale is a bit more flexible when searching for the message catalog and can support virtual filesystems like physfs. When using gettext the source and data files are written in english. The strings are then extracted out of these files (preferable done by the build system). For source files this is very easily done with xgettext, see this quick tutorial[1]. Obviously xgettext can't support arbitrary XML structures. Turns out there's a W3C-standard named ITS[2]. With this, one can define rules which text in the XML file should be translated. I recently found itstool[3] which executes these rules and gives us a nice POT-file like xgettext. Only minor problem here: It might get "out-of-sync" what you fed through gettext and what you mark as translatable. So, as i wrecked my previous test program while trying to get the opengl backend of cairo working, i made a simplier new one[4]. This time using an XML file from 0 A.D. (which doesn't sound ridiculous in german, btw). It shows how to extract string(s) from XML, translate them and render them with pango. It includes a very rough and incomplete german translation (LANG=de_DE.utf8). Look at the Makefile first to see what you need (sorry, no nice build system and linux-only, should be easy to port through). It would be nice if someone could provide a translation in a more complex script, like arabic or so. Just for testing. Use po/hoplite.pot as template. Cheers. [1] http://www.tuxamito.com/joomla/index.php/en/component/content/article/60-gettext-tutorial [2] http://www.w3.org/TR/2007/REC-its-20070403/ [3] http://itstool.org/ [4] https://github.com/tetzank/its_gettext_pango_opengl
  • Create New...