Jump to content

[Discussion] Localization (again)


Recommended Posts

Haha :D Yeah. I'm afraid you can't avoid having to copy some files if you're working on a project... ;)

Hundreds and hundreds of translators will disagree.

We could have a "translation manager" with svn access, this person will oversee the translation process and provide instructions and notices as nessesarry. The translators will email this person the po file they are working on, the translation manager will then compile .po files into .mo ones commit to the source.

Link to comment
Share on other sites

We could have a "translation manager" with svn access, this person will oversee the translation process and provide instructions and notices as nessesarry. The translators will email this person the po file they are working on, the translation manager will then compile .po files into .mo ones commit to the source.

I won't rule out that someone has the level of dedication to be that person. But the process seems very clunky and error prone to me, compared to the streamlined workflow of services like Transifex and Launchpad Translations.

Link to comment
Share on other sites

Most people won't end up using fancy translation software, but in the case they want to, we'll have the PO Template for them and the tool that convert it back to .txt.

I've never worked on commercial games, but on a few open source projects, and my experience is that the biggest advantage of gettext is actually that the tools are so dead easy. Lokalize (http://userbase.kde.org/Lokalize), Poedit (http://www.poedit.net/screenshots.php) etc. are very accessible and easily learned. I wouldn't call them fancy. Then there are the ready-to-use webinterface for hosted translation systems, e.g. Pootle and Weblate. Just to get an impression for people that don't know them (random links into demo projects):

http://pootle.locamotion.org/de/filezilla/filezilla.po/translate/#filter=suggestions

http://demo.weblate.org/projects/hello/master/de/translate/?type=all

These interfaces can be integrated with all popular VCS (git, svn, cvs at least) so you can work on trunk, branches, etc. and directly commit from the interface.

In my opinion the emphasis is quite naturally a bit different for open source projects than for commercial projects, as for open source projects the translators need to be attracted and motivated first, which is probably the most important aspect then.

Please note this is not an argument against using an additional intermediate layer for more efficient in-game usage. I just wanted to point out that the tools can make a massive difference.

Cheers

Link to comment
Share on other sites

Or use public ones like transifex and launchpad.

https://www.transifex.com/

https://translations.launchpad.net/

Here's the documentation for git integration of weblate:

http://docs.weblate....tomatic-worflow

(other systems have similar ways)

Or if all this is too fancy for you, you can just use a translation mailing list where all interested translators subscribe. A developer then posts an updated POT file every now and then (mostly when a new release is near) and makes a "call for translation", so that the subscribed translators merge the string changes in the POT file into their PO file, update the translation and send back the new PO file. The developer then commits the new PO files. This was more or less the standard way some years ago.

This is nice and simple but has the disadvantage that their's no ready way of collaboration between multiple translators for the same language. They only get a mailing list. It's up to them to organize themself. That's why more and more projects use online translation systems which make participation and collaboration very easy. Which is important to attract translators.

@RedFox: If it's set in stone that you want to use your "dictionary" approach you can still use all the gettext utilities. You just have to create a POT file by collecting all translatable strings (however you want), let the translators create PO files and also commit these PO files to svn. Then just use your txt format as a backend like gnu gettext uses MO files by converting the PO files during compile (the same way msgfmt does it which has other backends for java, c#, tcl and qt too). This way you only need a PO->txt tool and don't need to be feature compatible with PO files as you can drop stuff.

I still doubt you get any benefit from your format as i also think string handling isn't a bottleneck.

Link to comment
Share on other sites

I won't rule out that someone has the level of dedication to be that person. But the process seems very clunky and error prone to me, compared to the streamlined workflow of services like Transifex and Launchpad Translations.

I forgot to mention that there are "maintainers" for each language too. This maintainer is not only a translator but also a leader and representative for the translators of a particular language, he collects po files from translators in the team and send them to the translation manager on a regular basis.

IMO the translation manager doesn't have to be very dedicated, he just has to check the email everyday and time to commit po files. Atleast that's what The Battle for Wesnoth is doing now.

Having said that, I agree that an online translation platform like transifex or launchpad is very handy, especially the ability to collaborate with other translators without the need for a language group. It's just that offline editors offer many advanced features (Lokalize's features are awesome!) that I'm sure I will miss upon moving to an online solution.

Edited by snwl
Link to comment
Share on other sites

Having said that, I agree that an online translation platform like transifex or launchpad is very handy, especially the ability to collaborate with other translators without the need for a language group. It's just that offline editors offer many advanced features (Lokalize's features are awesome!) that I'm sure I will miss upon moving to an online solution.

I'm not advocating that online solutions are the only solution. Just that with PO files you have the option, along with many other tools that support the format. But at least on Launchpad, you have the option to upload a PO you made (and I believe Transifex offers the same). So it is not really choice between online or offline editors - they work together. One thing they certainly don't do, though, is support some custom format we come up with.

Link to comment
Share on other sites

Thanks for the awesome discussion everyone, a lot of important points have already been mentioned. :)

First of all, Launchpad looks really awesome. I wonder how difficult it would be to integrate it into the current development cycle?

@RedFox: If it's set in stone that you want to use your "dictionary" approach you can still use all the gettext utilities. You just have to create a POT file by collecting all translatable strings (however you want),

Since we can use .MO files, the plain text is starting to loose its point. Binary is always a more preferred format for sure. As for creating a POT file, we can use the wfgpotext program to just cycle through all the current XML files and generate an appropriate POT file from the texts in those files.

I guess I had a slightly different idea of collating the source text files into text files like (en_gui.txt, en_units.txt, etc.). And for mods that added new units, they could just create a "en_mymod.txt" file in their mod folder. It would also be trivial to update these entries with an 'Entity editor' of some sort.

I still doubt you get any benefit from your format as i also think string handling isn't a bottleneck.

I guess the plaintext format was never really the main point - the neat sideffect of having to only do 1 malloc for the entire thing was the point. Also avoiding any temporary std::string objects.

Even though a small part of the game such as this doesn't seem like it has a big effect (or any effect), we should accept that all of these different modules across the 0 A.D. source add up to a butterfly effect.

Creating a huge amount of string objects early on will definitely fragment memory - it's an accepted fact and has made clever developers even use temporary heaps when they have to do a large amount of string operations in programs with a long life-cycle. If you trash the heap with strings you'll make further allocations slower and some memory will be lost due to fragmentation.

Now if you add all this up to a large scale project like 0 A.D. with all its different modules that "don't care about memory", you'll get your desired butterfly effect and you'll see the game eat up more and more resources. I'm just saying game development has always been about clever tricks to squeeze out the best from the CPU. This is not just some business app where you don't care about performance - you'd use Java, C# or even Ruby for that.

Even though we know there are the 'big obvious bottlenecks', we should also not underestimate tiny objects that fragment the heap. Those not familiar with memory fragmentation should read up: http://blogs.msdn.co...-and-wow64.aspx

Even though your program seems to not take that much memory, the allocations can get slower and slower over time, because it becomes increasingly more difficult to find a reasonably sized free block.

I'm sure though that a similar low-fragmentation approach can be used for .MO files, so that would be a preferred solution. For all intents and purposes, the strings can remain in entity description files as they are now. And we can generate .POT file based on those XML files. Then the translators can jump in and do their magic.

Link to comment
Share on other sites

You don't have to create your own tool. There's a W3C standard named ITS[1] and a nice program itstool[2] which generates POT files for us.

So for entity xml files you could create ITS rules like this one:

https://github.com/t.../entity.its.xml

itstool generates then this one (only from one xml):

https://github.com/t.../po/hoplite.pot

The proposed patch[3] marks translatable strings explicitly with special tags. Not sure what's prefered.

Other formats can be handle with existing tools too, see here:

http://www.wildfireg...=60#entry258851

(i think the proposed patch handles javascript in a different way but uses format strings too)

@fragmentation: I don't think you deallocate unit descriptions that often, so fragmentation should be minimal. You only allocate it on load and free it when the game shuts down (this is a guess, not familiar with the code).

[1] http://www.w3.org/TR...C-its-20070403/

[2] http://itstool.org/

[3] http://trac.wildfire...s.com/ticket/67

Link to comment
Share on other sites

Now if you add all this up to a large scale project like 0 A.D. with all its different modules that "don't care about memory", you'll get your desired butterfly effect and you'll see the game eat up more and more resources.

If that is something we want to combat, we should do it on the basis of factual measurements, not simple conjecture. Since gettext is a tried and true solution, with more than two decades of optimization in the bag, that is by far the most obvious choice, and what we ought to go with for an initial implementation. If anyone can then subsequently improve upon the performance of that implementation, and present factual evidence in support of that, then making those changes may be worth considering.

Link to comment
Share on other sites

  • 2 months later...

Hi all,

Sorry for barging in a bit late on this discussion, I did not see this until now.

First a few words about where I'm coming from: I have been working as a localizer for 2 years now both professionally for proprietary software and as a hobby for Opoen Source, and I can't wait to get started translating this game into Scottish Gaelic :)

If I may make an observation from past experience and hair-pulling: Whatever you do internally, I strongly recommend presenting translators with PO files. This is why:

Access to translation memories and easy cooperation

PO files allow you to use translation memories and put them up on sites like transifex (yes, Transifex allows you to download and reupload the files for offline editing, just like Launchpad).

Plural handling

Gettext implements proper plural handling. How do you intend to implement plural handling? Just to give you an example for why it's needed - if I would use Gaelic grammar on an English example, I would get:

0 fighter

1 f
h
ighter

2 f
h
ighter

3 - 10 fighter
s

11 f
h
ighter

12 f
h
ighter

13-19 fighter
s

40 ... fighter

as opposed to English

0 fighter
s

1 fighter

2 ... fighter
s

Spellcheckers

Also, PO files can be used with tools that support hunspell, like Firefox or Virtaal. Personally, I will be translating into a minority language and will be grateful not to have to copy/paste everything into LibreOffice in order to gain access to a spellchecker.

Easy reference to the English source text

Seeing the source and translation string side by side greatly speeds up the process. There are only 2 people active in the Open Source community who are qualified to localize into my language. So, I will have to translate the complete game on my own and saving time will be essential.

With some translation projects, I found it helpful that I could easily see which line in the source the string had come from. Also, as has already been mentioned, programmers can add comments that will be picked up by gettext.

Instant testing of your translation in-game

There is one problem with the text2po conversion approach: How does the average translator who does not run a Linux machine test their translations in-game? If you'd just use gettext, easy-peasy, save as MO and drop your translation into LC_MESSAGES and Bob's your uncle.

Some notes concerning programming issues

If you're worried about strings being spread out to much e.g. for a campaign, that would depend on the design of your campaign config files if they are spread out or not? Maybe you could chat to the devs at Wesnoth who have a very workable solution also for mods, using multiple gettext domains If you're using XML, check out Intltool.

4) Simplification: In this case the dictionary .txt files a very simple, so it's difficult to make any errors. At any rate, C-style format strings should not be used inside translation strings - that's just bad design.

If you mean strings like "foo %s bar", they actually are the easiest way to adjust the word order in the target language, if a placeholder can't be avoided.

If you absolutely have to have your own format, talk to the OpenTTD people. They have a working format that does plural, gender and case handling, and they have a web translator as well.

Please let the user select their language in the GUI

And one final request, please do implement a language option in the menu. Detect Locale is fine for language pre-selection only, trust me on this.

Thanks for listening to my long ramble :)

Edited by GunChleoc
  • Like 3
Link to comment
Share on other sites

@GunChleoc: Don’t you worry, I think I’m covering all your points in my on-going implementation: https://github.com/Gallaecio/0ad (language selection is not there yet, but it will be, hopefully without the need to restart the game as well).

Once that is finished, if RedFox detects that this internationalization implementation is a bottleneck somewhere, affecting performance noticeably, anything we do to increase performance with translations will hopefully be without removing features for translators provided by the working implementation.

  • Like 2
Link to comment
Share on other sites

You are right - we do need something pinned. So, no need to apologise!

It would be good to have something written up though:

- How to add a language

- How to translate, including links to tools

- How to proofread

- How to test

- How to submit translations (Can translations be grabbed automatically from Transifex?)

I vote for restricting translations to team members on Transifex, so we can have some quality control. So, team coordinators can decide who has the necessary experience/skills to proofread.

We could keep this thread to discuss how we want to do things, and the new thread to help with the actual translation work.

Edited by GunChleoc
Link to comment
Share on other sites

All that would be great once the implementation is in place. That is, it does not make much sense to document the internationalization and localization process does not apply to master but to an external Git repository. For instance, the Transifex project will be different once/if using Transifex is approved by the development team. Also, I don't think the forums are the right place to write documentation, the right place for that would be the wiki, and I'm not sure if it's OK to write documentation about this just yet (although I should probably ask).

Translation is a different story. Anything you translate now in the current Transifex project you can keep latter.

Now some answers:

- How to add a language: Request a translation team in the Trasifex 0 A.D. (unofficial) project.

- How to translate, including links to tools: Transifex documentation.

- How to proofread: General translation documentation.

- How to test: Check out my github repository, i18n-gallaecio branch, and follow the 0 A.D. documentation to build and run the game.

- How to submit translations (Can translations be grabbed automatically from Transifex?) - They can, using Transifex's command-line client: http://support.transifex.com/customer/portal/topics/440187-transifex-client/articles (however, this is more interesting for developer once everything is in place, translators can just place the files for their language in the right folders manually).

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...