Multilanguage game using GNU gettext in C#

I’m currently creating a video game in C#/Monogame. It started off in Czech, but I wanted it to be available in Czech and English, so I set out to add multilanguage support to the game.

In this article, I’ll explain why I chose gettext and how I added it to my C# game.

Motivation

I knew I wanted this:

  • Translate UI elements such as buttons
  • Translate names and descriptions of in-game items
  • Translate dialogue lines

There are few translation systems that handle all three points well.

The first two lend themselves to systems where the source code contains an identifier and you have configuration files (Windows .ini, XML, JSON, whatever) for each language that assign translations to these identifiers. For example, the source code could have “DrawString(warnInsufficientGold)” and the configuration file would have a line with ‘warnInsufficientGold : “You don’t have enough enough gold.”’.

But this is a problem for dialogue lines. A game can have thousands of lines of dialogue and it’s a lot of useless effort to create identifiers for each line. They’d be confusing, too. This doesn’t read well:

AddDialogue(namePaladin, paladinHello);
AddDialogue(nameCivilian, civilianAsksForHelp);
AddDialogue(namePaladin, paladinOffersHelp);

But this is better:

AddDialogue(“Arthur”, “Greetings, citizen. What troubles you?”);
AddDialogue(“Civilian”, “There’s… ogres! They’ve entered the village!”);
AddDialogue(“Arthur”, “Say no more, and point the way! My steed and I will come to your aid.”);

The other approach would be to use some domain-specific language for the dialogue, some kind of script files. The script file would contain dialogue, or potentially even dialogue trees, as pure text, with no identifiers. Reading the script file would allow you to understand the dialogue, and would be quick to write.

But then either each language would have to have a separate script file, in which case synchronization might be difficult, or all languages would be in the same script file, and it wouldn’t be easy to read and edit (without specialized tools).

So I chose the gettext approach which contains one translation (the canonical translation) in the source code and for each other language, you maintain a translation file that matches the canonical translation to that language. In the source code, all translatable strings are passed to some function, traditionally named gettext or _ (underscore), but any name can be used. So then the gold warning example would be:

DrawString(_(“You don’t have enough gold.”));

and the dialogue example would be:

AddDialogue(_(“Arthur”), _(“Greetings, citizen. What troubles you?”));
AddDialogue(_(“Civilian”), _(“There’s… ogres! They’ve entered the village!”));
AddDialogue(_(“Arthur”), _(“Say no more, and point the way! My steed and I will come to your aid.”));

How to use gettext with C#

Here’s how gettext works:

1. Translatable strings. First, you choose the method names that count as gettext or _. For my game, I chose the name “T” because it’s quick to type and doesn’t require the use of the Shift key as an underscore does. You can choose several method names.

In C#, methods need to be in classes, so I’ll create a class named “G” (so that it would be as short as possible and reasonably close to T on a keyboard). Then wherever I need a translatable string, I use G.T(“English here…”). It stands for GetText :).

2. Create the G.T class. In the G class, you must load the translation file based on some logic (perhaps based on what the user chose in the Options menu) and in the G.T method, you must return the string from that translation. Gettext keeps machine-readable translations in .mo files which you can load into C# with some NuGet packages. I used GetText.NET and you can find information on how to create your G.T class with GetText.NET in their readme.

Now, you can change the strings in your code into G.T translatable strings.

3. Extract strings from source code. Then, each time you want to update the translations because you changed or added new translatable strings in your source code, you run the xgettext utility. I got it by downloading the NuGet package GetText.Tools, going into my local nuget cache and copying them from there to my %PATH%, but you can also download the binaries online.

As arguments to xgettext.exe, you pass the method names, the list of source files that you want translatable strings to be extracted from, and the name of the resulting .pot file which will contain the canonical translation.

Here’s my command line:

xgettext.exe -kT –from-code=utf-8 -o SeekAWayOut.pot –files-from=~allCSharpFiles.txt

where ~allCSharpFiles.txt is a generated list of all C# files in my repository and T is the method name I chose. xgettext then scans all of those files (using, I assume, regexes) and finds all methods and constructors with the name T (regardless of what class they are in) and considers the first argument of those methods to be the translatable string.

The source code is thus your authoritative canonical translation and the fallback if there are no translations.

4. Get the language file to work on. The first time you add a new language, you copy the template .pot file into a new file named, say, slovak.po for the Slovak translation, and you can then edit that file in an editor such as Poedit.

If you already have a slovak.po file, perhaps because you already translated the game but have now added/changed translatable strings, you will want to merge the new changes into your existing slovak.po file. You can do that from poedit as well (Catalogue – Update from POT file) or with the GNU command line utility msgmerge.

5. Translate the text. I’m using Poedit as my translation tool. I find it pretty, fast to work with, stable and intuitive.

Poedit’s free edition is very reasonable. It’s not to be confused with Poeditor, which comes from a different author and has a very limited free edition.

6. Create the .mo file. From your translated slovak.po file, you can create a sorta-binary “machine file” called slovak.mo, again using Poedit, or with the command-line utility msgfmt. Poedit creates the file automatically each time you save. This file contains the same data as the .po file, except, I assume, it lacks comments and is somewhat faster to read for a computer.

6. Have the .mo file read by your game. Coming back to the beginning, this .mo file can now be read from disk or assembly resources or anywhere you want by your runtime .mo file reader, such as GetText.NET.

And you’re done!

Conclusion

I think that gettext is a good way to write translatable multi-language programs for many use cases, especially in games or where there is a lot of strings (as opposed to, for example, software libraries where you only really need to translate error messages).

I found it difficult at first to wrap my head around some concepts, especially where does the _ method come from, and how does gettext extract the translatable strings from C# files (using text scanning and the xgettext utility), so I wrote this blog post so that others find this easier.

My experience with gettext is positive and I can recommend it to you as well.

Autor

Petr Hudeček

Ahoj, já jsem webmaster tohoto webu.

Napsat komentář

Vaše emailová adresa nebude zveřejněna.