Skip to content

Instantly share code, notes, and snippets.

Created June 22, 2012 00:02
Show Gist options
  • Save gabrielhurley/2969337 to your computer and use it in GitHub Desktop.
Save gabrielhurley/2969337 to your computer and use it in GitHub Desktop.
Translation, Internationalization and Localization in OpenStack

Translation, Internationalization and Localization in OpenStack

OpenStack is committed to broad international support, and as such there must be an ongoing concern with making OpenStack usable for all audiences. This includes proper use of internationalization and localization tools by developers, and high-quality translations for both user-facing messages and documentation.

Let's start with a working definition: translation is the act of taking the written materials in one language and converting them into another language in the most meaningful way possible. In terms of OpenStack, translation happens on both the written documentation and on strings marked for translation in the projects' codebases.


For information on how to prepare your code or documentation for translation, see the section on internationalization below.

OpenStack has adopted Transifex as its translation management platform of choice. Transifex's overall selection of management tools, features, community support, solid GitHub integration and proven track record made it the winning candidate in the end.

The following sections should introduce you to the core concepts for contributing translations to OpenStack.

The OpenStack Project Hub is the starting point for OpenStack on Transifex. It tracks all the registered OpenStack projects and allows for coordinated management and shared translation capabilities (such as shared language teams and shared translation memory).

At present the OpenStack project is a "free for all" where anyone can contribute any translation without officially joining a translation team, so feel free to jump right in and start contributing.

Within the OpenStack project hub, each individual project can manage its own translation resources. Setting up new projects is fairly simple, and the goal is to have all core OpenStack projects fully integrated into the internationalization process.

Translation is most efficiently done right on Transifex's site. You don't need to download any files or applications to get started.

If your language already exists, simply click on it in the list of available langauges for OpenStack, and then click on the name of a project resource which you'd like to start working on. In the modal dialog that appears, click the "Translate Now" button to begin working.

If the language you wish to contribute to doesn't already exist, you'll need to navigate to a specific project's resource files to click the "add language" button. Clicking the "Explore" link in the navigation bar, switching to the "Projects" tab, and filtering on the tag "openstack" will get you a list of all the available OpenStack projects. From the project you wish to work on, click the "Resources" tab on the project page, select a resource (if there's more than one), and then click the "add language" button on that page.

Once you're satisfied with your translations, click the "Save all" or "Save and exit" button to finalize your contributions.

If you wish to download the translation files (.po files) you can do so by selecting the language you're interested in, then clicking on the name of the project resource you wish to download. In the modal dialog which appears, you can use any of the "download" options depending on your use case.

One of the most challenging aspects of managing translations in an Open Source project is handling the interplay between translators and developers during the release cycle. The key piece of this equation is the "string freeze".


OpenStack's string freeze happens at the close of the final milestone in the development cycle, giving translators the entire RC period to update translations.

At a predefined time during the release cycle there will be a "string freeze", which means that after this point strings marked for translation in the codebase can no longer be changed except in the case of critical-priority bugs.

Once the string freeze is in effect, the translation files in Transifex can be assumed to be static, and translation efforts should happen in full force. This is not to say that translation can't happen all the time. But during the development process strings may change and translation efforts may end up being wasted.

Any changes during the RC period should be carefully vetted to ensure they do not alter or add translation strings, or else coordinated with translators to ensure that changes are handled appropriately.

Th OpenStack Infrastructure team has set up automatic generation of reviews for translations so that they can be re-incorporated with minimal effort at any time.

Most importantly though, immediately prior to the release of each Release Candidate, and before cutting the Final Release for each version, the translation files should be merged back into their respective projects to make sure they are properly distributed with the release.

At present it is the responsibility of each project's PTL or appointed translation manager to make sure this happens, though OpenStack's release managers, translation team coordinators, etc. are also encouraged to help ensure that this happens smoothly.

At present, changes to translations will not be backported to stable release branches. Doing so would require maintaining wholly separate copies of each set of translations and massively increases the burden on translators.

The term internationalization is used to broadly describe coding practices that allow software to be adapted to the linguistic and technical differences of various regions. This includes practices such as marking strings for translation, supporting non-ASCII character sets, etc.

For most of the OpenStack core projects (and any that use Python), the preferred tools for internationalization are gettext and babel. Getting started is pretty easy:

  1. Enable the gettext module everywhere in your application by installing it in the root file like so:

    import gettext
    gettext.install(<project name>, unicode=1)

    This makes the _() function available everywhere in your codebase as a shortcut to mark strings for translation.

  2. Configure your project to use Babel to easily create your translation files. First, add Babel to your pip-requires file (or wherever you track dependencies). Second, create a babel.cfg file in the root of your project; at it's simplest it can just contain this line:

    [python: **.py]

    Finally, add the following to your setup.cfg file:

    keywords = _ gettext ngettext l_ lazy_gettext
    mapping_file = babel.cfg
    output_file = <project name>/locale/<project name>.pot

    That will allow you to run python extract_messages and have it automatically generate the base translation resource file for your project.

  3. In your codebase, mark user-facing strings (API messages, etc.) for for translation by wrapping them with the underscore function like so:

    my_internationalized_string = _("I'm internationalized!")

    You can use internationalized strings anywhere that unicode is safe.

  4. Once you've marked all your strings for translation, you can use the aforementioned extract_messages command to generate the base translation resource file. That file can then be uploaded to Transifex to serve as the basis for your project's translation efforts.

Once the project is set up with Transifex, the OpenStack Infrastructure team can make sure that any changes to translation files are automatically pushed to Transifex in real-time.

Django has built-in internationalization tools that go well-beyond the basics of gettext to ensure proper unicode support throughout the entire codebase and to make advanced features more accessible. As such, Horizon uses Django's family of ugettext functions from django.utils.translation. It is preferrable to explicitly import the translation function you wish to use:

from django.utils.translation import ugettext, ugettext_lazy  # ..., etc.

For more information on the internationalization tools Django makes available, see the Django i18n Docs.

While developer documentation for projects can generally be maintained solely in English, user-oriented documentation such as that produced and maintained by OpenStack's Docs team is also a high-priority for translation. This includes installation and administration manuals.


For the first release this does not include API documentation. Typically these are sourced in the openstack-manuals project.

For specifics on translation of OpenStack Documentation, please refer to the OpenStack Document Translation Guide.

At present the convention is to translate all user-facing strings. This means API messages, CLI responses, documentation, help text, etc.

There has been a lack of consensus about the translation of log messages; the current ruling is that while it is not against policy to mark log messages for translation if your project feels strongly about it, translating log messages is not actively encouraged.

Exception text should not be marked for translation, becuase if an exception occurs there is no guarantee that the translation machinery will be functional.

The term localization is used more specifically than internationalization to cover coding practices that allow a software's input and output characteristics to adjust to variances in style from region to region. This includes things like number and date formatting, especially.

Going beyond What is accomplished by Internationalization, the most important aspect to consider is regional differences in formatting for dates and numbers. For example:

    04/01/2012 == April 1st, 2012 (US)
    04/01/2012 == January 4th, 2012 (UK)

    1,000.42 == One thousand and 42 hundredths (US)
    1.000,42 == One thousand and 42 hundredths (UK)

Accepting any format and naively passing it into our code would horribly break things. Accepting only one format leaves out large chunks of the world. Therefore, we use localization tools to accept these formats and normalize them into data structures Python can handle universally on input, and to convert them back to the user's expected format for display.

Another less common (for OpenStack) issue related to localization revolves around name formats, which vary culturally. The western style of "first name" and "last name" doesn't fit for many cultural naming conventions. This isn't something a software tool can account for, so for problems such as these the best solution is to simply accept the broadest range of inputs (e.g. a single "name" field).

Horizon has excellent localization tools available since it is built on top of the Django web framework. Most conversions happen automatically when the localization framework is active. Full support for a localized user dashboard experience is a high-priority feature.

Python's locale and gettext modules offer most of the tools necessary to localize a Python project with some effort. More information on this will be added in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment