Skip to content

Instantly share code, notes, and snippets.

@Danack
Last active November 2, 2015 16:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Danack/703010ec1e855684296e to your computer and use it in GitHub Desktop.
Save Danack/703010ec1e855684296e to your computer and use it in GitHub Desktop.
translating stuff.

Translation in code - How to not fuck it up.

Translation is one of those things that sounds trivial to do, but can be fucked up if not done properly. The approach below is the right one for small to medium sized projects. It might not be appropriate for enterprise level projects, where the same text needs to be used across multiple projects.

What the correct approach looks like

function translate($text, $values = [], $languageID) {}

where:

$text is a string identifier of the text to be translated. Any parameters are % escaped, or other similar named simple scheme. $values is an associative array of the params to be used in the translation. languageID is the desired output language.

if the $text is unknown, any values are missing, or there is no translation avaiable for the requested langauge, an exception should be thrown.

Example usage

   $params = ['CREDITS_LEFT' => 25];
   translate("You have %CREDITS_LEFT% left.", $params, Language::English);

For ease of use in templates, this can be encapsulated into

$tn = function ($text, $params = []) use ($user) {
    $langauge = null;
    if ($langauge === null) {
        $langauge = $user->getLangauge()
   }  

   return translate($text, $params, )
}

When the translators do the translation the placeholders should be left e.g. "Имате %CREDITS_LEFT% кредита останалите." This makes the work flow for translators be sane, as there is no ambiguity about what parameter order is going to be used.

Translators also need to be able to specify different piece of text to be used for different value of parameters.

i.e. when %CREDITS_LEFT% == 0 "You have no credits left" else "You have %CREDITS_LEFT% credits left"

Annoyingly some languages need more advanced text selection. I suggest looking at the Symfony/Twig doc for the rules they have implemented.

Why is it right to use strings as the identifiers rather than consts or TextIDs

Using strings as the identifiers allows the programmer who is using the transltion library to see how the string will be used more clearly than a const like LANGAUGE_CREDITS_REMAINING would do.

Additionally it allows for strings to be added without a PHP file being edited. It should be possible for a frontend HTML monkey to be able to add a new piece of text to a webpage without getting a backend developer involved in the process.

For the record, I believe that templates should be unit-tested completely, and my templating library (www.github.com/danack/jig) easily supports that. Once you have the templates being unit-tested in every language your app supports, there is no longer any benefit in using const IDs, only the downsides remain.

Why put the parameters in associative arrays? aka Why not extra parameters or stuff like "You have $1 left"

Putting the required parameters in an associative array removes a whole class of bugs where a programmer puts the parameters in the wrong order.

It also makes life waaaaaaaaaaaaaaaaaaay easier for people who are actually doing the translations. There are multiple langauges where the order of paramters change in a translation. It also allows smooth migration from one piece of text to another. For example, say we have a bit of text:

"You have used %PERCENTAGE_USED% of your credits" And want to change it to: "You have %PERCENTAGE_REMAINING% of your credits remaining"

You'll almost certainly get the English version done straight away (or whatever your native langauge is). For the other languages it can take days or weeks for the translation to be done. You don't want to have your release process dependent on getting those translations done.

Having a separate value object for each bit of text that needs translating would work just as well, e.g. new CreditsLeftParams(25); - but that would be a massive pain in the arse creating one object per bit of text translated. I probably would recommend doing this for an enterprise level solution.

Why always require the language needed?

Although the vast majority of the text in an application will be for the current user, you will have cases where the langauge will be different. For example on a social site where someone sends a 'friend' request to another user, any text generated should be for recipient user, not the sending one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment