Skip to content

Instantly share code, notes, and snippets.

@coldfusionPaul
Created September 25, 2012 01:16
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save coldfusionPaul/3779416 to your computer and use it in GitHub Desktop.
Save coldfusionPaul/3779416 to your computer and use it in GitHub Desktop.
coldfusion G11N

The Basics of Building a Globalized ColdFusion Application

What is Globalization?

The process of making an application ready for global usage is globalization or G11N (for the 11 letters between the G and the N in globalization). Basically globalization consists of two steps, internationalization or I18N (for the 18 letters between the I and N in internationalization) and localization or L11N (for the 10 letters between L and N in localizatio—if you're sensing a pattern here, yes there is, people working in this field are particularly fond of numeronyms). The atomic units for globalization are locales. Locales are the most important piece of G11N.

Locales

Locales are languages and calendars; date, number, and currency formatting; spelling; writing system direction; etc. that are specific to a geographic region. For instance, the English (color) and date formats (month/day/year) used in Brooklyn are not exactly the same as the English (colour) and date formats (day/month/year) used in Perth.

Since version 7, ColdFusion locales are based on core Java locales, which means the locale information concerning formatting, calendars, etc. are an integral part of the underlying Java platform and vetted by the Unicode Consortium—you do not have to research these yourself. Locale resources are accessed via locale identifiers in the form of language_Country, such as en_US for the English language used in the United States or fr_CA for the French language as used in Canada. Take note of the case, as in all things related to core Java, it matters.

Internationalization (I18N)

The process of I18N is the first step in making your application globalized. It consists of stripping away any hard-coded text and date, time and number formatting that tie it to a specific locale and replacing these with variables or functions that can dynamically return locale appropriate content or formatting. This allows for the application to be adapted to various locales without major engineering changes. Considering the amount of work this can involve it is perhaps best done from the outset rather after the fact (though its quite common for applications to be undergo G11N well after development has completed).

Localization (L10N)

Once an application has undergone I18N, the next step is making it ready for use in a given locale. This is done by translating all of the application's text into the various languages that your application will support as well as using appropriate locale-based functions to format dates, numbers, currency, etc., these function are usually prefixed by "LS" such as LSdateFormat. You can find a reference for all ColdFusion functions and tags related to globalization here: http://adobe.ly/UrOesD

Note that during the translation process, locales need to be considered as well. It helps fine tune the content to a specific region.

Simple I18N example

Lets say you have an original web application dealing with appointments and had something like the following block of code:

You have the following appointments for #dateFormat(now())#:

<ul>
<cfoutput query="todaysAppointments">
<li>#appointmentWith#: #timeFormat(appointment)#</li>
</cfoutput>
</ul>

After I18N that would look like the following:

<cfprocessingdirective pageencoding="UTF-8"/>
<cfset setLocale( session.locale) />
#appointmentsRB[session.locale].appointmentsText#
#LsdateFormat(now(),"FULL")#:

<ul>
<cfoutput query="todaysAppointments">
    <li>#appointmentWith#: #LStimeFormat(appointment,"MEDIUM")#</li>
</cfoutput>
</ul>

Where:

  • <cfprocessingdirective pageencoding="UTF-8" /> identifies the character encoding used on this page, UTF-8.

  • session.locale is a session variable holding the user's preferred locale, usually set at login or application initialization.

  • setLocale(session.locale) tells ColdFusion what locale this page will be using

  • appointmentsRB is a resource bundle (see below) as a ColdFusion structure holding the application's translated text

  • appointmentsText is the structure value holding the translated text for "You have the following appointments for"

Some things to consider when building a globalized application

Locales

Since your entire globalized application flows from a user's locale it is critical that this be captured and stored to be used throughout the application. The easiest way of capturing a user's locale is simply to ask for it. For example if your application requires registration, ask for it then. As a tip, it's usually a good idea to display locale choices in that language, French in French, Thai in Thai, etc.

If this isn't possible, a stealthier approach is to examine the user's browser-provided data such as http_accept_language which is accessible via the CGI scope. You can also supplement this by examining the user's IP address and looking up their country from one of the many online geolocation services. Note that it's a good practice though to also provide a way for a user to manually change their locale.

It's important that the user's locale be persisted and used throughout the application. A session-scoped variable is often the best choice for this. The relevant ColdFusion function for handling locales are:

  • setLocale, which tells ColdFusion to use the supplied locale identifier for the current page

  • getLocale which returns the locale ColdFusion is currently using

Character encoding/Writing systems

A character encoding is a map for each character in a language to a numeric code that can be represented in a computer. For a variety of reasons there are often more than one encoding for a language, for example Japanese is represented by Shift-JIS, EUC-JP, and ISO-2022-JP encodings. Since each encoding was basically the same set of numbers, data and application content couldn't be dynamic. For instance, in a web application, each web page had to be encoded in that language's code page—developers therefore had to develop and maintain one page per encoding the application needed to support. Data for each encoded web page had to be stored in it's own column in a table or it's own table in a database. Prior to the creation of Unicode in 1987 developers always had to climb this encoding Tower of Babel making globalized applications extremely costly and very limited in scope. Unicode allows for a single, standard encoding scheme for much of the world's languages. Since all modern databases now support it, it's the logical choice for character encoding for globalized applications. In short, just use Unicode.

A language's writing system dictates the way a language is written. There are basically three flavors:

  • left-to-right (LTR) used in Western languages such as English, French, German, etc.

  • right-to-left (RTL) used in Middle Eastern languages such as Hebrew and Arabic.

  • vertical used in some Asian languages (traditional Japanese uses tategaki , top-to-bottom, right to left format as does traditional Chinese from which it was derivedโ€”modern forms of Japanese and Chinese follow a LTR format, though some print formats still use a vertical layout)

Writing systems have an impact on an application's layout and display. It's especially important for RTL languages. Not only is the text RTL but the whole application layout needs to be RTL—visual attention is no longer in the upper-left corner but instead in the upper-right.

Resource bundles

As mentioned above, one common way to handle localized text at run time is via resource bundles. All static text in the application is replaced by variables that hold that specific text translated into the different languages that need to be supported. A resource bundle can be a simple ColdFusion structure:

<cfscript>
    appointments={};
    appointments.en_US.greeting="Hello"; //(American English)
    appointments.fr_FR.greeting="Bonjour"; //(French)
    appointments.de_DE.greeting="Hallo"; //(German)
    appointments.sv_SE.greeting="Hallå" //(Swedish)
    appointments.th_TH.greeting="สวัสดี"; //(Thai)
    appointments.ja_JP.greeting="もしもし"; //(Japanese)
</cfscript>

where the various locales in this structure could be accessed at run time by supplying a locale key (en_US, fr_FR, etc.). This approach however is very cumbersome to maintain especially as the application grows in complexity or the number of locales it supports. Another, perhaps better approach, is to use files or a database to hold the translated text but this requires advanced methodologies beyond the scope of this series and will be addressed in the advanced one.

Dates, times, calendars and timezones

Date formats are particularly vexing to many developers. For instance, a date string of 1/2/2012 in en_US locale is January 2, 2012, while in en_AU it's 1 February 2012, quite a difference if you're making an appointment. To ensure that date and time formats are correct for a user's locale use the following functions to format dates and times:

  • LStimeFormat
  • LSdateFormat

Note that it's usually a good idea to format dates and times using one of the regular masks, FULL, LONG, MEDIUM, or SHORT to ensure your application isn't breaking some cultural rule as well as to make sure the dates and times the application displays to the user are in the end parseable back to a ColdFusion date-time object via LSParsedate-time function.

ColdFusion dates are based on the Gregorian calendar which will suffice for many locales. While this is an advanced topic. it's important to note that there are several other calendars in common use globally:

  • Islamic calendar
  • Buddhist calendar
  • Chinese calendar
  • Indian calendar
  • Persian calendar

Handling non-Gregorian calendars will be covered in the advanced series.

Timezones are another bane to many developers especially if users are in one timezone while the ColdFusion server is in another. Again this is an advanced topic but developers should take note of the following points:

  • ColdFusion considers all date-times to be in the server's timezone, ColdFusion doesn't care about your intentions, just the server's timezone

  • If ColdFusion handles any date-times, these will be unmercifully converted to the server's timezone

  • The simplest option for handling timezone issues is not to store date-time objects but instead store epoch offsets such as Java's (milliseconds since 1-jan-1970)

  • Basic timezone handling is done via the getTimeZoneInfo which returns a structure holding the server's timezone information

Numbers and Currencies

Similar to date and time formats, it's important to display numbers and currencies in the locale correct format. For example, 123456789 in en_US locale would be understandable displayed as 123,456,789 to Americans. The same number would be displayed as 123 456 789 in fr_FR locale and unfamiliar to most Americans. The relevant ColdFusion functions for formatting numbers and currencies are:

  • LSnumberFormat
  • LScurrencyFormat

Note that the masks used in these function will map the dollar sign ($). decimal (.) and comma (,) to the appropriate locale symbols. Also note that the LScurrencyFormat function will not convert between currencies, it's not a function for handling exchange rates.

Databases

As mentioned above, modern databases are all Unicode-capable and shouldn't be an issue when considering which one to deploy. It's only important that any database-specific setup, datatypes, collations, etc be followed to ensure your application's database is able to fully use Unicode and perform sorting in a locale specific way.

@atuttle
Copy link

atuttle commented Sep 26, 2012

I hope you don't mind, I took the liberty of improving your Markdown here: https://gist.github.com/812187c97df322f35a71

I also hope you don't mind that I also made some very minor edits. :)

@coldfusionPaul
Copy link
Author

adam, actually thanks. i was trying to find the time to learn markdown for the code examples & any editing is welcome. and if you don'ty mind, i'll replace what's here w/you edited & fixed markdown.

@coldfusionPaul
Copy link
Author

interesting, downloading your gist turned all the unicode to mojibake.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment