Skip to content

Instantly share code, notes, and snippets.

@SoniEx2
Created November 11, 2017 20:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save SoniEx2/a36fd94c0424b8e0fc98974ce42545c6 to your computer and use it in GitHub Desktop.
Save SoniEx2/a36fd94c0424b8e0fc98974ce42545c6 to your computer and use it in GitHub Desktop.

UTF-16LLE

There are 4 forms of UTF-16:

  1. UTF-16BE
  2. UTF-16LE
  3. UTF-16LBE
  4. UTF-16LLE

The first two are described by the Unicode consortium. The latter 2 are described on this document.

Differences between UTF-16LE and UTF-16LLE

The easiest way to explain UTF-16LLE is to describe its differences.

For documents that only use codepoints in the BMP region, UTF-16LE and UTF-16LLE are exactly the same.

For documents that use codepoints in the upper planes, the only difference is that the order of surrogate pairs is swapped, such that the lower surrogate comes first, and the upper surrogate comes second.

Differences between UTF-16BE and UTF-16LBE

As with UTF-16LLE, the same principle applies: the only change is the order in which surrogate pairs appear.

License

Copyright (C) 2017 Soni L.
All rights reserved.

This document may be freely shared, copied, and distributed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment