Skip to content

Instantly share code, notes, and snippets.

@samcv
Created May 11, 2017 01:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save samcv/4c17eb4d7a0b17526a0a175667adb1cf to your computer and use it in GitHub Desktop.
Save samcv/4c17eb4d7a0b17526a0a175667adb1cf to your computer and use it in GitHub Desktop.
NAME
Unicode GraphemeBreakTest
DESCRIPTION
Unicode Data files in 3rdparty/Unicode/ and the snippet of commented code
below are under SPDX-License-Identifier: Unicode-DFS-2016 See
3rdparty/Unicode/LICENSE for full text of license. From
GraphemeBreakTest.txt Unicode 9.0
# Default Grapheme Break Test
#
# Format:
# <string> (# <comment>)?
# <string> contains hex Unicode code points, with
# ÷ wherever there is a break opportunity, and
# × wherever there is not.
# <comment> the format can change, but currently it shows:
# - the sample character name
# - (x) the Grapheme_Cluster_Break property value for the sample character
# - [x] the rule that determines whether there is a break or not
HOW TO FUDGE
The keys of the hash below are line numbers of the unicode test document. values are either set to ALL or set to one or more of C,0,1,2,3,4..
Example:
* not ok 2384 - Line 835: grapheme [1] has correct codepoints
You can add 835 => ['1'] to the hash and it will fudge that line for you
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment