Skip to content

Instantly share code, notes, and snippets.

@mcdurdin
Last active May 26, 2020 06:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mcdurdin/eff65bb2d77034525120d48cbe280d30 to your computer and use it in GitHub Desktop.
Save mcdurdin/eff65bb2d77034525120d48cbe280d30 to your computer and use it in GitHub Desktop.
{
"$schema": "http://json-schema.org/schema#",
"$ref": "#/definitions/langtags",
"definitions": {
"langtags": {
"type": "array",
"items": { "oneOf": [
{"$ref": "#/definitions/langtag"},
{"$ref": "#/definitions/_globalvar"},
{"$ref": "#/definitions/_phonvar"},
{"$ref": "#/definitions/_version"}
] },
"additionalItems": false
},
"langtag": {
"type": "object",
"properties": {
"tag": {
"$ref": "#/definitions/bcp47"
},
"full": {
"type": "string"
},
"tags": {
"type": "array",
"items": { "$ref": "#/definitions/bcp47" },
"additionalItems": false
},
"variants": {
"type": "array",
"items": { "$ref": "#/definitions/bcp47_variant" },
"additionalItems": false
},
"iso639_3": {
"$ref": "#/definitions/iso639_3"
},
"region": {
"$ref": "#/definitions/iso3166_1"
},
"regions": {
"type": "array",
"items": { "$ref": "#/definitions/iso3166_1" },
"additionalItems": false
},
"regionname": {
"type": "string"
},
"iana": {
"oneOf": [
{ "type": "string" },
{ "type": "array", "items": {"type":"string"} }
]
},
"name": {
"type": "string"
},
"names": {
"type": "array",
"items": { "type": "string" }
},
"localname": {
"type": "string"
},
"sldr": {
"type": "boolean"
},
"nophonvars": {
"type": "boolean"
},
"script": {
"$ref": "#/definitions/iso15924"
},
"localnames": {
"type": "array",
"items": { "type": "string" }
},
"latnnames": {
"type": "array",
"items": { "type": "string" }
},
"suppress": {
"type": "boolean"
},
"windows": {
"$ref": "#/definitions/bcp47"
},
"rod": { "type": "string"}
},
"required": ["full"],
"additionalProperties": false
},
"_globalvar": {
"type": "object",
"properties": {
"tag": {
"type": "string",
"const": "_globalvar"
},
"variants": {
"type": "array",
"items": { "type": "string" },
"additionalItems": false
}
},
"required": ["tag", "variants"],
"additionalProperties": false
},
"_phonvar": {
"type": "object",
"properties": {
"tag": {
"type": "string",
"const": "_phonvar"
},
"variants": {
"type": "array",
"items": { "type": "string" },
"additionalItems": false
}
},
"required": ["tag", "variants"],
"additionalProperties": false
},
"_version": {
"type": "object",
"properties": {
"tag": {
"type": "string",
"const": "_version"
},
"api": {
"type": "string",
"pattern": "^\\d\\.+\\d\\.+\\d+$"
},
"date": {
"type": "string",
"pattern": "^\\d+-\\d+-\\d+$"
}
},
"required": ["tag", "api", "date"],
"additionalProperties": false
},
"bcp47": {
"type": "string",
"pattern": "^(((en-GB-oed|i-ami|i-bnn|i-default|i-enochian|i-hak|i-klingon|i-lux|i-mingo|i-navajo|i-pwn|i-tao|i-tay|i-tsu|sgn-BE-FR|sgn-BE-NL|sgn-CH-DE)|(art-lojban|cel-gaulish|no-bok|no-nyn|zh-guoyu|zh-hakka|zh-min|zh-min-nan|zh-xiang)|(brv-(Thai|TH)-x-(dongluang|khongchiem|sakonnakon)|cek-(Latn-)?(MM-)?x-asangkhongso|cek-(Latn-)?(MM-)?x-khawngtuu|dao-(Latn-)?(MM-)?x-khengdaai|1901|1996|dgl-(Copt-)?(SD-)?x-oldnubian|ers-(Zzzz-)?(CN-)?x-ersushaba|fia-(Copt-)?(SD-)?x-oldnubian|mnc-(Mong-)?(CN-)?x-oldmanchu|nst-(Latn-)?(MM-)?x-moshanghawa|onw-(Copt-)?(SD-)?x-oldnubian|sgn-(Zxxx-)?MY-(Zxxx-)?MM|sgn-MY-Zxxx|sgn-Zxxx-MY-mm|tew-(Latn-)?(US-)?x-santaclara|tzo-(Latn-)?(MX-)?x-sanandres|tzo-(Latn-)?(MX-)?x-zinacantan|xnz-(Copt-)?(EG-)?x-oldnubian))|((([A-Za-z]{2,3}(-([A-Za-z]{3}(-[A-Za-z]{3}){0,2}))?)|[A-Za-z]{4}|[A-Za-z]{5,8})(-([A-Za-z]{4}))?(-([A-Za-z]{2}|[0-9]{3}))?(-([A-Za-z0-9]{5,8}|[0-9][A-Za-z0-9]{3}))*(-([0-9A-WY-Za-wy-z](-[A-Za-z0-9]{2,8})+))*(-(x(-[A-Za-z0-9]{1,8})+))?)|(x(-[A-Za-z0-9]{1,8})+))$"
},
"bcp47_variant": {
"type": "string",
"pattern": "^([0-9][a-zA-Z0-9]{3,8})|([a-zA-Z][a-zA-Z0-9]{4,8})$"
},
"iso639_3": {
"type": "string",
"pattern": "^[a-z]{3}$"
},
"iso3166_1": {
"type": "string",
"pattern": "^([A-Z]{2})|(\\d\\d\\d)$"
},
"iso15924": {
"type": "string",
"pattern": "^[A-Z]([a-z]{3})$"
}
}
}
@mcdurdin
Copy link
Author

Please note: the bcp47 pattern:

"pattern": "^(((en-GB-oed|i-ami|i-bnn|i-default|i-enochian|i-hak|i-klingon|i-lux|i-mingo|i-navajo|i-pwn|i-tao|i-tay|i-tsu|sgn-BE-FR|sgn-BE-NL|sgn-CH-DE)|(art-lojban|cel-gaulish|no-bok|no-nyn|zh-guoyu|zh-hakka|zh-min|zh-min-nan|zh-xiang))|((([A-Za-z]{2,3}(-([A-Za-z]{3}(-[A-Za-z]{3}){0,2}))?)|[A-Za-z]{4}|[A-Za-z]{5,8})(-([A-Za-z]{4}))?(-([A-Za-z]{2}|[0-9]{3}))?(-([A-Za-z0-9]{5,8}|[0-9][A-Za-z0-9]{3}))*(-([0-9A-WY-Za-wy-z](-[A-Za-z0-9]{2,8})+))*(-(x(-[A-Za-z0-9]{1,8})+))?)|(x(-[A-Za-z0-9]{1,8})+))$"

should not include the following:

|(brv-(Thai|TH)-x-(dongluang|khongchiem|sakonnakon)|cek-(Latn-)?(MM-)?x-asangkhongso|cek-(Latn-)?(MM-)?x-khawngtuu|dao-(Latn-)?(MM-)?x-khengdaai|1901|1996|dgl-(Copt-)?(SD-)?x-oldnubian|ers-(Zzzz-)?(CN-)?x-ersushaba|fia-(Copt-)?(SD-)?x-oldnubian|mnc-(Mong-)?(CN-)?x-oldmanchu|nst-(Latn-)?(MM-)?x-moshanghawa|onw-(Copt-)?(SD-)?x-oldnubian|sgn-(Zxxx-)?MY-(Zxxx-)?MM|sgn-MY-Zxxx|sgn-Zxxx-MY-mm|tew-(Latn-)?(US-)?x-santaclara|tzo-(Latn-)?(MX-)?x-sanandres|tzo-(Latn-)?(MX-)?x-zinacantan|xnz-(Copt-)?(EG-)?x-oldnubian)

which are the special cases to get the existing codes to pass. I felt it was important to validate the BCP-47 codes for structure as closely as possible given this is kinda a master list. The regex comes from: https://stackoverflow.com/a/7036171/1836776 and in the past I have compared it to and tested it fairly extensively against the standard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment