Skip to content

Instantly share code, notes, and snippets.

@cristianoc
Last active May 6, 2021 23:43
Show Gist options
  • Save cristianoc/bb66a3ae569afaccb5576717688d3532 to your computer and use it in GitHub Desktop.
Save cristianoc/bb66a3ae569afaccb5576717688d3532 to your computer and use it in GitHub Desktop.

A theory of unicode

Given

  • CP: a set of codepoints
  • GLP: a subset of CP+ (non-empty sequences)

With the property:

(1) if x in GLP then x+y not in GLP (for all x,y in CP+)

Define:

STR: smallest subset of CP* such that

  • empty in STR
  • if s in STR and x in GLP then x+s in STR
Theorem: If s in STR then there exist unique n, and x1, ..., xn in GLP such that s = x1 + ... + xn. We call n the length of s.

In-place Udate:

If s1 + x1 + ... + xn + s2 in STR and |y1 + ... + ym| = |x1 + ... + xn| then s1 + y1 + ... + ym + s2 in STR. (for s1, s2 in S and x1, ..., xn and y1, ..., ym in GLP).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment