/\p{..}\P{..}/u
) cheat sheet
Unicode-aware JavaScript regex (Unicode property escapes MDN
Browser support✅ Chrome 64 & Edge 79✅ Safari 11.1✅ Firefox 78✅ nodejs: 10.0✅ babel
#... | |
function gitzip() { | |
git archive -o $@.zip HEAD | |
} | |
#... gitzip ZIPPED_FILE_NAME |
# generated by Git for Windows | |
test -f ~/.profile && . ~/.profile | |
test -f ~/.bashrc && . ~/.bashrc | |
_gitzip(){ | |
CURRDATE=`date +%Y%m%d` | |
NAME=${PWD##*/} | |
ARG=$1 | |
LAST_COMMIT=$2 |
#!/bin/bash | |
## | |
## I usually put this in my .bashrc | |
## | |
## When in a git project dir, run 'gitzip foo' to get a 'foo.zip' in the parent | |
## directory. | |
gitzip() { git archive HEAD --format=zip --prefix="$*/" > ../"$*.zip"; } |
import copy | |
import datetime | |
import pickle | |
# Each quarter corresponds to the following month and day combinations: | |
_q1 = (3, 31) | |
_q2 = (6, 30) | |
_q3 = (9, 30) | |
_q4 = (12, 31) |
import icu | |
thkey = icu.Collator.createInstance(icu.Locale('th_TH')).getSortKey | |
words = 'ไก่ ไข่ ก ฮา'.split() | |
print(sorted(words, key=thkey)) # ['ก', 'ไก่', 'ไข่', 'ฮา'] |
/\p{..}\P{..}/u
) cheat sheetfrom icu import Locale, UnicodeString | |
# loc = Locale.createCanonical("haw_US") | |
loc = Locale("haw_US") | |
s1 = "ʻōlelo hawaiʻi" | |
s2 = "oude ijssel " | |
print(UnicodeString(s1).toTitle(loc)) | |
print(UnicodeString(s2).toTitle(Locale("nl_NL")).trim()) |
# -*- coding: utf-8 -*- | |
"""Example Google style docstrings. | |
This module demonstrates documentation as specified by the `Google Python | |
Style Guide`_. Docstrings may extend over multiple lines. Sections are created | |
with a section header and a colon followed by a block of indented text. | |
Example: | |
Examples can be given using either the ``Example`` or ``Examples`` | |
sections. Sections support any reStructuredText formatting, including |
$ ls -al /usr/share/locale/*/LC_COLLATE
lrwxr-xr-x 1 root wheel 29 11 Jan 18:03 /usr/share/locale/af_ZA.ISO8859-1/LC_COLLATE -> ../la_LN.ISO8859-1/LC_COLLATE
lrwxr-xr-x 1 root wheel 30 11 Jan 18:03 /usr/share/locale/af_ZA.ISO8859-15/LC_COLLATE -> ../la_LN.ISO8859-15/LC_COLLATE
lrwxr-xr-x 1 root wheel 29 11 Jan 18:03 /usr/share/locale/af_ZA.UTF-8/LC_COLLATE -> ../la_LN.ISO8859-1/LC_COLLATE
lrwxr-xr-x 1 root wheel 29 11 Jan 18:03 /usr/share/locale/af_ZA/LC_COLLATE -> ../la_LN.ISO8859-1/LC_COLLATE
lrwxr-xr-x 1 root wheel 28 11 Jan 18:03 /usr/share/locale/am_ET.UTF-8/LC_COLLATE -> ../la_LN.US-ASCII/LC_COLLATE
lrwxr-xr-x 1 root wheel 28 11 Jan 18:03 /usr/share/locale/am_ET/LC_COLLATE -> ../la_LN.US-ASCII/LC_COLLATE
-r--r--r-- 1 root wheel 2086 11 Jan 18:03 /usr/share/locale/be_BY.CP1131/LC_COLLATE
-r--r--r-- 1 root wheel 2086 11 Jan 18:03 /usr/share/locale/be_BY.CP1251/LC_COLLATE
Snippet at https://github.com/enabling-languages/python-i18n/blob/main/snippets/sort_key_normalise.py
If we take two strings that differ only in the Unicode normalisation form they use, would Python sort them the same? The strings éa (00E9 0061) and éa (0065 0301 0061) are canonically equivalent, but when we lists that only differ in the normalisation form of these two strings, we find the sort order is different.
>>> lc = ["za", "éa", "eb", "ba"]
>>> sorted(lc)
['ba', 'eb', 'za', 'éa']