Skip to content

Instantly share code, notes, and snippets.

View eksortso's full-sized avatar

Dave Ostroske eksortso

View GitHub Profile
@eksortso
eksortso / codepoint-equiv.py
Created December 13, 2022 22:49
Script illustrating that Python 3.8 (through 3.11 at least) compares strings by comparing code points. Shows NFC and NFD can be used to normalize the strings for Unicode equivalence.
#!/usr/bin/env python
"""Illustration of Unicode equivalence in Python 3.8+"""
from unicodedata import normalize
# Both variables represent "Françoise" but use different forms.
print("# Code")
# print(r'n1 = "pr\u00e9nom"') # prénom
# print(r'n2 = "pr\u0065\u0301nom"') # prénom
print(r'n1 = "Fran\u00e7oise"') # Françoise