Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pessom/b6c8c4d55296e5439403ae3cc942fbbc to your computer and use it in GitHub Desktop.
Save pessom/b6c8c4d55296e5439403ae3cc942fbbc to your computer and use it in GitHub Desktop.
quick example of encoding and decoding a international domain name in Python (from Unicode to Punycode or IDNA codecs and back). Pay attention to the Unicode versus byte strings
# INCORRECT! DON'T DO THIS!
>>> x = "www.Alliancefrançaise.nu" # This is the problematic line. Forgot to make this a Unicode string.
>>> print x
www.Alliancefrançaise.nu
>>> x.encode('punycode')
'www.Alliancefranaise.nu-h1a31e'
>>> x.encode('punycode').decode('punycode')
u'www.Alliancefran\xc3\xa7aise.nu'
>>> print x.encode('punycode').decode('punycode')
www.Alliancefrançaise.nu
>>> print x
www.Alliancefrançaise.nu
>>> x == x.encode('punycode').decode('punycode')
/usr/bin/ipython:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
#!/usr/bin/env python
False
# CORRECT FOR PUNYCODE (ALMOST THE BEST):
>>> x = u"www.Alliancefrançaise.nu" # The difference! The Unicode string (decoded) string must be Unicode type
>>> print x
www.Alliancefrançaise.nu
>>> x.encode('punycode')
'www.Alliancefranaise.nu-dbc'
>>> x.encode('punycode').decode('punycode')
u'www.Alliancefran\xe7aise.nu'
>>> print x.encode('punycode').decode('punycode')
www.Alliancefrançaise.nu
>>> x == x.encode('punycode').decode('punycode')
True
# BEST ('idna' is preferable to 'punycode', see http://en.wikipedia.org/wiki/Punycode and https://docs.python.org/2/library/codecs.html#module-encodings.idna ) :
>>> x = u"www.Alliancefrançaise.nu"
>>> print x
www.Alliancefrançaise.nu
>>> x.encode('idna')
www.xn--alliancefranaise-npb.nu
>>> x.encode('idna').decode('idna')
u'www.alliancefran\xe7aise.nu'
>>> print x.encode('idna').decode('idna')
www.alliancefrançaise.nu
>>> x == x.encode('idna').decode('idna')
True
@gene1wood
Copy link

Thanks! Works great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment