Skip to content

Instantly share code, notes, and snippets.

@mahmoud
Created January 1, 2018 22:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mahmoud/de5634a005545d606a9ef6bca6165815 to your computer and use it in GitHub Desktop.
Save mahmoud/de5634a005545d606a9ef6bca6165815 to your computer and use it in GitHub Desktop.
some notes on the idna package: https://github.com/kjd/idna/

idna package notes:

  • If a segment of a host (i.e., something in url.host.split('.')) is
    already ascii, idna doesn't perform its usual checks. For instance,
    capital letters are not valid idna2008. The package automatically lowercases.

You'll get something like:

idna.core.InvalidCodepoint: Codepoint U+004B at position 1 ... not allowed

This check and some other functionality can be bypassed by passing
uts46=True to encode/decode. This allows a more permission and
convenient interface. So far it seems like the balanced approach.

However, all of this is bypassed if the string segment contains no
unicode characters.

Example output:

>>> idna.encode(u'mahmöud.io')                                                                                                          
'xn--mahmud-zxa.io'                                                                                                                     
>>> idna.encode(u'Mahmöud.io')                                                                                                          
Traceback (most recent call last):                                                                                                      
  File "<stdin>", line 1, in <module>                                                                                                   
  File "/home/mahmoud/virtualenvs/hyperlink/local/lib/python2.7/site-packages/idna/core.py", line 355, in encode                        
    result.append(alabel(label))                                                                                                        
  File "/home/mahmoud/virtualenvs/hyperlink/local/lib/python2.7/site-packages/idna/core.py", line 276, in alabel                        
    check_label(label)                                                                                                                  
  File "/home/mahmoud/virtualenvs/hyperlink/local/lib/python2.7/site-packages/idna/core.py", line 253, in check_label                   
    raise InvalidCodepoint('Codepoint {0} at position {1} of {2} not allowed'.format(_unot(cp_value), pos+1, repr(label)))              
idna.core.InvalidCodepoint: Codepoint U+004D at position 1 of u'Mahm\xf6ud' not allowed                                                 
>>> idna.encode(u'Mahmoud.io')                                                                                                          
'Mahmoud.io' 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment