Skip to content

Instantly share code, notes, and snippets.

@Spindel
Last active August 29, 2015 14:23
Show Gist options
  • Save Spindel/8b2f2b5bbba10c93ca18 to your computer and use it in GitHub Desktop.
Save Spindel/8b2f2b5bbba10c93ca18 to your computer and use it in GitHub Desktop.
Python3 locale inconsistencies
#!/bin/env python
# vim: ts=4 sts=4 sw=4 ft=python expandtab fileencoding=utf8 :
""" Python2 and Python3 differ on how they handle input and output
files depending on the current locale.
Python2: behaves consistently(but badly), reading and writing utf8
as if it was ascii. Successfully giving back the same byte sequence
as was put in. (Aka. The same string)
Python3: Tests if "utf8" is in your LANG environment, and that
`setlocale` for the selected setting works,
otherwise falls back to "ascii"
Python3 will also behave differently on Windows and OS X.
OS X hard coded to utf8, windows doing... Something else, entirely.
That only makes this harder to detect until it bites you in the face."""
try:
import ConfigParser as configparser
except ImportError:
import configparser
import locale
BASE="räksmörgås"
FNAME="testfile.ini"
config = configparser.SafeConfigParser()
config.add_section(BASE)
config.set(BASE, "testing", "With some unicode")
with open(FNAME, "w") as f:
config.write(f)
config = configparser.SafeConfigParser()
with open(FNAME, "r") as f:
config.readfp(f)
@Spindel
Copy link
Author

Spindel commented Jun 23, 2015

To show the different behaviours, run it with

LANG=en_GB.UTF-8 LC_ALL=en_GB.UTF-8 python2 test.py
LANG=C LC_ALL=C python2 test.py

LANG=en_GB.UTF-8 LC_ALL=en_GB.UTF-8 python3 test.py
LANG=C LC_ALL=C python3 test.py 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment