Skip to content

Instantly share code, notes, and snippets.

@znz
Last active December 27, 2015 01:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save znz/7244868 to your computer and use it in GitHub Desktop.
Save znz/7244868 to your computer and use it in GitHub Desktop.
uniq で丸数字が同一視されてしまう?
$ cat n.txt
$ uniq n.txt
$ sort n.txt
$ tac n.txt | sort
$
$ cat n.txt
$ LANG=ja_JP.utf8 uniq -c n.txt
2 ①
$ LANG=C uniq -c n.txt
1 ①
1 ②
$ uniq --version
uniq (GNU coreutils) 8.20
Copyright (C) 2012 Free Software Foundation, Inc.
ライセンス GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
作者 Richard M. Stallman および David MacKenzie。
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 13.04
Release: 13.04
Codename: raring
$
@znz
Copy link
Author

znz commented Oct 31, 2013

apt-get source locales でソースをとってきて locales/ja_JPLC_COLLATE から END LC_COLLATE の間に書いていないコードポイントは同一視されているように見える。

@znz
Copy link
Author

znz commented Oct 31, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment