Skip to content

Instantly share code, notes, and snippets.

@nozma
Last active December 6, 2018 10:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nozma/dbc0f1ae0951b075722b3fb2bf78c06e to your computer and use it in GitHub Desktop.
Save nozma/dbc0f1ae0951b075722b3fb2bf78c06e to your computer and use it in GitHub Desktop.
Rのソートとlocaleについて ref: https://qiita.com/nozma/items/4aea36022ce18a6aa5ca
## データの準備
d <- data.frame(
kana = c("お", "エ", "う", "イ", "あ"),
letter = c("A", "e", "C", "b", "Z"),
stringsAsFactors = FALSE
)
## 空気を読んだソートをしてくれる
sort(d$kana)
#> [1] "あ" "イ" "う" "エ" "お"
## colCaseFirstの指定で大文字を前にできる
stringi::stri_sort(str, locale = "en@colCaseFirst=upper")
#> [1] "A" "A" "a" "a"
## 通常のソートは辞書順
str2 <- c("A1", "A12", "A112")
stringi::stri_sort(str2, locale = "en")
#> [1] "A1" "A112" "A12"
## 数字の大きさを考慮したソート(自然順ソート)ができる
stringi::stri_sort(str2, locale = "en@colNumeric=yes")
#> [1] "A1" "A12" "A112"
Sys.setlocale(locale = "C")
sort(d$kana)
#> [1] "\343\201\202" "\343\201\206" "\343\201\212" "\343\202\244"
#> [5] "\343\202\250"
sorted <- sort(d$kana)
Sys.setlocale(locale = "ja_JP")
sorted
#> [1] "あ" "う" "お" "イ" "エ"
withr::with_locale(c("LC_COLLATE" = "C"), sort(d$kana))
#> [1] "あ" "う" "お" "イ" "エ"
withr::with_collate("C", sort(d$kana))
#> [1] "あ" "う" "お" "イ" "エ"
stringi::stri_sort(d$kana, locale = "C")
#> [1] "あ" "イ" "う" "エ" "お"
## 通常の空気読んだソート
stringi::stri_sort(d$letter, locale = "en_EN")
#> [1] "A" "b" "C" "e" "Z"
## locale = "C"っぽさがあるソート
stringi::stri_sort(d$letter, locale = "C")
#> [1] "A" "C" "Z" "b" "e"
## locale="en"のデフォルトのソートは小文字が前に出る
str <- c("a", "A", "a", "A")
stringi::stri_sort(str, locale = "en")
#> [1] "a" "a" "A" "A"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment