Skip to content

Instantly share code, notes, and snippets.

@kohske
Created March 5, 2014 15:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kohske/9369253 to your computer and use it in GitHub Desktop.
Save kohske/9369253 to your computer and use it in GitHub Desktop.
# ↓CP932
> txt <- "あa"
> txt
[1] "あa"
> Encoding(txt)
[1] "unknown"
> charToRaw(txt)
[1] 82 a0 61
# subでマッチがあると
> x <- sub("a", "z", txt)
> x
[1] "あz"
# ↓UTF-8にマークングされて
> Encoding(x)
[1] "UTF-8"
# 中身もUTF-8になってるし。
> charToRaw(x)
[1] e3 81 82 7a
# 同じことして、マッチがないと
> y <- sub("b", "z", txt)
> y
[1] "あa"
# ↓マーキングなし
> Encoding(y)
[1] "unknown"
# ↓CP932のまま
> charToRaw(y)
[1] 82 a0 61
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: i386-w64-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=Japanese_Japan.932 LC_CTYPE=Japanese_Japan.932
[3] LC_MONETARY=Japanese_Japan.932 LC_NUMERIC=C
[5] LC_TIME=Japanese_Japan.932
attached base packages:
[1] stats graphics grDevices utils datasets methods base
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment