Skip to content

Instantly share code, notes, and snippets.

@tripodsan
Created April 24, 2019 07:40
Show Gist options
  • Save tripodsan/69ed8ba5323ceaf583c4bbaf19f86e24 to your computer and use it in GitHub Desktop.
Save tripodsan/69ed8ba5323ceaf583c4bbaf19f86e24 to your computer and use it in GitHub Desktop.
wrong github getcontents

first I create a file with unicode characters. my terminal is set to utf8, so it stores it as such:

$ echo "日本" > utf8-test.txt
$ cat utf8-test.txt
日本
$ file utf8-test.txt
utf8-test.txt: UTF-8 Unicode text

then I convert it to uft32

$ iconv -f utf8 -t utf32 utf8-test.txt -o utf32-test.txt
$ cat utf32-test.txt
���e,g
$ iconv -f utf32 utf32-test.txt -t utf8
日本
$ xxd utf32-test.txt
00000000: fffe 0000 e565 0000 2c67 0000 0a00 0000  .....e..,g......
ls -al utf8-test.txt utf32-test.txt
-rw-rw-r-- 1 user user 16 Apr 19 10:25 utf32-test.txt
-rw-rw-r-- 1 user user  7 Apr 19 10:25 utf8-test.txt

then I add it to git and push it to github

$ git add utf8-test.txt utf32-test.txt
$ git ci -m"test"
$ git push

when I fetch the raw content, the files behave as expected

$ curl -s https://raw.githubusercontent.com/tripodsan/hlxtest/master/.github/utf8-test.txt
日本
$ curl -s  --output - https://raw.githubusercontent.com/tripodsan/hlxtest/master/.github/utf32-test.txt
���e,g
$ curl -s  --output - https://raw.githubusercontent.com/tripodsan/hlxtest/master/.github/utf32-test.txt | iconv -f utf32 -t utf8
日本

but when I fetch the file via the API, the utf32 one makes trouble:

$ curl -s https://api.github.com/repos/tripodsan/hlxtest/contents/.github/utf8-test.txt | jq -r .content
5pel5pysCg==
$ curl -s https://api.github.com/repos/tripodsan/hlxtest/contents/.github/utf8-test.txt | jq -r .content  | openssl base64 -d
日本
$ curl -s https://api.github.com/repos/tripodsan/hlxtest/contents/.github/utf32-test.txt | jq -r .content | openssl base64 -d
���e,g
$ curl -s https://api.github.com/repos/tripodsan/hlxtest/contents/.github/utf32-test.txt | jq -r .content | openssl base64 -d | iconv -f utf32 -t utf8
iconv: illegal input sequence at position 0
$ curl -s https://api.github.com/repos/tripodsan/hlxtest/contents/.github/utf32-test.txt | jq -r .content | openssl base64 -d | xxd
00000000: efbf bdef bfbd 0000 efbf bd65 0000 2c67  ...........e..,g
00000010: 0000 0a00 0000                           ......
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment