Skip to content

Instantly share code, notes, and snippets.

@t-sin
Last active May 22, 2017 14:28
Show Gist options
  • Save t-sin/9557558ea67490bc10028b7f2a003ecf to your computer and use it in GitHub Desktop.
Save t-sin/9557558ea67490bc10028b7f2a003ecf to your computer and use it in GitHub Desktop.
~/code/inquisitor$ git log --oneline -n 1
322b055 BOM check (but not used yet)
~/code/inquisitor$ run-prove inquisitor-test.asd
To load "inquisitor-test":
Load 1 ASDF system:
inquisitor-test
; Loading "inquisitor-test"
1..16
✓ (:UTF-8 :UCS-2LE :UCS-2BE :UTF-16 :ISO-2022-JP :EUC-JP :CP932 :BIG5 :ISO-2022-TW :GB2312 :GB18030 :ISO-2022-CN :EUC-KR :JOHAB :ISO-2022-KR :ISO-8859-6 :CP1256 :ISO-8859-7 :CP1253 :ISO-8859-8 :CP1255 :ISO-8859-9 :CP1254 :ISO-8859-5 :KOI8-R :KOI8-U :CP866 :CP1251 :ISO-8859-2 :CP1250 :ISO-8859-13 :CP1257) is expected to be (:UTF-8 :UCS-2LE :UCS-2BE :UTF-16 :ISO-2022-JP :EUC-JP :CP932 :BIG5 :ISO-2022-TW :GB2312 :GB18030 :ISO-2022-CN :EUC-KR :JOHAB :ISO-2022-KR :ISO-8859-6 :CP1256 :ISO-8859-7 :CP1253 :ISO-8859-8 :CP1255 :ISO-8859-9 :CP1254 :ISO-8859-5 :KOI8-R :KOI8-U :CP866 :CP1251 :ISO-8859-2 :CP1250 :ISO-8859-13 :CP1257)
✓ (:LF :CR :CRLF) is expected to be (:LF :CR :CRLF)
unicode
✓ :UTF-8 is expected to be :UTF-8
✓ :UTF-16LE is expected to be :UTF-16LE
✓ :UTF-16BE is expected to be :UTF-16BE
✓ :UTF-16 is expected to be :UTF-16
japanese
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
✓ :EUC-JP is expected to be :EUC-JP
✓ :WINDOWS-31J is expected to be :WINDOWS-31J
tiwanese
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
chinese
✓ :CP936 is expected to be :CP936
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
korean
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
arabic
✓ :ISO-8859-6 is expected to be :ISO-8859-6
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
greek
✓ :ISO-8859-7 is expected to be :ISO-8859-7
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
hebrew
✓ :ISO-8859-8 is expected to be :ISO-8859-8
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
turkish
✓ :ISO-8859-9 is expected to be :ISO-8859-9
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
russian
✓ :ISO-8859-5 is expected to be :ISO-8859-5
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
polish
✓ :ISO-8859-2 is expected to be :ISO-8859-2
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
baltic
✓ :ISO-8859-13 is expected to be :ISO-8859-13
✓ :CANNOT-TREAT is expected to be :CANNOT-TREAT
end of line
✓ :UNIX is expected to be :UNIX
✓ :MACOS is expected to be :MACOS
✓ :DOS is expected to be :DOS
if specified encodings is unicode?
only unicode returns t
✓ (:UTF-8 :UCS-2LE :UCS-2BE :UTF-16) is expected to be T
✓ (:UCS-2LE :UCS-2BE :UTF-16) is expected to be T
✓ (:UCS-2BE :UTF-16) is expected to be T
✓ (:UTF-16) is expected to be T
other encodings return nil
✓ NIL is expected to be NIL
✓ NIL is expected to be NIL
✓ 16 tests completed (122ms)
Running a test file 'code/inquisitor/t/util.lisp'
1..4
with-byte-array
expansion
✓ (WITH-BYTE-ARRAY (VEC 10) (PRINT VEC) (SVREF VEC 0)) is expected to be expanded to (LET ((VEC (MAKE-ARRAY 10 :ELEMENT-TYPE '(UNSIGNED-BYTE 8)))) (PRINT VEC) (SVREF VEC 0)) (got (LET ((VEC (MAKE-ARRAY 10 :ELEMENT-TYPE '(UNSIGNED-BYTE 8)))) (PRINT VEC) (SVREF VEC 0)))
✓ 10 is expected to be 10
✓ T is expected to be T
byte-array-p
return nil
✓ T is expected to be T
✓ T is expected to be T
✓ T is expected to be T
return t
✓ T is expected to be T
✓ T is expected to be T
byte-input-stream-p
return nil
✓ T is expected to be T
✓ T is expected to be T
✓ T is expected to be T
✓ T is expected to be T
return t
✓ T is expected to be T
file-position-changable-p
return nil
✓ T is expected to be T
✓ T is expected to be T
✓ T is expected to be T
return t
✓ 0 is expected to be T
✓ 0 is expected to be T
check file-position not be changed
✓ 0 is expected to be 0
✓ 4 tests completed (4ms)
Running a test file 'code/inquisitor/t/eol.lisp'
1..5
✓ :CR is expected to be :CR
✓ :CRLF is expected to be :CRLF
✓ :LF is expected to be :LF
✓ :LF is expected to be :LF
If file has no newline then return NIL
✓ NIL is expected to be NIL
✓ NIL is expected to be NIL
✓ 5 tests completed (8ms)
Running a test file 'code/inquisitor/t/encoding.lisp'
1..14
Byte order mark treatment
For details, see Unicode Standard, 2.13. Special Characters, Byte Order Mark (BOM)
Specified vector is too short
✓ NIL is expected to be NIL
✓ NIL is expected to be NIL
return nil for normal string
✓ NIL is expected to be NIL
✓ NIL is expected to be NIL
✓ NIL is expected to be NIL
partially matched for BOM
✓ NIL is expected to be NIL
✓ NIL is expected to be NIL
✓ NIL is expected to be NIL
✓ NIL is expected to be NIL
✓ NIL is expected to be NIL
✓ NIL is expected to be NIL
BOM exactly
✓ :BIG-ENDIAN is expected to be :BIG-ENDIAN
✓ :LITTLE-ENDIAN is expected to be :LITTLE-ENDIAN
string following BOM
✓ :BIG-ENDIAN is expected to be :BIG-ENDIAN
✓ :LITTLE-ENDIAN is expected to be :LITTLE-ENDIAN
list available schemes
✓ (:JP :TW :CN :KR :RU :AR :TR :GR :HW :PL :BL) is expected to be (:JP :TW :CN :KR :RU :AR :TR :GR :HW :PL :BL)
encoding -- not supported scheme
✓ (TEST-ENC "data/empty.txt" :NOT-SUPPORTED :UTF-8) is expected to raise a condition ERROR (got #<CCL::SIMPLE-FILE-ERROR #x30200201A0CD>) (68ms)
encoding -- jp
✓ :UTF-8 is expected to be :UTF-8
✓ :UTF-8 is expected to be :UTF-8
× :CP932 is expected to be :EUC-JP
✓ :ISO-2022-JP is expected to be :ISO-2022-JP
✓ :CP932 is expected to be :CP932
✓ :UTF-8 is expected to be :UTF-8
× :EUC-JP is expected to be :UCS-2BE
× :EUC-JP is expected to be :UCS-2LE
× :UCS-2BE is expected to be :UTF-16
encoding -- tw
✓ :UTF-8 is expected to be :UTF-8
✓ :UTF-8 is expected to be :UTF-8
✓ :BIG5 is expected to be :BIG5
iso2022 is equivalent to euc-tw, probably... (https://en.wikipedia.org/wiki/CNS_11643)
× :BIG5 is expected to be :ISO-2022-TW
✓ :UTF-8 is expected to be :UTF-8
encoding -- cn
✓ :UTF-8 is expected to be :UTF-8
✓ :UTF-8 is expected to be :UTF-8
× :GB18030 is expected to be :GB2312
✓ :GB18030 is expected to be :GB18030
✓ :ISO-2022-CN is expected to be :ISO-2022-CN
✓ :UTF-8 is expected to be :UTF-8
encoding -- kr
✓ :UTF-8 is expected to be :UTF-8
✓ :UTF-8 is expected to be :UTF-8
✓ :EUC-KR is expected to be :EUC-KR
✓ :JOHAB is expected to be :JOHAB
✓ :ISO-2022-KR is expected to be :ISO-2022-KR
✓ :UTF-8 is expected to be :UTF-8
encoding -- ar
✓ :UTF-8 is expected to be :UTF-8
✓ :UTF-8 is expected to be :UTF-8
× :CP1256 is expected to be :ISO-8859-6
✓ :CP1256 is expected to be :CP1256
× :CP1256 is expected to be :UTF-8
encoding -- gr
✓ :UTF-8 is expected to be :UTF-8
✓ :UTF-8 is expected to be :UTF-8
× :CP1253 is expected to be :ISO-8859-7
in range of greek character, cp1253 is subset of iso8859-7, probably
✓ :CP1253 is expected to be :CP1253
✓ :UTF-8 is expected to be :UTF-8
encoding -- hw
✓ :UTF-8 is expected to be :UTF-8
✓ :UTF-8 is expected to be :UTF-8
× :CP1255 is expected to be :ISO-8859-8
✓ :CP1255 is expected to be :CP1255
*TODO!* iso8859-8 does not has vowels (called 'nikud'), thus that case must be added.
✓ :UTF-8 is expected to be :UTF-8
encoding -- tr
✓ :UTF-8 is expected to be :UTF-8
✓ :UTF-8 is expected to be :UTF-8
× :CP1254 is expected to be :ISO-8859-9
✓ :CP1254 is expected to be :CP1254
✓ :UTF-8 is expected to be :UTF-8
encoding -- ru
✓ :UTF-8 is expected to be :UTF-8
✓ :UTF-8 is expected to be :UTF-8
× :CP866 is expected to be :ISO-8859-5
× :CP866 is expected to be :KOI8-R
× :CP866 is expected to be :KOI8-U
✓ :CP866 is expected to be :CP866
× :CP866 is expected to be :CP1251
✓ :UTF-8 is expected to be :UTF-8
encoding -- pl
✓ :UTF-8 is expected to be :UTF-8
✓ :UTF-8 is expected to be :UTF-8
× :UTF-8 is expected to be :ISO-8859-2
× :UTF-8 is expected to be :CP1250
✓ :UTF-8 is expected to be :UTF-8
encoding -- bl
✓ :UTF-8 is expected to be :UTF-8
✓ :UTF-8 is expected to be :UTF-8
× :UTF-8 is expected to be :ISO-8859-13
× :UTF-8 is expected to be :CP1257
✓ :UTF-8 is expected to be :UTF-8
× 10 of 14 tests failed (86ms)
Running a test file 'code/inquisitor/t/external-format.lisp'
1..1
make-external-format
✓ #<EXTERNAL-FORMAT :UTF-8/:UNIX #x302000466DCD> is expected to be #<EXTERNAL-FORMAT :UTF-8/:UNIX #x302000466DCD>
✓ 1 test completed (6ms)
Running a test file 'code/inquisitor/t/inquisitor.lisp'
1..5
detect-encoding
for stream
✓ (DETECT-ENCODING IN :JP) is expected to raise a condition ERROR (got #<SIMPLE-ERROR #x302001F12FDD>)
✓ (DETECT-ENCODING S :JP) is expected to raise a condition ERROR (got #<SIMPLE-ERROR #x302001F5C36D>) (61ms)
✓ :UTF-8 is expected to be :UTF-8 (487ms)
✓ 0 is expected to be 0
for pathname
✓ :UTF-8 is expected to be :UTF-8 (516ms)
detect-end-of-line
for stream
✓ (DETECT-END-OF-LINE IN) is expected to raise a condition ERROR (got #<SIMPLE-ERROR #x302001FF909D>)
✓ (DETECT-END-OF-LINE S) is expected to raise a condition ERROR (got #<SIMPLE-ERROR #x302001FF39BD>) (54ms)
✓ :LF is expected to be :LF
✓ 0 is expected to be 0
for pathname
✓ :LF is expected to be :LF (105ms)
detect external-format --- from vector
when not byte-array
✓ (DETECT-EXTERNAL-FORMAT "string" :JP) is expected to raise a condition ERROR (got #<SIMPLE-ERROR #x30200203382D>)
when cannot treat the encodings (how do I cause it...?)
✓ (DETECT-EXTERNAL-FORMAT "" :JP) is expected to raise a condition ERROR (got #<SIMPLE-ERROR #x30200203212D>)
✓ #<EXTERNAL-FORMAT :UTF-8/:UNIX #x302000466DCD> is expected to be #<EXTERNAL-FORMAT :UTF-8/:UNIX #x302000466DCD>
detect external-format --- from stream
✓ (DETECT-EXTERNAL-FORMAT OUT :JP) is expected to raise a condition ERROR (got #<SIMPLE-ERROR #x30200208E5DD>)
✓ (DETECT-EXTERNAL-FORMAT IN :JP) is expected to raise a condition ERROR (got #<SIMPLE-ERROR #x30200208D05D>)
✓ #<EXTERNAL-FORMAT :UTF-8/:UNIX #x302000466DCD> is expected to be #<EXTERNAL-FORMAT :UTF-8/:UNIX #x302000466DCD> (597ms)
✓ 0 is expected to be 0
detect external-format --- from pathname
✓ (DETECT-EXTERNAL-FORMAT "data/ascii/ascii-lf.txt" :JP) is expected to raise a condition ERROR (got #<SIMPLE-ERROR #x3020020CFD1D>)
✓ #<EXTERNAL-FORMAT :UTF-8/:UNIX #x302000466DCD> is expected to be #<EXTERNAL-FORMAT :UTF-8/:UNIX #x302000466DCD> (670ms)
✓ 5 tests completed (2689ms)
Summary:
1 file failed.
- code/inquisitor/t/encoding.lisp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment