Skip to content

Instantly share code, notes, and snippets.

@metaskills
Created February 25, 2013 12:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save metaskills/5029604 to your computer and use it in GitHub Desktop.
Save metaskills/5029604 to your computer and use it in GitHub Desktop.
Misc Notes On CharlockHolmes
# Charlock Holmes Install (failed)
$ gem uninstall charlock_holmes
$ gem install charlock_holmes -- --with-icu-dir=/opt/local/lib
/opt/local/share/icu
/opt/local/lib/icu
$ irb
> CharlockHolmes::EncodingDetector.detect "€20 – “Woohoo”".encode('CP1252')
dyld: lazy symbol binding failed: Symbol not found: _magic_open
# Charlock Holmes Install (good)
# gem 'charlock_holmes'
$ unset DYLD_LIBRARY_PATH
$ export VERSION="1.9.3-p194"
$ export CONFIGURE_OPTS="--enable-shared --with-arch=i686"
$ export CC="/opt/local/bin/gcc-apple-4.2"
$ export CXX="/opt/local/bin/g++-apple-4.2"
$ export CFLAGS="-O3 -arch i686"
$ export CPPFLAGS="-O3 -arch i686"
$ export LDFLAGS="-L/opt/local/lib -arch i686"
$ export CPPFLAGS="-I/opt/local/include -arch i686"
$ gem uninstall charlock_holmes
# Charlock Holmes Play (good)
$ irb
> require 'charlock_holmes'
> CharlockHolmes::EncodingDetector.detect "€20 – “Woohoo”".encode('CP1252')
> {:type=>:binary, :confidence=>100}
# Charlock Holmes Play (bad)
$ rc
> content = "€20 – “Woohoo”".encode('CP1252')
> detection = CharlockHolmes::EncodingDetector.detect(content)
> utf8_encoded_content = CharlockHolmes::Converter.convert content, detection[:encoding], 'UTF-8'
(irb):3: [BUG] Bus Error
ruby 1.9.3p194 (2012-04-20 revision 35410) [i686-darwin12.2.0]
-- Control frame information -----------------------------------------------
c:0029 p:---- s:0107 b:0107 l:000106 d:000106 CFUNC :convert
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment