Skip to content

Instantly share code, notes, and snippets.

@xissy
Last active December 19, 2015 00:59
Show Gist options
  • Save xissy/5872760 to your computer and use it in GitHub Desktop.
Save xissy/5872760 to your computer and use it in GitHub Desktop.
How to get a UTF-8 string from another charset string using node.js, node-icu-charset-detector and node-iconv.
# dependencies.
charsetDetector = require 'node-icu-charset-detector'
Buffer = require('buffer').Buffer
Iconv = require('iconv').Iconv
# this is it.
iconv = new Iconv charsetDetector.detectCharset(data).toString(), 'UTF-8'
buffer = iconv.convert data
text = buffer.toString()
# do you want this with a file?
fs = require 'fs'
fs.readFile './text.txt', (err, data) ->
iconv = new Iconv charsetDetector.detectCharset(data).toString(), 'UTF-8'
buffer = iconv.convert data
text = buffer.toString()
# do you want this with http?
request = require 'request'
request
url: 'http://www.google.com'
encoding: null # if set encoding to null, the body is returned as a Buffer.
,
(err, response, data) ->
iconv = new Iconv charsetDetector.detectCharset(data).toString(), 'UTF-8'
buffer = iconv.convert data
body = buffer.toString()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment