Skip to content

Instantly share code, notes, and snippets.

@andresgutgon
Forked from norman/character_reference.rb
Created May 2, 2012 16:18
Show Gist options
  • Save andresgutgon/2577895 to your computer and use it in GitHub Desktop.
Save andresgutgon/2577895 to your computer and use it in GitHub Desktop.
HTML entities? We don't need no stinkin' HTML entities.
# coding: utf-8
#
# Encode any codepoint outside the ASCII printable range to an HTML character
# reference (http://bit.ly/KNupLT).
def encode(string)
string.each_codepoint.inject("") do |buffer, cp|
cp = "&#x#{cp.to_s(16)};" unless cp >= 0x20 && cp <= 0x7E
buffer << cp
end
end
puts encode "Japan"
# => "Japan"
puts encode "日本"
# => "&#x65e5;&#x672c;"
puts encode "Japón"
# => "Jap&#xf3;n"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment