Skip to content

Instantly share code, notes, and snippets.

@frangarcia
Last active August 29, 2015 14:20
Show Gist options
  • Save frangarcia/8a74107f47f476ede89f to your computer and use it in GitHub Desktop.
Save frangarcia/8a74107f47f476ede89f to your computer and use it in GitHub Desktop.
Replacing surrogates chars by html entities
@Grapes(
@Grab(group='commons-lang', module='commons-lang', version='2.6')
)
/*@Grapes(
@Grab(group='org.apache.commons', module='commons-lang3', version='3.3')
)*/
import org.apache.commons.lang.StringEscapeUtils
//import org.apache.commons.lang3.StringEscapeUtils
String sentence = "This is my < first caption πŸ˜„πŸ˜ƒπŸ˜€πŸ˜Šβ˜ΊοΈπŸ˜‰πŸ˜πŸ˜˜πŸ˜šπŸ˜—πŸ˜™πŸ˜œπŸ˜πŸ˜©πŸ˜§πŸ‘³πŸ˜ΉπŸ‘ΏπŸ˜¬πŸ™ŠπŸ‘„πŸ‘ƒπŸ‘€πŸ™ŒπŸ‘ˆπŸ‘ˆπŸ’’πŸ‘‹πŸ™†πŸ‘”πŸ’šπŸ’„πŸ’„πŸ’“πŸ’–πŸ’ͺπŸ™ŠπŸ’€πŸ‘½πŸƒπŸ™‹πŸ’πŸ‘†πŸ‘―πŸ’‡πŸ‘–πŸ‘–πŸšπŸ—ΌπŸŒ…πŸŽ‘β›²οΈπŸš”πŸš¨βš οΈπŸ‡¬πŸ‡§πŸŽͺπŸš²πŸš²πŸš‚πŸ—ΌπŸ­πŸš—πŸš†πŸš…β›΅οΈπŸπŸ”πŸ©πŸ•πŸ–πŸπŸŒΊπŸŒΊπŸŒΌπŸ‡πŸπŸπŸŒΈ"
String sentenceOut = ""
(0..<sentence.size()).each {
if (Character.isHighSurrogate(sentence.charAt(it))) {
sentenceOut += StringEscapeUtils.escapeHtml(sentence[it]+sentence[it+1])
//sentenceOut += StringEscapeUtils.escapeHtml4(sentence[it]+sentence[it+1])
println "escaping ${sentence[it]+sentence[it+1]}"
} else if (Character.isLowSurrogate(sentence.charAt(it))){
} else {
sentenceOut += sentence[it]
}
}
return sentenceOut
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment