Skip to content

Instantly share code, notes, and snippets.

@bkazez
Created July 23, 2023 22:01
Show Gist options
  • Save bkazez/27820e502ef4fd9ca221d40062aa7c80 to your computer and use it in GitHub Desktop.
Save bkazez/27820e502ef4fd9ca221d40062aa7c80 to your computer and use it in GitHub Desktop.
Accent Removal Benchmark 1
require 'benchmark'
require 'i18n'
I18n.config.available_locales = :en
COMBINING_DIACRITICS = [*0x1DC0..0x1DFF, *0x0300..0x036F, *0xFE20..0xFE2F].pack('U*')
def removeaccents(str)
str
.unicode_normalize(:nfd) # Decompose characters
.tr(COMBINING_DIACRITICS, '')
.unicode_normalize(:nfc) # Recompose characters
end
str = File.read(File.expand_path("benchmark_input.txt"))
Benchmark.bmbm do |x|
x.report("I18n") { I18n.transliterate(str) }
x.report("removeaccents") { removeaccents(str) }
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment