Skip to content

Instantly share code, notes, and snippets.

@rummelonp
Last active December 16, 2015 10:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save rummelonp/5421629 to your computer and use it in GitHub Desktop.
Save rummelonp/5421629 to your computer and use it in GitHub Desktop.
MeCab + Ruby でベンチ取ってみた
# -*- coding: utf-8 -*-
require 'benchmark'
require 'MeCab'
require 'natto'
require 'ffi'
# Natto#parse のボトルネック解消したバージョン
class DaizuNatto < Natto::MeCab
def parse(str)
raise ArgumentError.new 'String to parse cannot be nil' if str.nil?
mecab_sparse_tostr(@tagger, str)
.force_encoding(Encoding.default_external)
end
end
# 自分で作ったバージョン
class Negitoro
extend FFI::Library
ffi_lib 'mecab'
attach_function :mecab_new2, [:string], :pointer
attach_function :mecab_sparse_tostr, [:pointer, :string], :string
attach_function :mecab_destroy, [:pointer], :void
def self.clean_proc(tagger)
Proc.new { mecab_destroy tagger }
end
def initialize(option = "")
@tagger = mecab_new2 option
ObjectSpace.define_finalizer self, self.class.clean_proc(@tagger)
end
def parse(str)
raise ArgumentError.new 'String to parse cannot be nil' if str.nil?
mecab_sparse_tostr(@tagger, str)
.force_encoding(Encoding.default_external)
end
end
# ベンチマーク
def do_parse(tagger)
10000.times { tagger.parse("太郎はこの本を二郎を見た女性に渡した。") }
end
Benchmark.bmbm(10) do |x|
x.report("和布蕪") { do_parse(MeCab::Tagger.new) }
x.report("納豆") { do_parse(Natto::MeCab.new) }
x.report("大豆納豆") { do_parse(DaizuNatto.new) }
x.report("ネギトロ") { do_parse(Negitoro.new) }
end
@rummelonp
Copy link
Author

Rehearsal ----------------------------------------------
和布蕪          0.090000   0.000000   0.090000 (  0.100041)
納豆           0.130000   0.000000   0.130000 (  0.127894)
大豆納豆         0.090000   0.000000   0.090000 (  0.093463)
ネギトロ         0.100000   0.010000   0.110000 (  0.101709)
------------------------------------- total: 0.420000sec

                 user     system      total        real
和布蕪          0.090000   0.000000   0.090000 (  0.090971)
納豆           0.120000   0.000000   0.120000 (  0.124085)
大豆納豆         0.090000   0.000000   0.090000 (  0.090563)
ネギトロ         0.100000   0.000000   0.100000 (  0.094437)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment