Skip to content

Instantly share code, notes, and snippets.

@madogiwa0124
Last active May 27, 2019 08:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save madogiwa0124/5907df8b84185f1fd4614329cb149838 to your computer and use it in GitHub Desktop.
Save madogiwa0124/5907df8b84185f1fd4614329cb149838 to your computer and use it in GitHub Desktop.
mecab with rails on Docker

mecab with rails on Docker

infomation

name version
ruby 2.5.1
rails latest
mecab latest

mecab rapper class

mecab_client.rb

puts MecabClient.new("すもももももももものうち").parse
=>
すもも	名詞,一般,*,*,*,*,すもも,スモモ,スモモ
	助詞,係助詞,*,*,*,*,,,
もも	名詞,一般,*,*,*,*,もも,モモ,モモ
	助詞,係助詞,*,*,*,*,,,
もも	名詞,一般,*,*,*,*,もも,モモ,モモ
	助詞,連体化,*,*,*,*,,,
うち	名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ

pp MecabClient.new("すもももももももものうち").words
=>
[#<MecabClient::Word:0x00005576e2965948
  @category1="名詞",
  @category2="一般",
  @category3="*",
  @text="すもも">,
 #<MecabClient::Word:0x00005576e29655b0
  @category1="助詞",
  @category2="係助詞",
  @category3="*",
  @text="も">,
 #<MecabClient::Word:0x00005576e2965218
  @category1="名詞",
  @category2="一般",
  @category3="*",
  @text="もも">,
 #<MecabClient::Word:0x00005576e2964e80
  @category1="助詞",
  @category2="係助詞",
  @category3="*",
  @text="も">,
 #<MecabClient::Word:0x00005576e2964ae8
  @category1="名詞",
  @category2="一般",
  @category3="*",
  @text="もも">,
 #<MecabClient::Word:0x00005576e2964750
  @category1="助詞",
  @category2="連体化",
  @category3="*",
  @text="の">,
 #<MecabClient::Word:0x00005576e29643b8
  @category1="名詞",
  @category2="非自立",
  @category3="副詞可能",
  @text="うち">]

build for mac

$ docker build -t mecab-with-rails .
$ docker run -v mount_from_path:mount_to_path -p 3000:3000 -d -it --name mecab-with-rails mecab-with-rails
$ docker exec -it mecab-with-rails bash

remove

 $ docker stop mecab-with-rails
 $ docker rm mecab-with-rails
 $ docker rmi mecab-with-rails
FROM ruby:2.5.1
ENV LANG=C.UTF-8
ENV WORK_SPACE=/work
RUN mkdir ${WORK_SPACE}
WORKDIR ${WORK_SPACE}
RUN apt-get update
RUN apt-get install libmecab2 libmecab-dev mecab mecab-ipadic mecab-ipadic-utf8 mecab-utils
RUN apt-get install -y nodejs build-essential libpq-dev mysql-server
ADD Gemfile ${WORK_SPACE}/Gemfile
RUN gem install bundler
RUN bundle install
EXPOSE 3000
source "https://rubygems.org"
gem "rails"
class MecabClient
attr_reader :text, :parsed_text, :words
def initialize(text)
@text = text
parse
end
def parse
@parsed_text = MeCab::Tagger.new.parse @text
@words = parsed_text_rows.map { |row| build_word(row) }
@parsed_text
end
private
def parsed_text_rows
rows = @parsed_text.split("\n")
rows[0...(rows.length - 1)]
end
def build_word(row)
text = row.split("\t")[0]
properties = row.split("\t")[1].split(',')
Word.new(
text: text,
category1: properties[0],
category2: properties[1],
category3: properties[2]
)
end
class Word
attr_reader :text, :category1, :category2, :category3
def initialize(text: nil, category1: nil, category2: nil, category3: nil)
@text = text
@category1 = category1
@category2 = category2
@category3 = category3
end
def noun?
@category1 == "名詞"
end
def number?
@category2 == '数'
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment