Skip to content

Instantly share code, notes, and snippets.

@kaisugi
Created January 11, 2023 01:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kaisugi/e125957e0aa6bd17f4da31d3f2ea237b to your computer and use it in GitHub Desktop.
Save kaisugi/e125957e0aa6bd17f4da31d3f2ea237b to your computer and use it in GitHub Desktop.

/anywhere/you/like というパスの下に辞書を入れたい場合

!wget https://sociocom.jp/~data/2018-manbyo/data/MANBYO_201907_Dic-utf8.dic
!mv MANBYO_201907_Dic-utf8.dic /anywhere/you/like
from transformers import AutoModelForMaskedLM, AutoTokenizer

model = AutoModelForMaskedLM.from_pretrained("alabnii/jmedroberta-base-manbyo-wordpiece")
tokenizer = AutoTokenizer.from_pretrained("alabnii/jmedroberta-base-manbyo-wordpiece", **{
    "mecab_kwargs": {
        "mecab_option": "-u /anywhere/you/like/MANBYO_201907_Dic-utf8.dic"
    }
})

from transformers import pipeline
fmp=pipeline("fill-mask", model=model, tokenizer=tokenizer)
fmp("夜の底が[MASK]なった。")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment