Skip to content

Instantly share code, notes, and snippets.

View ywzhang909's full-sized avatar

ZhangHao ywzhang909

  • Haidian, Beijing, China
View GitHub Profile
@ywzhang909
ywzhang909 / build_corpus.py
Last active April 26, 2022 05:59
common string utils#string
#!pip install jieba, tqdm
from tqdm import tqdm
from typing import List, Dict
import jieba
import numpy as np
def load_char_vocab_and_corpus(train_data:List[str], min_count=2):
chars = dict()