Skip to content

Instantly share code, notes, and snippets.

View infinityfuture's full-sized avatar

Infinity Future infinityfuture

View GitHub Profile
@infinityfuture
infinityfuture / fasttext_gensim.py
Created February 6, 2019 14:50
Using Gensim train FastText Chinese Char vector
import pickle
from gensim.models.fasttext import FastText
# [ ['你', '好', '吗'] ]
sentences = pickle.load(open('./x.pkl', 'rb'))
model = FastText(size=64, window=10, min_count=1)
model.build_vocab(sentences=sentences)
model.train(sentences=sentences, total_examples=len(sentences), epochs=10)
model.wv.save_word2vec_format('./fasttext64.txt')
@infinityfuture
infinityfuture / textrank_summarization_word2vec.py
Created October 4, 2018 08:11
Use TextRank algorithm to generate summary, using word2vec as sentence similarity
"""
Reference:
http://www.hankcs.com/nlp/textrank-algorithm-to-extract-the-keywords-java-implementation.html
http://www.hankcs.com/nlp/textrank-algorithm-java-implementation-of-automatic-abstract.html
Chinese Embedding From
https://github.com/Embedding/Chinese-Word-Vectors
@infinityfuture
infinityfuture / textrank_keywords_word2vec.py
Created October 4, 2018 08:08
TextRank extract keywords using word2vec as similarity
"""
Reference:
http://www.hankcs.com/nlp/textrank-algorithm-to-extract-the-keywords-java-implementation.html
http://www.hankcs.com/nlp/textrank-algorithm-java-implementation-of-automatic-abstract.html
Chinese Embedding From
https://github.com/Embedding/Chinese-Word-Vectors
@infinityfuture
infinityfuture / textrank_summarization.py
Last active October 4, 2018 08:11
Use TextRank algorithm to generate summary
"""
Reference:
http://www.hankcs.com/nlp/textrank-algorithm-to-extract-the-keywords-java-implementation.html
http://www.hankcs.com/nlp/textrank-algorithm-java-implementation-of-automatic-abstract.html
"""
from gensim.summarization.bm25 import get_bm25_weights
import numpy as np
@infinityfuture
infinityfuture / textrank_keywords.py
Last active February 15, 2019 16:01
TextRank extract keywords
"""
Reference:
http://www.hankcs.com/nlp/textrank-algorithm-to-extract-the-keywords-java-implementation.html
http://www.hankcs.com/nlp/textrank-algorithm-java-implementation-of-automatic-abstract.html
"""
import numpy as np
@infinityfuture
infinityfuture / pytorch_global_max_pooling.py
Last active October 3, 2018 16:47
Pytorch Global Max Pooling
def global_max_pooling(tensor, dim, topk):
"""Global max pooling"""
ret, _ = torch.topk(tensor, topk, dim)
return ret