Skip to content

Instantly share code, notes, and snippets.

@arosh
Last active November 4, 2015 13:12
Show Gist options
  • Save arosh/3f0b2be594c45cd6ddc2 to your computer and use it in GitHub Desktop.
Save arosh/3f0b2be594c45cd6ddc2 to your computer and use it in GitHub Desktop.
Document-Term Matrix
LabelBinarizer : ラベル → OneHot表現
LabelEncoder : ラベル → 0, 1, ..., N-1
OneHotEncoder : 0, 1, ..., N-1 → OneHot表現
https://github.com/recruit-tech/summpy/blob/master/summpy/lexrank.py#L34
CountVectorizer : テキスト → document term matrix
TfidfTransformer : document term matrix → IDF重み,l2正規化
TfidfVectorizer : テキスト → IDF重み,l2正規化
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment