Skip to content

Instantly share code, notes, and snippets.

@pandanote-info
Created January 7, 2022 07:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pandanote-info/8044249416d5f6049e4154c2fa62a406 to your computer and use it in GitHub Desktop.
Save pandanote-info/8044249416d5f6049e4154c2fa62a406 to your computer and use it in GitHub Desktop.
TF-IDFの計算用のサンプルコード片
# TF
aa = bow.copy()
np.set_printoptions(threshold=np.inf,formatter={'float': '{:.8f}'.format})
for i in range(0,dim[0]):
ar = bow.getrow(i)
rowsum = np.matrix.sum(ar.todense())
arr = ar/rowsum
aa[i] = arr
# IDF(ln)
for j in range(0,dim[1]):
ac = aa.getcol(j)
idf = math.log(dim[0]/ac.getnnz())
aa[0:dim[0],j] = ac*idf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment