The challenge begins! Don't overthink it. A cub can be made in only a few shapes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import glob | |
import math | |
line='' | |
s=set() | |
flist=glob.glob(r'E:\PROGRAMMING\PYTHON\programs\corpus2\*.txt') #get all the files from the d`#open each file >> tokenize the content >> and store it in a set | |
for fname in flist: | |
tfile=open(fname,"r") | |
line=tfile.read() # read the content of file and store in "line" | |
tfile.close() # close the file | |
s=s.union(set(line.split(' '))) # union of common words |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import math | |
from text.blob import TextBlob as tb | |
def tf(word, blob): | |
return blob.words.count(word) / len(blob.words) | |
def n_containing(word, bloblist): | |
return sum(1 for blob in bloblist if word in blob) | |
def idf(word, bloblist): |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Given a list of words, remove any that are | |
# in a list of stop words. | |
def removeStopwords(wordlist, stopwords): | |
return [w for w in wordlist if w not in stopwords] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import glob | |
import math | |
line='' | |
s=set() | |
flist=glob.glob(r'E:\PROGRAMMING\PYTHON\programs\corpus2\*.txt') #get all the files from the d`#open each file >> tokenize the content >> and store it in a set | |
for fname in flist: | |
tfile=open(fname,"r") | |
line=tfile.read() # read the content of file and store in "line" | |
tfile.close() # close the file | |
s=s.union(set(line.split(' '))) # union of common words |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
date | value | |
---|---|---|
2013-01 | 53 | |
2013-02 | 165 | |
2013-03 | 269 | |
2013-04 | 344 | |
2013-05 | 376 | |
2013-06 | 410 | |
2013-07 | 421 | |
2013-08 | 405 | |
2013-09 | 376 |