Skip to content

Instantly share code, notes, and snippets.

@abhishek-shrm
Last active August 3, 2020 16:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save abhishek-shrm/a63ea564207ab86665c06fdbd9b09ae4 to your computer and use it in GitHub Desktop.
Save abhishek-shrm/a63ea564207ab86665c06fdbd9b09ae4 to your computer and use it in GitHub Desktop.
def create_corpus(result):
unique_docid=result['docid'].unique()
condition=df['docid'].isin(unique_docid)
corpus=df[condition].reset_index(drop=True)
corpus=corpus.drop(columns='url')
print('Number of Rows=>',len(corpus))
return corpus
training_corpus=create_corpus(training_result)
training_corpus.head()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment