Skip to content

Instantly share code, notes, and snippets.

@lastlegion
Last active November 7, 2018 15:20
Show Gist options
  • Save lastlegion/dd7f11aada4673dfbb4b to your computer and use it in GitHub Desktop.
Save lastlegion/dd7f11aada4673dfbb4b to your computer and use it in GitHub Desktop.
LexRank summarization in python using sumy
#Import library essentials
from sumy.parsers.plaintext import PlaintextParser #We're choosing a plaintext parser here, other parsers available for HTML etc.
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lex_rank import LexRankSummarizer #We're choosing Lexrank, other algorithms are also built in
file = "plain_text.txt" #name of the plain-text file
parser = PlaintextParser.from_file(file, Tokenizer("english"))
summarizer = LexRankSummarizer()
summary = summarizer(parser.document, 5) #Summarize the document with 5 sentences
for sentence in summary:
print sentence
@jeffquach
Copy link

@Jbrown214 You can use this method:
parser = PlaintextParser.from_string(string, Tokenizer("english"))

The documentation for sumy isn't the greatest so try looking at the source code

@josiahdavis
Copy link

In order to get LSA use: from sumy.summarizers.lsa import LsaSummarizer

@Redwa
Copy link

Redwa commented Nov 7, 2018

Thank you @Jbrown214 and @jeffquach for asking and answering. I had the same question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment