- Enhances the document encoder with an additional graph-structured encoder to maintain global context and local characteristics.
- Both document encoder and graph encoder are used for abstract generation
- Outperforms pretrained language models (e.g. BERT)
- Generates better summaries
- Automatic evaluation might not catch all the important errors
- Use Stanford CoreNLP
- OpenIE model
- Performs coreference resolution
- Does not do global entity linking
- Uses OpenIE triples extracted
- Subjects and objects are nodes connected by edges
- Collapse coreferential mentions
Model inputs:
- Document (sequence of tokens)
- Knowledge graph consisting of nodes
The document and the knowledge graph are encoded separately
Document encoder: RoBERTa -> bi-LSTM
Graph encoder:
- nodes for predicates and subjects
- Subject -> predicate
- Predicate -> object
- Each node contains multiple mentions of the entity. We take the mean of the embeddings of each mention.
- The graph goes through a GAT (graph attention network)
Capturing topic shift:
- Encode each paragraph as a subgraph (same encoder)
- Connect all the subgraphs with a bi-LSTM (how?)
- Apply max-pooling over all nodes in the subgraph from the GAT output
- Use max-pooled results as input to the LSTM
Single-layer LSTM generating summary tokens, simultaneously processing the graph and the document
Apply some attention mechanism to the graph?
SegGraph conveys a sense of topic shift between paragraphs
- So nodes are weighted so as to give more weight to nodes that are relevant
- (i.e. nodes in the same or relevant paragraphs)
Maximum likelihood loss function
Node salience labeling:
- train network to prioritize nodes (subjects) that appear in summaries
- gold-standard: mask is 1 for a node if it is in the reference summary, zero otherwise
I'm still pretty new to reinforcement learning :(
ROUGE?
- Automatically generate questions from a human reference summary
- Train a QA model (RoBERTa) to answer questions by reading context
- concatenate context, question, and four candidate answers
Acknowledge 3 kinds of errors:
- hallucination (made-up information)
- out of context (including information without useful context)
- deletion error (mistakenly deleting important subjects or clauses)
Could build on top of pre-trained encoder-decoder (like BART)
Existing metrics don't adequately capture the presence or absence of key error types--we need better metrics to identify this.