Skip to content

Instantly share code, notes, and snippets.

@srush
Created January 11, 2017 16:06
Show Gist options
  • Save srush/6dd95785a5cf6fbb8732dd7c704a9f0a to your computer and use it in GitHub Desktop.
Save srush/6dd95785a5cf6fbb8732dd7c704a9f0a to your computer and use it in GitHub Desktop.
---------------------------------------------------------------------------
Reviewer's Scores
---------------------------------------------------------------------------
Appropriateness: 5
Clarity: 5
Originality: 3
Soundness / Correctness: 4
Impact of Ideas / Results: 4
Meaningful Comparison: 4
Substance: 4
Replicability: 4
Recommendation: 4
---------------------------------------------------------------------------
Comments
---------------------------------------------------------------------------
This paper conducts sentence-level abstractive summarization/rewriting with a
neural network approach. The method in principle can handle not just deletion
but also rewriting (using words not in the original sentences), while minimally
relies on linguistic analysis. The paper is easy to follow and is written
clearly. Advantages of the proposed models are supported by the experimental
results. I recommend for publication.
I have a couple of suggestions. First, (probably I missed something), will a
good neural-network-based deletion model be good enough to capture the benefit
seen in this paper? The approach proposed in this paper can generate summaries
with words not seen in the original sentences, but how much does that benefit?
(although the COMPRESS model is discussed in section 7.2, which used a very
different approach and still has some gaps). From summarization evaluation view
points, ROUGE may not value too much “unseen” words, compared with other
metrics like Pyramid, so the benefit of the proposed model may not be fully
showed. While Section 5 try to compromise towards ROUGE, some discussions may
help think of the issue in another way.
The paper used a trick (section 5) to tune the model towards ROUGE. If a little
more discussion can be added, it may help some readers further understand why
directly tuning an objective, e.g., ROUGE, is not feasible here, Is it because
that ROUGE may not be accurate on a small data set (like BLUE on individual
sentences), because of computational concerns, or because of other reasons.
============================================================================
REVIEWER #2
============================================================================
---------------------------------------------------------------------------
Reviewer's Scores
---------------------------------------------------------------------------
Appropriateness: 5
Clarity: 4
Originality: 3
Soundness / Correctness: 4
Impact of Ideas / Results: 4
Meaningful Comparison: 4
Substance: 4
Replicability: 3
Recommendation: 4
---------------------------------------------------------------------------
Comments
---------------------------------------------------------------------------
This paper talks about abstractive sentence summarization, specifically
headline generation using a neural language model. The work was evaluated on
the DUC 2004 dataset. The paper is well written and the work is interesting and
has been carefully evaluated. However, while the quantitative evaluation seems
reasonable, the actual summaries (from the examples) seems to have major
grammatical and repetition issues and does not look quite as good as the true
headline. Having said that, the idea is promising but it needs more work in
terms of soundness of generated sentences.
A few questions/comments:
Why was the model trained using only the first line in the text? What is the
intuition for this? Could the last line which summarizes the text be used as
well? It would be nice if this is discussed in the paper.
In Section 7.2 the authors talk about a capped ROUGE score, but the authors
don't explain how this is computed. Was this used in the DUC 2004 task, if so
please state and if not please provide the exact formula for reproduceability.
It seems like the authors tried to fit more content than the page limit would
allow as the bottom margin is completely off. Please fix this and make the
writing more concise.
============================================================================
REVIEWER #3
============================================================================
---------------------------------------------------------------------------
Reviewer's Scores
---------------------------------------------------------------------------
Appropriateness: 4
Clarity: 3
Originality: 4
Soundness / Correctness: 4
Impact of Ideas / Results: 4
Meaningful Comparison: 3
Substance: 4
Replicability: 3
Recommendation: 4
---------------------------------------------------------------------------
Comments
---------------------------------------------------------------------------
This paper utilizes neural language models to generate sentence summarization
word by word, which goes beyond previous sentence based extractive methods and
phrase based abstractive approaches for sentence summarization. More specific,
their Attention Based Summarization (ABS) approach is modeled off the attention
based encoder and a beam search decoder with extractive features, which can be
seen as a tradeoff between abstractive and extractive methods.
For the encoder models, they deploy four step-by-step models of which two are
only considering the input word information, and the other two combining the
embedded current context information. Thus, the latter two encoders, which are
simultaneously learning embeddings for the input with distributions based on
the current context, are capable to show interpretable alignment between the
summary and the input sentence.
The author conducted extensive experiments on several strong and well known
baseline models, achieving promising results. Especially, their tuned model
ABS+, which leverages the advantage of fluency by extractive features, scores
significantly the best on the tasks. While they introduced how to tune the
weight vector alpha, they don't report the real value of the alpha in the final
best performance. Such real values would be beneficial to examine the
importance of the extractive features. Therefore, I'm weakly hesitated for the
analysis of the degree of their attention based neural models.
I am just wondering how the grammar can be ensured with the proposed approach.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment