Dondon Jie dondon2475848

## 自動生成摘要-Automatic Text Summarization.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                dondon2475848
                / 自動生成摘要-Automatic Text Summarization.md
            
            
              Last active
              February 14, 2018 01:50
            
          
    Dataset

英文


CNN/Daily Mail

2015-Hermann et al. - Teaching machines to read and comprehend
2016-Nallapati et al.-Abstractive text summarization using sequence-to-sequence rnns and beyond
Nallapati等人有定義評估的步驟，後續如要使用可以follow他們的研究
dataset contains 287,113 training examples, 13,368 validation examples and 11,490 testing examples. After limiting the input length to 800 tokens and output length to 100 tokens, the average input and output lengths are respectively 632 and 53 tokens.


the New York Times dataset (NYT)


2008 - Evan Sandhaus - The new york times annotated corpus.


## information_retrieval_evaluation_ndcg.py
"""
程式參考自：
https://gist.github.com/bwhite/3726239
https://gist.github.com/gumption/b54278ec9bab2c0e0472816d1d7663be
差異：新增「 sum (2^rel_i - 1) / log2(i + 1) 」的版本
作者：Jie Dondon
版本：ndcg_dondon_20180201_v2
"""

import numpy as np
	"""
	程式參考自：
	https://gist.github.com/bwhite/3726239
	https://gist.github.com/gumption/b54278ec9bab2c0e0472816d1d7663be
	差異：新增「 sum (2^rel_i - 1) / log2(i + 1) 」的版本
	作者：Jie Dondon
	版本：ndcg_dondon_20180201_v2
	"""

	import numpy as np