2020 Volume 27 Issue 2 Pages 169-188
In this paper, we derive the Integer Linear Programming (ILP) formulations to obtain extractive oracle summaries to reveal the upper bound automatic score of the extractive summarization paradigm. Then, we manually evaluate the oracle summaries in terms of the pyramid method and Quality Questions to assess the validity of the oracle summaries. We evaluated three kinds of extractive oracle summaries, sentence extraction, Elementary Discourse Unit (EDU) extraction, and subtree extraction, against ROUGE and Basic Elements (BE) on Text Analysis Conference (TAC) 2009/2011 data sets. The results demonstrated the pyramid scores and automatic scores of the oracle summaries are quite high, but the linguistic quality of them is not so good. The results imply that we can generate informative summaries by extraction, but we have to improve the linguistic quality of the summaries.