Dataset Domain Task-setting Paper url-paper url-dataset
LCSTS weibo task-single LCSTS: A Large Scale Chinese Short Text Summarization Dataset
Xsum news task-single Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization
RottenTomatoes movie task-multi Neural Network-Based Abstract Generation for Opinions and Arguments
Reddit TIFU social-media task-single Abstractive Summarization of Reddit Posts with Multi-level Memory Networks
BIGPATENT patent task-longtext, task-single BIGPATENT: A Large-Scale Dataset for Abstractive and Coherent Summarization
CNNDM news task-single Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond
Arxiv scientific-paper task-longtext, task-single A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents
Pubmed scientific-paper task-longtext, task-single A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents
Newsroom news task-single Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies
BillSum legislation task-single BillSum: A Corpus for Automatic Summarization of US Legislation
AESLC email task-single This Email Could Save Your Life: Introducing the Task of Email Subject Line Generation
SAMSum dialogue task-single SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive Summarization
Global Voices multilingual task-single, multilingual Global Voices: Crossing Borders in Automatic News Summarization
WikiSum wiki task-multi Generating Wikipedia by Summarizing Long Sequences
ScisummNet scientific-paper task-single ScisummNet: A Large Annotated Corpus and Content-Impact Models for Scientific Paper Summarization with Citation Networks
Multi-News news task-multi Multi-News: a Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model
Auto-hmds multilingual task-multi Auto-hMDS: Automatic Construction of a Large Heterogeneous Multilingual Multi-Document Summarization Corpus
WikiHow wiki task-single WikiHow: A Large Scale Text Summarization Dataset
DisputeDiscussions debate task-single Understanding and Detecting Supporting Arguments of Diverse Types
Debatepedia debate task-question Diversity driven Attention Model for Query-based Abstractive Summarization
Funcom code task-code Recommendations for Datasets for Source Code Summarization
Talksum scientific-paper task-multi TalkSumm: A Dataset and Scalable Annotation Method for Scientific Paper Summarization Based on Conference Talks
Multi-Aspect CNN/DM news task-aspect Inducing Document Structure for Aspect-based Summarization
proto-summ court-judgment task-single How to Write Summaries with Patterns? Learning towards Abstractive Summarization through Prototype Editing
PeerRead peer-review task-single A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications
AMI meeting task-multimodal The AMI meeting corpus: a pre-announcement
BookSum book task-longtext Explorations in Automatic Book Summarization
MScript movie-script task-single Movie Script Summarization as Graph-based Scene Extraction
Summarizing Opinions product-review task-opinion Summarizing Opinions: Aspect Extraction Meets Sentiment Prediction and They Are Both Weakly Supervised