1. The Heat Change of Summarization Research （Updating...）
As shown in Fig.1, we can observe the influence of deep neural networks on the development of summarization. And next we will make brief descriptions of some key turning points：
The neural networks are brought into NLP by  in 2011 with a unified framework (not the first one but the most complete one). After that (from 2011~2013), there are some related researchers who mainly focused on the word representation learning problem, discussing how to speed up the training process of the language model. The deep neural networks have been officially applied to mainstream NLP tasks at around 2014, and the tasks to be early investigated are relatively simple, mainly concentrating on text representation learning and classification, such as CNN [2,3] ，RNN [4 ,5] .
2014: When we put our attention on text summarization, we found that for those papers accepted by EMNLP/ACL2014 (see below “Knowledge base of Summarization Papers”), there are few about neural networks. Why? Let’s think about it. At that time, we still don’t know what structures of the networks can be used for sentence encoding. For example, researchers haven’t known how to utilize CNN to better model text, not to mention the more complex task, text summarization.
2014->2015: in 2014, there are some major events in NLP community that seem worth remembering now:
a) The proposal of seq2seq learning framework based on LSTM (At that time, no one believed seq2seq could work or LSTM could work)
b) Attention mechanism 
These ground-breaking networks are first applied to machine translation. So, we can guess what would happen next: just looking for a nail with a hammer. Now that the model has been ready, there is no reason not to try other tasks. Therefore, we have seen several papers in EMNLP2015 that use neural networks for summarization tasks. Among these papers, Rusn’s work  has been widely remembered, since that ‘s the first one to use the hammer “attention”. (Of course, it’s not that easy). In this year, there was another awesome work  , which proposed a new dataset for neural abstractive summarization.
During this year, the development of the summarization filed seems to have begun to plateau. The possible reason is that there is no updated technical support in terms of model, data, or training methods. Here, Chen et al. 's work  has played an important role in the development of this field, by formulating extraction summarization task into neural sequence labeling problems.
2016->2017: this year can be said to be the "eve" of the outbreak of abstract field research. Specifically manifested in:
a) Training methods: reinforcement learning and adversarial learning have begun to land in NLP, which make it possible to consider more loss functions for summarization tasks;
b) The introduction of CNN/DM datasets (with some baselines that are not too high) significantly promotes the development of this field;
c) model structures: the proposal of new mechanisms such as copy  /coverage  .
2017->2018: thanks to several important preparations (model structure, datasets and training methods) in 2017, the development of summarization field has entered a golden period in this year, especially in the EMNLP2018, nearly 20 papers have been accepted.
2019: Based on the current acceptance of NAACL and ACL, we can conjecture that this year is a summarization year, in which the number of related papers in ACL is more than 20, and it can be expected that EMNLP will not be less. One of the important reasons here can be attributed to the development of unsupervised pre-training model (ELMo, BERT).
The dataset is organized by Rush for abstractive summarization, which contains many spurious headline-article pairs. There are 3.8M training, 189k development and 1951 test samples.
A large corpus of Chinese short text summarization dataset constructed from the Chinese microblogging website Sina Weibo.
The dataset is re-organized by Nallapati, covering 286,817 training pairs, 13,368
validation pairs and 11,487 test pairs.
It’s a summarization dataset of 1.3 million articles and summaries written by authors and editors in newsrooms of 38 major news publications.
The original Newsroom dataset is pre-processed and repurposed for cross-domain evaluation.
The dataset is collected a real-world, large scale dataset for this task by harvesting online articles from the British Broadcasting Corporation, which does not favor extractive strategies and calls for an abstractive modeling approach.
Scientific papers which are an example of long-form structured document types
PubMed：scientific papers which are an example of long-form structured document types
Scientific papers which are an example of long-form structured document types.
It is a movie review website that aggregates both professional critics and user-generated reviews
The dataset could less suffer from some biases that key sentences usually locate at the beginning of the text and favorable summary candidates are already inside the text in similar forms.
It consists of 1.3 million records of U.S. patent documents along with human written abstractive summaries.
NIST launched a new text summarization evaluation effort, called DUC.
The input is comprised of a Wikipedia topic (title of article) and a collection of non-Wikipedia reference documents, and the target is the Wikipedia article text.
ScisummNet (Yale University; AAAI19):[paper][data]
It contains 1,000 examples of papers, citation information and human summaries, is orders of magnitude larger than prior datasets.
Multi-News (Yale University; ACL2019):[paper][data]
A multi-document summarization dataset in the news domain.
The dataset provides an asynchronous (i.e., there is no given description for images and no subtitle for videos) collection of multi-modal information about a specific news topics, including multiple documents, images, and videos, to generate a fixed length textual summary.
The dataset provides a testbed for Multimodal Summarization with Multimodal Output, constructed from Daily Mail website.
The corpus consists of around 80,000 instructional videos (about 2,000 hours) with associated English sub-titles and summaries.
New tasks for Summarization
It is a Wikipedia-style website for gathering pro and con arguments on controversial issues.
The dataset is created from Debatepedia an encyclopedia of pro and con arguments and quotes on critical debate topics.
It is a collection of ~2.1 million Java methods and their associated Javadoc comments.
It contains 1716 summaries for papers from several computer science conferences based on the video of talk.
Multi-Aspect CNN/DM (ACL2019):[paper][data]
An aspect-based summarization dataset.
3. Typical Research Problems
We have summarized and classified the research problems of existing work based on more than 100 papers in the last six years, to share them in the hope that more beginners can learn the most core things in this field more quickly.
a) The choices of encoder
b) The choices of decoder
c) How to model the semantic relationship between sentences
b) How to improve the abstractiveness of existing systems
a) How to define a valid scoring function and filter different semantic units based on it
The number of source documents
New Datasets and Evaluations
Defition of new tasks
Summary of programming language
4. Active Research Groups
The Language Computing and Web Mining Group lead by Xiaojun Wan