This repository contains a list of papers, codes, and datasets in Biomedical Text Summarisation based on PLM. If you found any errors, please don't hesitate to open an issue or pull a request.
Resource Contributed by Qianqian Xie, Zheheng Luo, Benyou Wang,Sophia Ananiadou.
Biomedical text summarization has long been a fundamental task in biomedical natural language processing (BioNLP), aiming at generating concise summaries that distil key information from one or multiple biomedical documents. In recent years, pre-trained language models (PLMs) have been the de facto standard of various natural language processing tasks in the general domain. Most recently, PLMs have been further investigated in the biomedical domain and brought new insights into the biomedical text summarization task.
To help researchers quickly grasp the development in this task and inspire further research, we line up available datasets, recent approaches and evaluation methods in this project.
At present, the project has been completely open source, including:
- BioTS dataset table: we listed the datasets in the BioTS field, You can find the category, size, content, and access of them in the table.
- PLM Based BioTS Methods: we classified and arranged papers based on the type of output summary, numbers and type of input documents. the current mainstream frontiers. Each line of the table contains the category, the strategy of applying PLM, the backbone model, the training type, and used datasets.
- BioTS Evaluation: we listed metrics that cover three essential aspects in the evaluation of biomedical text summarization: 1) relevancy 2) fluency 3) factuality.
The organization and our survey and the detailed background of biomedical text summarization are illustrated in the pictures below.
- Text summarization in the biomedical domain: A systematic review of recent research
J. biomedical informatics 2014
[html] - Summarization from medical documents: a survey
Artif. intelligence medicine 2005
[html] - Automated methods for the summarization of electronic health records
J Am Med Inform Assoc. 2015
[html] - A systematic review of automatic text summarization for biomedical literature and ehrs
J Am Med Inform Assoc. 2021
[html]
Name | Category | Size | Content | Multi/Single Sum(M/S) | Access |
---|---|---|---|---|---|
PubMed |
Biomedical literature | 133,215 | Full contents of articles | Single | https://github.com/armancohan/long-summarization |
RCT |
Biomedical literature | 4,528 | Titles and abstracts of articles | Multiple | https://github.com/bwallace/RCT-summarization-data |
MSˆ2 |
Biomedical literature | 470,402 | Abstracts of articles | Multiple | https://github.com/allenai/ms2/ |
CDSR |
Biomedical literature | 7,805 | Abstracts of articles | Single | hhttps://github.com/qiuweipku/Plain_language_summarization |
SumPubMed |
Biomedical literature | 33,772 | Full contents of articles | Single | https://github.com/vgupta123/sumpubmed |
S2ORC |
Biomedical literature | 63,709 | Full contents of articles | Single | https://github.com/jbshp/GenCompareSum |
CORD-19 |
Biomedical literature | - (constantly increasing) | Full contents of articles | Single | https://github.com/allenai/cord19 |
MIMIC-CXR |
EHR | 227,835 | Full contents of reports | Single | https://physionet.org/content/mimic-cxr/2.0.0/ |
OpenI |
EHR | 3599 | Full contents of reports | Single | https://openi.nlm.nih.gov/faq#collection |
MeQSum |
meidical question summarization | 1000 | Full contents of question | Single | https://github.com/abachaa/MeQSum/ |
CHQ-Summ |
meidical question summarization | 1507 | Full contents of question | Single | https://github.com/shwetanlp/Yahoo-CHQ-Summ |
Paper | Category | Strategy | Model | Training | Dataset |
---|---|---|---|---|---|
ContinualBERT [code] |
extractive | fine-tuning | BERT | supervised | PubMed, CORD-19 |
BioBERTSum |
extractive | fine-tuning | BioBERT | supervised | PubMed |
KeBioSum [code] |
extractive | adaption+fine-tuning | PubMedBERT | supervised | PubMed, CORD-19, S2ORC |
N. Kanwal and G. Rizzo [code] |
extractive | fine-tuning | BERT | unsupervised | MIMIC-III |
M. Moradi et.al [code] |
extractive | feature-base | BERT | unsupervised | PubMed |
M. Moradi et.al [code] |
extractive | feature-base | BioBERT | unsupervised | PubMed |
GenCompareSum [code] |
extractive | feature-base | T5 | unsupervised | PubMed, CORD-19, S2ORC |
RadBERT |
extractive | feature-base | RadBERT | unsupervised | - |
B Tan et.al [code] |
hybrid | adaption+fine-tuning | BERT,GPT-2 | supervised | CORD-19 |
S. S. Gharebagh et.al |
abstractive | feature-base | BERT | supervised | MIMIC-CXR |
Y. Guo et.al [code] |
hybrid | adaption+fine-tuning | BERT, BART | supervised | CDSR |
L. Xu et.al |
abstractive,question | adaption+fine-tuning | BART,PEGASUS | supervised | MIMIC-CXR,OpenI,MeQSum |
W. Zhu et.al |
abstractive | fine-tuning | BART,T5,PEGASUS | supervised | MIMIC-CXR,OpenI |
R. Kondadadi et.al |
abstractive | fine-tuning | BART,T5,PEGASUS | supervised | MIMIC-CXR,OpenI |
S. Dai et.al |
abstractive | adaption+fine-tuning | PEGASUS | supervised | MIMIC-CXR,OpenI |
D. Mahajan et.al |
abstractive | adaption+fine-tuning | BioRoBERTa | supervised | MIMIC-CXR,OpenI |
H. Jingpeng et.al [code] |
abstractive | fine-tuning | BioBERT | supervised | MIMIC-CXR,OpenI |
X. Cai et.al |
abstractive | fine-tuning | SciBERT | supervised | CORD-19 |
A. Yalunin et.al |
abstractive | adaption+fine-tuning | BERT,Longformer | supervised | - |
B. C. Wallace et.al [code] |
abstractive,multi-doc | adaption+fine-tuning | BART | supervised | RCT |
J. DeYoung et.al [code] |
abstractive,multi-doc | fine-tuning | BART,Longformer | supervised | MSˆ2 |
A. Esteva et.al |
abstractive,multi-doc | fine-tuning | BERT,GPT-2 | supervised | CORD-19 |
CAiRE-COVID [code] |
hybrid,multi-doc | fine-tuning,feature-base | ALBERT,BART | un+supervised | CORD-19 |
HET [code] |
extractive,dialogue | fine-tuning | BERT | supervised | HET-MC |
CLUSTER2SENT [code] |
abstractive,dialogue | fine-tuning | BERT,T5 | supervised | - |
L. Zhang et.al [code] |
abstractive,dialogue | fine-tuning | BART | supervised | - |
B. Chintagunt et.al |
abstractive,dialogue | fine-tuning | GPT-3 | supervised | - |
D. F. Navarro et.al |
abstractive,dialogue | fine-tuning | BART,T5, PEGASUS | supervised | - |
BioBART [code] |
abstractive,dialogue | fine-tuning | BioBART | supervised | - |
Y. He et.al |
abstractive,question | fine-tuning | BART,T5,PEGASUS | supervised | MeQSum,MIMIC-CXR,OpenI |
S. Yadav et.al |
abstractive,question | fine-tuning | BERT,ProphetNet | supervised | MeQSum |
S. Yadav et.al |
abstractive,question | adaption+fine-tuning | Minilm | supervised | MeQSum |
K. Mrini et.al [code] |
abstractive,question | adaption+fine-tuning | BART,BioBERT | supervised | MeQSum |
- ROUGE-N: N-gram overlap between generated summaries of summarizers and gold summaries(relevancy)
- ROUGE-L: the longest common subsequences between generated summaries of summarizers and gold summaries(fluency)
Automatic:
- CheXbert check binary presence values of disease variables
- Jensen-Shannon Distance check directions(increase, decrease, no change)
Human Involved
Model | ROUGE-1 | ROUGE-2 | ROUGE-L | Paper | Code Like | Source |
---|---|---|---|---|---|---|
Top Down Transformer(AdaPool) |
51.05 | 23.26 | 46.47 | Long Document Summarization with Top-down and Bottom-up Inference (https://arxiv.org/pdf/2203.07586v1.pdf) | https://github.com/kangbrilliant/DCA-Net | arxiv |
LongT5 |
50.23 | 24.76 | 46.67 | LongT5: Efficient Text-To-Text Transformer for Long Sequences (https://arxiv.org/pdf/2112.07916v2.pdf) | https://github.com/google-research/longt5 | NAACL |
MemSum (extractive) |
49.25 | 22.94 | 44.42 | MemSum: Extractive Summarization of Long Documents Using Multi-Step Episodic Markov Decision Processes(https://arxiv.org/pdf/2107.08929v2.pdf) | https://github.com/nianlonggu/memsum | ACL |
HAT-BART |
48.25 | 21.35 | 36.69 | Hierarchical Learning for Generation with Long Source Sequences(https://arxiv.org/pdf/2104.07545v2.pdf) | arxiv | |
DeepPyramidion |
47.81 | 21.14 | 46.47 | Sparsifying Transformer Models with Trainable Representation Pooling (https://aclanthology.org/2022.acl-long.590.pdf) | https://github.com/applicaai/pyramidions | |
HiStruct+ |
46.59 | 20.39 | 42.11 | HiStruct+: Improving Extractive Text Summarization with Hierarchical Structure Information(https://aclanthology.org/2022.findings-acl.102.pdf) | acl | |
DANCER PEGASUS |
46.34 | 19.97 | 42.42 | A Divide-and-Conquer Approach to the Summarization of Long Documents[[pdf]](https://arxiv.org/pdf/2004.06190v3.pdf) | https://github.com/AlexGidiotis/DANCER-summ | IEEE/ACM Transactions on Audio Speech and Language Processing |
BigBird-Pegasus |
46.32 | 20.65 | 42.33 | Big Bird: Transformers for Longer Sequences(https://arxiv.org/pdf/2007.14062v2.pdf) | https://github.com/google-research/bigbird | NeuralIPS |
ExtSum-LG+MMR-Select+ |
45.39 | 20.37 | 40.99 | Systematically Exploring Redundancy Reduction in Summarizing Long Documents(https://arxiv.org/pdf/2012.00052v1.pdf) | https://github.com/Wendy-Xiao/redundancy_reduction_longdoc | AACL |
ExtSum-LG+RdLoss |
45.3 | 20.42 | 40.95 | Systematically Exploring Redundancy Reduction in Summarizing Long Documents(https://arxiv.org/pdf/2012.00052v1.pdf) | https://github.com/Wendy-Xiao/redundancy_reduction_longdoc | AACL |
Model | ROUGE-1 | ROUGE-2 | ROUGE-L | Paper | Code Like | Source |
---|---|---|---|---|---|---|
DAMEN |
28.95 | 9.72 | 21.83 | MDiscriminative Marginalized Probabilistic Neural Method for Multi-Document Summarization of Medical Literature (https://aclanthology.org/2022.acl-long.15.pdf) | https://disi-unibo-nlp.github.io/projects/damen/ | ACL |
BART Hierarchical |
27.56 | 9.40 | 20.80 | MSˆ2: A Dataset for Multi-Document Summarization of Medical Studies (https://aclanthology.org/2021.emnlp-main.594.pdf) | https://github.com/allenai/ms2/ | EMNLP |
LED Flat |
26.89 | 8.91 | 20.32 | MSˆ2: A Dataset for Multi-Document Summarization of Medical Studies (https://aclanthology.org/2021.emnlp-main.594.pdf) | https://github.com/allenai/ms2/ | EMNLP |
Model | ROUGE-1 | ROUGE-2 | ROUGE-L | Paper | Code Like | Source |
---|---|---|---|---|---|---|
ClinicalBioBERTSumAbs |
58.97 | 47.06 | 57.37 | Predicting Doctor’s Impression For Radiology Reports with Abstractive Text Summarization (https://web.stanford.edu/class/cs224n/reports/final_reports/report005.pdf) | Stanford CS224N | |
Attend to Medical Ontologies |
53.57 | 40.78 | 51.81 | Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization (https://aclanthology.org/2020.acl-main.172.pdf) | ACL |
Model | ROUGE-1 | ROUGE-2 | ROUGE-L | Paper | Code | Source |
---|---|---|---|---|---|---|
Gradually Soft MTL + Data Aug |
54.5 | 37.9 | 50.2 | A Gradually Soft Multi-Task and Data-Augmented Approach to Medical Question Understanding (https://aclanthology.org/2021.acl-long.119.pdf) | https://github.com/KhalilMrini/Medical-Question-Understanding | ACL |
Explicit QTA Knowledge-Infusion |
45.20 | 28.38 | 48.76 | Question-aware Transformer Models for Consumer Health Question Summarization (https://arxiv.org/pdf/2106.00219.pdf) | J. Biomed. Informatics | |
ProphetNet + QTR + QFR |
45.52 | 27.54 | 48.19 | Reinforcement Learning for Abstractive Question Summarization with Question-aware Semantic Rewards (https://aclanthology.org/2021.acl-short.33.pdf) | https://github.com/shwetanlp/CHQ-Summ | ACL |
ProphetNet + QFR |
45.36 | 27.33 | 47.96 | Reinforcement Learning for Abstractive Question Summarization with Question-aware Semantic Rewards (https://aclanthology.org/2021.acl-short.33.pdf) | https://github.com/shwetanlp/CHQ-Summ | ACL |
Multi-Cloze Fusion |
44.58 | 27.02 | 47.81 | Question-aware Transformer Models for Consumer Health Question Summarization (https://arxiv.org/pdf/2106.00219.pdf) | J. Biomed. Informatics | |
ProphetNet + QTR |
44.60 | 26.69 | 47.38 | Reinforcement Learning for Abstractive Question Summarization with Question-aware Semantic Rewards (https://aclanthology.org/2021.acl-short.33.pdf) | https://github.com/shwetanlp/CHQ-Summ | ACL |
Implicit QTA Knowledge-Infusion |
44.44 | 26.98 | 47.66 | Question-aware Transformer Models for Consumer Health Question Summarization (https://arxiv.org/pdf/2106.00219.pdf) | J. Biomed. Informatics | |
Minilm |
43.13 | 26.03 | 46.39 | Minilm: Deep self-attention distillation for task-agnostic compression of pretrained transformers (https://arxiv.org/pdf/2106.00219.pdf) | https://github.com/microsoft/unilm/tree/master/minilm | NIPS |