Few-shot Fine-tuning for Opinion Summarization

This repository contains the main codebase for the corresponding NAACL findings paper. In this work, we explored in-domain information storage to adapters by pre-training them on customer reviews via the leave-one-out objective. Further, we fine-tune the pre-trained adapters on a handful of summaries. This method yields state-of-the-art results in terms of ROUGE scores and reduces semantic mistakes in generated summaries.

1. Conda environment

In this project, we used conda for environments. To re-create the environment, use the command below.

conda env create --file environment.yml

Then, activate it:

conda activate adasum

2. FAIRSEQ

The codebase relies on FAIRSEQ, which can be downloaded and installed in a parent folder as follows.

git clone https://github.com/pytorch/fairseq.git
mv fairseq fairseq_lib
cd fairseq_lib

git reset --hard 81046fc
pip install --editable ./

Please make sure you use the correct commit to avoid incompatibility issues. Also, set the global variable.

export MKL_THREADING_LAYER=GNU

3. Folder structure

The main codebase is stored at adasum.

artifacts: checkpoints and model generated summaries (checkpoints need to be download separately);
data: contains pre-training and fine-tuning datasets (see pre-processing folder for instructions on how to obtain data);
adasum: fairseq files for adasum and adaqsum models;
preprocessing: scripts for data pre-processing;
shared: files shared between adasum and preprocessing scripts.

4. Citation

@inproceedings{brazinskas-etal-2022-efficient,
    title = "Efficient Few-Shot Fine-Tuning for Opinion Summarization",
    author = "Brazinskas, Arthur  and
      Nallapati, Ramesh  and
      Bansal, Mohit  and
      Dreyer, Markus",
    booktitle = "Findings of the Association for Computational Linguistics: NAACL 2022",
    month = jul,
    year = "2022",
    address = "Seattle, United States",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.findings-naacl.113",
    pages = "1509--1523"
}

5. Security

See CONTRIBUTING for more information.

6. License

This project is licensed under the CC-BY-NC-4.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github		.github
adasum		adasum
artifacts		artifacts
img		img
preprocessing		preprocessing
shared		shared
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Few-shot Fine-tuning for Opinion Summarization

1. Conda environment

2. FAIRSEQ

3. Folder structure

4. Citation

5. Security

6. License

About

Releases

Packages

Contributors 4

Languages

License

amazon-science/adasum

Folders and files

Latest commit

History

Repository files navigation

Few-shot Fine-tuning for Opinion Summarization

1. Conda environment

2. FAIRSEQ

3. Folder structure

4. Citation

5. Security

6. License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages