Skip to content
This repository has been archived by the owner on Jul 22, 2024. It is now read-only.
/ document2slides Public archive

This repository contains the code to reconstruct the training dataset from NLP/ML Papers in PDF format together with their corresponding slides.

License

Notifications You must be signed in to change notification settings

IBM/document2slides

Repository files navigation

document2slides

This code is for NAACL 2021 paper D2S: Document-to-Slide Generation Via Query-Based Text Summarization

This repository contains:

  1. sciduet-build: code to reconstruct the training dataset from NLP/ML Papers in PDF format together with their corresponding slides
  2. SciDuet-ACL: finished preprocess ACL training data
  3. Derivability annotations together with the trained classifier
  4. d2s-model: code to train and evaluate automatic slide generation system

Edward Sun, Yufang Hou, Dakuo Wang, Yunfeng Zhang, Nancy X.R. Wang. D2S: Document-to-Slide Generation Via Query-Based Text Summarization. In Proceedings of the 18th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2021), Online, 6 - 11 June 2021

About

This repository contains the code to reconstruct the training dataset from NLP/ML Papers in PDF format together with their corresponding slides.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages