GitHub - laeubli/constrained_decoding: Lexically constrained decoding for sequence generation using Grid Beam Search

Lexically Constrained Decoding with Grid Beam Search

This project is a reference implementation of Grid Beam Search (GBS) as presented in Lexically Constrained Decoding For Sequence Generation.

We provide two sample implementations of translation models -- one using our framework for Neural Interactive Machine Translation, and another for models trained with Nematus.

NMT models trained with Nematus model work out of the box. This project can also be used as a general-purpose ensembled decoder for Nematus models with or without constraints.

Quick Start

git clone https://github.com/chrishokamp/constrained_decoding.git
cd constrained_decoding
pip install -e .

Citing

If you use code or ideas from this project, please cite:

@misc{1704.07138,
Author = {Chris Hokamp and Qun Liu},
Title = {Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search},
Year = {2017},
Eprint = {arXiv:1704.07138},
}

Project Structure

Core Abstractions

Running Experiments

For PRIMT and Domain Adaptation experiments, the lexical constraints are stored in *.json files. The format is [[[c1_t1, c1_t2], [c2_t1, c2_t2, c2_t3], ...], ...]: Each constraint is a list of tokens, and each segment has a list of constraints. The length of the outer list in the *.json should be the same as number of segments in the source data. If there are no constraints for a segment, there should be an empty list.

Pick-Revise

Domain-Adaptation

Performance

The current implementation is pretty slow, and it gets slower the more constraints you add 😞. The GBS algorithm can be easily parallelized, because each cell in a column is independent of the others (see paper). However, implementing this requires us to make some assumptions about the underlying model, and would thus limit the generality of the code base. If you have ideas about how to make things faster, please create an issue.

Features

Ensembling and weighted decoding for Nematus models

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
constrained_decoding		constrained_decoding
lib		lib
proto		proto
scripts		scripts
tests		tests
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lexically Constrained Decoding with Grid Beam Search

Quick Start

Citing

Project Structure

Core Abstractions

Running Experiments

Pick-Revise

Domain-Adaptation

Performance

Features

Running Tests

About

Releases

Packages

Languages

laeubli/constrained_decoding

Folders and files

Latest commit

History

Repository files navigation

Lexically Constrained Decoding with Grid Beam Search

Quick Start

Citing

Project Structure

Core Abstractions

Running Experiments

Pick-Revise

Domain-Adaptation

Performance

Features

Running Tests

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages