Skip to content

Commit

Permalink
add outline
Browse files Browse the repository at this point in the history
  • Loading branch information
jiesutd committed Jan 8, 2019
1 parent 720bb6b commit 55bb550
Showing 1 changed file with 13 additions and 12 deletions.
25 changes: 13 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
* [Update](#Update)

## Introduction
-----------

Sequence labeling models are quite popular in many NLP tasks, such as Named Entity Recognition (NER), part-of-speech (POS) tagging and word segmentation. State-of-the-art sequence labeling models mostly utilize the CRF structure with input word features. LSTM (or bidirectional LSTM) is a popular deep learning based feature extractor in sequence labeling task. And CNN can also be used due to faster computation. Besides, features within word are also useful to represent word, which can be captured by character LSTM or character CNN structure or human-defined neural features.

NCRF++ is a PyTorch based framework with flexiable choices of input features and output structures. The design of neural sequence labeling models with NCRF++ is fully configurable through a configuration file, which does not require any code work. NCRF++ is a neural version of [CRF++](http://taku910.github.io/crfpp/), which is a famous statistical CRF framework.
Expand All @@ -35,15 +35,15 @@ NCRF++ supports diffent structure combinations of on three levels: character seq
Welcome to star this repository!

## Requirement
-----------

Python: 2 or 3
PyTorch: 1.0

[PyTorch 0.3 compatible version is here.](https://github.com/jiesutd/NCRFpp/tree/PyTorch0.3)


## Advantages
-----------

* Fully configurable: all the neural model structures can be set with a configuration file.
* State-of-the-art system performance: models build on NCRF++ can give comparable or better results compared with state-of-the-art models.
* Flexible with features: user can define their own features and pretrained feature embeddings.
Expand All @@ -52,7 +52,7 @@ Welcome to star this repository!


## Usage
-----------

NCRF++ supports designing the neural network structure through a configuration file. The program can run in two status; ***training*** and ***decoding***. (sample configuration and data have been included in this repository)

In ***training*** status:
Expand All @@ -71,7 +71,7 @@ NCRF++ is designed in three layers (shown below): character sequence layer; word


## Data Format
------------

* You can refer the data format in [sample_data](sample_data).
* NCRF++ supports both BIO and BIOES(BMES) tag scheme.
* Notice that IOB format (***different*** from BIO) is currently not supported, because this tag scheme is old and works worse than other schemes [Reimers and Gurevych, 2017](https://arxiv.org/pdf/1707.06799.pdf).
Expand All @@ -80,7 +80,7 @@ NCRF++ is designed in three layers (shown below): character sequence layer; word


## Performance
-------------

Results on CONLL 2003 English NER task are better or comparable with SOTA results with the same structures.

CharLSTM+WordLSTM+CRF: 91.20 vs 90.94 of [Lample .etc, NAACL16](http://www.aclweb.org/anthology/N/N16/N16-1030.pdf);
Expand All @@ -100,7 +100,7 @@ We have compared twelve neural sequence labeling models (`{charLSTM, charCNN, No


## Add Handcrafted Features
--------------

NCRF++ has integrated several SOTA neural characrter sequence feature extractors: CNN ([Ma .etc, ACL16](http://www.aclweb.org/anthology/P/P16/P16-1101.pdf)), LSTM ([Lample .etc, NAACL16](http://www.aclweb.org/anthology/N/N16/N16-1030.pdf)) and GRU ([Yang .etc, ICLR17](https://arxiv.org/pdf/1703.06345.pdf)). In addition, handcrafted features have been proven important in sequence labeling tasks. NCRF++ allows users designing their own features such as Capitalization, POS tag or any other features (grey circles in above figure). Users can configure the self-defined features through configuration file (feature embedding size, pretrained feature embeddings .etc). The sample input data format is given at [train.cappos.bmes](sample_data/train.cappos.bmes), which includes two human-defined features `[POS]` and `[Cap]`. (`[POS]` and `[Cap]` are two examples, you can give your feature any name you want, just follow the format `[xx]` and configure the feature with the same name in configuration file.)
User can configure each feature in configuration file by using

Expand All @@ -113,14 +113,14 @@ Feature without pretrained embedding will be randomly initialized.


## Speed
--------------

NCRF++ is implemented using fully batched calculation, making it quite effcient on both model training and decoding. With the help of GPU (Nvidia GTX 1080) and large batch size, LSTMCRF model built with NCRF++ can reach 1000 sents/s and 2000sents/s on training and decoding status, respectively.

![alt text](readme/speed.png "System speed on NER data")


## N best Decoding
----------------

Traditional CRF structure decodes only one label sequence with largest probabolities (i.e. 1-best output). While NCRF++ can give a large choice, it can decode `n` label sequences with the top `n` probabilities (i.e. n-best output). The nbest decodeing has been supported by several popular **statistical** CRF framework. However to the best of our knowledge, NCRF++ is the only and the first toolkit which support nbest decoding in **neural** CRF models.

In our implementation, when the nbest=10, CharCNN+WordLSTM+CRF model built in NCRF++ can give 97.47% oracle F1-value (F1 = 91.35% when nbest=1) on CoNLL 2003 NER task.
Expand All @@ -129,22 +129,22 @@ In our implementation, when the nbest=10, CharCNN+WordLSTM+CRF model built in NC


## Reproduce Paper Results and Hyperparameter Tuning
-------------------

To reproduce the results in our COLING 2018 paper, you only need to set the `iteration=1` as `iteration=100` in configuration file `demo.train.config` and configure your file directory in this configuration file. The default configuration file describes the `Char CNN + Word LSTM + CRF` model, you can build your own model by modifing the configuration accordingly. The parameters in this demo configuration file are the same in our paper. (Notice the `Word CNN` related models need slightly different parameters, details can be found in our COLING paper.)

If you want to use this framework in new tasks or datasets, here are some tuning [tips](readme/hyperparameter_tuning.md) by @Victor0118.


## Report Issue or Problem
------------------

If you want to report an issue or ask a problem, please attach the following materials if necessary. With these information, I can give fast and accurate discussion and suggestion.
* `log file`
* `config file`
* `sample data`


## Cite
-----------------

If you use NCRF++ in your paper, please cite our [ACL demo paper](https://arxiv.org/abs/1806.05626):

@inproceedings{yang2018ncrf,
Expand All @@ -168,6 +168,7 @@ If you use experiments results and analysis of NCRF++, please cite our [COLING p


## Update

* 2018-Dec-17, NCRF++ v0.2, support PyTorch 1.0
* 2018-Mar-30, NCRF++ v0.1, initial version
* 2018-Jan-06, add result comparison.
Expand Down

0 comments on commit 55bb550

Please sign in to comment.