sebastianruder · sebastianruder · Oct 25, 2018 · Oct 25, 2018
diff --git a/README.md b/README.md
@@ -4,56 +4,56 @@
 
 ### English
 
-- [ASR](english/asr.md)
+- [Automatic speech recognition](english/automatic_speech_recognition.md)
 - [CCG supertagging](english/ccg_supertagging.md)
 - [Chunking](english/chunking.md)
 - [Constituency parsing](english/constituency_parsing.md)
 - [Coreference resolution](english/coreference_resolution.md)
 - [Dependency parsing](english/dependency_parsing.md)
-- [Dialog](english/dialog.md)
+- [Dialogue](english/dialogue.md)
 - [Domain adaptation](english/domain_adaptation.md)
-- [Entity Linking](english/entity_linking.md)
-- [Grammatical Error Correction](english/grammatical_error_correction.md)
-- [Information Extraction](english/information_extraction.md)
+- [Entity linking](english/entity_linking.md)
+- [Grammatical error correction](english/grammatical_error_correction.md)
+- [Information extraction](english/information_extraction.md)
 - [Language modeling](english/language_modeling.md)
-- [Lexical Normalization](english/lexical_normalization.md)
+- [Lexical normalization](english/lexical_normalization.md)
 - [Machine translation](english/machine_translation.md)
 - [Multi-task learning](english/multi-task_learning.md)
-- [Multimodal](english/multimodal.md)
+- [Multi-modal](english/multimodal.md)
 - [Named entity recognition](english/named_entity_recognition.md)
 - [Natural language inference](english/natural_language_inference.md)
 - [Part-of-speech tagging](english/part-of-speech_tagging.md)
 - [Question answering](english/question_answering.md)
-- [Relation Prediction](english/relation_prediction.md)
+- [Relation prediction](english/relation_prediction.md)
 - [Relationship extraction](english/relationship_extraction.md)
 - [Semantic textual similarity](english/semantic_textual_similarity.md)
-- [Sentiment analysis](english/sentiment_analysis.md)
 - [Semantic parsing](english/semantic_parsing.md)
 - [Semantic role labeling](english/semantic_role_labeling.md)
+- [Sentiment analysis](english/sentiment_analysis.md)
 - [Stance detection](english/stance_detection.md)
 - [Summarization](english/summarization.md)
 - [Taxonomy learning](english/taxonomy_learning.md)
-- [Temporal Processing](english/temporal_processing.md)
+- [Temporal processing](english/temporal_processing.md)
 - [Text classification](english/text_classification.md)
-- [Word Sense Disambiguation](english/word_sense_disambiguation.md)
+- [Word sense disambiguation](english/word_sense_disambiguation.md)
 
-### Korean
+### Chinese
 
-- [Chunking](korean/korean.md)
-- [Part-of-speech tagging](korean/korean.md)
+- [Entity linking](chinese/chinese.md#entity-linking)
 
 ### Hindi
 
-- [Chunking](hindi/hindi.md)
-- [Machine Translation](hindi/hindi.md)
+- [Chunking](hindi/hindi.md#chunking)
+- [Part-of-speech tagging](hindi/hindi.md#part-of-speech-tagging)
+- [Machine Translation](hindi/hindi.md#machine-translation)
 
 ### Vietnamese
 
-- [Word segmentation](vietnamese/vietnamese.md)
-- [Part-of-speech tagging](vietnamese/vietnamese.md)
-- [Named entity recognition](vietnamese/vietnamese.md)
-- [Dependency parsing](vietnamese/vietnamese.md)
-- [Machine translation](vietnamese/vietnamese.md)
+- [Dependency parsing](vietnamese/vietnamese.md#dependency-parsing)
+- [Machine translation](vietnamese/vietnamese.md#machine-translation)
+- [Named entity recognition](vietnamese/vietnamese.md#named-entity-recognition)
+- [Part-of-speech tagging](vietnamese/vietnamese.md#part-of-speech-tagging)
+- [Word segmentation](vietnamese/vietnamese.md#word-segmentation)
 
 This document aims to track the progress in Natural Language Processing (NLP) and give an overview
 of the state-of-the-art (SOTA) across the most common NLP tasks and their corresponding datasets.
@@ -68,89 +68,71 @@ the reader will be pointed there.
 If you want to find this document again in the future, just go to [`nlpprogress.com`](https://nlpprogress.com/)
 or [`nlpsota.com`](http://nlpsota.com/) in your browser.
 
-### Wish list
-
-These are tasks and datasets that are still missing.
-
-- Bilingual dictionary induction
-- Discourse parsing
-- Keyphrase extraction
-- Knowledge base population (KBP)
-- More dialogue tasks
-- Semi-supervised learning
-
 ### Contributing
 
-If you would like to add a new result, you can do so with a pull request (PR). 
-In order to minimize noise and to make maintenance somewhat manageable, results reported
-in published papers will be preferred (indicate the venue of publication in your PR);
-an exception may be made for influential preprints. The result should include the name
-of the method, the citation, the score, and a link to the paper and should be added
-so that the table is sorted (with the best result on top).
+#### Guidelines
 
-If your pull request contains a new result, please make sure that "new result" appears
-somewhere in the title of the PR. This way, we can track which tasks are the most
-active and receive the most attention.
+**Results** &nbsp; Results reported in published papers are preferred; an exception may be made for influential preprints.
+
+**Datasets** &nbsp; Datasets should have been used for evaluation in at least one published paper besides 
+the one that introduced the dataset.
 
-In order to make reproduction easier, we recommend to add a link to an implementation 
-to each method if available. You can add a `Code` column (see below) to the table if it does not exist.
+**Code** &nbsp; We recommend to add a link to an implementation 
+if available. You can add a `Code` column (see below) to the table if it does not exist.
 In the `Code` column, indicate an official implementation with [Official](http://link_to_implementation).
 If an unofficial implementation is available, use [Link](http://link_to_implementation) (see below).
 If no implementation is available, you can leave the cell empty.
 
-| Model           | Score  |  Paper / Source | Code | 
-| ------------- | :-----:| --- | --- | 
-| |  |  | [Official](http://link_to_implementation) | 
-| |  |  | [Link](http://link_to_implementation) |
+#### Adding a new result
 
-To add a new dataset or task, follow the below steps. Any new datasets
-should have been used for evaluation in at least one published paper besides 
-the one that introduced the dataset.
+If you would like to add a new result, you can just click on the small edit button in the top-right
+corner of the file for the respective task (see below).
+
+![Click on the edit button to add a file](img/edit_file.png)
 
-1. Fork the repository.
-2. If your task is completely new, create a new file and link to it in the table of contents above.
-If not, add your task or dataset to the respective section of the corresponding file (in alphabetical order).
-3. Briefly describe the dataset/task and include relevant references. 
-4. Describe the evaluation setting and evaluation metric.
-5. Show how an annotated example of the dataset/task looks like.
-6. Add a download link if available.
-7. Copy the below table and fill in at least two results (including the state-of-the-art)
-  for your dataset/task (change Score to the metric of your dataset).
-8. Submit your change as a pull request.
+This allows you to edit the file in Markdown. Simply add a row to the corresponding table in the
+same format. Make sure that the table stays sorted (with the best result on top). 
+After you've made your change, make sure that the table still looks ok by clicking on the
+"Preview changes" tab at the top of the page. If everything looks good, go to the bottom of the page,
+where you see the below form. 
+
+![Fill out the file change information](img/propose_file_change.png)
+
+Add a name for your proposed change, an optional description, indicate that you would like to
+"Create a new branch for this commit and start a pull request", and click on "Propose file change".
+
+#### Adding a new dataset or task
+
+For adding a new dataset or task, you can also follow the steps above. Alternatively, you can fork the repository.
+In both cases, follow the steps below:
+
+1. If your task is completely new, create a new file and link to it in the table of contents above.
+1. If not, add your task or dataset to the respective section of the corresponding file (in alphabetical order).
+1. Briefly describe the dataset/task and include relevant references. 
+1. Describe the evaluation setting and evaluation metric.
+1. Show how an annotated example of the dataset/task looks like.
+1. Add a download link if available.
+1. Copy the below table and fill in at least two results (including the state-of-the-art)
+  for your dataset/task (change Score to the metric of your dataset). If your dataset/task
+  has multiple metrics, add them to the right of `Score`.
+1. Submit your change as a pull request.
 
 | Model           | Score  |  Paper / Source | Code | 
 | ------------- | :-----:| --- | --- | 
 |  |  |  | | 
 
-**Important note:** We are currently transitioning from storing results in tables (as above) to using
-[YAML](https://en.wikipedia.org/wiki/YAML) files for their greater flexibility. This will allow us to 
-highlight additional attributes and have interesting visualizations of results down the line.
-
-If the results for your task are already stored in a YAML file, you can simply extend the YAML file
-using the same fields as the existing entries. To check that the resulting table looks as expected,
-you can build the site locally using Jekyll by following the steps detailed
-[here](https://help.github.com/articles/setting-up-your-github-pages-site-locally-with-jekyll/#requirements):
-
-1. Check whether you have Ruby 2.1.0 or higher installed with `ruby --version`, otherwise [install it](https://www.ruby-lang.org/en/downloads/).
-On OS X for instance, this can be done with `brew install ruby`. Make sure you also have `ruby-dev` and `zlib1g-dev` installed.
-1. Install Bundler `gem install bundler`. If you run into issues with installing bundler on OS X, have a look
-[here](https://bundler.io/v1.16/guides/rubygems_tls_ssl_troubleshooting_guide.html) for troubleshooting tips. Also try refreshing
-the terminal.
-1. Clone the repo locally: `git clone https://github.com/sebastianruder/NLP-progress`
-1. Navigate to the repo with `cd NLP-progress`
-1. Install Jekyll: `bundle install`
-1. Run the Jekyll site locally: `bundle exec jekyll serve`
-1. You can now preview the local Jekyll site in your browser at `http://localhost:4000`.
-
-### Things to do
-
-- Add a column for code (see above) to each table and a link to the source code to each method.
-- Add pointers on how to retrieve data.
-- Provide more details regarding the evaluation setup of each task.
-- Add an example to every task/dataset.
-- Add statistics to every dataset.
-- Provide a description and details for every task / dataset.
-- Add a table of contents to every file (particularly the large ones).
-- We could potentially use [readthedocs](https://github.com/rtfd/readthedocs.org) to provide a clearer structure.
-- All current datasets in this list are for the English language (except for [UD](#ud)). In a separate section, we could add
-datasets for other languages.
+
+### Wish list
+
+These are tasks and datasets that are still missing:
+
+- Bilingual dictionary induction
+- Discourse parsing
+- Keyphrase extraction
+- Knowledge base population (KBP)
+- More dialogue tasks
+- Semi-supervised learning
+
+### Instructions for building the site locally
+
+Instructions for building the website locally using Jekyll can be found [here](jekyll_instructions.md).
diff --git a/_data/ccg_supertagging.yaml b/_data/ccg_supertagging.yaml
diff --git a/_data/chunking.yaml b/_data/chunking.yaml
diff --git a/_data/constituency_parsing.yaml b/_data/constituency_parsing.yaml