add doc

shenxuhui · Jun 26, 2018 · 5ca9595 · 5ca9595
1 parent 084bae9
commit 5ca9595
Show file tree

Hide file tree

Showing 2 changed files with 13 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -107,6 +107,11 @@ In our implementation, when the nbest=10, CharCNN+WordLSTM+CRF model built in NC
 ![alt text](readme/nbest.png  "N best decoding oracle result")
 
 
+7.Hyperparameter tuning:
+========================
+You can get some tips [here](reademe/hyperparameter_tuning.md). Share your experience on your configuration and your dataset now!
+
+
 Cite: 
 ========
 

diff --git a/readme/hyperparameter_tuning.md b/readme/hyperparameter_tuning.md
@@ -0,0 +1,8 @@
+## Hyperparamter tuning
+
+1. If you use large batch, you'd better set `avg_batch_loss` to `True`
+2. You can get 2~5 points' improvement if you use 300-d word embedding instead of 50-d word embedding
+3. If you want to write a script to tune hyperparameters, you can use the `main_parse.py` to set hyperparameters in command line arguements
+4. `lr` needs to be carefully tuned for different structures:
+    * If you run LSTM-LSTM-CRF on CONLL-2003 dataset, a good `lr` is 0.015
+    * If you run LSTM-CNN-CRF on CONLL-2003 dataset,, a good `lr` is 0.005