-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finetune flag #540
Finetune flag #540
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable to me, but I'd prefer that @pruksmhc or @W4ngatang be the main reviewers on this. It might help to add a comment to defaults clarifying what counts as a pretrained model—this is for GPT/BERT/ELMo, not for smaller models that we train here.
Yep this looks good, thanks for tackling this Haokun! Do some runs to test that this works. |
Does it look better now? |
Tested, looks reasonable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good and good change, but some docs changes.
config/defaults.conf
Outdated
@@ -70,8 +70,11 @@ do_full_eval = 1 // Evaluate the model on the tasks on target_tasks. | |||
|
|||
// Related configuration | |||
load_model = 1 // If true, restore from checkpoint when starting do_pretrain. No impact on do_target_task_training. | |||
transfer_paradigm = "frozen" // How to do learning on top of the pretrained model | |||
transfer_paradigm = "frozen" // How to do learning on top of the task-independently pretrained model | |||
// Note: this option only works for models like ELMO/GPT/BERT (models that was pretrained on large amount of data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why wouldn't these apply if we're using LSTMs or some non pre-trained models?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right, I don't what I was thinking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me once you attend to Alex's comments!
Now there are merge conflicts—dang. Mind fixing those, then we can merge? |
* deprive bert_fine_tune and openai_transformer_fine_tune * update config and readme * add explaination to transfer_paradigm * fix the comment
Removed redundant bert_fine_tune and openai_transformer_fine_tune flags. transfer_paradigm now controls whether those parameters in bert and gpt are trainable.
#505