Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text generation improvement (UI client, data parallel support) #5437

Merged
merged 98 commits into from
Dec 9, 2022
Merged
Changes from 1 commit
Commits
Show all changes
98 commits
Select commit Hold shift + click to select a range
2dec00d
Squashed commit of the following:
yidong72 Oct 13, 2022
e2dd840
Merge branch 'main' into universal_prompt_fix
yidong72 Oct 13, 2022
3d4f8d4
fix LGTM
yidong72 Oct 13, 2022
6308f97
fix validation
yidong72 Oct 13, 2022
fa7a720
change for the lm eval
yidong72 Oct 13, 2022
301a8b7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 13, 2022
5101e06
make text generation work in data parallel environment
yidong72 Oct 14, 2022
349cdfe
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 14, 2022
50d9970
implement the service with rest service
yidong72 Oct 15, 2022
951f520
Merge branch 'universal_prompt_fix' of github.com:NVIDIA/NeMo into un…
yidong72 Oct 15, 2022
3231a48
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 15, 2022
b64f1ba
surpress log
yidong72 Oct 15, 2022
da54820
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 15, 2022
d4970d3
Fix
MaximumEntropy Oct 18, 2022
9676a69
Fix
MaximumEntropy Oct 19, 2022
e5aef83
Merge branch 'main' of github.com:NVIDIA/NeMo into t0_dataset_fixes
MaximumEntropy Oct 19, 2022
d4d51f6
Fixes
MaximumEntropy Oct 19, 2022
bb7b44c
Update config
MaximumEntropy Oct 19, 2022
ec8df6a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 19, 2022
2e16243
Restore function needed for NMT
MaximumEntropy Oct 19, 2022
0a60990
Merge branch 't0_dataset_fixes' of github.com:NVIDIA/NeMo into t0_dat…
MaximumEntropy Oct 19, 2022
7395dd7
Merge branch 'main' into universal_prompt_fix
yidong72 Oct 20, 2022
2f348ba
handles no answer only
yidong72 Oct 20, 2022
1387925
Fix config
MaximumEntropy Oct 21, 2022
f7f844d
added knn to web
yidong72 Oct 21, 2022
86798a3
fix lgtm.com comments
yidong72 Oct 21, 2022
97b8dcc
output the retrieved context
yidong72 Oct 22, 2022
1cd4ac0
allow no neighbor query
yidong72 Oct 25, 2022
3718fd6
remove the imports
yidong72 Oct 25, 2022
ba1e50b
warn only once
yidong72 Oct 25, 2022
011e6a9
Change output file format from JSON to JSONL
MaximumEntropy Oct 27, 2022
f17545d
Merge branch 't0_dataset_fixes' into universal_prompt_newdata
yidong72 Oct 28, 2022
c062103
new t0 dataset
yidong72 Oct 31, 2022
92485bb
Add T0 data preproc scripts
MaximumEntropy Nov 1, 2022
4600377
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 1, 2022
177a81f
Merge and multiprocessing
MaximumEntropy Nov 1, 2022
257548d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 1, 2022
9a9b735
Fix for is_correct
MaximumEntropy Nov 1, 2022
b44b4e1
Merge branch 't0_dataset_fixes' of github.com:NVIDIA/NeMo into t0_dat…
MaximumEntropy Nov 1, 2022
aab8679
fix epoch > 2
yidong72 Nov 1, 2022
fd54348
handles multiple dataloader
yidong72 Nov 1, 2022
76658f9
remove template
yidong72 Nov 1, 2022
8ebff3d
Refactor T0 dataset
MaximumEntropy Nov 2, 2022
ea663bd
Add script to merge train folder into individual training files to mi…
MaximumEntropy Nov 2, 2022
3e266b1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 2, 2022
77e6917
Merge branch 'main' into t0_dataset_fixes
MaximumEntropy Nov 2, 2022
98a75be
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 2, 2022
1709ddf
added on the fly service
yidong72 Nov 2, 2022
d9c169c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 2, 2022
a4eba25
add combo instance
yidong72 Nov 2, 2022
83eccf4
Merge branch 'universal_prompt_fix' of github.com:NVIDIA/NeMo into un…
yidong72 Nov 2, 2022
87c17e6
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 2, 2022
5df10dc
added combo service
yidong72 Nov 2, 2022
3682322
Merge branch 'universal_prompt_fix' of github.com:NVIDIA/NeMo into un…
yidong72 Nov 2, 2022
0b33b49
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 2, 2022
83bf269
send weights back to server
yidong72 Nov 2, 2022
816a4f3
Merge branch 'universal_prompt_fix' of github.com:NVIDIA/NeMo into un…
yidong72 Nov 2, 2022
c20df14
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 2, 2022
06e25a8
fix index store
yidong72 Nov 2, 2022
ea69455
Merge branch 'universal_prompt_fix' of github.com:NVIDIA/NeMo into un…
yidong72 Nov 2, 2022
52b37a4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 2, 2022
a1a3bf4
Minor changes
MaximumEntropy Nov 2, 2022
06625db
Merge branch 't0_dataset_fixes' of github.com:NVIDIA/NeMo into t0_dat…
MaximumEntropy Nov 2, 2022
65da5d6
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 2, 2022
54b5556
add reset button
yidong72 Nov 3, 2022
31d7aa3
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 3, 2022
5371163
add add eos
yidong72 Nov 3, 2022
f9e1bab
Merge branch 'universal_prompt_fix' of github.com:NVIDIA/NeMo into un…
yidong72 Nov 3, 2022
f52f88b
use a seperate bert service
yidong72 Nov 3, 2022
7717163
no loss of accuracy
yidong72 Nov 3, 2022
def6ac1
pin the gradio version
yidong72 Nov 3, 2022
7d20338
Remove bin compat
MaximumEntropy Nov 4, 2022
12ed3eb
Merge
MaximumEntropy Nov 4, 2022
999d242
Fix header lines
MaximumEntropy Nov 4, 2022
9d98f83
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 4, 2022
143ed80
Merge branch 'universal_prompt_fix' into universal_prompt_newdata
yidong72 Nov 4, 2022
eb67182
Merge branch 't0_dataset_fixes' into universal_prompt_newdata
yidong72 Nov 4, 2022
41da78d
evaluate based on text generation
yidong72 Nov 4, 2022
3ffe51f
exact match result aggregation
yidong72 Nov 5, 2022
374865a
working SP and SA
yidong72 Nov 7, 2022
d4adef0
sync
yidong72 Nov 7, 2022
93236ac
fix checkpoint
yidong72 Nov 8, 2022
1cc6c55
fix eval
yidong72 Nov 8, 2022
1dd1be1
backup states
yidong72 Nov 8, 2022
09af294
backup states reset
yidong72 Nov 8, 2022
9ef26c9
fix the bug
yidong72 Nov 8, 2022
84e8df9
fix evaluation for sentence piece
yidong72 Nov 10, 2022
7f4aa82
fix a bug
yidong72 Nov 12, 2022
b4903a8
Merge branch 'main' into universal_prompt_newdata
yidong72 Nov 14, 2022
43cec8b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 14, 2022
f94a374
potential fix in the future
yidong72 Nov 15, 2022
d25650b
Merge branch 'universal_prompt_newdata' of github.com:NVIDIA/NeMo int…
yidong72 Nov 15, 2022
791682c
Merge branch 'main' into text_generation_improvement
yidong72 Nov 16, 2022
b0b06a1
remove the universal codes
yidong72 Nov 16, 2022
8680ef3
remove universal strategy
yidong72 Nov 16, 2022
1db6582
Merge branch 'main' into text_generation_improvement
okuchaiev Nov 16, 2022
09d5854
Merge branch 'main' into text_generation_improvement
yidong72 Dec 8, 2022
5f33b3b
address reviewer comment
yidong72 Dec 8, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
  • Loading branch information
pre-commit-ci[bot] committed Oct 13, 2022
commit 301a8b770dae2b7bca848dee635f68c81dbdfe48
Original file line number Diff line number Diff line change
Expand Up @@ -426,7 +426,9 @@ def tokenize_batch_with_context_and_completion(self, sentences, max_len, add_BOS
"""
tokenizer = self.model.tokenizer
if add_BOS:
context_tokens = [[[tokenizer.eos_id]+tokenizer.text_to_ids(s[0]), tokenizer.text_to_ids(s[1])] for s in sentences]
context_tokens = [
[[tokenizer.eos_id] + tokenizer.text_to_ids(s[0]), tokenizer.text_to_ids(s[1])] for s in sentences
]
else:
context_tokens = [[tokenizer.text_to_ids(s[0]), tokenizer.text_to_ids(s[1])] for s in sentences]
if self.pad_token_for_retrieval:
Expand Down