Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed errors by upper-case of model name, and changed the description #82

Merged
merged 85 commits into from
Aug 29, 2022
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
83fceae
saved workd
Anhforth Jun 29, 2022
f113f7c
saved workd
Anhforth Jun 29, 2022
708ce15
saved work on 6.29
Anhforth Jun 30, 2022
0d1b079
transformed tokenizer: progressing
Anhforth Jul 1, 2022
2763b5d
Opt 30b (#16)
920232796 Jul 1, 2022
3e52907
fix bert tokenizer issue (#18)
Anhforth Jul 1, 2022
deb2612
reconstruct the tokenizer structure
ZhaodongYan1 Jul 3, 2022
c2c6e9d
tested the new tokenizer
Anhforth Jul 4, 2022
fc2b5d8
removed some redundant codes and added sp model
Anhforth Jul 4, 2022
7da1757
updated the tokenizer
ZhaodongYan1 Jul 4, 2022
7c8c0b1
saved work
Anhforth Jul 5, 2022
3a0c8cb
Opt 66b (#19)
920232796 Jul 6, 2022
265d35a
saved work on 7.6
Anhforth Jul 6, 2022
4f8d715
updated release version
Anhforth Jul 6, 2022
efc1310
fix tokenizer issue
Anhforth Jul 6, 2022
59531e7
temp save
Anhforth Jul 6, 2022
3b6c16a
tokenizer test passed
Anhforth Jul 6, 2022
a7ff8f3
fixed some errors
Anhforth Jul 7, 2022
f4ff1a8
test of tokenizer transform
Anhforth Jul 7, 2022
811d9e9
fixed conflicts
Anhforth Jul 7, 2022
1406d89
fixed error
Anhforth Jul 7, 2022
b30eefa
add encode_plus
Anhforth Jul 8, 2022
9b81869
fix bug multi_gpu_training
920232796 Jul 8, 2022
7ad38a0
Merge pull request #21 from baai-open-internal/fix_multi_gpu_training
Anhforth Jul 8, 2022
72ffd6a
changed the version
Anhforth Jul 8, 2022
e6f89a6
fix_validation_bug (#24)
920232796 Jul 11, 2022
29ea850
updated the version
Anhforth Jul 11, 2022
4c68936
updated
Anhforth Jul 15, 2022
4834f23
modified encoder_plus
Anhforth Jul 15, 2022
8d44329
add vit and examples
920232796 Jul 15, 2022
81c438d
vit and examples
920232796 Jul 15, 2022
da24628
Update base_model.py
marscrazy Jul 15, 2022
aff728b
Update vit.py
marscrazy Jul 15, 2022
e5a0ddb
modify readme.md
920232796 Jul 15, 2022
fe56b8b
modify readme.md
920232796 Jul 15, 2022
fc6c32e
delete annotating code
920232796 Jul 15, 2022
cd45e5c
Vit xzh (#25)
920232796 Jul 15, 2022
5448084
updated
Anhforth Jul 17, 2022
eb555fc
updated
Anhforth Jul 17, 2022
9649aa4
performing tests on examples
Anhforth Jul 17, 2022
67c1288
finished example testing
Anhforth Jul 18, 2022
faee281
Merge branch 'develop' into vit_xzh
BAAI-OpenPlatform Jul 19, 2022
06f0b69
Merge pull request #28 from baai-open-internal/vit_xzh
BAAI-OpenPlatform Jul 19, 2022
deaa120
Merge pull request #27 from baai-open-internal/develop
marscrazy Jul 20, 2022
9558a47
env trainer
920232796 Jul 20, 2022
c35d4b6
Merge pull request #29 from baai-open-internal/env_args
marscrazy Jul 20, 2022
437caa4
vit-checkpoint-activations
920232796 Jul 21, 2022
dc6fc3d
vit-checkpoint-activations
920232796 Jul 21, 2022
c1cec9f
Merge pull request #33 from baai-open-internal/vit-checkpointing-acti…
marscrazy Jul 21, 2022
d74cf92
update
jongjyh Jul 25, 2022
044bc80
Merge pull request #34 from baai-open-internal/fix_eval_loss
marscrazy Jul 25, 2022
d85f8af
merged the master
Anhforth Jul 26, 2022
1b5ecc6
inference and train
wchh-2000 Jul 29, 2022
1fe6d3e
fix bug bert model
xuanricheng Aug 5, 2022
0c243d6
add autoloader and example training data
wchh-2000 Aug 15, 2022
2c28a7d
updated seq2seq
shunxing1234 Aug 16, 2022
e03247e
update
wchh-2000 Aug 16, 2022
4a4b003
Merge pull request #52 from baai-open-internal/add_clip
marscrazy Aug 17, 2022
ce5fd31
Merge branch 'master' into transform_tokenizer
Anhforth Aug 18, 2022
8353cd3
Update train.py
marscrazy Aug 18, 2022
5d5e135
Delete tst_superglue.py
marscrazy Aug 18, 2022
4c6ba56
updated according to comments
BAAI-OpenPlatform Aug 19, 2022
6076287
Merge pull request #50 from baai-open-internal/bert_model
BAAI-OpenPlatform Aug 19, 2022
c11e232
merged the clip tokenizer
BAAI-OpenPlatform Aug 22, 2022
6e135ef
merged clip tokenizer
BAAI-OpenPlatform Aug 23, 2022
fd06e4d
Update inference_clip.py
marscrazy Aug 25, 2022
b61b708
Update auto_loader.py
marscrazy Aug 25, 2022
25b659b
Update glm_10b_en_tokenizer.py
marscrazy Aug 25, 2022
8cffa38
Merge pull request #20 from baai-open-internal/transform_tokenizer
marscrazy Aug 25, 2022
9117f78
swinv1v2
920232796 Aug 25, 2022
f3186d9
Merge pull request #58 from baai-open-internal/swinv1v2_checkpoint_ac…
marscrazy Aug 25, 2022
4bd211d
updated the version
Anhforth Aug 25, 2022
6ef4190
updated the requirement packages list
Anhforth Aug 25, 2022
036e337
fixed some issues
BAAI-OpenPlatform Aug 26, 2022
edfd518
fixed some issues
BAAI-OpenPlatform Aug 26, 2022
497d709
tried to fix the data directory not found error
BAAI-OpenPlatform Aug 26, 2022
1ac43c0
fixed issues in running glm_seq2seq
BAAI-OpenPlatform Aug 26, 2022
351fba7
Update test_glm_seq2seq.py
marscrazy Aug 26, 2022
35b5d9a
Merge pull request #59 from baai-open-internal/fix_issues
marscrazy Aug 26, 2022
b5a14ed
fix glm tokenizer bug
920232796 Aug 29, 2022
9f786e0
fix a glm tokenizer bug
920232796 Aug 29, 2022
18c95e2
Update tokenizer.py
marscrazy Aug 29, 2022
56c081f
Merge branch 'master' into fix_glm_tokenizer
marscrazy Aug 29, 2022
c3c3569
Merge pull request #60 from baai-open-internal/fix_glm_tokenizer
marscrazy Aug 29, 2022
1c28821
merged upstream
Anhforth Aug 29, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ FlagAI (Fast LArge-scale General AI models) is an fast, easy-to-use and extensib

* Now it supports **WuDao GLM** with a maximum of 10 billion parameters (see [Introduction to GLM](/docs/GLM.md)). It also supports **BERT**, **RoBERTa**, **GPT2**, **T5**, and models from Huggingface Transformers.

* It provides APIs to quickly download and use those pre-trained models on a given text, fine-tune them on widely-used datasets collected from [SuperGLUE](https://super.gluebenchmark.com/) and [CLUE](https://github.com/CLUEbenchmark/CLUE) benchmarks, and then share them with the community on our model hub. It also provides [prompt-learning](/docs/TUTORIAL_7_PROMPT_LERANING.md) toolkit for few shot tasks.
* It provides APIs to quickly download and use those pre-trained models on a given text, fine-tune them on widely-used datasets collected from [SuperGLUE](https://super.gluebenchmark.com/) and [CLUE](https://github.com/CLUEbenchmark/CLUE) benchmarks, and then share them with the community on our model hub. It also provides [prompt-learning](/docs/TUTORIAL_7_PROMPT_LEARNING.md) toolkit for few shot tasks.

* These models can be applied to (Chinese/English) Text, for tasks like text classification, information extraction, question answering, summarization, and text generation.

Expand Down
2 changes: 1 addition & 1 deletion README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,7 @@ for text_pair in test_data:
* [Tutorial 4: 为模型和数据并行训练定制训练器](/doc_zh/TUTORIAL_4_TRAINER.md)
* [Tutorial 5: 使用 Autoloader 简化模型和分词器初始化过程](/doc_zh/TUTORIAL_5_INSTRUCTIONS_FOR_AutoLoader.md)
* [Tutorial 6: 将现成的推理算法与 Predictor 结合使用](/doc_zh/TUTORIAL_6_INSTRUCTIONS_FOR_PREDICTOR.md)
* [Tutorial 7: 使用飞智提示学习工具包来提高在SuperGLUE任务上的表现](/doc_zh/TUTORIAL_7_PROMPT_LERANING.md)
* [Tutorial 7: 使用飞智提示学习工具包来提高在SuperGLUE任务上的表现](/doc_zh/TUTORIAL_7_PROMPT_LEARNING.md)
* [Tutorial 8: 多机训练模型搭建环境](/doc_zh/TUTORIAL_8_ENVIRONMENT_SETUP.md)
* [Tutorial 9: 使用encoder/decoder/encoder-decoder模型进行文本生成](/doc_zh/TUTORIAL_9_SEQ2SEQ_METHOD.md)

Expand Down
12 changes: 8 additions & 4 deletions examples/glm_title_generation/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,16 @@
num_checkpoints=1,
)

cur_dir = os.path.dirname(os.path.abspath(__file__))
src_dir = cur_dir + '/data/train.src'
tgt_dir = cur_dir + '/data/train.tgt'
# cur_dir = os.path.dirname(os.path.abspath(__file__))
# src_dir = cur_dir + '/data/train.src'
# tgt_dir = cur_dir + '/data/train.tgt'

src_dir = "./data/train.src"
tgt_dir = "./data/train.tgt"


maxlen = 256
auto_loader = AutoLoader("seq2seq",
auto_loader = AutoLoader("lm",
model_name="GLM-large-ch",
model_dir="./state_dict/")
model = auto_loader.get_model()
Expand Down
98 changes: 97 additions & 1 deletion examples/opt/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,4 +52,100 @@ out = predictor.predict_generate_randomsample(text,
repetition_penalty=3.0)

print(f"input is {text} \n out is {out}")
```
```

# Multi-GPU inference
## OPT-30b

To inference by multi-GPU and model parallel, we use torch-DDP and Megatron-LM library.
### Basic step
1. Set up the parameters of model parallel, such as ```model_parallel_size``` and ```world_size```
2. Initialize torch-DDP
3. Initialize Megatron-LM, model parallel
4. Set up random seed
5. Initialize the model and tokenizer
6. Prediction
### code
```python
import torch
import os
import argparse
from flagai import mpu
from flagai.auto_model.auto_loader import AutoLoader
import random
import numpy as np
from flagai.model.predictor.predictor import Predictor

# run script : python -m torch.distributed.launch --nproc_per_node=4 --nnodes=1 opt_30b_en_mutigpu.py
os.environ["ENV_TYPE"] = "deepspeed+mpu"
model_parallel_size = 4
world_size = 4

os.environ["MODEL_PARALLEL_SIZE"] = str(model_parallel_size)
os.environ["WORLD_SIZE"] = str(world_size)

def set_random_seed(seed):
"""Set random seed for reproducability."""
if seed is not None and seed > 0:
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
mpu.model_parallel_cuda_manual_seed(seed)

parser = argparse.ArgumentParser()
parser.add_argument('--local_rank',
type=int,
default=0,
help="local_rank")

ds_args = parser.parse_args()
local_rank = ds_args.local_rank

master_addr = os.environ.get('MASTER_ADDR', '127.0.0.1')
master_port = os.environ.get('MASTER_PORT', '17501')

device = torch.device("cuda", local_rank)

def initialize_distributed():
"""Initialize torch.distributed."""
torch.backends.cudnn.enabled = False
# Manually set the device ids.
torch.cuda.set_device(device)
# Call the init process
init_method = 'tcp://'

init_method += master_addr + ':' + master_port
torch.distributed.init_process_group(
backend='nccl', # gloo
world_size=world_size,
rank=local_rank,
init_method=init_method)
mpu.initialize_model_parallel(model_parallel_size)

initialize_distributed()

set_random_seed(123)

print(f"building model...")
loader = AutoLoader("lm", model_name="opt-30b-en")
model = loader.get_model()
tokenizer = loader.get_tokenizer()
model.half()

model.parallel_output = False
model.eval()
model.to(device)

torch.distributed.barrier(group=mpu.get_model_parallel_group())

text = """I think The Old Man and the Sea is a very good book, what do you think? I think """

predictor = Predictor(model, tokenizer)
out = predictor.predict_generate_randomsample(text)
if mpu.get_model_parallel_rank() == 0:
print(f"pred is {out}")
```
### Run script is
```commandline
python -m torch.distributed.launch --nproc_per_node=4 --nnodes=1 opt_30b_en_mutigpu.py
```
22 changes: 4 additions & 18 deletions examples/opt/generate_opt_30b.py
Original file line number Diff line number Diff line change
@@ -1,27 +1,13 @@
from flagai.model.predictor.predictor import Predictor
from flagai.auto_model.auto_loader import AutoLoader
import torch
import os
# device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device = torch.device("cpu")
# loader = AutoLoader(task_name="lm",
# model_name="opt-30b")
# loader.load_pretrain_params("/mnt/models_xingzhaohu/opt_30b")

loader = AutoLoader(task_name="lm",
model_name="opt-30b-en")

from flagai.model.opt_model import OPTModel
from flagai.data.tokenizer.opt.opt_en_tokenizer import OPTTokenizer
print(f"正在构建模型")
model = OPTModel.init_from_json(os.path.join("/mnt/models_xingzhaohu/opt_30b", "config.json"))
tokenizer = OPTTokenizer()
model.load_weights("/mnt/models_xingzhaohu/opt_30b/pytorch_model.bin")



# model = loader.get_model()
# tokenizer = loader.get_tokenizer()
model = loader.get_model()
tokenizer = loader.get_tokenizer()
model.eval()
model.to(device)

text = "The trophy doesn’t fit in the suitcase because "
predictor = Predictor(model, tokenizer)
Expand Down
22 changes: 22 additions & 0 deletions examples/opt/generate_opt_66b.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
from flagai.model.predictor.predictor import Predictor
from flagai.auto_model.auto_loader import AutoLoader
import torch

loader = AutoLoader(task_name="lm",
model_name="opt-66b-en")

model = loader.get_model()
tokenizer = loader.get_tokenizer()
model.eval()

text = """I think The Old Man and the Sea is a very good book, what do you think? Thank you for your question, I think """

predictor = Predictor(model, tokenizer)
out = predictor.predict_generate_randomsample(text,
input_max_length=100,
out_max_length=300,
top_k=50,
top_p=0.9,
repetition_penalty=3.0)

print(f"input is {text} \n out is {out}")
15 changes: 9 additions & 6 deletions examples/opt/opt_30b_en_mutigpu.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# os.environ["CUDA_VISIBLE_DEVICES"] = "0,2"

import torch
import os
import argparse
Expand All @@ -7,8 +7,9 @@
import random
import numpy as np
from flagai.model.predictor.predictor import Predictor
import glob
import time

# run script : python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 glm_blank_filling_QA_ch_mutigpu.py
os.environ["ENV_TYPE"] = "deepspeed+mpu"
model_parallel_size = 4
world_size = 4
Expand Down Expand Up @@ -58,11 +59,14 @@ def initialize_distributed():

set_random_seed(123)

loader = AutoLoader("lm", model_name="opt-350m-en")
print(f"building model...")
loader = AutoLoader("lm", model_name="opt-30b-en")
model = loader.get_model()
model.half()
tokenizer = loader.get_tokenizer()
# model.parallel_output = False
model.half()

model.parallel_output = False

model.eval()
model.to(device)

Expand All @@ -75,4 +79,3 @@ def initialize_distributed():
if mpu.get_model_parallel_rank() == 0:
print(f"pred is {out}")


108 changes: 108 additions & 0 deletions examples/opt/opt_66b_en_mutigpu.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# os.environ["CUDA_VISIBLE_DEVICES"] = "0,2"
import torch
import os
import time
os.environ["ENV_TYPE"] = "deepspeed+mpu"
os.environ["MODEL_PARALLEL_SIZE"] = '8'
os.environ["WORLD_SIZE"] = '8'
import argparse
from flagai import mpu
import random
import numpy as np
from flagai.model.predictor.predictor import Predictor
from flagai.model.opt_model import OPTModel
from flagai.data.tokenizer import OPTTokenizer

def get_current_rank():
with open('current_rank','r',encoding='utf8') as infile:
line = infile.readline().strip()
return int(line)
def set_current_rank(rank):
with open('current_rank','w',encoding='utf8') as outfile:
outfile.write(str(rank))

def get_current_pool():
with open('current_pool','r',encoding='utf8') as infile:
line = infile.readline().strip()
return int(line)

def set_current_pool(rank):
with open('current_pool','w',encoding='utf8') as outfile:
outfile.write(str(rank))

# run script : python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 opt_66b_en_mutigpu.py
parser = argparse.ArgumentParser()
parser.add_argument('--local_rank',
type=int,
default=0,
help="local_rank")

def set_random_seed(seed):
"""Set random seed for reproducability."""
if seed is not None and seed > 0:
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
mpu.model_parallel_cuda_manual_seed(seed)

ds_args = parser.parse_args()
local_rank = ds_args.local_rank

master_addr = os.environ.get('MASTER_ADDR', '127.0.0.1')
master_port = os.environ.get('MASTER_PORT', '17501')

device = torch.device("cuda", local_rank)
model_parallel_size = 8
world_size = 8

def initialize_distributed():
"""Initialize torch.distributed."""
torch.backends.cudnn.enabled = False
# Manually set the device ids.
torch.cuda.set_device(device)
# Call the init process
init_method = 'tcp://'

init_method += master_addr + ':' + master_port
torch.distributed.init_process_group(
backend='nccl', # gloo
world_size=world_size,
rank=local_rank,
init_method=init_method)
mpu.initialize_model_parallel(model_parallel_size)

initialize_distributed()

set_current_pool(4)
set_current_rank(0)
set_random_seed(123)
torch.distributed.barrier(group=mpu.get_model_parallel_group())
tokenizer = OPTTokenizer()

while get_current_rank() != local_rank:
time.sleep(10)
while get_current_pool() == 0:
time.sleep(10)
set_current_pool(get_current_pool()-1)
print("loading rank {}".format(local_rank))
set_current_rank(local_rank + 1)

model = OPTModel.init_from_json('/mnt/models_xingzhaohu/opt-66b-en/config.json')
checkpoint_path = '/mnt/models_xingzhaohu/opt-66b-en/pytorch_model_{:02d}.bin'.format(local_rank)
model.half()
model.eval()
model.to(device)
model.load_weights(checkpoint_path)

print("loading rank {} finished".format(local_rank))
set_current_pool(get_current_pool()+1)
print('current rank setting is {}'.format(get_current_pool()))

torch.distributed.barrier(group=mpu.get_model_parallel_group())
text = """I think The Old Man and the Sea is a very good book, what do you think? I think """

predictor = Predictor(model, tokenizer)
out = predictor.predict_generate_randomsample(text)
if mpu.get_model_parallel_rank() == 0:
print(f"pred is {out}")

Loading