Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

多卡训练占用显存 #3

Open
nilin1998 opened this issue Jun 5, 2023 · 1 comment
Open

多卡训练占用显存 #3

nilin1998 opened this issue Jun 5, 2023 · 1 comment

Comments

@nilin1998
Copy link

我在精度为fp16的情况下,单卡微调训练占用16G显存,使用多卡训练,则需要两个16G显存,但我的理解中,多卡训练时,所需的16G显存平均分到每张卡上,每张卡只占用8G显存。请问问题出在哪里了呢
单卡:image

多卡:image

@zejunwang1
Copy link
Owner

多卡训练一般指的是数据并行,每张卡上会有一个 batch 的数据在训练,所以每张卡上都会占用 16g 显存。假设有两个卡,多卡训练的好处是其真实的 batch_size 是单卡的2倍,所以总训练步数会减半,训练时间减少

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants