Skip to content

Pull requests: microsoft/DeepSpeedExamples

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Remove the fixed eot_token mechanism for SFT
#927 opened Sep 15, 2024 by Xingfu-Yi Loading…
Add LoRA optimization to the SD training example
#873 opened Mar 8, 2024 by PareesaMS Loading…
Replace deprecated transformers.deepspeed module
#872 opened Mar 6, 2024 by HollowMan6 Loading…
[DeepSpeed-Chat] Fix OOM issue in dataloader
#841 opened Jan 1, 2024 by youkaichao Loading…
Add DPO support for DeepSpeed-Chat
#828 opened Dec 8, 2023 by stceum Loading…
Inference test enhance
#713 opened Aug 31, 2023 by sakogan Loading…
Auto feature selection for deepspeed-chat training stage 1
#605 opened Jun 23, 2023 by cli99 Loading…
Add FALCON inference-test
#557 opened May 30, 2023 by RezaYazdaniAminabadi Loading…
Xiaoxia/fix dropout
#537 opened May 19, 2023 by xiaoxiawu-microsoft Loading…
.
#462 opened Apr 28, 2023 by yaozhewei Loading…
fix step_time
#454 opened Apr 28, 2023 by thuzhf Loading…
Fix numerical instability of critic loss
#424 opened Apr 25, 2023 by s-isaev Loading…
Empty partition cache
#360 opened Apr 19, 2023 by tjruwase Loading…
fix run_6.7b.sh for single gpu
#293 opened Apr 13, 2023 by FindHao Loading…
Fix port number in cifar model compression
#237 opened Feb 17, 2023 by SuperSecureHuman Loading…
Fix syntax and dependency error for cifar
#234 opened Jan 14, 2023 by yenchenlin Loading…
Fixed dataset bug in bing_bert.
#117 opened Jul 21, 2021 by wenting-zhao Loading…
Remove redundant layer norm operation
#67 opened Dec 5, 2020 by owmohamm Loading…
ProTip! no:milestone will show everything without a milestone.