-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add O2 support for RETRO model (#4411)
* make sure post-ln works with new coeff Signed-off-by: Yi Dong <[email protected]> * add some comments Signed-off-by: Yi Dong <[email protected]> * second v Signed-off-by: Yi Dong <[email protected]> * fix headscale Signed-off-by: Yi Dong <[email protected]> * works both for pre-ln and post-ln Signed-off-by: Yi Dong <[email protected]> * fix unittest Signed-off-by: Yi Dong <[email protected]> * stop gradient Signed-off-by: Yi Dong <[email protected]> * stop gradient Signed-off-by: Yi Dong <[email protected]> * use half rotary embedding Signed-off-by: Yi Dong <[email protected]> * use default grad clip Signed-off-by: Yi Dong <[email protected]> * turn off rotary embedding Signed-off-by: Yi Dong <[email protected]> * added o2 support Signed-off-by: Yi Dong <[email protected]> * fix style Signed-off-by: Yi Dong <[email protected]> * add debugging Signed-off-by: Yi Dong <[email protected]> * make cyclic lr work Signed-off-by: Yi Dong <[email protected]> * o2 works with cyclic lr Signed-off-by: Yi Dong <[email protected]> * remove deepnet Signed-off-by: Yi Dong <[email protected]> * fix merge error Signed-off-by: Yi Dong <[email protected]> * update the comments Signed-off-by: Yi Dong <[email protected]> * added output scaling for stable training Signed-off-by: Yi Dong <[email protected]> * improve the debug code Signed-off-by: Yi Dong <[email protected]> * fix comment Signed-off-by: Yi Dong <[email protected]> * move debug hook above Signed-off-by: Yi Dong <[email protected]> * move optimizer config to base class Signed-off-by: Yi Dong <[email protected]> * address comments Signed-off-by: Yi Dong <[email protected]> Co-authored-by: Eric Harper <[email protected]>
- Loading branch information
1 parent
d8785e0
commit c9f16fd
Showing
15 changed files
with
375 additions
and
114 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,6 +13,7 @@ | |
# limitations under the License. | ||
|
||
import logging | ||
|
||
import torch | ||
from torchmetrics import Metric | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.