-
LLM-RLHF-Tuning-with-PPO-and-DPO Public
Forked from raghavc/LLM-RLHF-Tuning-with-PPO-and-DPOComprehensive toolkit for Reinforcement Learning from Human Feedback (RLHF) training, featuring instruction fine-tuning, reward model training, and support for PPO and DPO algorithms with various c…
Python UpdatedMar 18, 2024 -
-
-
-
Xwin-LM Public
Forked from Xwin-LM/Xwin-LMXwin-LM: Powerful, Stable, and Reproducible LLM Alignment
UpdatedSep 24, 2023 -
-
-
HanLP Public
Forked from hankcs/HanLP中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
Python Apache License 2.0 UpdatedApr 15, 2023 -
-
FangkuaiXiaoXiaoLe Public
Forked from foldcc/FangkuaiXiaoXiaoLe使用Unity制作的一款休闲类消消乐游戏
JavaScript UpdatedJul 3, 2018