Highlights
LLM
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
Example models using DeepSpeed
Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.
A modular graph-based Retrieval-Augmented Generation (RAG) system
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
Distributed LLM and StableDiffusion inference for mobile, desktop and server.
Scalable toolkit for efficient model alignment
SGLang is a fast serving framework for large language models and vision language models.
Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
Efficient Triton Kernels for LLM Training
A high-throughput and memory-efficient inference and serving engine for LLMs
【大模型】3小时完全从0训练一个仅有26M的小参数GPT,最低仅需2G显卡即可推理训练!