Highlights
- Pro
Starred repositories
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
A native PyTorch Library for large model training
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Models (LLMs).
Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
🌟 Yi-Coder is a series of open-source code language models that delivers state-of-the-art coding performance with fewer than 10 billion parameters.
A throughput-oriented high-performance serving framework for LLMs
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
LinkedIn_AIHawk is a tool that automates the jobs application process on LinkedIn. Utilizing artificial intelligence, it enables users to apply for multiple job offers in an automated and personali…
LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Efficient Triton Kernels for LLM Training
Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" (ICLR 2024)
Dafny is a verification-aware programming language
Lean 4 programming language and theorem prover
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Helpful tools and examples for working with flex-attention
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…