Stars
EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Models (LLMs).
High accuracy RAG for answering questions from scientific documents with citations
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
An extremely fast Python linter and code formatter, written in Rust.
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
interactive visualization of 5 popular gradient descent methods with step-by-step illustration and hyperparameter tuning UI
Official inference repo for FLUX.1 models
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Test Software for the Characterization of AI Technologies
Make it easy to automatically and uniformly measure the behavior of many AI Systems.
Code and example data for the paper: Rule Based Rewards for Language Model Safety
Agentic components of the Llama Stack APIs
Utilities intended for use with Llama models.
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various r…
This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) avai…
WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning method which reduces LLM performance on WMDP while retaining …
Improving Alignment and Robustness with Circuit Breakers
Generative AI extensions for onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
A curated list of awesome leaderboard-oriented resources for foundation models
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
A Python toolbox for performing gradient-free optimization
GPT4V-level open-source multi-modal model based on Llama3-8B
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
lightweight, standalone C++ inference engine for Google's Gemma models.
The official PyTorch implementation of Google's Gemma models