Skip to content

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

License

Notifications You must be signed in to change notification settings

vllm-project/llm-compressor

Repository files navigation

LLM Compressor

A library for compressing large language models utilizing the latest techniques and research in the field for both training aware and post training techniques. The library is designed to be flexible and easy to use on top of PyTorch and HuggingFace Transformers, allowing for quick experimentation.

More Coming Soon!