Stars
3
stars
written in Cuda
Clear filter
FlashInfer: Kernel Library for LLM Serving
A throughput-oriented high-performance serving framework for LLMs
libcubwt is a library for GPU accelerated suffix array and burrows wheeler transform construction.