Starred repositories
🎉 CUDA Learn Notes with PyTorch: fp32、fp16/bf16、fp8/int8、flash_attn、sgemm、sgemv、warp/block reduce、dot prod、elementwise、softmax、layernorm、rmsnorm、hist etc.
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
Making large AI models cheaper, faster and more accessible
Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
SGLang is a fast serving framework for large language models and vision language models.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
[🔥updating ...] AI 自动量化交易机器人(完全本地部署) AI-powered Quantitative Investment Research Platform. 📃 online docs: https://ufund-me.github.io/Qbot ✨ :news: qbot-mini: https://github.com/Charmve/iQuant
NoSQL data store using the SEASTAR framework, compatible with Redis
NoSQL data store using the seastar framework, compatible with Apache Cassandra
A high-throughput and memory-efficient inference and serving engine for LLMs
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
A collection of modern/faster/saner alternatives to common unix commands.
KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
[SIGMOD 2023] High-Dimensional Approximate Nearest Neighbor Search: with Reliable and Efficient Distance Comparison Operations
🦜🔗 Build context-aware reasoning applications
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…
Pika is a Redis-Compatible database developed by Qihoo's infrastructure team.
PyTorch code and models for the DINOv2 self-supervised learning method.
eight-legged essay for MySQL. Just a learning Project