Machnet provides applications like databases and finance an easy way to access low-latency DPDK-based messaging on public cloud VMs. 750K RPS on Azure at 61 us P99.9.

C++ 71 19 Updated Sep 29, 2024

erpc-io / eRPC

Efficient RPCs for datacenter networks

C++ 851 138 Updated May 9, 2024

cupy / cupy

NumPy & SciPy for GPU

Python 9,297 837 Updated Oct 3, 2024

pytorch / tensorpipe

A tensor-aware point-to-point communication primitive for machine learning

C++ 248 77 Updated Dec 17, 2022

databricks / megablocks

Python 1,178 170 Updated Sep 19, 2024

microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 1,854 174 Updated Sep 11, 2024

gururise / AlpacaDataCleaned

Alpaca dataset from Stanford, cleaned and curated

Python 1,503 149 Updated Apr 14, 2023

HPMLL / BurstGPT

A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems

Python 114 7 Updated Aug 17, 2024

flexflow / FlexFlow

FlexFlow Serve: Low-Latency, High-Performance LLM Serving

C++ 1,667 224 Updated Oct 7, 2024

goliaro / specinfer-ae

Shell 9 2 Updated Mar 14, 2024

S-LoRA / S-LoRA

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,720 91 Updated Jan 21, 2024

AmadeusChan / Awesome-LLM-System-Papers

480 22 Updated Sep 5, 2024

hao-ai-lab / LookaheadDecoding

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Python 1,112 65 Updated Feb 14, 2024

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,342 937 Updated Oct 1, 2024

state-spaces / mamba

Mamba SSM architecture

Python 12,750 1,075 Updated Sep 26, 2024

hehonghui / awesome-english-ebooks

经济学人(含音频)、纽约客、卫报、连线、大西洋月刊等英语杂志免费下载,支持epub、mobi、pdf格式, 每周更新

CSS 21,355 1,656 Updated Oct 4, 2024

IlyaGrebnov / libcubwt

libcubwt is a library for GPU accelerated suffix array and burrows wheeler transform construction.

Cuda 30 1 Updated Feb 8, 2024

lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 36,591 4,518 Updated Oct 6, 2024

dfm / extending-jax

Extending JAX with custom C++ and CUDA code

Python 372 21 Updated Aug 18, 2024

thuml / depyf

depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.

Python 456 12 Updated Sep 16, 2024

BOINC / boinc

Open-source software for volunteer computing and grid computing.

PHP 2,004 445 Updated Oct 6, 2024

bigscience-workshop / petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

Python 9,125 513 Updated Sep 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

He Guangrong hogura99

Achievements

Achievements

Highlights

Block or report hogura99

Stars

flashinfer-ai / flashinfer

microsoft / mscclpp

efeslab / Nanoflow

USCiLab / cereal

zeromq / cppzmq

xai-org / grok-1

joerick / pyinstrument

microsoft / vidur

microsoft / machnet