0400H

0400H 0400H

HPC / MLSys / PPML

4 followers · 48 following

Shanghai
00:42 (UTC +08:00)

Achievements

Starred repositories

KRTirtho / spotube

🎧 Open source Spotify client that doesn't require Premium nor uses Electron! Available for both desktop & mobile!

Dart 28,946 1,191 Updated Sep 19, 2024

vosen / ZLUDA

CUDA on ??? GPUs

Rust 8,953 596 Updated Sep 16, 2024

organicmaps / organicmaps

🍃 Organic Maps is a free Android & iOS offline maps app for travelers, tourists, hikers, and cyclists. It uses crowd-sourced OpenStreetMap data and is developed with love by MapsWithMe (MapsMe) fou…

C++ 9,606 921 Updated Sep 22, 2024

AnswerDotAI / gpu.cpp

A lightweight library for portable low-level GPU computation using WebGPU.

C++ 3,676 175 Updated Sep 22, 2024

microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 17,364 1,656 Updated Sep 20, 2024

intel / x86-simd-sort

C++ template library for high performance SIMD based sorting algorithms

C++ 844 54 Updated Sep 6, 2024

amd / ZenDNN

C++ 80 14 Updated May 21, 2024

geogebra / geogebra

GeoGebra apps (mirror)

Java 1,611 352 Updated Sep 22, 2024

alibaba / tengine-ingress

Tengine Ingress Controller for Kubernetes

Go 101 19 Updated Nov 9, 2023

sandboxie-plus / Sandboxie

Sandboxie Plus & Classic

C 13,535 1,508 Updated Sep 22, 2024

openxla / stablehlo

Backward compatible ML compute opset inspired by HLO/MHLO

MLIR 382 102 Updated Sep 21, 2024

pytorch / glow

Compiler for Neural Network hardware accelerators

C++ 3,206 688 Updated May 11, 2024

bytedance / lightseq

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3,171 328 Updated May 16, 2023

NVIDIA / warp

A Python framework for high performance GPU simulation and graphics

Python 4,110 227 Updated Sep 20, 2024

AvaloniaUI / Avalonia

Develop Desktop, Embedded, Mobile and WebAssembly apps with C# and XAML. The most popular .NET UI client technology

C# 25,307 2,195 Updated Sep 21, 2024

wanglin2 / mind-map

一个还算强大的Web思维导图。A relatively powerful web mind map.

JavaScript 6,008 849 Updated Sep 21, 2024

THUDM / ChatGLM3

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型

Python 13,340 1,551 Updated Jul 10, 2024

Vahe1994 / AQLM

Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf and PV-Tuning: Beyond Straight-Through Estimation for Ext…

Python 1,129 173 Updated Sep 10, 2024

turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs

Python 3,503 268 Updated Sep 20, 2024

turboderp / exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.

Python 2,736 215 Updated Sep 30, 2023

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 4,229 381 Updated Sep 22, 2024

AutoGPTQ / AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 4,346 463 Updated Aug 19, 2024

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,235 913 Updated Sep 18, 2024

intel / auto-round

Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"

Python 206 19 Updated Sep 20, 2024

ModelTC / lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,305 191 Updated Sep 19, 2024