Skip to content
View gangliao's full-sized avatar
🏠
Working from home
🏠
Working from home

Highlights

  • Pro

Block or report gangliao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

Code for the book "High Performance Python 2e" by Micha Gorelick and Ian Ozsvald with OReilly

Python 408 135 Updated Jan 18, 2023

LLM101n: Let's build a Storyteller

28,911 1,582 Updated Aug 1, 2024

Unsupervised text tokenizer for Neural Network-based text generation.

C++ 10,112 1,165 Updated Sep 1, 2024

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 8,404 596 Updated Sep 20, 2024

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,536 363 Updated Sep 13, 2024

The official PyTorch implementation of Google's Gemma models

Python 5,242 502 Updated Jul 31, 2024

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 9,065 836 Updated Jul 1, 2024

Build system, successor to Buck

Rust 3,523 215 Updated Sep 26, 2024

ZP7: Zach's Peppy Parallel-Prefix-Popcountin' PEXT/PDEP Polyfill

C 43 3 Updated Aug 14, 2024

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,714 89 Updated Jan 21, 2024

An annotated implementation of the Transformer paper.

Jupyter Notebook 5,604 1,212 Updated Apr 7, 2024

Inference code for CodeLlama models

Python 15,899 1,848 Updated Aug 12, 2024

ChatLaw:A Powerful LLM Tailored for Chinese Legal. 中文法律大模型

6,839 540 Updated Jun 4, 2024

SQuangLe is a C++ API for accessing MySQL servers

C++ 123 54 Updated Sep 26, 2024

RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library.

C++ 8,900 1,123 Updated Aug 6, 2024

Header-only C++ library for low precision floating point type emulation.

C++ 163 26 Updated Jan 24, 2020

Inference Llama 2 in one file of pure C

C 17,214 2,053 Updated Aug 6, 2024

Towards a New File Format

C++ 151 8 Updated Sep 16, 2024

Fastest Integer Compression

C 762 111 Updated Mar 1, 2024

Benchmarks of approximate nearest neighbor libraries in Python

Python 4,871 734 Updated Sep 2, 2024

Warp speed Data Transfer (WDT) is an embeddedable library (and command line tool) aiming to transfer data between 2 systems as fast as possible over multiple TCP paths.

C++ 2,862 391 Updated Aug 23, 2024

ai4db and db4ai work

647 87 Updated Aug 16, 2024

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of …

Go 10,939 749 Updated Sep 26, 2024

Open-source vector similarity search for Postgres

C 11,888 540 Updated Sep 26, 2024

Awesome-LLM: a curated list of Large Language Model

17,594 1,428 Updated Sep 23, 2024

Cuckoo Index: A Lightweight Secondary Index Structure

C++ 129 18 Updated Dec 2, 2021

A library for efficient similarity search and clustering of dense vectors.

C++ 30,664 3,572 Updated Sep 26, 2024

magic-trace collects and displays high-resolution traces of what a process is doing

OCaml 4,568 85 Updated Jun 4, 2024

Distributed object store

Java 1,741 275 Updated Sep 26, 2024

Run Google Test suites in parallel.

Python 418 104 Updated Jan 16, 2024
Next