Skip to content
View yizhang2077's full-sized avatar

Block or report yizhang2077

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Inference Llama 2 in one file of pure C

C 17,180 2,043 Updated Aug 6, 2024

LLM training in simple, raw C/CUDA

Cuda 23,352 2,601 Updated Sep 21, 2024

🎉 CUDA Learn Notes with PyTorch: fp32、fp16/bf16、fp8/int8、flash_attn、sgemm、sgemv、warp/block reduce、dot prod、elementwise、softmax、layernorm、rmsnorm、hist etc.

Cuda 1,168 121 Updated Sep 21, 2024

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

2,516 170 Updated Sep 19, 2024

Making large AI models cheaper, faster and more accessible

Python 38,631 4,331 Updated Sep 19, 2024

Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 7,487 689 Updated Sep 21, 2024

learning how CUDA works

Cuda 152 19 Updated Aug 16, 2024

LLM inference in C/C++

C++ 65,203 9,345 Updated Sep 21, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 5,197 369 Updated Sep 21, 2024

交易模块

Python 3,513 794 Updated May 13, 2024

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 36,294 5,675 Updated Aug 19, 2024

[🔥updating ...] AI 自动量化交易机器人(完全本地部署) AI-powered Quantitative Investment Research Platform. 📃 online docs: https://ufund-me.github.io/Qbot ✨ :news: qbot-mini: https://github.com/Charmve/iQuant

Jupyter Notebook 6,996 939 Updated Sep 21, 2024

基于Python的开源量化交易平台开发框架

Python 24,503 8,595 Updated Sep 15, 2024

Inference code for Llama models

Python 55,553 9,478 Updated Aug 18, 2024

The official Meta Llama 3 GitHub site

Python 26,205 2,951 Updated Aug 12, 2024

NoSQL data store using the SEASTAR framework, compatible with Redis

C++ 1,313 172 Updated Oct 2, 2019

NoSQL data store using the seastar framework, compatible with Apache Cassandra

C++ 13,245 1,256 Updated Sep 20, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 27,085 3,976 Updated Sep 21, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 19,377 2,128 Updated Aug 12, 2024

A collection of modern/faster/saner alternatives to common unix commands.

30,663 780 Updated Sep 10, 2024

k8s tutorials | k8s 教程

Go 4,502 511 Updated Apr 9, 2024

High Performance Embedded Key-Value Store

C 680 57 Updated Sep 20, 2024

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.

Go 2,060 167 Updated Sep 21, 2024

[SIGMOD 2023] High-Dimensional Approximate Nearest Neighbor Search: with Reliable and Efficient Distance Comparison Operations

C++ 41 2 Updated Mar 17, 2023

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 92,517 14,812 Updated Sep 21, 2024

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…

TypeScript 31,288 5,452 Updated Sep 20, 2024

notes on papers/books/codes

Python 199 21 Updated May 6, 2022

Pika is a Redis-Compatible database developed by Qihoo's infrastructure team.

C++ 5,835 1,189 Updated Sep 19, 2024

PyTorch code and models for the DINOv2 self-supervised learning method.

Jupyter Notebook 8,831 768 Updated Aug 7, 2024

eight-legged essay for MySQL. Just a learning Project

40 Updated Jul 2, 2023
Next