Stars
A collection of recent papers on building autonomous agent. Two topics included: RL-based / LLM-based agents.
The Triton TensorRT-LLM Backend
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型
A high-throughput and memory-efficient inference and serving engine for LLMs
Lightning fast C++/CUDA neural network framework
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
This is a Chinese translation of the CUDA programming guide
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Transformer related optimization, including BERT, GPT
C++ 资源大全中文版,标准库、Web应用框架、人工智能、数据库、图片处理、机器学习、日志、代码分析等。由「开源前哨」和「CPP开发者」微信公号团队维护更新。
A curated list of awesome C++ (or C) frameworks, libraries, resources, and shiny things. Inspired by awesome-... stuff.
C++ Parallel Computing and Asynchronous Networking Framework
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Lightweight thread library for C/C++ coroutine (similar to goroutine), for high performance network servers.
RTSP/RTP/RTMP/FLV/HLS/MPEG-TS/MPEG-PS/MPEG-DASH/MP4/fMP4/MKV/WebM
SRS is a simple, high-efficiency, real-time media server supporting RTMP, WebRTC, HLS, HTTP-FLV, HTTP-TS, SRT, MPEG-DASH, and GB28181.
Named Entity Recognition using multilayered bidirectional LSTM
Expert System with Fuzzy Control to Froth Flotation control