-
-
-
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedMar 13, 2024 -
TensorRT-LLM Public
Forked from NVIDIA/TensorRT-LLMTensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
C++ Apache License 2.0 UpdatedMar 8, 2024 -
-
TensorRT Public
Forked from NVIDIA/TensorRTNVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
C++ Apache License 2.0 UpdatedFeb 9, 2024 -
-
-