-
NVIDIA
- Toronto, Canada
Highlights
Stars
- All languages
- AppleScript
- Arduino
- Assembly
- Bikeshed
- C
- C#
- C++
- CMake
- CSS
- Clojure
- CoffeeScript
- Cuda
- Dart
- Dockerfile
- Erlang
- Go
- HTML
- Haskell
- Java
- JavaScript
- Jinja
- Jsonnet
- Jupyter Notebook
- Kotlin
- LiveScript
- Lua
- Makefile
- Markdown
- Mustache
- Nginx
- OCaml
- P4
- PHP
- PowerShell
- Processing
- Python
- QML
- Racket
- Red
- Ruby
- Rust
- SCSS
- Sass
- Scala
- Shell
- Smarty
- Swift
- TeX
- TypeScript
- VHDL
- Vim Script
- Vue
Generative AI extensions for onnxruntime
The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but completely free and 100% private.
An autoregressive character-level language model for making more things
Neural Networks: Zero to Hero
DSPy: The framework for programming—not prompting—foundation models
llama3.np is a pure NumPy implementation for Llama 3 model.
A VSCode extension to generate development environments using micromamba and conda-forge package repository
A book about compiling Racket and Python to x86-64 assembly
A Python framework for high performance GPU simulation and graphics
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Development repository for the Triton language and compiler
Enabling CPython multi-core parallelism via subinterpreters.
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.
Fast and memory-efficient exact attention
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.
Utilities for using Python's PEP 554 subinterpreters
torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters in a single C++ process.
High accuracy RAG for answering questions from scientific documents with citations
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
Some notes on things I find interesting and important.
`std::execution`, the proposed C++ framework for asynchronous and parallel programming.
A hybrid thread / fiber task scheduler written in C++ 11
[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl