Skip to content
View Tabrizian's full-sized avatar
  • NVIDIA
  • Toronto, Canada

Organizations

@nuxt-community @kubeflow @triton-inference-server

Block or report Tabrizian

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Generative AI extensions for onnxruntime

C++ 453 110 Updated Oct 8, 2024

The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but completely free and 100% private.

TypeScript 2,937 154 Updated Oct 7, 2024

An autoregressive character-level language model for making more things

Python 2,509 659 Updated Jun 4, 2024

Neural Networks: Zero to Hero

Jupyter Notebook 11,670 1,459 Updated Aug 18, 2024

Package management made easy

Rust 3,057 171 Updated Oct 8, 2024

DSPy: The framework for programming—not prompting—foundation models

Python 17,586 1,339 Updated Oct 8, 2024

llama3.np is a pure NumPy implementation for Llama 3 model.

Python 959 73 Updated Jun 2, 2024

A VSCode extension to generate development environments using micromamba and conda-forge package repository

TypeScript 85 10 Updated Sep 13, 2024

CUDA checkpoint and restore utility

Cuda 207 10 Updated Apr 17, 2024

A book about compiling Racket and Python to x86-64 assembly

TeX 1,295 141 Updated Oct 7, 2024

A Python framework for high performance GPU simulation and graphics

Python 4,168 232 Updated Oct 8, 2024

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,725 91 Updated Jan 21, 2024

Development repository for the Triton language and compiler

C++ 12,959 1,577 Updated Oct 8, 2024

Extending JAX with custom C++ and CUDA code

Python 373 21 Updated Aug 18, 2024

Enabling CPython multi-core parallelism via subinterpreters.

245 6 Updated Aug 19, 2022

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 9,083 841 Updated Jul 1, 2024

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

11,406 2,107 Updated Sep 25, 2024

Fast and memory-efficient exact attention

Python 13,668 1,254 Updated Oct 8, 2024

MLX: An array framework for Apple silicon

C++ 16,661 957 Updated Oct 8, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,357 940 Updated Oct 8, 2024

PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.

Python 720 50 Updated Sep 25, 2024

Utilities for using Python's PEP 554 subinterpreters

Python 112 6 Updated Oct 1, 2024

torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters in a single C++ process.

C++ 173 35 Updated Jun 20, 2024

High accuracy RAG for answering questions from scientific documents with citations

Python 6,046 568 Updated Oct 7, 2024

A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

Jupyter Notebook 10,149 1,461 Updated Aug 8, 2024

Some notes on things I find interesting and important.

JavaScript 1,968 177 Updated Sep 11, 2024

`std::execution`, the proposed C++ framework for asynchronous and parallel programming.

C++ 1,542 158 Updated Sep 20, 2024

RAPIDS Memory Manager

C++ 480 195 Updated Oct 8, 2024

A hybrid thread / fiber task scheduler written in C++ 11

C++ 1,863 193 Updated Jul 12, 2024

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

C++ 2,291 186 Updated Feb 7, 2024
Next