Tabrizian

Iman Tabrizian Tabrizian

342 followers · 180 following

NVIDIA
Toronto, Canada

Achievements

x3 x3 x3

Achievements

x3 x3 x3

Highlights

Developer Program Member

Organizations

Stars

microsoft / onnxruntime-genai

Generative AI extensions for onnxruntime

C++ 453 110 Updated Oct 8, 2024

twinnydotdev / twinny

The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but completely free and 100% private.

TypeScript 2,937 154 Updated Oct 7, 2024

karpathy / makemore

An autoregressive character-level language model for making more things

Python 2,509 659 Updated Jun 4, 2024

karpathy / nn-zero-to-hero

Neural Networks: Zero to Hero

Jupyter Notebook 11,670 1,459 Updated Aug 18, 2024

prefix-dev / pixi

Package management made easy

Rust 3,057 171 Updated Oct 8, 2024

stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models

Python 17,586 1,339 Updated Oct 8, 2024

likejazz / llama3.np

llama3.np is a pure NumPy implementation for Llama 3 model.

Python 959 73 Updated Jun 2, 2024

mamba-org / vscode-micromamba

A VSCode extension to generate development environments using micromamba and conda-forge package repository

TypeScript 85 10 Updated Sep 13, 2024

NVIDIA / cuda-checkpoint

CUDA checkpoint and restore utility

Cuda 207 10 Updated Apr 17, 2024

IUCompilerCourse / Essentials-of-Compilation

A book about compiling Racket and Python to x86-64 assembly

TeX 1,295 141 Updated Oct 7, 2024

NVIDIA / warp

A Python framework for high performance GPU simulation and graphics

Python 4,168 232 Updated Oct 8, 2024

S-LoRA / S-LoRA

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,725 91 Updated Jan 21, 2024

triton-lang / triton

Development repository for the Triton language and compiler

C++ 12,959 1,577 Updated Oct 8, 2024

dfm / extending-jax

Extending JAX with custom C++ and CUDA code

Python 373 21 Updated Aug 18, 2024

ericsnowcurrently / multi-core-python

Enabling CPython multi-core parallelism via subinterpreters.

245 6 Updated Aug 19, 2022

karpathy / minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 9,083 841 Updated Jul 1, 2024

ritchieng / the-incredible-pytorch

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

11,406 2,107 Updated Sep 25, 2024

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 13,668 1,254 Updated Oct 8, 2024

ml-explore / mlx

MLX: An array framework for Apple silicon

C++ 16,661 957 Updated Oct 8, 2024

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,357 940 Updated Oct 8, 2024

triton-inference-server / pytriton

PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.

Python 720 50 Updated Sep 25, 2024

jsbueno / extrainterpreters

Utilities for using Python's PEP 554 subinterpreters

Python 112 6 Updated Oct 1, 2024

pytorch / multipy

torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters in a single C++ process.

C++ 173 35 Updated Jun 20, 2024

Future-House / paper-qa

High accuracy RAG for answering questions from scientific documents with citations

Python 6,046 568 Updated Oct 7, 2024

karpathy / micrograd

A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

Jupyter Notebook 10,149 1,461 Updated Aug 8, 2024

frankmcsherry / blog

Some notes on things I find interesting and important.

JavaScript 1,968 177 Updated Sep 11, 2024

NVIDIA / stdexec

`std::execution`, the proposed C++ framework for asynchronous and parallel programming.

C++ 1,542 158 Updated Sep 20, 2024

rapidsai / rmm

RAPIDS Memory Manager

C++ 480 195 Updated Oct 8, 2024

google / marl

A hybrid thread / fiber task scheduler written in C++ 11

C++ 1,863 193 Updated Jul 12, 2024

NVIDIA / libcudacxx

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

C++ 2,291 186 Updated Feb 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Iman Tabrizian Tabrizian

Achievements

Achievements

Highlights

Organizations

Block or report Tabrizian

Stars

microsoft / onnxruntime-genai

twinnydotdev / twinny

karpathy / makemore

karpathy / nn-zero-to-hero

prefix-dev / pixi

stanfordnlp / dspy

likejazz / llama3.np

mamba-org / vscode-micromamba

NVIDIA / cuda-checkpoint

IUCompilerCourse / Essentials-of-Compilation

NVIDIA / warp

S-LoRA / S-LoRA

triton-lang / triton

dfm / extending-jax

ericsnowcurrently / multi-core-python

karpathy / minbpe

ritchieng / the-incredible-pytorch

Dao-AILab / flash-attention

ml-explore / mlx

NVIDIA / TensorRT-LLM

triton-inference-server / pytriton

jsbueno / extrainterpreters

pytorch / multipy

Future-House / paper-qa

karpathy / micrograd

frankmcsherry / blog

NVIDIA / stdexec

rapidsai / rmm

google / marl

NVIDIA / libcudacxx