Skip to content
View mit10000's full-sized avatar

Block or report mit10000

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
  • GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as…

    C++ Other Updated Aug 9, 2024
  • A Library for Advanced Deep Time Series Models.

    Python MIT License Updated Jul 23, 2024
  • LLM Inference analyzer for different hardware platforms

    Jupyter Notebook MIT License Updated Jul 15, 2024
  • Ramulator 2.0 is a modern, modular, extensible, and fast cycle-accurate DRAM simulator. It provides support for agile implementation and evaluation of new memory system designs (e.g., new DRAM stan…

    C++ MIT License Updated Jul 10, 2024
  • galois Public

    Forked from mhostetter/galois

    A performant NumPy extension for Galois fields and their applications

    Python MIT License Updated Jul 7, 2024
  • tiny-gpu Public

    Forked from adam-maj/tiny-gpu

    A minimal GPU design in Verilog to learn how GPUs work from the ground up

    SystemVerilog Updated May 1, 2024
  • VAR Public

    Forked from FoundationVision/VAR

    [GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …

    Python MIT License Updated Apr 30, 2024
  • wanda Public

    Forked from locuslab/wanda

    A simple and effective LLM pruning approach.

    Python MIT License Updated Apr 29, 2024
  • llmperf Public

    Forked from ray-project/llmperf

    LLMPerf is a library for validating and benchmarking LLMs

    Python Apache License 2.0 Updated Apr 28, 2024
  • 📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

    GNU General Public License v3.0 Updated Apr 21, 2024
  • LLM-Viewer Public

    Forked from hahnyuan/LLM-Viewer

    Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

    Python MIT License Updated Apr 12, 2024
  • sparsegpt Public

    Forked from IST-DASLab/sparsegpt

    Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

    Python Apache License 2.0 Updated Apr 6, 2024
  • calm Public

    Forked from zeux/calm

    C(UDA) accelerated language model inference

    C MIT License Updated Apr 3, 2024
  • collection of benchmarks to measure basic GPU capabilities

    Jupyter Notebook GNU General Public License v3.0 Updated Apr 3, 2024
  • DejaVu Public

    Forked from FMInference/DejaVu
    Python Updated Apr 2, 2024
  • sot Public

    Forked from imagination-research/sot

    [ICLR 2024] Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding

    Python MIT License Updated Mar 1, 2024
  • mixbench Public

    Forked from ekondis/mixbench

    A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)

    C++ GNU General Public License v2.0 Updated Feb 23, 2024
  • Fixes macOS Preview garbled annotations

    Rust MIT License Updated Feb 23, 2024
  • Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.

    Python Apache License 2.0 Updated Feb 20, 2024
  • llm-analysis Public

    Forked from cli99/llm-analysis

    Latency and Memory Analysis of Transformer Models for Training and Inference

    Python Apache License 2.0 Updated Feb 3, 2024
  • llm theoretical performance analysis tools and support params, flops, memory and latency analysis.

    Python Updated Feb 2, 2024
  • gemmini Public

    Forked from ucb-bar/gemmini

    Berkeley's Spatial Array Generator

    Scala Other Updated Jan 29, 2024
  • BladeDISC Public

    Forked from alibaba/BladeDISC

    BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

    C++ Apache License 2.0 Updated Jan 22, 2024
  • zigzag Public

    Forked from KULeuven-MICAS/zigzag

    HW Architecture-Mapping Design Space Exploration Framework for Deep Learning Accelerators

    C++ BSD 3-Clause "New" or "Revised" License Updated Jan 19, 2024
  • Repository to host and maintain scale-sim-v2 code

    Python Updated Jan 2, 2024
  • Verilog CERN Open Hardware Licence Version 2 - Strongly Reciprocal Updated Dec 15, 2023
  • nnfusion Public

    Forked from microsoft/nnfusion

    A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

    C++ MIT License Updated Nov 22, 2023
  • calculon Public

    Forked from calculon-ai/calculon
    Python Apache License 2.0 Updated Nov 16, 2023
  • pdfs Public

    Forked from tpn/pdfs

    Technically-oriented PDF Collection (Papers, Specs, Decks, Manuals, etc)

    HTML Updated Oct 22, 2023
  • Collect some CS textbooks for learning.

    Updated Oct 17, 2023