Lists (6)
Sort Name ascending (A-Z)
Stars
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
A unified framework for 3D content generation.
Generative Models by Stability AI
RobustSAM: Segment Anything Robustly on Degraded Images (CVPR 2024 Highlight)
An open source implementation of CLIP.
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, sparsity, distillation, etc. It compresses deep learning models for downstream …
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
👁️ 🖼️ 🔥PyTorch Toolbox for Image Quality Assessment, including LPIPS, FID, NIQE, NRQM(Ma), MUSIQ, TOPIQ, NIMA, DBCNN, BRISQUE, PI and more...
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
The official repo for [CVPR'23] "DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting" & [ArXiv'23] "DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multi…
Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.
Python class for calculating confusion matrix for object detection task
A toolbox of ocr models and algorithms based on MindSpore
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
Run ruff, isort, pyupgrade, mypy, pylint, flake8, and more on Jupyter Notebooks
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
A natural language interface for computers
A 3D computer vision development toolkit based on PaddlePaddle. It supports point-cloud object detection, segmentation, and monocular 3D object detection models.
EVA Series: Visual Representation Fantasies from BAAI
(TPAMI 2024) A Survey on Open Vocabulary Learning
LAVIS - A One-stop Library for Language-Vision Intelligence
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
AMBER: Automated annotation and Multimodal Bag Extraction for Robotics
detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.