Stars
Simple Finetuning Starter Code for Segment Anything
Python API for Tuya WiFi smart devices using a direct local area network (LAN) connection or the cloud (TuyaCloud API).
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
[ECCV2024] ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation
[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
[OpenPAR] An open-source framework for Pedestrian Attribute Recognition, based on PyTorch
[ECCV'24] Official Implementation of SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance
[ECCV 2024] SMFANet: A Lightweight Self-Modulation Feature Aggregation Network for Efficient Image Super-Resolution
PyTorch implementation for the paper Don't Look into the Dark: Latent Codes for Pluralistic Image Inpainting (CVPR2024).
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Universal Monocular Metric Depth Estimation
[ECCV 2024] Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation
[arXiv 2024] Improving Unsupervised Video Object Segmentation via Fake Flow Generation
A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".
Latte: Latent Diffusion Transformer for Video Generation.
[ECCV 2024] ProDepth: Boosting Self-Supervised Multi-Frame Monocular Depth with Probabilistic Fusion
[CVPR24] Official Implementation of 'A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video Editing'
This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation
[CVPR 2024] Exploring Orthogonality in Open World Object Detection
This is the repository for the BioCLIP model and the TreeOfLife-10M dataset [CVPR'24 Oral, Best Student Paper].
AuraSR: GAN-based Super-Resolution for real-world
[CVPR 2024 Oral] MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation.
[CVPR'24 Oral] Official repository of Point Transformer V3 (PTv3)