Stars
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Code for PointInfinity: Resolution-Invariant Point Diffusion Models
Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
Multiview Compressive Coding for 3D Reconstruction
Code Release for MeMViT Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition, CVPR 2022
This repo accompanies the research paper, ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D Data and contains the data, scripts to visualize and proces…
BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)
VOLO: Vision Outlooker for Visual Recognition
This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
A deep learning library for video understanding research.
An end-to-end PyTorch framework for image and video classification
Transformer training code for sequential tasks
Learning Continuous Image Representation with Local Implicit Image Function, in CVPR 2021 (Oral)
MONeT framework for reducing memory consumption of DNN training
You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization
PyTorch implementation of X3D models with Multigrid training.
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations