Awesome Video Object Segmentation

Recent Advances in Video Object Segmentation (VOS). VOS works before 2022 can be found in our review paper:

Deep Learning for Video Object Segmentation: A Review / paper / project page

🧸 We indicate different VOS types with coloured squares:

🟦 SVOS: Semi-Supervised VOS (also termed as One-Shot VOS)

🟩 UVOS: Un-Supervised VOS (also termed as Zero-Shot VOS)

🟧 RVOS: Referring VOS (also termed as Language-Guided VOS)

🟥 AVOS: Audio-guided VOS (also termed as Audio-Visual Video Segmentation)

⬜ XVOS: Other types of VOS

🧸 Please feel free to send us pull requests to add VOS works.

Links for a quick jump: ArXiv 2023, ACMMM 2023, ICCV 2023, CVPR 2023, IJCAI 2023, AAAI 2023, Journals 2023, Earlier ArXiv 2023, NeurIPS 2022, ECCV 2022, CVPR 2022, AAAI 2022, Journals 2022

ArXiv 2023 (Within the last 6 months)

🟧 RVOS Nov - paper / code - VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models (:fire: versatile model, support rvos)

🟩 UVOS Nov - paper / code - Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation

⬜ XVOS Nov - paper / code - Sketch-based Video Object Segmentation: Benchmark and Analysis

⬜ XVOS Nov - paper / code - Learning the What and How of Annotation in Video Object Segmentation

🟦 SVOS Oct - paper / code - Putting the Object Back into Video Object Segmentation

🟦 SVOS Oct - paper / code - Sub-token ViT Embedding via Stochastic Resonance Transformers (support svos)

🟦 SVOS Sep - paper / DATASET - PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation

🟩 UVOS Sep - paper / code - Treating Motion as Option with Output Selection for Unsupervised Video Object Segmentation

🟥 AVOS Sep - paper / code - Rethinking Audiovisual Segmentation with Semantic Quantization and Decomposition

🟦 SVOS Aug - paper / code - Joint Modeling of Feature, Correspondence, and a Compressed Memory for Video Object Segmentation

🟧 RVOS 🟥 AVOS Aug - paper / code - EPCFormer: Expression Prompt Collaboration Transformer for Universal Referring Video Object Segmentation

🟧 RVOS Aug - paper / code - Learning Referring Video Object Segmentation from Weak Annotation

🟦 SVOS Jul - paper / code - Tracking Anything in High Quality

🟧 RVOS Jul - paper / code - Referring Video Object Segmentation with Inter-Frame Interaction and Cross-Modal Correlation

🟧 RVOS Jul - paper / code - RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation

⬜ XVOS Jul - paper / code - Segment Anything Meets Point Tracking

🟧 RVOS Jun - paper / code - LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation

ACM MM 2023

🟩 UVOS - paper / code - SimulFlow: Simultaneously Extracting Feature and Identifying Target for Unsupervised Video Object Segmentation

🟩 UVOS - paper / code - Temporally Efficient Gabor Transformer for Unsupervised Video Object Segmentation

🟦 SVOS - paper / code - Exploring the Adversarial Robustness of Video Object Segmentation via One-shot Adversarial Attacks

🟥 AVOS - paper / code - CATR: Combinatorial-Dependence Audio-Queried Transformer for Audio-Visual Video Segmentation

🟥 AVOS - paper / code - Audio-Visual Segmentation by Exploring Cross-Modal Mutual Semantics

ICCV 2023

⬜ XVOS - paper / code - Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation

🟩 UVOS - paper / code - Time Does Tell: Self-Supervised Time-Tuning of Dense Image Representations (self-supervised learning for UVOS)

🟩 UVOS - paper / code - Isomer: Isomerous Transformer for Zero-Shot Video Object Segmentation

🟩 UVOS - paper / code - Unsupervised Video Object Segmentation with Online Adversarial Self-Tuning

🟩 UVOS 🟧 RVOS - paper / code - DEVA: Tracking Anything with Decoupled Video Segmentation (:fire:versatile model)

🟧 RVOS - paper / code - Temporal Collection and Distribution for Referring Video Object Segmentation

🟧 RVOS - paper / code - Robust Referring Video Object Segmentation with Cyclic Structural Consensus

🟧 RVOS - paper / code - Spectrum-guided Multi-granularity Referring Video Object Segmentation

🟧 RVOS - paper / code - OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation

🟧 RVOS - paper / code - Learning Cross-Modal Affinity for Referring Video Object Segmentation Targeting Limited Samples

🟧 RVOS - paper / code - HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation

🟧 RVOS - paper / DATASET - MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions

🟦 SVOS - paper / code - Integrating Boxes and Masks: A Multi-Object Framework for Unified Visual Tracking and Segmentation (:fire:versatile model)

🟦 SVOS - paper / code - XMem++: Production-level Video Segmentation From Few Annotated Frames

🟦 SVOS - paper / code - Scalable Video Object Segmentation with Simplified Framework

🟦 SVOS - paper / code - Alignment Before Aggregation: Trajectory Memory Retrieval Network for Video Object Segmentation

🟦 SVOS - paper / code - SegGPT: Segmenting Everything In Context (:fire:versatile model)

🟦 SVOS - paper / DATASET - LVOS: A Benchmark for Long-term Video Object Segmentation

🟦 SVOS - paper / DATASET - MOSE: A New Dataset for Video Object Segmentation in Complex Scenes

CVPR 2023

🟩 UVOS - paper / code - MED-VT: Multiscale Encoder-Decoder Video Transformer with Application to Object Segmentation

🟦 SVOS - paper / code - Boosting Video Object Segmentation via Space-time Correspondence Learning

🟦 SVOS 🟧 RVOS - paper / code - Universal Instance Perception as Object Discovery and Retrieval (:fire: versatile model)

🟦 SVOS - paper / code - TarViS: A Unified Approach for Target-Based Video Segmentation (:fire:versatile model)

🟦 SVOS - paper / code - Two-shot Video Object Segmetnation

🟦 SVOS - paper / code - MobileVOS: Real-Time Video Object Segmentation Contrastive Learning meets Knowledge Distillation

🟦 SVOS - paper / code - Look Before You Match: Instance Understanding Matters in Video Object Segmentation

⬜ XVOS - paper / DATASET - Breaking the “Object” in Video Object Segmentation

IJCAI 2023

🟥 AVOS - paper / code - Discovering Sounding Objects by Audio Queries for Audio Visual Segmentation

🟦 SVOS - paper / DATASET - Video Object Segmentation in Panoptic Wild Scenes

AAAI 2023

🟦 SVOS - paper / code - Learning to Learn Better for Video Object Segmentation

Journals 2023

🟩 UVOS - paper / code - TIP Hierarchical Graph Pattern Understanding for Zero-Shot Video Object Segmentation

🟩 UVOS - paper / code - TCSVT Online Unsupervised Video Object Segmentation via Contrastive Motion Clustering

🟦 SVOS - paper / code - TIP Hierarchical Co-Attention Propagation Network for Zero-Shot Video Object Segmentation

🟧 RVOS - paper / code - TPAMI VLT: Vision-Language Transformer and Query Generation for Referring Segmentation

🟧 RVOS - paper / code - TPAMI Local-Global Context Aware Transformer for Language-Guided Video Segmentation

Earlier Arxiv 2023

🟩 UVOS - paper / code - UVOSAM: A Mask-free Paradigm for Unsupervised Video Object Segmentation via Segment Anything Model

🟧 RVOS 🟥 AVOS - paper / code - Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation

🟧 RVOS - paper / code - SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation

⬜ XVOS - paper / code - Segment and Track Anything

⬜ XVOS - paper / code - Track Anything: Segment Anything Meets Videos

⬜ XVOS - paper / code - Reliability-Hierarchical Memory Network for Scribble-Supervised Video Object Segmentation

NeurIPS 2022

🟦 SVOS - paper / code - Decoupling Features in Hierarchical Propagation for Video Object Segmentation

⬜ XVOS - paper / code - Self-supervised Amodal Video Object Segmentation

ECCV 2022

🟦 SVOS - paper / code - XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model

🟦 SVOS - paper / code - BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation

🟦 SVOS - paper / code - Learning Quality-aware Dynamic Memory for Video Object Segmentation

🟦 SVOS - paper / code - Tackling Background Distraction in Video Object Segmentation

🟦 SVOS - paper / code - Global Spectral Filter Memory Network for Video Object Segmentation

🟩 UVOS - paper / code - Hierarchical Feature Alignment Network for Unsupervised Video Object Segmentation

CVPR 2022

🟧 RVOS - paper / code - End-to-End Referring Video Object Segmentation With Multimodal Transformers

🟧 RVOS - paper / code - Language As Queries for Referring Video Object Segmentation

🟧 RVOS - paper / code - Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation

🟧 RVOS - paper / code - Multi-Level Representation Learning With Semantic Alignment for Referring Video Object Segmentation

🟦 SVOS - paper / code - Recurrent Dynamic Embedding for Video Object Segmentation

🟦 SVOS - paper / code - Accelerating Video Object Segmentation With Compressed Video

🟦 SVOS - paper / code - SWEM: Towards Real-Time Video Object Segmentation With Sequential Weighted Expectation-Maximization

🟦 SVOS - paper / code - Per-Clip Video Object Segmentation

⬜ XVOS - paper / code - Wnet: Audio-Guided Video Object Segmentation via Wavelet-Based Cross-Modal Denoising Networks

⬜ XVOS - paper / DATASET - YouMVOS: An Actor-Centric Multi-Shot Video Object Segmentation Dataset

AAAI 2022

🟦 SVOS - paper / code - Siamese Network with Interactive Transformer for Video Object Segmentation

🟦 SVOS - paper / code - Reliable Propagation-Correction Modulation for Video Object Segmentation

🟧 RVOS - paper / code - You Only Infer Once: Cross-Modal Meta-Transfer for Referring Video Object Segmentation

🟩 UVOS - paper / code - Iteratively Selecting an Easy Reference Frame Makes Unsupervised Video Object Segmentation Easier

Journals 2022

🟦 SVOS - paper / code - TPAMI Video Object Segmentation Using Kernelized Memory Network With Multiple Kernels

🟦 SVOS - paper / code - TIP From Pixels to Semantics: Self-Supervised Video Object Segmentation With Multiperspective Feature Mining

🟦 SVOS - paper / code - TIP Delving Deeper Into Mask Utilization in Video Object Segmentation

🟦 SVOS - paper / code - TIP Adaptive Online Mutual Learning Bi-Decoders for Video Object Segmentation

End of the list. 🌱

VOS papers and datasets before 2022 could be found below:

Deep Learning for Video Object Segmentation: A Review / paper / project page

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Video Object Segmentation

ArXiv 2023 (Within the last 6 months)

ACM MM 2023

ICCV 2023

CVPR 2023

IJCAI 2023

AAAI 2023

Journals 2023

Earlier Arxiv 2023

NeurIPS 2022

ECCV 2022

CVPR 2022

AAAI 2022

Journals 2022

About

Releases

Packages

sowonny/Awesome-Video-Object-Segmentation_2

Folders and files

Latest commit

History

Repository files navigation

Awesome Video Object Segmentation

ArXiv 2023 (Within the last 6 months)

ACM MM 2023

ICCV 2023

CVPR 2023

IJCAI 2023

AAAI 2023

Journals 2023

Earlier Arxiv 2023

NeurIPS 2022

ECCV 2022

CVPR 2022

AAAI 2022

Journals 2022

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages