Awesome-Deep-Stereo-Matching

Welcome to the "Awesome-Deep-Stereo-Matching" repository, a curated list of state-of-the-art deep stereo matching resources maintained by Fabio Tosi and Matteo Poggi, both from the University of Bologna. This repository, inspired by awesome-computer-vision, aims to provide a comprehensive collection of the latest and most influential papers on deep stereo matching published in top-tier computer vision conferences and prestigious journals.

The methods included in this repository are appropriately categorized to facilitate navigation and understanding of the diverse approaches and techniques employed in deep stereo matching research. Additionally, for anyone in need, we also release the reference bib which contains the bib entries for all the works included in this page.

We use the 🚩 symbol to highlight the absolute most groundbreaking works.

How to submit a pull request?

If you find this repository valuable, please consider citing it in your work and giving it a star ! ⭐

Survey & Fundamentals

Stereo Matching Basics

"A taxonomy and evaluation of dense two-frame stereo correspondence algorithms", Scharstein & Szeliski, International Journal of Computer Vision (TPAMI), 2002. [Paper] [Bibtex] [Google Scholar]
"Evaluation of cost functions for stereo matching", Hirschmuller & Scharstein, CVPR, 2007. [Paper] [Bibtex] [Google Scholar]
SGM: "Stereo processing by semiglobal matching and mutual information", Heiko Hirschmuller, TPAMI, 2007. [Paper] [Bibtex] [Google Scholar]
"Computer Vision: Algorithms and Applications", 2nd Edition - (Chapter 12, Depth Estimation), Richard Szeliski [Slides] [Bibtex] [Google Scholar]
"Stereo Matching", Richard Szeliski, University of Washington [Slides]
"Stereo Vision", Fei-Fei Li, Stanford Vision Lab [Slides]
"Stereo Vision: Algorithms and Applications", Stefano Mattoccia, University of Bologna [Slides] [Bibtex] [Google Scholar]

Deep Stereo Matching

"A survey on deep learning techniques for stereo-based depth estimation", Laga et al., IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020. [Paper] [Bibtex] [Google Scholar]
"On the synergies between machine learning and binocular stereo for depth estimation from images: a survey", Poggi et al., IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021. [Paper] [Bibtex] [Google Scholar]

Learned Confidence Estimation

"Quantitative evaluation of confidence measures in a machine learning world", Poggi et al., ICCV, 2017. [Paper] [Bibtex] [Google Scholar]
"On the Confidence of Stereo Matching in a Deep-Learning Era: A Quantitative Evaluation", Poggi et al., IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022. [Paper] [Bibtex] [Google Scholar]

CodeBase

OpenStereo: "OpenStereo: A Comprehensive Benchmark for Stereo Matching and Strong Baseline", Xianda et al., arXiv, 2023 [Paper] [Code] [Bibtex] [Google Scholar]

Datasets

Real-World

RGB

KITTI 2012: "Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite", Geiger et al., CVPR, 2012. [Paper] [Dataset] [Bibtex] [Google Scholar]
Middlebury v3: "High-resolution stereo datasets with subpixel-accurate ground truth", Scharstein et al., GCPR 2014. [Paper] [Dataset] [Bibtex] [Google Scholar]
Cityscapes: "The cityscapes dataset for semantic urban scene understanding", Cordts et al., CVPR, 2016. [Paper] [Dataset] [Bibtex] [Google Scholar]
ETH3D: "A multi-view stereo benchmark with high-resolution images and multi-camera videos", Schops et al., CVPR, 2017. [Paper] [Dataset] [Bibtex] [Google Scholar]
DrivingStereo: "DrivingStereo: A Large-Scale Dataset for Stereo Matching in Autonomous Driving Scenarios", Yang et al., CVPR, 2019. [Paper] [Dataset] [Bibtex] [Google Scholar]
WSVD: "Web stereo video supervision for depth prediction from dynamic scenes", Wang et al., 3DV, 2019. [Paper] [Dataset] [Bibtex] [Google Scholar]
Flickr1024: "Flickr1024: A large-scale dataset for stereo image super-resolution", Wang et al., ICCVW, 2019. [Paper] [Dataset] [Bibtex] [Google Scholar]
ApolloScape: "The apolloscape open dataset for autonomous driving and its application", Huang et al., TPAMI, 2019. [Paper] [Dataset] [Bibtex] [Google Scholar]
Holopix50k: "Holopix50k: A Large-Scale In-the-Wild Stereo Image Dataset", Hua et al., CVPR, 2020. [Paper] [Dataset] [Bibtex] [Google Scholar]
A* 3d: "A* 3d dataset: Towards autonomous driving in challenging environments"*, Pham et al., ICRA, 2020. [Paper] [Github] [Bibtex] [Google Scholar]
A2D2: "Audi Autonomous Driving Dataset", Geyer et al., arXiv, 2020. [Paper] [Dataset] [Bibtex] [Google Scholar]
InStereo2K: "InStereo2K: A Large Real Dataset for Stereo Matching in Indoor Scenes", Bao et al., Science China Information Sciences, 2020. [Paper] [Github] [Bibtex] [Google Scholar]
Middlebury 2021 Mobile Dataset: [Dataset] [Bibtex]
Booster: "Open Challenges in Deep Stereo: The Booster Dataset", Ramirez et al., CVPR, 2022. [Paper] [Dataset] [Bibtex] [Google Scholar]
WHU-Stereo: "WHU-Stereo: A challenging benchmark for stereo matching of high-resolution satellite images", Li et al., TGRS, 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]

Multimodal/Beyond-Visible

CATS: "CATS: A Color and Thermal Stereo Benchmark", Treible et al., CVPR, 2017. [Paper] [Dataset] [Bibtex] [Google Scholar]
RGB-NIR-Stereo: "Deep material-aware cross-spectral stereo matching", Zhi et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
MVSEC: "The Multivehicle Stereo Event Camera Dataset: An Event Camera Dataset for 3D Perception", Zhu et al., RAL 2018. [Paper] [Dataset] [Bibtex] [Google Scholar]
DSEC: "DSEC: A Stereo Event Camera Dataset for Driving Scenarios", Gehrig et al., RAL, 2021. [Paper] [Code] [Dataset] [Bibtex] [Google Scholar]
RGB-MS: "RGB-Multispectral Matching: Dataset, Learning Methodology, Evaluation", Tosi et al., CVPR, 2022. [Paper] [Dataset] [Bibtex] [Google Scholar]
M3ED: "M3ed: Multi-robot, multi-sensor, multi-environment event dataset", Chaney et al., CVPR, 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]
Gated Stereo: "Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues", Walz et al., CVPR, 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]
RPS/IPS: "DPS-Net: Deep Polarimetric Stereo Depth Estimation", Tian et al., ICCV, 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]
MS^2: "Deep Depth Estimation From Thermal Image", Shin et al., CVPR 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]

Rendered

The NeRF-Stereo Dataset: "NeRF-Supervised Deep Stereo", Tosi et al., CVPR 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]

Synthetic

MPI Sintel: "A naturalistic open source movie for optical flow evaluation", Butler et al., ECCV, 2012. [Paper] [Dataset] [Bibtex] [Google Scholar]
Freiburg SceneFlow: "A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation", Mayer et al., CVPR, 2016. [Paper] [Dataset] [Bibtex] [Google Scholar]
Falling Things: "A synthetic dataset for 3d object detection and pose estimation", Tremblay et al., CVPRW, 2018. [Paper] [Dataset] [Bibtex] [Google Scholar]
HS-VS: "Hierarchical deep stereo matching on high-resolution image", Yang et al., CVPR, 2019. [Paper] [Dataset] [Bibtex] [Google Scholar]
Virtual KITTI: "Virtual kitti 2", Cabon et al., arXiv, 2020. [Paper] [Dataset] [Bibtex] [Google Scholar]
TartanAir: "TartanAir: A dataset to push the limits of visual slam", Wang et al., IROS, 2020. [Paper] [Dataset] [Bibtex] [Google Scholar]
Semi-synthesis: "Semi-synthesis: A fast way to produce effective datasets for stereo matching", He et al., ICCVW, 2021. [Paper] [Bibtex] [Google Scholar]
UnrealStereo4K: "SMD-Nets: Stereo Mixture Density Networks", Tosi et al., CVPR, 2021. [Paper] [Dataset] [Bibtex] [Google Scholar]
IRS: "IRS: A large naturalistic indoor robotics stereo dataset to train deep models for disparity and surface normal estimation", Wang et al., ICME, 2021. [Paper] [Dataset] [Bibtex] [Google Scholar]
CREStereo: "Practical stereo matching via cascaded recurrent network with adaptive correlation", Li et al., CVPR, 2022. [Paper] [Dataset] [Bibtex] [Google Scholar]
SimStereo: "Active-Passive SimStereo – Benchmarking the Cross-Generalization Capabilities of Deep Learning-based Stereo Methods", Jospin et al., NeurIPS, 2022. [Paper] [Dataset] [Bibtex] [Google Scholar]
Spring: "Spring: A High-Resolution High-Detail Dataset and Benchmark for Scene Flow, Optical Flow and Stereo", Mehl et al., CVPR, 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]
Dynamic Replica: "DynamicStereo: Consistent Dynamic Depth From Stereo Videos", Karaev et al., CVPR 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]
All-In-One Drive: "A Comprehensive Perception Dataset with High-Density Long-Range Point Clouds", Weng et al., arXiv 2023. [Paper] [Dataset] [Bibtex] [Google Scholar]

Frameworks

Learning for Stereo Pipeline

Matching Cost

Deep Embed: "A deep visual correspondence embedding model for stereo matching costs", Chen et al., ICCV, 2015. [Paper] [Bibtex] [Google Scholar]
🚩 MC-CNN: "Stereo matching by training a convolutional neural network to compare image patches", Zbontar & LeCun, JMLR, 2016. [Paper] [Code] [Bibtex1] [Bibtex2] [Google Scholar]
Content CNN: "Efficient deep learning for stereo matching", Luo et al., CVPR, 2016. [Paper] [Code] [Bibtex] [Google Scholar]
Per-pixel pyramid-pooling: "Look wider to match image patches with convolutional neural networks", Park et al., SPR, 2016. [Paper] [Bibtex] [Google Scholar]
Consistency and Distinctiveness: "Fundamental principles on learning new features for effective dense matching", Zhang et al., TIP, 2017. [Paper] [Bibtex] [Google Scholar]
MC-CNN-WS: "Weakly supervised learning of deep metrics for stereo reconstruction", Tulyakov et al., ICCV, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
CBMV: "CBMV: A coalesced bidirectional matching volume for disparity estimation", Batsos et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]
SDC: "SDC - stacked dilated convolution: A unified descriptor network for dense matching tasks", Schuster et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]
Semi-dense Stereo: "Semi-dense Stereo Matching using Dual CNNs", Mao et al., WACV, 2019. [Paper] [Bibtex] [Google Scholar]

Optimization

GCP: "Learning to detect ground control points for improving the accuracy of stereo matching", Spyropoulos et al., CVPR, 2014. [Paper] [Bibtex] [Google Scholar]
LevStereo: "Leveraging stereo matching with learning-based confidence measures", Park et al., CVPR, 2015. [Paper] [Bibtex] [Google Scholar]
O1: "Learning a general-purpose confidence measure based on o (1) features and a smarter aggregation strategy for semi global matching", Poggi et al., 3DV, 2016. [Paper] [Bibtex] [Google Scholar]
PBCP: "Patch Based Confidence Prediction for Dense Disparity Map", Seki et al., BMVC, 2016. [Paper] [Bibtex] [Google Scholar]
Sgm-Nets: "Sgm-Nets: Semi-global matching with neural networks", Seki et al., CVPR, 2017. [Paper] [Bibtex] [Google Scholar]
SGM-Forest: "Learning to fuse proposals from multiple scanline optimizations in semi-global matching", Schonberger et al., ECCV, 2018. [Paper] [Bibtex] [Google Scholar]

Refinement

RCN: "Improved stereo matching with constant highway networks and reflective confidence learning", Shaked et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
DRR: "Detect, replace, refine: Deep structured prediction for pixel wise labeling", Gidaris et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
OSD: "Efficient stereo matching leveraging deep local and context information", Ye et al., IEEE Access, 2017. [Paper] [Bibtex] [Google Scholar]
Recresnet: "Recresnet: A recurrent residual cnn architecture for disparity map enhancement", Batsos et al., 3DV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
LRCR: "Left-right comparative recurrent model for stereo matching", Jie et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]
FD-Fusion: "Fast stereo disparity maps refinement by fusion of data-based and model-based estimations", Ferrera et al., 3DV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
VRN: "Learned collaborative stereo refinement", Knobelreiter et al., IJCV, 2021. [Paper] [Bibtex] [Google Scholar]
NDR: "Neural disparity refinement for arbitrary resolution stereo", Aleotti et al., 3DV, 2021. [Paper] [Website] [Bibtex]

End-to-End Architectures

Foundational Deep Stereo Architectures

CNN-based Cost Volume Aggregation

2D Architectures

🚩 DispNet-C: "A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation", Mayer et al.,CVPR, 2016. [Paper] [Bibtex] [Google Scholar]
CNN+CRF: "End-to-end training of hybrid CNN-CRF models for stereo", Knobelreiter et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
CRL: "Cascade residual learning: A two-stage convolutional neural network for stereo matching", Pang et al., CVPRW, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
iResNet: "Learning for disparity estimation through feature constancy", Liang et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
DispNet-CSS: "Occlusions, motion and depth boundaries with a generic network for disparity, optical flow or scene flow estimation", Ilg et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
EdgeStereo: "Edgestereo: A context integrated residual pyramid network for stereo matching", Song et al., ACCV, 2018. [Paper] [Bibtex] [Google Scholar]
AutoDispNet-CSS: "Autodispnet: Improving disparity estimation with automl", Saikia et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
HD³: "Hierarchical discrete distribution decomposition for match density estimation", Yin et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
AANet: "AANet: Adaptive Aggregation Network for Efficient Stereo Matching", Xu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
Bi3D: "Bi3D: Stereo Depth Estimation via Binary Classifications", Badki et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

3D Architectures

🚩 GC-Net: "End-to-end learning of geometry and context for deep stereo regression", Kendall et al., ICCV, 2017. [Paper] [Bibtex] [Google Scholar]
ECA: "Deep stereo matching with explicit cost aggregation sub-architecture", Yu et al., AAAI, 2018. [Paper] [Bibtex] [Google Scholar]
PSMNet: "Pyramid Stereo Matching Network", Chang et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
PDSNet: "Practical deep stereo (pds): Toward applications-friendly deep stereo matching", Tulyakov et al., NeurIPS, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
HSMNet: "Hierarchical deep stereo matching on high-resolution images", Yang et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
GWCNet: "Group-wise correlation stereo network", Guo et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
EMCUA: "Multi-Level Context Ultra-Aggregation for Stereo Matching", Nie et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]
CSPN: "Learning depth with convolutional spatial propagation network", Cheng et al., TPAMI, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
GA-Net: "Ga-net: Guided aggregation net for end-to-end stereo matching", Zhang et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Stereodrnet: "Stereodrnet: Dilated residual stereonet", Chabra et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]
CasStereo: "Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching", Gu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
WaveletStereo: "WaveletStereo: Learning Wavelet Coefficients of Disparity Map in Stereo Matching", Wang et al., CVPR, 2020. [Paper] [Bibtex] [Google Scholar]
CFNet: "CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching", Shen et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
UASNet: "UASNet: Uncertainty Adaptive Sampling Network for Deep Stereo Matching", Mao et al., ICCV, 2021 [Paper] [Bibtex] [Google Scholar]
PCR: "Parallax contextual representations for stereo matching", Deng et al., ICIP, 2021. [Paper] [Bibtex] [Google Scholar]
PCWNet: "PCW-Net: Pyramid Combination and Warping Cost Volume for Stereo Matching", Shen et al., ECCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
ICVP: "Image-Coupled Volume Propagation for Stereo Matching", Kwon et al., ICIP, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
SEDNet: "Learning the distribution of errors in stereo matching for joint disparity and uncertainty estimation", Chen et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

Neural Architecture Search (NAS)

LEAStereo: "Hierarchical Neural Architecture Search for Deep Stereo Matching", Cheng et al., NeurIPS, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
EASNet: "EASNet: searching elastic and accurate network architecture for stereo matching", Wang et al., ECCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

Iterative Optimized-based Architectures

🚩 RAFT-Stereo: "RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching", Lipson et al., 3DV, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
ORStereo: "Orstereo: Occlusion-aware recurrent stereo matching for 4k-resolution images", Hu et al., IROS, 2021. [Paper] [WebPage] [Bibtex] [Google Scholar]
SCV-Stereo: "SCV-Stereo: Learning Stereo Matching from a Sparse Cost Volume", Wang et al., ICIP, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
CREStereo: "Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation", Li et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
EAI-Stereo: "EAI-Stereo: Error Aware Iterative Network for Stereo Matching", Zhao et al., ACCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
IGEV-Stereo: "Iterative Geometry Encoding Volume for Stereo Matching", Xu et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
DLNR: "High-Frequency Stereo Matching Network", Zhao et al, CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
Dynamic Stereo: "DynamicStereo: Consistent Dynamic Depth From Stereo Videos", Karaev et al., CVPR 2023. [Paper] [Code] [Bibtex] [Google Scholar]
CREStereo++: "Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo Matching", Jing et al., ICCV, 2023. [Paper] [Bibtex] [Google Scholar]
Selective-Stereo: "Selective-Stereo: Adaptive Frequency Information Selection for Stereo Matching", Wang et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
Any-Stereo: "Any-Stereo: Arbitrary Scale Disparity Estimation for Iterative Stereo Matching", Liang et al., AAAI, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
MC-Stereo: "MC-Stereo: Multi-peak Lookup and Cascade Search Range for Stereo Matching", Feng et al., 3DV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
ICGNet: "Learning Intra-view and Cross-view Geometric Knowledge for Stereo Matching", Gong et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
MoCha-Stereo: "MoCha-Stereo: Motif Channel Attention Network for Stereo Matching", Chen et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
XR-Stereo: "Stereo Matching in Time: 100+ FPS Video Stereo Matching for Extended Reality", Cheng et al., WACV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

Transformer-based Architectures

STTR: "Revisiting Stereo Depth Estimation From a Sequence-to-Sequence Perspective With Transformers", Li et al., ICCV, 2021 [Paper] [Code] [Bibtex] [Google Scholar]
CEST: "Context-enhanced stereo transformer", Guo et al., ECCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
Chitransformer: "Chitransformer: Towards Reliable Stereo From Cues", Su et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
Dynamic Stereo: "DynamicStereo: Consistent Dynamic Depth From Stereo Videos", Karaev et al., CVPR 2023. [Paper] [Code] [Bibtex] [Google Scholar]
GMStereo: "Unifying Flow, Stereo and Depth Estimation", Xu et al., TPAMI, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
CroCo v2: "CroCo v2: Improved Cross-View Completion Pre-training for Stereo Matching and Optical Flow", Weinzaepfel et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
ELFNet: "Elfnet: Evidential local-global fusion for stereo matching", Lou et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
GOAT: "Global Occlusion-Aware Transformer for Robust Stereo Matching", Liu et al., WACV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

Markov Random Field-based Architectures

NMRF: "Neural Markov Random Field for Stereo Matching", Guan et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

Efficient-Oriented Deep Stereo Architectures

Compact Cost Volume Representation

Stereonet: "Stereonet: Guided hierarchical refinement for real-time edge-aware depth prediction", Khamis et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
Fast DS-CS: "Fast Deep Stereo with 2D Convolutional Processing of Cost Signatures", Yee et al., WACV, 2020 [Paper] [Code] [Bibtex] [Google Scholar]
DecNet: "A Decomposition Model for Stereo Matching", Yao et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
BTC: "Soft Cross Entropy Loss and Bottleneck Tri-Cost Volume For Efficient Stereo Depth Prediction", Nuanes et al., CVPRW, 2021. [Paper] [Bibtex] [Google Scholar]
ACVNet: "Attention Concatenation Volume for Accurate and Efficient Stereo Matching", Xu et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
PCVNet: "Parameterized Cost Volume for Stereo Matching", Zeng et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
IINet: "IINet: Implicit Intra-inter Information Fusion for Real-Time Stereo Matching", Li et al., AAAI, 2024. [Paper] [Bibtex] [Google Scholar]

Efficient Cost Volume Processing

Deeppruner: "Deeppruner: Learning efficient stereo matching via differentiable patchmatch", Duggal et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
CasStereo: "Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching", Gu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
MABNet: "MABNet: a lightweight stereo network based on multibranch adjustable bottleneck module", Xing et al., ECCV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
BGNet: "Bilateral Grid Learning for Stereo Matching Networks", Xu et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
Separable-Stereo: "Separable Convolutions for Optimizing 3D Stereo Networks", Rahim et al., ICIP, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
TemporalStereo: "TemporalStereo: Efficient Spatial-Temporal Stereo Matching Network", Zhang et al., IROS, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

Efficient Inference Schemes

Anytime: "Anytime stereo image depth estimation on mobile devices", Wang et al., ICRA, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
StereoVAE: "StereoVAE: A lightweight stereo-matching system using embedded GPUs", Chang et al., ICRA, 2023. [Paper] [Bibtex] [Google Scholar]

Lightweight Network Architecture Design

NVStereoNet: "On the importance of stereo for accurate depth estimation: An efficient semi-supervised deep neural network approach", Smolyanskiy et al., CVPRW, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
MadNet: "Real-Time Self-Adaptive Deep Stereo", Tonioni et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Fadnet: "Fadnet: A Fast and Accurate Network for Disparity Estimation", Wang et al., ICRA, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
AAFS: "Attention-Aware Feature Aggregation for Real-time Stereo Matching on Edge Devices", Chang et al., ACCV, 2020 [Code] [Paper] [Bibtex] [Google Scholar]
HITNet: "HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching", Tankovich et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
CoEX: "Correlate-and-Excite: Real-Time Stereo Matching via Guided Cost Volume Excitation", Bangunharcana et al., IROS, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
RLStereo: "RLStereo: Real-time stereo matching based on reinforcement learning", Yang et al., TIP, 2021. [Paper] [Bibtex] [Google Scholar]
MobileStereoNet: "MobileStereoNet: Towards Lightweight Deep Networks for Stereo Matching", Shamsafar et al., WACV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
PBCStereo: "PBCStereo: A Compressed Stereo Network with Pure Binary Convolutional Operations", Cai et al., ACCV, 2022. [Paper] [Bibtex] [Google Scholar]
MadNet2: "Federated Online Adaptation for Deep Stereo", Poggi et al., CVPR, 2024. [Bibtex]
Distill-And-Prune: "Distill-then-prune: An Efficient Compression Framework for Real-time Stereo Matching Network on Edge Devices", Pan et al., ICRA, 2024. [Paper] [Bibtex] [Google Scholar]

Multi-Task Deep Stereo Architectures

Normal-Assisted Stereo Matching

NA-Stereo: "Normal Assisted Stereo Depth Estimation", Kusupati et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
HITNet: "HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching", Tankovich et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

Joint Stereo Matching and Optical Flow

Multi-Task Learning Using Uncertainty: "Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics", Kendall et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]
BridgeDepthFlow: "Bridging Stereo Matching and Optical Flow via Spatiotemporal Correspondence", Lai et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
UnOS: "UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos", Wang et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Feature-Level Collaboration: "Feature-Level Collaboration: Joint Unsupervised Learning of Optical Flow, Stereo Depth and Camera Motion", Chi et al., CVPR, 2021. [Paper] [Bibtex]
StereoFlowGAN: "StereoFlowGAN: Co-training for Stereo and Flow with Unsupervised Domain Adaptation", Xiong et al., BMVC, 2023. [Paper] [Bibtex] [Google Scholar]

Joint Stereo Matching and Semantic Segmentation

Segstereo: "Segstereo: Exploiting semantic information for disparity estimation", Yang et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
Multi-Task Learning Using Uncertainty: "Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics", Kendall et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]
DSNet: "DSNet: Joint learning for scene segmentation and disparity estimation", Zhan et al., ICRA, 2019. [Paper] [Bibtex] [Google Scholar]
Dispsegnet: "Dispsegnet: Leveraging semantics for end-to-end learning of disparity estimation from stereo imagery", Zhang et al., RAL, 2019. [Paper] [Bibtex] [Google Scholar]
SSPCV-Net: "Semantic stereo matching with pyramid cost volumes", Wu et al., ICCV, 2019. [Paper] [Bibtex] [Google Scholar]
RSS-Net: "Real-time semantic stereo matching", Dovesi et al., ICRA, 2020. [Paper] [Bibtex] [Google Scholar]
SGNet: "SGNet: Semantics Guided Deep Stereo Matching", Chen et al., ACCV, 2020. [Paper] [Bibtex] [Google Scholar]

Joint Stereo Matching and Uncertainty

RCN: "Improved stereo matching with constant highway networks and reflective confidence learning", Shaked et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
UCN: "Unified confidence estimation networks for robust stereo matching", Kim et al., TIP, 2018. [Paper] [Bibtex] [Google Scholar]
ACN: "Adversarial confidence estimation networks for robust stereo matching", Kim et al., T-ITS, 2020. [Paper] [Bibtex] [Google Scholar]
AcfNet: "Adaptive Unimodal Cost Volume Filtering for Deep Stereo Matching", Zhang et al., AAAI, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
Weak Adversarial Learning: "Leveraging a weakly adversarial paradigm for joint learning of disparity and confidence estimation", Poggi et al., ICPR, 2021. [Paper] [Bibtex] [Google Scholar]
Bayesian: "Joint estimation of depth and its uncertainty from stereo images using bayesian deep learning", Mehltretter Max, ISPRS, 2022. [Paper] [Bibtex] [Google Scholar]
SEDNet: "Learning the distribution of errors in stereo matching for joint disparity and uncertainty estimation", Chen et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

Scene Flow

🚩 FlowNet3.0: "Occlusions, motion and depth boundaries with a generic network for disparity, optical flow or scene flow estimation", Ilg et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
DRISF: "Deep Rigid Instance Scene Flow", Ma et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]
DeblurringSF: "Joint stereo video deblurring, scene flow estimation and moving object segmentation", Pan et al., TIP, 2019. [Paper] [Bibtex] [Google Scholar]
IOSF: "Learning Independent Object Motion From Unlabelled Stereoscopic Videos", Cao et al., TPAMI, 2019. [Paper] [Bibtex] [Google Scholar]
EPC++: "Every pixel counts++: Joint learning of geometry and motion with 3d holistic understanding", Luo et al., TPAMI, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
SENSE: "Sense: A shared encoder network for scene-flow estimation", Jiang et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
StereoExpansion: "Upgrading Optical Flow to 3D Scene Flow through Optical Expansion", Yang et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
DWARF: "Learning end-to-end scene flow by distilling single tasks knowledge", Aleotti et al., AAAI, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
SceneFlowFields++: "SceneFlowFields++: Multi-frame matching, visibility prediction, and robust interpolation for scene flow estimation", Schuster et al., IJCV, 2020. [Paper] [Bibtex] [Google Scholar]
Effiscene: "Effiscene: Efficient per-pixel rigidity inference for unsupervised joint learning of optical flow, depth, camera pose and motion segmentation", Jiao et al., CVPR, 2021. [Paper] [Bibtex] [Google Scholar]
RAFT-3D: "RAFT-3D: Scene Flow using Rigid-Motion Embeddings", Teed et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
RigidMask: "Learning to Segment Rigid Motions from Two Frames", Yang et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
Self-superflow: "Self-superflow: self-supervised scene flow prediction in stereo sequences", Bendig et al., ICIP, 2022. [Paper] [Bibtex] [Google Scholar]
CamLiFlow: "Learning optical flow and scene flow with bidirectional camera-lidar fusion", Liu et al., TPAMI, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
M-FUSE: "M-fuse: Multi-frame fusion for scene flow estimation", Mehl et al., WACV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
OpticalExpansion: "Learning Optical Expansion from Scale Matching", Ling et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]

Beyond Visual Spectrum Deep Stereo Architectures

Depth-Guided Sensor Stereo Networks

LidarStereoFusion: "High-precision depth estimation with the 3d lidar and stereo fusion", Park et al., ICRA, 2018. [Paper] [Bibtex] [Google Scholar]
GSD: "Guided stereo matching", Poggi et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
LidarStereoNet: "Noise-Aware Unsupervised Deep Lidar-Stereo Fusion", Cheng et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Stereo-LiDAR-CCVNorm: "3d lidar and stereo fusion using stereo matching network with conditional cost volume normalization", Wang et al., IROS, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Pseudo-LiDAR++: "Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving", You et al., ICLR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
Listereo: "Listereo: Generate dense depth maps from lidar and stereo imagery", Zhang et al., ICRA, 2020. [Paper] [Bibtex] [Google Scholar]
S³: "S³: Learnable sparse signal superdensity for guided depth estimation", Huang et al., CVPR, 2021. [Paper] [Bibtex] [Google Scholar]
LSMD-Net: "LSMD-Net: LiDAR-Stereo Fusion with Mixture Density Network for Depth Sensing", Yin et al., ACCV, 2022. [Paper] [Bibtex] [Google Scholar]
CamLiFlow: "Learning optical flow and scene flow with bidirectional camera-lidar fusion", TPAMI, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
Active Disparity Sampling: "Active Disparity Sampling for Stereo Matching With Adjoint Network", Zhang et al., TIP, 2023. [Paper] [Bibtex] [Google Scholar]
VPP: "Active Stereo Without Pattern Projector", Bartolomei et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
SDG-Depth: "Stereo-LiDAR Depth Estimation with Deformable Propagation and Learned Disparity-Depth Conversion", Li et al., ICRA, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

Pattern Projection-Based Stereo Networks

ActiveStereoNet: "ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems", Zhang et al., ECCV, 2018. [Paper] [Bibtex] [Google Scholar]
Polka Lines: "Polka Lines: Learning Structured Illumination and Reconstruction for Active Stereo", Baek et al., CVPR, 2021. [Paper] [Bibtex] [Google Scholar]
Activezero: "Activezero: Mixed domain learning for active stereovision with zero annotation", Liu et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
MonoStereoFusion: "Depth Estimation by Combining Binocular Stereo and Monocular Structured-Light", Xu et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
Activezero++: "Activezero++: Mixed domain learning stereo and confidence-based depth completion with zero annotation", Chen et al., TPAMI, 2023. [Paper] [Bibtex] [Google Scholar]
ASGrasp: "ASGrasp: Generalizable Transparent Object Reconstruction and 6-DoF Grasp Detection from RGB-D Active Stereo Camera", Shi et al., ICRA, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]

Cross-Spectral Stereo Networks

CS-Stereo: "Deep material-aware cross-spectral stereo matching", Zhi et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
UCSS: "Unsupervised cross-spectral stereo matching by learning to synthesize", Liang et al., AAAI, 2019. [Paper] [Code - Unofficial] [Bibtex] [Google Scholar]
SS-MCE: "There and back again: Self-supervised multispectral correspondence estimation", Walters et al., ICRA, 2021. [Paper] [Bibtex] [Google Scholar]
RGB-MS: "RGB-Multispectral matching: Dataset, learning methodology, evaluation", Tosi et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
DPS-Net: "DPS-Net: Deep Polarimetric Stereo Depth Estimation", Tian et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
CrossSP: "Unsupervised Cross-Spectrum Depth Estimation by Visible-Light and Thermal Cameras", Guo et al., T-ITS, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
Gated-RCCB: "Cross-spectral Gated-RGB Stereo Depth Estimation", Brucker et al., CVPR, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]

Event Stereo Networks

Event-IntensityStereo: "Event-Intensity Stereo: Estimating Depth by the Best of Both Worlds", Mostafavi et al., ICCV, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
SE-CFF: "Stereo Depth From Events Cameras: Concentrate and Focus on the Future", Nam et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
SCSNet: "Selection and Cross Similarity for Event-Image Deep Stereo", Cho et al., ECCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
DTC-SPADE: "Discrete Time Convolution for Fast Event-Based Stereo", Zhang et al., CVPR, 2022. [Paper] [Bibtex] [Google Scholar]
EFS: "Event-image fusion stereo using cross-modality feature propagation", Cho et al., AAAI, 2022. [Paper] [Bibtex] [Google Scholar]
ADES: "Learning Adaptive Dense Event Stereo From the Image Domain", Cho et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]
SAFE: "Depth From Asymmetric Frame-Event Stereo: A Divide-and-Conquer Approach", Chen et al., WACV, 2024. [Paper] [Bibtex] [Google Scholar]

Gated Stereo Networks

GatedStereo: "Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues", Walz et al., CVPR, 2023. [Paper] [WebPage] [Bibtex] [Google Scholar]
Gated-RCCB: "Cross-spectral Gated-RGB Stereo Depth Estimation", Brucker et al., CVPR, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]

Stereo Networks with Echoes

StereoEchoes: "Stereo Depth Estimation with Echoes", Zhang et al., ECCV, 2022. [Paper] [Bibtex] [Google Scholar]

Architectural Analysis

OpenStereo: "OpenStereo: A Comprehensive Benchmark for Stereo Matching and Strong Baseline", Xianda et al., arXiv, 2023 [Paper] [Code] [Bibtex] [Google Scholar]
"Exploring the Usage of Pre-trained Features for Stereo Matching", Zhang et al., IJCV, 2024 [Paper] [Bibtex] [Google Scholar]

Challenges & Solutions

Addressing the Over-Smoothing Issue

SM-CDE: "On the over-smoothing problem of cnn based disparity estimation", Chen et al., ICCV, 2019. [Paper] [Bibtex] [Google Scholar]
AcfNet: "Adaptive Unimodal Cost Volume Filtering for Deep Stereo Matching", Zhang et al., AAAI, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
CDN: "Wasserstein Distances for Stereo Disparity Estimation", Garg et al., NeurIPS, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
SMD-Nets: "SMD-Nets: Stereo Mixture Density Networks", Tosi et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
NDR: "Neural disparity refinement for arbitrary resolution stereo", Aleotti et al., 3DV, 2021. [Paper] [Website] [Bibtex] [Google Scholar]
LaC: "Local similarity pattern and cost self-reassembling for deep stereo matching networks", Liu et al., AAAI, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
ADL: "Adaptive Multi-Modal Cross-Entropy Loss for Stereo Matching", Xu et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

Missing Ground Truth Depth

Self-Supervised

🚩 MonoDepth/StereoDepth: "Unsupervised monocular depth estimation with left-right consistency", Godard et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
USM: "Unsupervised learning of stereo matching", Zhou et al., ICCV, 2017. [Paper] [Bibtex] [Google Scholar]
OASM-Net: "Occlusion aware stereo matching via cooperative unsupervised learning", Li et al., ACCV, 2018. [Paper] [Bibtex] [Google Scholar]
UnOS: "UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos", Wang et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
BridgeDepthFlow: "Bridging Stereo Matching and Optical Flow via Spatiotemporal Correspondence", CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Correspondence Consistency: "Unsupervised stereo matching using confidential correspondence consistency", Joung et al., T-ITS, 2019. [Paper] [Bibtex] [Google Scholar]
Flow2Stereo: "Flow2Stereo: Effective Self-Supervised Learning of Optical Flow and Stereo Matching", Liu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
PASMNet: "Parallax attention for unsupervised stereo correspondence learning", Wang et al., TPAMI, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
MultiscopicVision: "Stereo matching by self-supervision of multiscopic vision", Yuan et al., IROS, 2021. [Paper] [WebPage] [Bibtex] [Google Scholar]
Feature-Level Collaboration: "Feature-Level Collaboration: Joint Unsupervised Learning of Optical Flow, Stereo Depth and Camera Motion", Chi et al., CVPR, 2021. [Paper] [Bibtex]
Occlusion-Aware Stereo: "Unsupervised Occlusion-Aware Stereo Matching With Directed Disparity Smoothing", Li et al., T-ITS, 2022. [Paper] [Bibtex] [Google Scholar]

Cross-Framework/Proxy Supervision

Reversing-Stereo: "Reversing the cycle: self-supervised deep stereo through enhanced monocular distillation", Aleotti et al., ECCV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
Revealing-Stereo: "Revealing the Reciprocal Relations between Self-Supervised Stereo and Monocular Depth Estimation", Chen et al., ICCV, 2021. [Paper] [Bibtex] [Google Scholar]
Two-in-One: "Two-in-one depth: Bridging the gap between monocular and binocular self-supervised depth estimation", Zhou et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
NeRF-Supervised Stereo: "NeRF-Supervised Deep Stereo", Tosi et al., CVPR, 2023. [Paper] [Website] [Code] [Bibtex]

Domain Shift

Zero-shot Generalization

Domain-Agnostic Feature Modeling

🚩 DSM-Net: "Domain-invariant Stereo Matching Networks", Zhang et al., ECCV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
FCStereo: "Revisiting Domain Generalized Stereo Matching Networks From a Feature Consistency Perspective", Zhang et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
GraftNet: "GraftNet: Towards Domain Generalized Stereo Matching With a Broad-Spectrum and Task-Oriented Feature", Liu et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
ITSA: "ITSA: An Information-Theoretic Approach to Automatic Shortcut Avoidance and Domain Generalization in Stereo Matching Networks", Chuah et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
HVT: "Domain Generalized Stereo Matching via Hierarchical Visual Transformation", Chang et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
MRL-Stereo: "Masked representation learning for domain generalized stereo matching", Rao et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]

Non-parametric Cost Volumes

MS-Nets: "Matching-space Stereo Networks for Cross-domain Generalization", Cai et al., 3DV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
ARStereo: "Revisiting Non-Parametric Matching Cost Volumes for Robust and Generalizable Stereo Matching", Cheng et al., NeurIPS, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

Integration of Additional Geometric Cues

NDR: "Neural disparity refinement for arbitrary resolution stereo", Aleotti et al., 3DV, 2021. [Paper] [Website] [Bibtex] [Google Scholar]
EVHS: "Expansion of Visual Hints for Improved Generalization in Stereo Matching", Pilzer et al., WACV, 2023. [Paper] [Bibtex] [Google Scholar]

Synthetic Data Generation and Domain Translation

StereoGAN: "StereoGAN: Bridging Synthetic-to-Real Domain Gap by Joint Optimization of Domain Translation and Stereo Matching", Liu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
LSSI: "Learning Stereo from Single Images", Watson et al., ECCV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
FoggyStereo: "FoggyStereo: Stereo Matching with Fog Volume Representation", Yao et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]
NeRF-Supervised Stereo: "NeRF-Supervised Deep Stereo", Tosi et al., CVPR, 2023. [Paper] [Website] [Code] [Bibtex] [Google Scholar]

Knowledge Transfer

DKT-Stereo: "Robust Synthetic-to-Real Transfer for Stereo Matching", Zhang et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

Data Augmentation Analysis

NLCA-Net_v2: "Rethinking training strategy in stereo matching", Rao et al., TNNLS, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

Offline Adaptation

Confidence-guided Adaptation: "Unsupervised adaptation for deep stereo", Tonioni et al., ICCV, 2017. [Paper] [Code] [Bibtex1] [Bibtex2]
Open-World Stereo: "Open-world stereo video matching with deep rnn", Zhong et al., ECCV, 2018. [Paper] [Bibtex] [Google Scholar]
ZOLE: "Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domain", Pang et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
AdaStereo: "AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching", Song et al., CVPR, 2021. [Paper] [Bibtex] [Google Scholar]
UnDAF: "UnDAF: A General Unsupervised Domain Adaptation Framework for Disparity or Optical Flow Estimation", Wang et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
RAG: "Continual Stereo Matching of Continuous Driving Scenes With Growing Architecture", Zhang et al., CVPR, 2022. [Paper] [Bibtex] [Google Scholar]
UCFNet: "Digging Into Uncertainty-Based Pseudo-Label for Robust Stereo Matching", Shen et al., TPAMI, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
StereoFlowGAN: "StereoFlowGAN: Co-training for Stereo and Flow with Unsupervised Domain Adaptation", Xiong et al., BMVC, 2023. [Paper] [Bibtex] [Google Scholar]
Few-Shot Stereo Matching: "Few-Shot Stereo Matching with High Domain Adaptability Based on Adaptive Recursive Network", Wu et al.,IJCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
RAG-Continual: "Reusable Architecture Growth for Continual Stereo Matching", Zhang et al.,TPAMI, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

Online Continual Adaptation

🚩 MadNet: "Real-Time Self-Adaptive Deep Stereo", Tonioni et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Learning2Adapt: "Learning to adapt for stereo", Tonioni et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Continual Adaptation for Deep Stereo: "Continual adaptation for deep stereo", Poggi et al., TPAMI, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
PointFix: "PointFix: Learning to Fix Domain Bias for Robust Online Stereo Adaptation", Kim et al., ECCV, 2022. [Paper] [Bibtex] [Google Scholar]
FedStereo: "Federated Online Adaptation for Deep Stereo", Poggi et al., CVPR, 2024. [Paper] [Bibtex] [Google Scholar]

Adverse Weather

DDF: "Dusk Till Dawn: Self-supervised Nighttime Stereo Depth Estimation using Visual Foundation Models", Vankadari et al., ICRA, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

Transparent and Reflective (ToM) Surfaces

DDF: "Deep Depth Fusion for Black, Transparent, Reflective and Texture-Less Objects", Chai et al., ICRA, 2020. [Paper] [Bibtex] [Google Scholar]
TA-Stereo: "Transparent Objects: A Corner Case in Stereo Matching", Wu et al., ICRA, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
Depth4ToM: "Learning Depth Estimation for Transparent and Mirror Surfaces", Costanzino et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
ASGrasp: "ASGrasp: Generalizable Transparent Object Reconstruction and 6-DoF Grasp Detection from RGB-D Active Stereo Camera", Shi et al., ICRA, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]

Asymmetric Stereo

Visually-Imbalanced Stereo: "Visually Imbalanced Stereo Matching", Liu et al., CVPR, 2020. [Paper] [Code] [Bibtex]
NDR: "Neural disparity refinement for arbitrary resolution stereo", Aleotti et al., 3DV, 2021. [Paper] [Code] [Bibtex]
DA-AS: "Degradation-agnostic Correspondence from Resolution-asymmetric Stereo", Chen et al., CVPR, 2022. [Paper] [Bibtex] [Google Scholar]
SASS: "Unsupervised Deep Asymmetric Stereo Matching with Spatially-Adaptive Self-Similarity", Song et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]

Confidence Estimation

Machine Learning Approaches

Disparity-based

ENS7: "Ensemble learning for confidence measures in stereo vision", Haeusler et al., CVPR, 2013. [Paper] [Bibtex] [Google Scholar]
O1: "Learning a general-purpose confidence measure based on o (1) features and a smarter aggregation strategy for semi global matching", Poggi et al., 3DV, 2016. [Paper] [Bibtex1] [Bibtex2] [Google Scholar]

Cost Volume-based

ENS23: "Ensemble learning for confidence measures in stereo vision", Haeusler et al., CVPR, 2013. [Paper] [Bibtex] [Google Scholar]
GCP: "Learning to detect ground control points for improving the accuracy of stereo matching", Spyropoulos et al., CVPR, 2014. [Paper] [Bibtex1] [Bibtex2] [Google Scholar]
LEV: "Leveraging stereo matching with learning-based confidence measures", Park et al., CVPR, 2015. [Paper] [Bibtex1] [Bibtex2] [Google Scholar]
FA: "Feature augmentation for learning confidence measure in stereo matching", Kim et al., TIP, 2017. [Paper] [Bibtex] [Google Scholar]

Model-based

Multi-Task Learning Using Uncertainty: "Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics", Kendall et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]

SGM-specific

SGMForest: "Learning to fuse proposals from multiple scanline optimizations in semi-global matching", Schonberger et al., ECCV, 2018. [Paper] [Bibtex] [Google Scholar]

Deep Learning Approaches

Disparity-based

CCNN: "Learning from scratch a confidence measure", Poggi et al., BMVC, 2016. [Paper] [Code] [Bibtex] [Google Scholar]
PBCP: "Patch Based Confidence Prediction for Dense Disparity Map", Seki et al., BMVC, 2016. [Paper] [Bibtex] [Google Scholar]
EFN/LFN: "Stereo matching confidence learning based on multi-modal convolution neural networks", Fu et al., RFMI, 2017. [Paper] [Bibtex] [Google Scholar]
MMC: "Learning confidence measures by multi-modal convolutional neural networks", Fu et al., WACV, 2018. [Paper] [Bibtex] [Google Scholar]
LGC/ConfNet: "Beyond local reasoning for stereo confidence estimation with deep learning", Tosi et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
Self-adapting Confidence: "Self-adapting confidence estimation for stereo", Poggi et al., ECCV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
SEDNet: "Learning the distribution of errors in stereo matching for joint disparity and uncertainty estimation", Chen et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

Cost Volume-based

RCN: "Improved stereo matching with constant highway networks and reflective confidence learning", Shaked et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
MPN: "Deep stereo confidence prediction for depth estimation", Kim et al., ICIP, 2017. [Paper] [Bibtex] [Google Scholar]
UCN: "Unified confidence estimation networks for robust stereo matching", Kim et al., TIP, 2018. [Paper] [Bibtex] [Google Scholar]
LAF: "Laf-net: Locally adaptive fusion networks for stereo confidence estimation", Kim et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]
CRNN: "Pixel-Wise Confidences for Stereo Disparities Using Recurrent Neural Networks", Gul et al., BMVC, 2019. [Paper] [Bibtex] [Google Scholar]
CVA: "Cnn-based cost volume analysis as confidence measure for dense matching", Mehltretter et al., ICCVW, 2019. [Paper] [Bibtex] [Google Scholar]
Disparity Plane Sweep: "Modeling Stereo-Confidence Out of the End-to-End Stereo-Matching Network via Disparity Plane Sweep", Lee et al., AAAI, 2024. [Paper] [Bibtex] [Google Scholar]
ACN: "Adversarial confidence estimation networks for robust stereo matching", Kim et al., T-ITS, 2020. [Paper] [Bibtex] [Google Scholar]

Multiple Confidence Fusion

Learning Local Consistency: "Learning to predict stereo reliability enforcing local consistency of confidence maps", Poggi et al., CVPR, 2017. [Paper] [Bibtex] [Google Scholar]
EMC: "Even More Confident Predictions With Deep Machine-Learning", Poggi et al., CVPRW, 2017. [Paper] [Bibtex] [Google Scholar]

Sensor-based

Lidar-Confidence: "Unsupervised confidence for lidar depth maps and applications", Conti et al., IROS, 2022. [Paper] [Bibtex] [Code] [Google Scholar]

Applications

(Not an exhaustive list)

Deep3d: "Deep3d: Fully automatic 2d-to-3d video conversion with deep convolutional neural networks", Xie et al., ECCV, 2016. [Paper] [Code] [Bibtex] [Google Scholar]
Geometry to the Rescue: "Unsupervised cnn for single view depth estimation: Geometry to the rescue", Garg et al., ECCV, 2016. [Paper] [Bibtex] [Google Scholar]
MonoDepth/StereoDepth: "Unsupervised monocular depth estimation with left-right consistency", Godard et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]
SVSM: "Single View Stereo Matching", Luo et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]
MonoResMatch: "Learning monocular depth estimation infusing traditional stereo knowledge", Tosi et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]
Ida-3d: "Ida-3d: Instance-depth-aware 3d object detection from stereo vision for autonomous driving", Peng et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
Stereopifu: "Stereopifu: Depth aware clothed human digitization via stereo vision", Hong et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]
Smart Glasses: "A Practical Stereo Depth System for Smart Glasses", Wang et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]
Cross Attention Renderer: "Learning to render novel views from wide-baseline stereo pairs", Du et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]
SDCNet: "Stereo-augmented depth completion from a single rgb-lidar image", Choi et al., ICRA, 2021. [Paper] [Bibtex] [Google Scholar]
VPPDC: "Revisiting Depth Completion from a Stereo Matching Perspective for Cross-domain Generalization", Bartolomei et al., 3DV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
CoPoNeRF: "Unifying Correspondence, Pose and NeRF for Pose-Free Novel View Synthesis from Stereo Pairs", Hong et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]
DSGN: "Deep Stereo Geometry Network for 3D Object Detection", Chen et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]
StereoNeRF: "Generalizable Novel-View Synthesis using a Stereo Camera", Lee et al., CVPR, 2024. [Paper] [WebSite] [Bibtex] [Google Scholar]

Workshops

NTIRE 2024: HR Depth from Images of Specular and Transparent Surfaces. P. Z. Ramirez, F. Tosi, L. Di Stefano, R. Timofte A. Costanzino, M. Poggi, S. Salti, S. Mattoccia; CVPRW 2024, Seattle, US [Website]
NTIRE 2023: HR Depth from Images of Specular and Transparent Surfaces. P. Z. Ramirez, F. Tosi, L. Di Stefano, R. Timofte A. Costanzino, M. Poggi, S. Salti, S. Mattoccia; CVPRW 2023, Vancouver, Canada [Website]
Robust Vision Challenge (ROB), Zendel et al., ECCV 2022 [Website]

Tutorials & Talks

Facing depth estimation in-the-wild with deep networks. M. Poggi, F. Tosi, F. Aleotti, K. Batsos, P. Mordohai, S. Mattoccia; ECCV 2020, SEC, Glasgow [Website]
Learning and understanding single image depth estimation in the wild. M. Poggi, F. Tosi, F. Aleotti, S. Mattoccia, C. Godard, J. Watson, M. Firman, G.J. Brostow; CVPR 2020, Seattle, Washington, US [Website]
Learning-based depth estimation from stereo and monocular images: successes, limitations and future challenges. M. Poggi, F. Tosi, K. Batsos, P. Mordohai, S. Mattoccia, CVPR 2019, Long Beach, California, US [Website]
Learning-based depth estimation from stereo and monocular images: successes, limitations and future challenges. M. Poggi, F. Tosi, K. Batsos, P. Mordohai, S. Mattoccia; 3DV 2018, Verona, Italy [Website]
Lecture: Computer Vision (Prof. Andreas Geiger, University of Tübingen). [Preliminaries] [Block Matching] [Siamese Networks] [Spatial Regularization] [End-to-End Learning]

Citation

Please consider citing this list if you find this repository useful:

@article{poggi2021synergies,
  title={On the synergies between machine learning and binocular stereo for depth estimation from images: a survey},
  author={Poggi, Matteo and Tosi, Fabio and Batsos, Konstantinos and Mordohai, Philippos and Mattoccia, Stefano},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={44},
  number={9},
  pages={5314--5334},
  year={2021},
  publisher={IEEE}
}

@article{poggi2021confidence,
  title={On the confidence of stereo matching in a deep-learning era: a quantitative evaluation},
  author={Poggi, Matteo and Kim, Seungryong and Tosi, Fabio and Kim, Sunok and Aleotti, Filippo and Min, Dongbo and Sohn, Kwanghoon and Mattoccia, Stefano},
  journal={IEEE transactions on pattern analysis and machine intelligence},
  volume={44},
  number={9},
  pages={5293--5313},
  year={2021},
  publisher={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
bibliography		bibliography
LICENSE		LICENSE
README.md		README.md
how-to-PR.txt		how-to-PR.txt
references.bib		references.bib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome-Deep-Stereo-Matching

How to submit a pull request?

Table of Contents

Survey & Fundamentals

CodeBase

Datasets

Frameworks

Learning for Stereo Pipeline

End-to-End Architectures

Architectural Analysis

Challenges & Solutions

Confidence Estimation

Applications

Workshops

Tutorials & Talks

Citation

About

Releases

Packages

Languages

License

bartn8/Awesome-Deep-Stereo-Matching

Folders and files

Latest commit

History

Repository files navigation

Awesome-Deep-Stereo-Matching

How to submit a pull request?

Table of Contents

Survey & Fundamentals

CodeBase

Datasets

Frameworks

Learning for Stereo Pipeline

End-to-End Architectures

Architectural Analysis

Challenges & Solutions

Confidence Estimation

Applications

Workshops

Tutorials & Talks

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages