MVS2D: Efficient Multiview Stereo via Attention-Driven 2D Convolutions
- 1 June 2022
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 8564-8574
- https://doi.org/10.1109/cvpr52688.2022.00838
Abstract
Deep learning has made significant impacts on multiview stereo systems. State-of-the-art approaches typically involve building a cost volume, followed by multiple 3D convolution operations to recover the input image's pixel-wise depth. While such end-to-end learning of plane-sweeping stereo advances public benchmarks' accuracy, they are typically very slow to compute. We present MVS2D, a highly efficient multi-view stereo algorithm that seamlessly integrates multi-view constraints into single-view net-works via an attention mechanism. Since MVS2D only builds on 2D convolutions, it is at least $2\times faster$ than all the notable counterparts. Moreover, our algorithm produces precise depth estimations and 3D reconstructions, achieving state-of-the-art results on challenging benchmarks ScanNet, SUN3D, RGBD, and the classical DTU dataset. our algorithm also outperforms all other algorithms in the setting of inexact camera poses. Our code is released at https://github.com/zhenpeiyang/MVS2D
Keywords
This publication has 39 references indexed in Scilit:
- Structure-from-Motion RevisitedPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow EstimationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- Large-Scale Data for Multiple-View StereopsisInternational Journal of Computer Vision, 2016
- Massively Parallel Multiview Stereopsis by Surface Normal DiffusionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Computing the stereo matching cost with a convolutional neural networkPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object LabelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2013
- A benchmark for the evaluation of RGB-D SLAM systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- Fast Cost-Volume Filtering for Visual Correspondence and BeyondIEEE Transactions on Pattern Analysis and Machine Intelligence, 2012
- Efficient large-scale multi-view stereo for ultra high-resolution image setsMachine Vision and Applications, 2011
- Accurate, Dense, and Robust Multiview StereopsisIEEE Transactions on Pattern Analysis and Machine Intelligence, 2009