NerfingMVS: Guided Optimization of Neural Radiance Fields
for Indoor Multi-view Stereo

1Tsinghua University

2ETH Zurich

Overview Video

In-the-wild Showcases

We captured data with a hand-held phone at home.



In this work, we present a new multi-view depth estimation method that utilizes both conventional SfM reconstruction and learning-based priors over the recently proposed neural radiance fields (NeRF). Unlike existing neural network based optimization method that relies on estimated correspondences, our method directly optimizes over implicit volumes, eliminating the challenging step of matching pixels in indoor scenes. The key to our approach is to utilize the learning-based priors to guide the optimization process of NeRF. Our system firstly adapts a monocular depth network over the target scene by finetuning on its sparse SfM reconstruction. Then, we show that the shape-radiance ambiguity of NeRF still exists in indoor environments and propose to address the issue by employing the adapted depth priors to monitor the sampling process of volume rendering. Finally, a per-pixel confidence map acquired by error computation on the rendered image can be used to further improve the depth quality. Experiments show that our proposed framework significantly outperforms state-of-the-art methods on indoor scenes, with surprising findings presented on the effectiveness of correspondence-based optimization and NeRF-based optimization over the adapted depth priors. In addition, we show that the guided optimization scheme does not sacrifice the original synthesis capability of neural radiance fields, improving the rendering quality on both seen and novel views.

Comparison with state-of-the-art methods

We conducted experiments on ScanNet dataset. This figure shows some qualitative results. While the original NeRF fails to predict reasonable geometry, our method generates visually appealing depth maps. Note that Atlas is trained on ScanNet with groundtruth depth supervision.

View Synthesis Results

Although view synthesis is not the main focus of our work, we observe that the proposed guided optimization scheme is beneficial to the view synthesis quality of NeRF. Moreover, without whole-space sampling procedure and coarse-to-fine strategy, our method speed up training procedure for 3 times.


  author    = {Wei, Yi and Liu, Shaohui and Rao, Yongming and Zhao, Wang and Lu, Jiwen and Zhou, Jie},
  title     = {NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo},
  booktitle = {ICCV},
  year = {2021}