Generalizable Novel-View Synthesis
using a Stereo Camera

Haechan Lee1,*   Wonjoon Jin1,*   Seung-Hwan Baek1   Sunghyun Cho1
1POSTECH GSAI & CSE

Stereonerf takes multi-view stereo-camera images and synthesizes high-quality novel-view images.

Abstract

In this paper, we propose the first generalizable view synthesis approach that specifically targets multi-view stereo-camera images. Since recent stereo matching has demonstrated accurate geometry prediction, we introduce stereo matching into novel-view synthesis for high-quality geometry reconstruction. To this end, this paper proposes a novel framework, dubbed StereoNeRF, which integrates stereo matching into a NeRF-based generalizable view synthesis approach. StereoNeRF is equipped with three key components to effectively exploit stereo matching in novel-view synthesis: a stereo feature extractor, a depth-guided plane-sweeping, and a stereo depth loss. Moreover, we propose the StereoNVS dataset, the first multi-view dataset of stereo-camera images, encompassing a wide variety of both real and synthetic scenes. Our experimental results demonstrate that StereoNeRF surpasses previous approaches in generalizable view synthesis.

Video Results

Motivation

The superior performance of stereo estimation motivates us to integrate it into the generalizable novel-view synthesis framework. Stereoscopic constraint and large-scale stereo dataset have faciliated remarkable generalization capabilitis of the stereo estimation compared to learning-based MVS methods.

photo

Comparison with a Baseline

Since our proposed framework effectively leverages stereoscopic prior from stereo-camera images, our method shows better synthesis quality than a baseline method, GeoNeRF, particularly in scenes with complex structures or textureless regions.

photo

Main comparison

We compare our method with several baseline methods, GNT, IBRNet, GeoNeRF and NeuRay.

photo

StereoNVS Dataset

We propose the StereoNVS dataset, the first multi-view dataset of stereo-camera images encompassing a wide variety of both real and synthetic scenes. Below are example scenes of the StereoNVS-Real and StereoNVS-Synthetic dataset.

photo

News

  • Our paper is accepted to CVPR 2024.
  • Our dataset is released! Check out our Google Drive Link!
  • Code will be released soon!

Related Links

  • We employ UniMatch as our stereo estimation network thanks to its remarkable generalization capability.
  • We use GeoNeRF as our baseline model.
  • For StereoNVS-Synthetic, we render multi-view stereoscopic images from 3D-FRONT.
  • We borrow a website template from Nerfies. Thanks for the source code.