- on: Dec. 3, 2018
- in: CVPR
- ✨Oral
DeepVoxels: Learning Persistent 3D Feature Embeddings
@inproceedings{sitzmann2019deepvoxels,
author = {Sitzmann, Vincent
and Thies, Justus
and Heide, Felix
and Nie{\ss}ner, Matthias
and Wetzstein, Gordon
and Zollh{\"o}fer, Michael},
title = {DeepVoxels: Learning Persistent 3D Feature Embeddings},
booktitle = {Proc. Computer Vision and Pattern Recognition (CVPR), IEEE},
year={2019}
}
Deep Generative Models today allow us to perform highly-realistic image synthesis. While each generated image is of high quality, a major challenge is to generate a series of coherent views of the same scene. This requires the network to have a latent space representation that fundamentally understands the 3D layout of the scene; e.g., how would the same chair look from a different viewpoint?
Unfortunately, this is challenging for existing models that are based on a series of 2D convolution kernels. Instead of parameterizing 3D transformations, they will explain training data in a higher-dimensional feature space, leading to poor generalization to novel views at test time - such as the output of Pix2Pix trained on images of the cube above.