Recognizing Scene Viewpoint using Panoramic Place Representation

Abstract

The pose of an object carries crucial semantic meaning for object manipulation and usage (e.g., grabbing a mug, watching a television). Just as pose estimation is part of object recognition, viewpoint recognition is a necessary and unavoidable component of scene recognition. For instance, as shown in Figure 1, a theater has a clear distinct distributions of objects – a stage on one side and seats on the other – that deﬁnes unique views in different orientations. Just as observers will choose a view of a television that allows them to see the screen, observers in a theater will sit facing the stage when watching a show. The goal of this paper is to study the viewpoint recognition problem in scenes. We aim to design a model which, given a photo, can classify the place category to which it belongs (e.g. a theater), and predict the direction in which the observer is facing within that place (e.g. towards the stage). Our model learns the typical arrangement of visual features in a 360-degree panoramic representation of a place, and learns to map individual views of a place to that representation. Now, given an input photo, we will be able to place that photo within a larger panoramic image. This allows us to extrapolate the layout beyond the available view, as if we were to rotate the camera all around the observer.

Paper

J. Xiao, K. A. Ehinger, A. Oliva and A. Torralba.
Recognizing Scene Viewpoint using Panoramic Place Representation.
Proceedings of 25th IEEE Conference on Computer Vision and Pattern Recognition, 2012.

Video

Short Fast-forward Introduction (Download)	Full Length Video (Download)

Poster

poster.pdf

More Example Results

Example_Results_on_Panorama.pdf: This file contains more examples of result visualization on our panorama dataset. It is an extension of Figure 8 in the paper.
Example_Results_on_SUN.pdf: This file contains more examples of result visualization on the SUN dataset. It is an extension of Figure 8 in the paper.

Algorithm Analysis

Algorithm_Analysis.pdf: This file contains further analysis of the algorithm and its relation with similar algorithms.

Geometry of Panorama

panorama.pdf: This file contains some explanation for the geometry of panorama image.

Performance

Performance_Table.pdf: This file contains a extended version of Table 1 and 2 in the paper to show the performance by category.

More materials

Border_Extension.pdf: This file contains some examples of boundary extension to extrapolate image based on texture synthesis.
MTurk_View_Matching_GUI.png: This file shows the Amazon Mechanical Turk GUI to let workers to label the viewpoint for the pictures from SUN dataset.

Slides

Powerpoint Slides

Download full SUN360 dataset

To download the panoramas at various resolutions (we have upto 9104x4552 resolution), go to http://sun360.csail.mit.edu/Images/.
To generate an normal field of view images from a panorma, download the code "pano2photo.zip" in the source code download section.
To download the images and other data we actually used in the experiments for scene viewpoint recognition, download the file "cvpr2012pano_codeRelease_v1.zip" in the source code download section.
Note: Object annotation on this dataset is in progress.

Browse SUN360 dataset

Source code

CVPR2012 code: This folder contains all source code and data used in the experiments. It contains all precomputed results as well as source code to recompute everything from scratch. If you just want to do the viewpoint recognition experiment and compare with our paper, you only need to download this file (no need to download the above links for SUN360 database).
pano2photo: This is a small piece of code to demonstrate how to warp between panorama and normal images. It has been included in the above file.
polarPlot: This is a small piece of code to plot a curve or a histogram in polar coordinate. It has been included in the above file.
OnlineStructuralSVM: a Matlab implementation of the cutting plane algorithm for training a Structural SVM.

Acknowledgments

We thank Tomasz Malisiewicz, Andrew Owens, Aditya Khosla, Dahua Lin and reviewers for helpful discussions. This work is funded by NSF grant (1016862) to A.O, Google research awards to A.O and A.T., ONR MURI N000141010933 and NSF Career Award No. 0747120 to A.T., and a NSF Graduate Research fellowship to K.A.E. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation and other funding agencies. All materials in this website, including images, data, and visualization, can be used for academic research purpose ONLY.