3D video

We present a system for capturing and rendering of 3D video of dynamic scenes. Our modular acquisition setup consisting of movable 3D video bricks. Each brick acquires depth maps of the scene using a stereo pair of grayscale cameras. To support the stereo matching process a projector augments the scene with random vertical stripe patterns. Alternating projection of a pattern and its inverse allows for concurrent acquisition of the scene texture using an appropriately synchronized color camera. Each brick performs the grabbing completely independently of the other bricks with the exception of the frames being timestamped consistently by using a common synchronization device. Scalability of multiple bricks is guaranteed, because overlapping projections are explicitly allowed by our depth reconstruction and because the computation load of each brick does not increase during real-time recording.

Using space-time stereo on the acquired pattern images, high-quality depth maps are extracted, whose corresponding surface samples are merged into a view-independent, point-based 3D data structure consisting of volumetric Gaussian ellipsoids. This representation allows for effective post-processing. Photo consistency enforcement and outlier removal leads to a significant decrease of visual artifacts. High-quality renderings from novel viewpoints are generated using EWA volume splatting.

Our framework allows for simple and convenient editing of 3D video. Once the three-dimensional information is available, selection and compositing issues become straightforward and can be easily implemented using spatial clustering or bounding box algorithms. The underlying core data structure is a novel 4D spatio-temporal representation which we call the video hypervolume. Conceptually, the processing loop comprises three fundamental operators: slicing, selection, and editing. The slicing operator allows users to visualize arbitrary regions from the 4D data set. The selection operator labels subsets of the footage for spatio-temporal editing. This operator includes a 4D graph-cut based algorithm for semi-automatic object segmentation. The actual editing operators include cut & paste, affine transformations, and compositing with other media, such as images and 2D video. This enables a director to create novel effects like time dilation or spatio-temporal compositing without being restricted to the acquired video streams.

3D video brick with acquires pattern and texture images. Acquisition system overview. Re-rendering from novel view.

3D video editing framework. Color camera view and reconstructed depth map.


Last update:
March 31. 2012 04:21:40
Powered by CMSimple