Sean Anderson
CS 348C (VR Topics)
March 20, 1996
Project Writeup

Augmented Reality by Chromakeying

Abstract

Ideally, augmented reality would merge the real world and a virtual world flawlessly - without lag and occlusion problems. But in its full generality, this task requires an accurate depth map of the real world, since the 3D graphics must be rendered only in places where it lies in front of the real world. The needed depth information could possibly be provided someday by a high speed, sub-millimeater accurate, compact time of flight laser range scanner, but such a device will probably not be available for many years.

For the meantime, I approximated the depth information by using chromakeying, thus yielding two (and potentially three) levels of depth. The idea is to color certain real objects green (or blue or whatever the chromakey) so that when they are viewed through the aparatus described below, parts of a virtual world are displayed in there place. The user wears a head mounted display with a six degree of freedom tracker and small camera attached. The camera's output is composited on top of a view of the virtual world, as it would be seen by a virtual camera positioned where the real camera is. Chromakeying hardware composits the view of the virtual world so that it is only visible in places where the view of the real world contains the chromakey color.

Introduction

Augmented reality has shown great potential in collaborative work, where other users can see themselves and others in addition to a virtual world. Gesturs and expressions aid in communication, particularly in the processes of design and education. The Holodeck on Star Trek clearly shows that there are many benefits to augmented reality in its ideal.

Unfortunately there are many problems that need to be solved before augmented reality works well. One of the greatest is correctly occluding virtual and real objects so that whatever should be closest appears closest. Several other ways of approximating this needed depth information have been developed. Many augmented reality systems use a head mounted display or Boom to put 3D graphics on top of either live video [Bajura92, Wloka95] or whatever may be seen through a Type 2 HMD. This technique assumes that all objects in the real world are always further away from all objects in the virtual world, which is clearly a poor assumption considering how close our hands and body parts are to our eyes. Another method [Wloka95] uses the two images of the real world and infers a depth map dynamically using a stereo image matching algorithm; unfortunately the resulting depth map appears very crude. In CAVEs [Cruz93], yet another form of augmented VR, the virtual world is projected onto walls of a small room, thereby permitting two levels of depth. Unfortunately, the users' bodies sometimes incorrectly occlude near virtual objects, and only a single user sees an accurate perspective view of the environment. Responsive Workbenches [Krueger95], which display a virtual environment on a table, also suffer from these two shortcommings. Moreover, when one user points at part of a virtual object, the place he is pointing does not appear the same for other users. A final way of approximating depth is to statically precalculate the geometry [Koch93]. Obviously this approach is unacceptable since if VR is to be interactive, then depth maps must be obtained dynamically in real-time.

Chromakeying Approach

The system I built required chromakeying hardware (an SGI Galileo Video board), a graphics workstation (an SGI Indigo2), a head mounted display (virtual iglasses!, by Virtual IO), a small camera (a Sony Camcorder), and some software to read the tracker and render the virtual world (SGI Performer). The camera was mounted on top of a bicycle helmet worn by the user, and its output went into an SGI Galileo Video board to be chromakeyed in hardware. Meanwhile, the position and orientation of the user's head was tracked with a Polhemus 6 DOF tracker (also attached to the helmet) and sent to an Indigo2 graphics workstation. The workstation used Performer to render the view of the virtual environment as it would be seen by the real camera, given the tracker's position and orientation. The rendering on the workstation was input to the Galileo for real time video compositing with the camera's video. Finally, the composited (NTSC) signal was sent to the head mounted display (virtual iglasses! by Virtual IO) to be viewed by the user.

Block diagram of chromakeying system.

A large piece of blue paper (Studio Blue, available at photography supply stores) was used as a window to the virtual world. In one case it laid flat on a table (simulating a Responsive Workbench), and in another the paper stood upright as a cylindrical display. The head mounted display (HMD) had a 30 degree field of view, so the real and virtual cameras were adjusted to have 30 degree field of views as well. The real camera was adjusted by pointing it directly at an object and setting the zoom for the lense to just include an object 15 degrees off-center. Because of limitations in the workstation's rendering power and the chromakeying hardware being unable to accept input from a Reality Engine and camera simultaneously, the virtual worlds were kept quite simple. The virtual worlds consisted of a gigantic teapot under the table and a Klingon warship projected onto the cylinder so that the user could walk around and view it from different angles.

Results

The Polhemus tracker used suffered from interference from the camera and HMD if it was placed too close to them, resulting in virtual objects jittering. The head aparatus was heavy and the helmet tended to push the HMD down; a small "finger" camera would have worked much better than the bulky camcorder, though it might be difficult to adjust the field of view to match the HMD's 30 degrees. (Initially, I intended to use IndyCams, but increasing their short 3 foot cables results in a sever degradation in picture quality.) Aside from the unexpected problems, the minimal field of view, bioccular view, and low resolution, the system essentially worked.

Future Work

  1. Simulate a create a CAVE by covering the walls with the chromakey. (Unfortunately, the FOV would not be nearly as good.)
  2. Cover parts of a user's body with the chromakey and attach trackers to those parts to view his skeleton or organs.
  3. Use a scan converter, such as a Lyon Lamb to take the output of the machine running the Galileo and scan convert a portion of the screen that contains a composited output window. Render additional graphics on top of this window. Send the output of the scan converter to the HMD, and voila, three levels of depth.
    1. By the user wearing a tracked VR glove and the workstation rendering a 3D model of it, unwanted hand occlusions could be fixed with three levels of depth.
    2. Multiplayer games can be built: Star Wars-like lasers and light sabers are possible with three levels of depth.
  4. With long cables and a blue sky, we could project VR onto the sky. A virtual planetarium would be possible - one that permits flying Earth to very close other planets, while the horizon and everything below still appear normal.
  5. Cover a large ball or cylinder that the user can walk around with the chromakey and project 3D virtual objects on it for 360 degree examination.
  6. Capture a light field at the frame of a door to a nice church, for example, and render the light field onto a piece of paper on the wall. The paper would be the size of the original door frame (and be the chromakey, naturally).

Acknowledgements

Tamara Munzner was very helpful in helping me get the video equipment to work and discussing the project. Bernd Froelich helped with the tracker and questions about the transformations. Andrew Beers explained how to go about making a new cable to go between the workstation and the Polhemus tracker. Last but not least, Mark Levoy generously approved of purchase the virtual iglasses! for me to use. Thank you.

References

[Bajura92] Bajura, Michael, Henry Fuchs, and Ryutarou Ohbuchi. Merging Virtual Objects with the Real World: Seeing Ultrasound Imagery within the Patient. Proceedings of SIGGRAPH '92 (Chicago, Illinois, July 26-31, 1992). In Computer Graphics 26, 2 (July 1992), 203-210.
[Cruz93] Cruz-Neira, C., D. J. Sandin, and T. A. DeFanti. Surrond-screen projection-based virtual reality: the design and implementation of the CAVE. Proceedings of SIGGRAPH '93 (Anaheim, California, August 1-6, 1993). In Computer Graphics Proceedings, Annual Conference Series, 1993, ACM SIGGRAPH, pages 135-142.
[Koch93] Koch, Reinhard. Automatic Reconstruction of Buildings from Stereoscopic Image Sequences. In R. J. Hubbold and R. Juan, editors, Eurographics '93, pages 339-350, Oxford, UK, 1993.
[Krueg95] Krueger, W., C.-A. Bohn, B. Froehlich, H. Schueth, W. Strauss, and G. Wesche. The Responsive Workbench. IEEE Computer. Vol. 28, No. 7, July 1995. pages 42-48.
[Wloka95] Wloka, Matthias M. and Brian G. Anderson. Resolving Occlusion in Augmented Reality. Proceedings 1995 Symposium on Interactive 3D Graphics (Monterey, California, April 9-12, 1995), pages 5-12