Sean Anderson
CS 348C (VR Topics)
February 13, 1996
Project Proposal

Augmented Reality by Chromakeying

Ideally, augmented reality would merge the real world and a virtual world flawlessly - without lag and occlusion problems. But in its full generality, this task requires an accurate depth map of the real world, since the 3D graphics must be rendered only in places where it lies in front of the real world. The needed depth information could possibly be provided someday by a high speed, sub-millimeater accurate, compact time of flight laser range scanner, but such a device will probably not be available for many years.

For the meantime, an interesting approach to try would approximate the depth information by using chromakeying, thus yielding two or perhaps three levels of depth. The idea is to color objects green (or blue or whatever the chromakey) so that when they are viewed, parts of the virtual world are shown instead.

Several other ways of approximating depth information have been developed. Many augmented reality systems use a head mounted display or Boom to put 3D graphics on top of either live video [Bajura92, Wloka95] or whatever may be seen through a Type 2 HMD. This technique assumes that all objects in the real world are always further away from all objects in the virtual world, which is clearly a poor assumption considering how close our hands and body parts are to our eyes. Another method [Wloka95] uses the two images of the real world that a user would see when no virtual objects occlude it. The depth map is dynamically inferred using a stereo image matching algorithm, but the resulting depth map appears very crude. In CAVEs [Cruz93], yet another form of augmented VR, the virtual world is projected onto walls, thereby permitting two levels of depth. Unfortunately, the users' bodies sometimes incorrectly occlude near virtual objects, and only a single user sees an accurate perspective view of the environment. Responsive Workbenches [Krueger95], which display a virtual environment on a table, also suffer from these two shortcommings. A final way of approximating depth is to statically precalculate the geometry [Koch93]. Obviously this approach is unacceptable since if VR is to be interactive, then depth maps must be obtained dynamically in real-time.

Figure 1. The user looks at the chromakey colored box and sees virtual reality projected on it, while the surrounding environment looks normal.

The system proposed would require at least one small camera, although for stereo, two would be needed. SGI IndyCams would be adequate for a prototype, however, they are not ideal because of their bulk and loss of signal strength over short (three foot) cables; color finger cameras would be better. The cameras would be mounted on the user's head near his eyes, and their output would go into a chromakey compositing device, such as an SGI Galileo Video board. Meanwhile, the position and orientation of the user's head would be tracked with a Polhemus 6 DOF tracker and sent to a graphics workstation, such as an SGI Reality Engine. The workstation would render the view of the virtual environment as it should be seen, given the user's head position and orientation. The output of the workstation would be sent to the chromakeying device for real time video compositing with the camera(s) video. Finally, the composited (NTSC) signal would be sent to a head mounted display, such as virtual iglasses! by Virtual I O, to be viewed by the user.

As a minimal test, one or more pieces of matt colored paper could be placed on the floor or wall to create virtual windows. A photograph consisting of a scene of distant mountains, for example, could be rendered in the windows. Rather than using the polhemus tracker, the simple 3 DOF angular tracker in the iglasses! could be used, since the mountains would be so distant. Because the rendering would simply involve panning a 2D image, the machine running the Galileo Video board could probably handle redrawing the mountain scene.

Provided the minimal test works, many more elaborate ideas could be tested, such as the following:

  1. Create a Responsive Workbench by covering a desk with the chromakey color. Potentially multiple people could wear HMDs and each have a correct perspective view.
  2. Similarly, create a CAVE by covering the walls with the chromakey. (Unfortunately, the FOV would not be nearly as good.)
  3. Cover parts of your body with the chromakey and attach trackers to those parts to view your skeleton or organs.
  4. Use a scan converter, such as a Lyon Lamb to take the output of the machine running the Galileo and scan convert a portion of the screen that contains a composited output window. Render additional graphics on top of this window. Send the output of the scan converter to the HMD, and voila, three levels of depth.
    1. By the user wearing a tracked VR glove and the workstation rendering a 3D model of it, unwanted hand occlusions could be fixed with three levels of depth.
    2. Multiplayer games can be built: Star Wars-like lasers and light sabers are possible with three levels of depth.
  5. With long cables and a blue sky, we could project VR onto the sky. A virtual planetarium would be possible - one that permits flying Earth to very close other planets, while the horizon and everything below still appear normal.
  6. Cover a large ball or cylinder that the user can walk around with the chromakey and project 3D virtual objects on it for 360 degree examination.
  7. Capture a light field at the frame of a door to a nice church, for example, and render the light field onto a piece of paper on the wall. The paper would be the size of the original door frame (and be the chromakey, naturally).

References

[Bajura92]Bajura, Michael, Henry Fuchs, and Ryutarou Ohbuchi. Merging Virtual Objects with the Real World: Seeing Ultrasound Imagery within the Patient. Proceedings of SIGGRAPH '92 (Chicago, Illinois, July 26-31, 1992). In Computer Graphics 26, 2 (July 1992), 203-210.
[Cruz93]Cruz-Neira, C., D. J. Sandin, and T. A. DeFanti. Surrond-screen projection-based virtual reality: the design and implementation of the CAVE. Proceedings of SIGGRAPH '93 (Anaheim, California, August 1-6, 1993). In Computer Graphics Proceedings, Annual Conference Series, 1993, ACM SIGGRAPH, pages 135-142.
[Koch93]Koch, Reinhard. Automatic Reconstruction of Buildings from Stereoscopic Image Sequences. In R. J. Hubbold and R. Juan, editors, Eurographics '93, pages 339-350, Oxford, UK, 1993.
[Krueg95]Krueger, W., C.-A. Bohn, B. Froehlich, H. Schueth, W. Strauss, and G. Wesche. The Responsive Workbench. IEEE Computer. Vol. 28, No. 7, July 1995. pages 42-48.
[Wloka95]Wloka, Matthias M. and Brian G. Anderson. Resolving Occlusion in Augmented Reality. Proceedings 1995 Symposium on Interactive 3D Graphics (Monterey, California, April 9-12, 1995), pages 5-12