Many of these project ideas could be Siggraph or CVPR/ICCV papers. In fact, some of the projects in this course will probably end up being submitted for publication! To keep you and your fellow students from being scooped, please don't distribute this handout, or the URL of this web page, beyond the Stanford community.
- Using the cameras in our array, create a virtual high-X camera, with high dynamic range, extended depth of field, or unusual spectral selectivity, by adjusting the electronic shutter or focus setting of selected cameras in a pattern, or by placing filters in front of the cameras. This is unexplored territory, so you'll need to invent your own patterns and "demosaicing" algorithms. For an extra challenge, try implementing several of these extensions at once.
- Try seaming together views from the cameras in our array to create an ultra-resolution video camera (10,000 pixels wide!). You'll need to mount miniature telephoto lenses on our cameras (we have these lenses), rotate the cameras to abut their fields of view, calibrate them geometrically and colorimetrically (we have code for this), and maybe do some blending at the seams. For an extra challenge, try increasing the overlap between the cameras' fields of view and use this redundancy to reduce imaging noise, perform super-resolution, or increase dynamic range as in the previous project.
- Super-resolution techniques provide a way to combine a collection of low-resolution images to produce a single image of higher resolution. Implement one of the many super-resolution algorithms in the literature. Try it first using a digital still camera that you "bump" slightly between shots. Then (if you like), try it using one of the cameras in our array. For an extra challenge, try spacetime super-resolution, using data from a video camera. We can provide you with literature references.
- Imagine a camera in which images are captured continuously during aiming and focusing. Could this extra imagery be used to boost the resolution, dynamic range, or other aspects of the snapshot taken when the photographer presses the shutter? Try it. Use image blending (Sawhney, Siggraph 2001), texture synthesis (Wei and Levoy, Siggraph 2000), image analogies (Hertzmann, Siggraph 2001), or an algorithm of your own invention.
- Imagine a camera that sweeps quickly through a range of focus settings, taking an image at every setting. This would allow you to select the focus and depth of field afterwards using an interactive graphics program. Design and implement such a program. What additional creative effects can you achieve using this technique? For this project, mount your camera on a tripod to avoid the need to first stabilize the imagery.
- Image stabilization, which is a mature technology in video cameras, could have many uses in still photography. Implement an image stabilization algorithm from the vision literature, then use it to stabilize and combine a set of images shot using a handheld camera. This should allow you to shorten exposure times in low-light scenes without enhancing noise, and to produce cleaner shadows in high-contrast scenes. Alternatively, by averaging multiple images to reduce noise, then boosting contrast, you might be able to take clear pictures in murky environments, such as underwater. Using non-linear image combinations, you might be able to remove cars from a freeway, or crowds from a plaza.
- By placing suitably oriented polarizing filters in front of the flash unit and lens of a camera, specular highlights can be removed from photographs of human faces. (See for example Debevec, Siggraph 2000.) However, if highlights are completely removed, the result looks unnatural. This suggests computing a blend of polarized and unpolarized images. A future camera could even be include a slider to control the blend between the two images. As obvious as this idea sounds, to our knowledge nobody has ever tried doing this.
- If you illuminate a scene first with a flash positioned on the left side of the lens, then with one positioned on the right side, you get differently structured shading and shadows in the two images. What can you do with this kind of multi-flash data? Ramesh Rasker at MERL has been playing in this space, but so far hasn't published it.
- Simple as may seem, we have not yet performed light field rendering using data from our camera array. Using an existing light field renderer (they're easy to write), or one of your own design, experiment with different camera layouts. Try surrounding an object with cameras, or jitter the camera locations to reduce light field aliasing. Beware: calibrating a full-surround arrangement could be tricky, because not all the cameras can see the calibration target at once. We'd also love someone to implement a real-time light field renderer using video from our array, but see us first before you attempt this one.
- Implement the world's first synthetically focusable wide-aperture camera. By warping the images captured by our camera array and adding them together, you can synthetically focus the array on any surface in space, even a curved surface. So far, we have tried this only offline. Design a UI for interactively specifying a focal surface, then write a program that grabs images from the array, then warps and adds them together for display. Try making the system run in realtime by subsampling the images or reducing the number of images you grab. Can you warp them in compressed format, i.e. in DCT space? Eventually, the warps can be performed in hardware using the per-camera FPGAs.
- Each sample in a light field should represent the integral over an aperture equal in size to the spacing between camera positions (See Levoy and Hanrahan, Siggraph 1996). However, the space between adjacent camera positions usually exceeds the diameter of the camera's aperture, so captured light fields contain aliasing. Using our motorized planar or spherical gantry, capture a finely spaced set of images, then combine adjacent images to produce a correctly prefiltered light field. This would be the first one ever acquired!
- In a paper to appear in ECCV 2004, Leonard McMillan describes the class of all linear cameras, i.e. all 2D affine subspaces of the 4D light field. These include ordinary perspective images, orthographic images, pushbroom panoramas, cross-slit panoramas, and four other exotic projections. Write a program that implements all eight cameras. Using imagery from our array or one of our motorized gantries, create images of each type. Is his classification complete? What interesting projections lie outside this class? And what do they look like as images?
- Determining the 3D shape of an object from multiple images is one of the central problems in computer vision, and many algorithms have been proposed for it. Having more images generally improves the performance of these algorithms, so given 128 cameras, we should do well. But which algorithms benefit most from having more images? Try one, off the shelf or roll your own. How about shape-from-focus using synthetically focused images? Can you think up a new algorithm for the "shape from light field" problem?
- Implement the world's first synthetically focusable wide-aperture projector, By displaying the same image on every projector in an array (of which we have several in the lab), but suitably warped for that projector, the images can be brought into alignment on any surface in front of the array - even a curved surface. Design a UI for interactively specifying a focal surface, then write a program that warps an image and displays it on each projector in the array. This is the illumination equivalent of the wide-aperture camera project (see above).
- We have used an array of video projectors to synthetically focus an image onto a plane in space. However, with a suitable light field as input, synthetic aperture illumination creates a 3D image in space, not 2D, with different parts of the light field coming to a focus at different depths. Try creating such an image, using one of the projector arrays in our lab. What are its lateral and axial resolution limits? How do these limits relate to the number and arrangement of projectors and to their spatial resolution?
- Suppose you have a scene containing one or more foreground objects, an array of projectors, and a set of per-projector mattes that identify for each projector those pixels belonging to the foreground versus the background. Try shaping the illumination that falls on the scene in interesting ways. For example, illuminate one foreground object without illuminating the background or casting shadows on the ground, or illuminate the background while leaving the foreground dark. There are several easy ways to compute the mattes. Imagine the theatrical applications!
- In our Siggraph 2004 paper, we explored only a few applications of shaped illumination; there must be many others. Try using it to cloak an object, i.e. make it disappear. Try shaping the illumination of a sculpture to change its apparant shape or surface properties, or the illumination in an office to create a virtual whiteboard on your bookcase, or a virtual library on your whiteboard. Create (or simulate) adaptively shaped headlights for a car or bicycle.
- We've used an array of projectors to simulate confocal imaging, and others have used it to create autostereoscopic displays, but synthetic aperture illumination is a very general idea that have many uses. Try using it to create raking-angle illumination simultaneously at all points on a curved object, thereby enhancing its surface texture. Or change the apparent BRDF of an object, e.g. make it look shiny. What other uses can you think of?
- Displaying a Powerpoint presentation on an array of projectors allows us to reduce the amount of light falling on a lecturer, and it allows each projector to be dimmer, thereby reducing uncomfortable glare. If the lecturer's silhouette is tracked, their shadow on the screen can be removed, and that part of the slide that would fall on them can be masked or replaced with gentle homogenous lighting. Some of these ideas have been tried; some have not. Implement one or more of these ideas. Make it a permanent feature of room 392!
- In our Siggraph 2004 paper, we implemented confocal imaging using one projector, one camera, and an array of planar mirrors. Re-implement our pipeline using our camera array and a set of interleaved video projectors, thereby permitting cross-sectional imaging and selective illumination of room-sized objects. This project will require careful calibration.
- By projecting synthetically focused textures onto an object, existing shape-from-focus algorithms should perform better. In fact, there are several kinds of texture-enhanced vision algorithms that could benefit from having an array of projectors, an array of cameras, or both. Try implementing one of these algorithms.
- By displaying temporal stripe sequences on two projectors having overlapping fields of view, one can project a unique time code onto every point in 3-space. Given such a volumetric coding of space, a photocell placed anywhere in a room can figure out where it is. Alternatively, one could build a powerful rangefinder using these two projectors and any number of uncalibrated cameras - to see around occlusions. To our knowledge, neither of these ideas has been published.
- Light field rendering (see above) assumes no geometric knowledge of the scene. However, if we had even a rough 3D model, compress the light field more efficiently, improve interpolation between views (Lumigraph-style), compute per-camera mattes, and many other tasks. We have a real-time structured-light rangefinding system in our lab. Try combining it with our camera array, and use it to implement one of the improvements listed above.
- By interleaving projectors and cameras, one can simultaneously create and capture 4D light fields. This allows us to characterize the appearance of a human face under arbitrary lighting, or to measure the BSSRDF (bidirectional surface scattering distribution function) of an onyx vase, which exhibits spatially inhomogenous subsurface scattering. Warning: nobody has ever measured such a "bidirectional light field" before; this function is 8-dimensional! This may not be a 1-quarter project.
- An array of cameras can be treated as a gigasample-per-second photometer, with wide latitude over how these samples are distributed in space, time, and wavelength. Researchers have investigated alternative pixel lattices for cameras, but nobody has thought about lattices of camera viewpoints, nor the general problem of arranging samples in space, time, dynamic range, focus, and wavelength. One can imagine sampling strategies that are optimal for a given scene or class of scenes, or strategies that are adaptive. This is a wide-open problem; in fact, it's a good dissertation topic.
- Analyze the applicability of an exotic or computational imaging technique from microscopy, astronomy, or another field to photography, i.e. at the human scale using combinations of cameras, projectors, or other optical components. Examples from microscopy might be phase contrast illumination or deconvolution microscopy. Examples from astronomy might be coded-aperture imaging (using an array of pinholes on a mask).
- We've designed and built an array of cameras in our laboratory, and we've envisioned lots of applications for an array of projectors. Design the future Stanford Multi-Projector Array. How small can you make the projectors? What are good arrangements for the array? How should the projectors be connected and synchronized? Where will you store the images? And how will you manage bandwidth? Now stick around for three years and help us build it. (Just kidding.)
- If the future of photography includes multiple images, then image editing programs should support arrays of images as a fundamental datatype, so that one can manipulate panoramic image sets, multi-exposure images, video sequences, light fields, and so on. Such a Photoshop++ program should also support images of arbitrary range dimensionality, such as rgbaz (range images), bump or normal maps, shader parameter maps, etc. Finally, it should offer built-in operators that are appropriate for those datatypes, e.g. differential geometry operators for range images, lighting operators for bump maps and BRDFs, etc. Anybody want to try writing such a program?