The Digital Michelangelo Project

Marc Levoy
Computer Science Department
Stanford University

Appears in the proceedings of the
Second International Conference on 3D Digital Imaging and Modeling,
Ottawa, Canada, October 5-8, 1999

Introduction

Recent improvements in laser rangefinder technology, together with algorithms developed at Stanford for combining multiple range and color images, allow us to reliably and accurately digitize the external shape and reflectance of many physical objects.

As an application of this technology, I and a team of 30 faculty, staff, and students from Stanford University and the University of Washington spent the 1998-99 academic year digitizing the sculptures and architecture of Michelangelo. During this time, we scanned 10 statues, including the giant figure of David, and 2 building interiors, including the Medici Chapel, which was designed by Michelangelo. As a side project, we also acquired a high-resolution light field of his statue of Night, in the Medici Chapel. Finally, in another side project, we scanned the 1,163 fragments of the Forma Urbis Romae, the giant marble map of ancient Rome. In the months ahead we will process the data we have collected to create 3D digital models of these objects and, in the case of the Forma Urbis, we will try to assemble the map.

The goals of this project are scholarly and educational. Commercial use of the models is not excluded, and many such uses can be imagined, but none is currently planned. In this paper, I will outline the technological underpinings, logistical challenges, and possible outcomes of this project.

Technological underpinings

From a technological standpoint, the Digital Michelangelo Project contains two components: a collection of 3D scanners and a suite of software for processing range and color data.

Our principal scanner is a laser triangulation rangefinder and motorized gantry, built to our specifications by Cyberware Inc. and customized for scanning large statues (figure 1). The rangefinder has a standoff distance of 1.2 meters, a Z-resolution of 0.1mm, and an X-Y sample spacing of 0.25mm - sufficient to capture Michelangelo's chisel marks (figure 2a). The scanner head also contains a calibrated white light source and high-resolution color camera with a pixel size of 0.125mm on the statue surface.

The scanner head is mounted on a 4-axis motorized gantry consisting of a 7-meter vertical truss, a 1-meter horizontal arm that translates vertically on the truss, and a pan-tilt assembly that translates horizontally on the arm. To maximize flexibility, the scanner can mounted in several positions on this pan-tilt head, providing a wide range of scanning configurations. The laser sheet is usually horizontal (parallel to the floor). This orientation is optimized for scanning vertical crevices, e.g. folds in drapery. To facilitate scanning horizontal crevices, the scanner head (and laser sheet) can be rolled 90 degrees. The maximum working volume of the gantry is 3 meters wide by 7.5 meters high - tall enough to scan Michelangelo's David on its pedestal (figure 3a).

For those hard-to-reach places, we used a commercial jointed digitizing arm and small triangulation laser scanner manufactured by Faro Technologies and 3D Scanners. However, its working volume was limited by the reach of a person's arm, and the scanner was fatiguing to use for long periods of time, so we made only moderate use of it in this project.

Our range processing pipeline consists of aligning the scans taken from different gantry positions, combining these scans together using a volumetric algorithm, and filling holes using silhouette carving (Curless and Levoy, 1996). Since gantry movements are not tracked in hardware, alignment is bootstrapped by aligning each scan to its neighbor interactively. This is followed by automatic pairwise alignment of scans using a modified iterated-closest-points (ICP) algorithm and finally by a global relaxation procedure designed to minimize alignment errors across the entire statue (Pulli, 1999).

Our color processing pipeline consists of compensating for ambient lighting, discarding pixels affected by shadows or specular reflections, and factoring out the dependence of observed color on surface orientation. This requires knowing the bidirectional reflectance distribution function (BRDF) of the surface being scanned. For marble statues, we have successfully employed a simple dichromatic model consisting of a colored diffuse term and a white specular term (Ikeuchi and Sato, 1991). The result of our range and color processing pipeline is a single, closed, irregular triangle mesh with a diffuse RGB reflectance at each vertex (figure 2b).

Non-photorealistic renderings of our datasets are also possible. For example, by coloring each vertex of a mesh according to its accessibility to a virtual probe sphere rolled around on the mesh (Miller, 1994), a visualization is produced that seems to show the structure of Michelangelo's chisel marks more clearly than a realistic rendering (figure 2c). We believe that the application of geometric algorithms and non-photorealistic rendering techniques to scanned 3D artworks is a fruitful area for future research.

Logistical challenges

The Digital Michelangelo Project was as much a production project as a research project, and we therefore faced logistical challenges throughout its duration.

One significant challenge we faced was the size of our datasets. During 5 months of nearly round-the-clock scanning, we acquired 200 gigabytes of data. Our largest single dataset is of the David (figure 3b). It contains 400 individually aimed scans, comprising 2 billion polygons and 7,000 color images. Losslessly compressed, it occupies 32 gigabytes. Although most of the techniques used in this project are taken from the existing literature, the scale of our datasets precludes the use of many published techniques. In particular, we know of no geometric modeling package into which we can load a 2-billion polygon model, nor any geometric simplification algorithm that will accept a mesh of this size.

A second logistical challenge we faced is insuring safety for the statues during scanning. Laser triangulation is fundamentally a non-contact digitization method; only light touches the artwork. Nevertheless, the digitization process involves positioning a scanner close to a precious artwork, so accidental collisions between the scanner and the statue are a constant threat. Our primary defense against collisions was our long standoff distance - over a meter. However, we found it difficult to maintain this standoff while also reaching all parts of a large statue. To protect statues from motorized collisions with the scanner, our design included an elaborate system of manual and automatic motion shutoffs and interlocks. To reduce the chance of damage in the case of inadvertent contact, our scanner head and pan-tilt assembly were encased in foam rubber. Despite these measures, our gantry was moved and aimed by human operators, so the possibility of operator error was omnipresent. To reduce this risk, our scanning teams included at least two people, one of whom operated as a spotter whenever the gantry was moved, we established rigid operating protocols, and we tried to get enough sleep.

A third logistical challenge we faced was the development of meaningful, equitable, and enforceable intellectual property agreements with the cultural institutions whose artistic patrimony we were digitizing. Since the goals of our project were scientific, our arrangement with the museums was simple and flexible: we are allowed to use and distribute our models and computer renderings for scientific use only. In the event we, or they, desire to use the models commercially, there will be further negotiations and probably the payment of royalties.

The corollary issues of distribution, verification, and enforcement, although difficult in principle, were simplified in the near term by the size of our datasets. In particular, they are too large to download over the Internet. Similarly, distinguishing our computer models from other models of Michelangelo's statues is not currently a problem, since none exist. In the long term, we are investigating methods of 3D digital watermarking as they apply to large geometric databases. However, this is still an open area for research.

Uses for our models

The first question people ask us about these models is whether we plan to use them to make copies of the statues for sale. We have no such plans. However, our technology can certainly be used to scan and replicate statues. Among the other clients we envision for these models are art historians, museum curators, educators, and the public.

For art historians, our methods provide a tool for answering specific geometric questions about statues. Questions we have been asked about Michelangelo's statues include computing the number of teeth in the chisels employed in carving the Unfinished Slaves, determining the smallest size block from which each of the allegories in the Medici Chapel could have been carved, and determining whether the giant statue of David is well balanced over his ankles, which have developed hairline cracks. Aside from answering specific questions like these, art historians envision computer models becoming a repository of information about specific works of art. Our model of the David, when it is completed, is expected to become the official record of diagnostics and restorations performed on this statue.

For educators, computer models provide a new tool for studying works of art. In a museum, we see most statues from a limited set of viewpoints. Computer models allow us to look at statues from any viewpoint, change the lighting, and so on. In the case of Michelangelo's statues, most of which are large, the available views are always from the ground looking up. Michelangelo knew this, and he designed his statues accordingly. Nevertheless, it is interesting and instructive to look at his statues from other viewpoints. Looking at the David from unusual directions has taught us many things about the statue's ingenious design (figure 3c).

For museum curators, while models displayed on a computer screen are not likely to replace the experience of walking around a statue, they can nevertheless enhance the experience. As an experiment, we allowed visitors in the Medici Chapel to manipulate our model of Michelangelo's statue of Dawn interactively. On the surface it seems ludicrous to place a computer in front of a statue, and on its screen to display a 3D model of that same statue. In reality we found that the computer focuses their attention on the statue and allows them to view it in a new way. By exploring the statue themselves, or by changing its virtual lighting, they turn the viewing of art into an active rather than a passive experience. The art museum becomes a hands-on museum.

Finally, for the public, we think that interactive viewing of computer models may eventually have the same impact on the plastic arts that high-quality art books have had on the graphic arts; they give the educated public a level of familiarity with great works of art that was previously possible only by traveling. Some will argue that this increased familiarity degrades the mystique of the original. We disagree; there are two good copies of Michelangelo's David in Florence, but tourists still line up for hours to see the original.

Side projects

Although the primary goal of this project was to scan statues, our team and equipment were involved in several other 3D scanning projects in Italy.

One such sideproject was the scanning of the architectural settings of Michelangelo's statues. For this purpose we used a time-of-flight laser scanner manufactured by Cyra Technologies. This scanner has a Z-resolution of 5mm and a typical X-Y sample spacing of 4mm at a distance of 10 meters. Its maximum calibrated range is 100 meters. Our prototype configuration included an ultra-high resolution (2K x 2.5K pixel) color camera. Using this scanner and an attached color camera, we built a colored 15-million polygon model of the Tribune del David in the Galleria dell'Accademia (figure 5) and a 50-million polygon model of the Medici Chapel. In this latter endeavor, we were aided by the fortuitous presence of a 25-meter-high scaffolding tower in the center of the chapel. By elevating our scanner to the intermediate floors of this tower, we were able to look both upward and downward on the architectural decorations, enabling us to build a computer model of unprecedented detail and completeness.

Unfortunately, our computer model of the Medici Chapel will be an irregular triangle mesh, like those produced by our other scanners. While a simplified version of this mesh might suffice for virtual reality flythroughs, it is not useful for most practical architectural applications. Converting this dataset into conventional graphical representations such as plans and elevation drawings is not easy. Converting it into a segmented, structured, and annotated architectural database is even harder. In fact, both are open research problems.

A second sideproject in which we were involved was the fusion of 3D scanning and other imaging modalities. One example was acquisition of a photographic dataset of Michelangelo's David under ultraviolet illumination. Using a calibrated camera and a simple correspondence-based method for locating the camera in 3-space, we mapped this data onto our 3D model, producing a per-vertex UV fluorescence map of the statue. This map, which shows the location of waxes and other organic materials, can be used when planning future cleanings and restorations of the David.

A second sensor fusion project was our acquisition of a dense light field (Levoy and Hanrahan, 1996) of Michelangelo's statue of Night. During 4 consecutive nights in the Medici Chapel, we replaced the 3D scanner head of our Cyberware gantry with a high-quality digital still camera, illuminated the statue with a fixed arrangement of spotlights, and acquired roughly 25,000 color pictures spaced 12.5mm apart in an arc around the statue (figure 4a,b). We have also scanned this statue using our laser scanner. Having a high-resolution 3D model and a dense light field of the same object provides a unique opportunity for exploring image-based representations in which the two datasets are combined. We intend to explore such combinations in the future.

The Forma Urbis Romae

A final sideproject is the scanning of the Forma Urbis Romae - the giant marble map of ancient Rome. The Forma Urbis is the single most important document on ancient Roman topography (Carettoni et al., 1960). Measuring 60 feet across, 45 feet high, and carved onto marble slabs several inches thick, it once graced the back wall of Templum Sacrae Urbis - the census bureau of Rome. Carved between A.D. 203 and 211, during the reign of Septimius Severus, it shows every street, building, room, and staircase in the city - a feat of mapmaking that has never been matched.

The map now lies in fragments - 1,163 of them. Moreover, these fragments represent only 15% of the original map surface. However, experts believe that due to the unique way the map fell from the wall and was buried by other rubble, this 15% is clustered in a few areas of the city. As a result, they believe that many of the fragments are likely to fit together. Fortunately, these areas are important ones, including portions of the imperial forums, the Colosseum, and the Palatine Hill. What does remain of the map is about 200 identified fragments, many of which have also been fit together, 500 unidentified fragments, some partially fit together and some not, and 400 fragments that have no incisions (Almeida, 1981). These blank fragments might correspond to the centers of plazas or the Tiber River, or they might represent the borders of the map - nobody knows.

Piecing this map together is one of the key unsolved problems in classical archeology. Our first idea was to search among digital photographs of the fragments for matches between the borders or incised designs on their top surfaces. Unfortunately, the top surfaces of the fragments are often eroded, reducing the effectiveness of such an approach. Moreover, scholars have been searching for 500 years for matches among these incised designs; it seems unlikely that we will find many more. On the other hand, the fragments are several inches thick, and fragments that do fit together usually mate intimately across at least a portion of the interface surface between them (figure 6). Our idea, not yet tested, is to develop compact signatures for these border surfaces and to search among the signatures for matches.

In order to test these ideas, we first needed to build 3D geometric and photographic models of every fragment of the map at a resolution of 1mm or finer. For this purpose, we moved our computers and scanners to Rome, we had an additional desktop laser scanner brought in from Palo Alto to increase our throughput, and during a 3-week 24-hour-a-day scanning marathon in May and June, we digitized every fragment of the map. In the months (and years) ahead, we will process and edit the data we have collected, then try to assemble the map.

Conclusion

Although we had run many back-of-the-envelope calculations in preparation for scanning the statues of Michelangelo, the actual difficulty of the task surprised us. In particular, the statues contained more recesses and partially occluded surfaces than we anticipated, and positioning our gantry to reach them required more time and effort than we imagined. We typically spent 50% of our time scanning the first 90% of the statue, 50% on next 9%, and the last 1% was unscannable. To improve on these numbers, a scanner system would need a more compact scanner head, more positional and angular flexibility, a variable standoff distance, automatic tracking of the gantry position, and a suite of automatic view planning software.

Another hard lesson we learned was about scattering of laser light from the crystal structure of marble. The effect of this scattering is to introduce noise into the range data (figure 3d). In joint experiments with researchers from the National Research Council of Canada, we found that this scattering is dependent on surface polish, highly dependent on incident beam angle, and can be reduced but not eliminated by narrowing the laser beam. Furthermore, although the scattering is related to laser speckle, it would occur even with incoherent illumination. In short, there appears to be a fundamental resolution limit for structured light scanning of marble surfaces.

Acknowledgements

The sponsors of the Digital Michelangelo Project are Stanford University, Interval Research Corporation, and the Paul G. Allen Foundation for the Arts. Our principal scanner and gantry was designed by a team led by Duane Fulk of Cyberware. For permission to scan the statues of Michelangelo, we thank Antonio Paolucci and Cristina Acidini Luchinat of the Superintendency of Fine Arts in Florence and Franca Falletti of the Accademia Gallery. For permission to scan the fragments of the Forma Urbis, we thank Eugenio La Rocca and Susanna Le Pera of the Superintendency of Museums, Galleries, Monuments, and Excavations of the City of Rome, and Anna Mura Somella and Laura Ferrea of the Capitaline Museum in Rome. The Digital Michelangelo Project represents the joint work of roughly 50 people from Stanford University, the University of Washington, Cyberware, and a half-dozen Italian museums and institutions. Rather than list all these people here, we refer the reader to our project web page: http://graphics.stanford.edu/projects/mich .

References

Almeida, R., Forma urbis marmorea: Aggiornamento generale, Rome, 1981.
Carettoni, G., Colini, A.M., Cozza, L., and Gatti, G., La Pianta Marmorea di Roma Antica (The Marble Map of Ancient Rome), Rome, 1960.
Curless, B., Levoy, M., "A Volumetric Method for Building Complex Models from Range Images," Proc. SIGGRAPH '96 (New Orleans, LA, August 5-9, 1996). In Computer Graphics Proceedings, Annual Conference Series, 1996, ACM SIGGRAPH, pp. 303-312.
De Tolnay, C., Michelangelo, Princeton University Press, 1947.
Ikeuchi, K., Sato, K., "Determining Reflectance Properties of an Object Using Range and Brightness Images," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 13, No. 11, November 1991, pp.1139-1153.
Levoy, M., Hanrahan, P., "Light Field Rendering," Proc. SIGGRAPH '96 (New Orleans, LA, August 5-9, 1996). In Computer Graphics Proceedings, Annual Conference Series, 1996, ACM SIGGRAPH, pp. 31-42.
Miller, G., "Efficient Algorithms for Local and Global Accessibility Shading," Proc. SIGGRAPH '94 (Orlando, Florida, July 24-29, 1994). In Computer Graphics Proceedings, Annual Conference Series, 1994, ACM SIGGRAPH, pp. 319-326.
Pulli, K., "Multiview registration for large data sets," Proc. Second International Conference on 3D Digital Imaging and Modeling (October 4-8, 1999, Ottawa, Canada), IEEE Computer Society.

Figure 1: Our principal laser scanner undergoing testing in our temporary computer graphics laboratory in Florence. The orange arrows show the four degrees of motorized freedom of the gantry. Figure 2a: A single raw scan of the head of Michelangelo's unfinished statue of St. Matthew. The X-Y resolution is 0.29mm. At this scale, individual chisel marks can be distinguished. The scan is from a single direction, so it contains holes. The color and shading are artificial.

Figure 2b: A 7.6-million polygon mesh produced by combining scans from several directions. Most of the holes have been filled. The coloring is derived from photographs of the statue as described in the text. The shading is still artificial. Figure 2c: A non-photorealistic, accessibility-shaded coloring of the same mesh. To us, it seems to show the structure of Michelangelo's chisel marks more clearly than figures 2a or 2b.

Figure 3a: Our laser scanner gantry positioned in front of Michelangelo's David. The gantry weighs 1800 pounds and stands 7.5 meters tall. The 2' truss section level with David's foot was added at the last minute after we discovered that the statue is nearly 3 feet taller than the height given in most art history books (for example, De Tolnay, 1947). Figure 3b: An 8-million polygon, 4-millimeter model of the David, constructed from our 2-billion polygon, 0.25-millimeter dataset of the statue. It was acquired over a period of 4 weeks by a crew of 22 people scanning 16 hours per day 7 days a week. The color and shading are artificial.

Figure 3c: A 1-million polygon, 2-millimeter model of David's head. This unusual view, a left profile, is never seen in art books. In fact, it is impossible to obtain this view of the actual statue, since the required viewpoint lies outside the museum walls. Doesn't he look like a Roman coin? Figure 3d,e: A model of David's eye at the full resolution of our dataset, 0.25mm, and a photograph for comparison. The bumps at the back of the pupil are real - they are drill holes. However, the nodules along the eye's lower rim are not real - they are artifacts of our hole-filling algorithm. The roughness of the eyeball is also not real - it is caused by the scattering of our laser from marble crystals.

Figure 4a: A plan view of the layout of our light field of Michelangelo's Night. The light field is composed of 7 "light field slabs", shown here by yellow line segments. Each slab is a planar array of 62 x 56 images, with images spaced 12.5mm apart on the plane. The total number of images is thus 3,472 per light field slab, or 24,304 for the entire light field. Figure 4b: A representative image from the light field. The image is 1300 x 1000 pixels, stored as a JPEG image with 6:1 compression, so the entire raw light field occupies 16GB. We expect that vector quantization (VQ) can do much better, but it will more seriously compromise image quality.

Figure 5: An orthographic view, rendered without color, of our 15-million polygon model of the in the Galleria dell'Accademia in Florence. The view is taken from below the floor looking up. This model, together with our more detailed models of Michelangelo's David, Slaves, and St. Matthew, allow us to create a nearly complete virtual reconstruction of this wing of the museum. Figure 6: Four fragments of the Forma Urbis Romae, fit together in the early 1980's by Emilio Rodriguez-Almeida. As this photograph shows, the matches between fragments are not obvious from an examination of their top surfaces. These particular pieces do fit, however, as can be verified by examining the surfaces, largely hidden in this photograph, where they mate. (Photograph courtesy of Prof. Rodriguez-Almeida.)

levoy@cs.stanford.edu


Figure 1: Our principal laser scanner undergoing testing in our temporary computer graphics laboratory in Florence. The orange arrows show the four degrees of motorized freedom of the gantry.	Figure 2a: A single raw scan of the head of Michelangelo's unfinished statue of St. Matthew. The X-Y resolution is 0.29mm. At this scale, individual chisel marks can be distinguished. The scan is from a single direction, so it contains holes. The color and shading are artificial.


Figure 2b: A 7.6-million polygon mesh produced by combining scans from several directions. Most of the holes have been filled. The coloring is derived from photographs of the statue as described in the text. The shading is still artificial.	Figure 2c: A non-photorealistic, accessibility-shaded coloring of the same mesh. To us, it seems to show the structure of Michelangelo's chisel marks more clearly than figures 2a or 2b.


Figure 3a: Our laser scanner gantry positioned in front of Michelangelo's David. The gantry weighs 1800 pounds and stands 7.5 meters tall. The 2' truss section level with David's foot was added at the last minute after we discovered that the statue is nearly 3 feet taller than the height given in most art history books (for example, De Tolnay, 1947).	Figure 3b: An 8-million polygon, 4-millimeter model of the David, constructed from our 2-billion polygon, 0.25-millimeter dataset of the statue. It was acquired over a period of 4 weeks by a crew of 22 people scanning 16 hours per day 7 days a week. The color and shading are artificial.


Figure 4a: A plan view of the layout of our light field of Michelangelo's Night. The light field is composed of 7 "light field slabs", shown here by yellow line segments. Each slab is a planar array of 62 x 56 images, with images spaced 12.5mm apart on the plane. The total number of images is thus 3,472 per light field slab, or 24,304 for the entire light field.	Figure 4b: A representative image from the light field. The image is 1300 x 1000 pixels, stored as a JPEG image with 6:1 compression, so the entire raw light field occupies 16GB. We expect that vector quantization (VQ) can do much better, but it will more seriously compromise image quality.