Introduction: A significant problem in computer science, which has become increasingly acute recently, is the automatic extraction and cataloging of desired features from large sets of complex images. Solution of this problem could potentially have broad applicability.
As a prototype of this kind of problem, our group has chosen to attempt the automatic retrieval of lava tubes from the Clementine dataset. Lunar lava tubes have long been recognized as desirable locations for the placement of manned lunar bases. Advantages include: (1) little construction is needed; (2) building materials need not be lifted out of Earth's gravity well; (3) natural environmental control; (4) protection from cosmic rays, meteorites, micrometeorites, and impact crater ejecta .
Coombes and Hawke  identified about 100 probable lava tubes associated with sinuous rilles in the Lunar Orbiter and Apollo photos, primarily in the nearside maria.
The lava tubes that are visible to Earth-based telescopes might be too large to provide good candidates for lunar bases. Such lava tubes of large diameter need a great depth of overlying rock to keep from collapsing. Any intact large tubes would lie inconveniently far underground. Most useful would be lava tubes that are too small to be discerned from Earth.
The Clementine spacecraft, which mapped the entire surface of the Moon to an unprecedented level of detail in 1994, gives us a view of these smaller lava tubes. Over 1.9 million images in the visible, near infrared, and mid-infrared portions of the spectrum were captured.
Our task is to find and catalog the small lava tubes in the Clementine dataset. Of particular interest are small sinuous rilles that contain interruptions, which represent uncollapsed portions of a tube that has partially collapsed. Once cataloged, the candidate base locations can be examined more closely for suitability. Considerations would be proximity to resources, sites of scientific interest, or favorable locations for siting of a railgun satellite launcher.
Clementine Imagery: Clementine captured images of the lunar surface in several spectral bands, spanning the visible, near infrared and long wavelength infrared. Collapsed lava tubes show up well in the visible part of the spectrum, given that the sun angle is suitable. Of the 1.9 million images taken, 620,000 were high-resolution images in the visible spectral band. Manual examination of even a significant fraction of those images is far too time-consuming to be feasible. Some form of automated search is the only practical way to thoroughly analyze such a large number of images in a reasonable time.
Difficulty of Characterization: Lunar rilles are inherently difficult to characterize, making it difficult to teach a computer how to find them. Such geological features do not have a common form, or a characteristic diameter or length. Due to differences in topography, some have numerous sharp bends, while others are quite straight. Some appear in clusters, while others seem to be isolated from other rilles.
Lack of Ground Truth: Only twelve human beings have ever set foot on the Moon. Only two of them landed near a rille (Hadley Rille). Since no further human expeditions to the Moon are contemplated any time soon, it is not possible to verify that features identified as rilles are actually collapsed lava tubes. Any identifications based on currently available datasets cannot be absolutely verified.
Lacking on-site verification, the next best method to obtain accurate identifications is consensus ground truth. It this method, several experts independently evaluate a sample of the total data set. These identifications are then compared against each other, and the identifications on which most experts agree are considered to be valid.
Acquiring Appropriate Training Examples: To prepare an adaptive learning feature identification tool to find members of a desired class of features, you must train it by showing it examples of ground truth. Present it with as representative a set of different examples as possible, of the class of features that you are seeking. In the case of lunar rilles, the consensus ground truth produced by human experts is the best that we can do. The goal is to create an automated system that is comparable to human experts in its ability to identify sinuous rilles caused by ancient lava flows.
An Adaptive Feature Recognition Tool: A similar, but smaller scale problem was faced by researchers at the California Institute of Technology and the Jet Propulsion Laboratory in searching the Magellan radar dataset for small volcanoes on the surface of Venus. An adaptive recognition tool named JARTool was developed for the purpose of automated analysis of large datasets, and the Magellan dataset was used to test the effectiveness of the tool at recognizing target features, and rejecting features that might resemble the target features but that are not of the class.
The CIT/JPL team, led by M.C. Burl used JARTool to find volcanoes in a set of 30,000 Magellan radar images  that contain approximately 1 million small volcanoes. Burl's team developed an algorithm that proved to be effective at identifying volcanoes, based on a series of training images containing volcanoes identified by geologists, that were presented to the JARTool before it was tasked with identifying volcanoes in the remaining images.
Applying JARTool to the Clementine Dataset: Our effort has adapted JARTool to identify sinuous rilles in the Clementine images of the lunar surface, particularly those with interruptions or gaps in the rille. We assume that such gaps represent uncollapsed segments of lava tubes. The goal of our project is to produce a catalog of uncollapsed lava tubes on the Moon. Researchers can then search the catalog for a wide variety of research purposes, including finding the best candidates for lunar bases, based on proximity to lunar resources, or areas of scientific interest.
References:  Coombes C. B. and Hawke B. R. (1988) Lunar Bases and Space Activities in the 21st Century, II, PGD #541.  Burl M. C. et al. (1996) Trainable Cataloging for Digital Image Libraries with Applications to Volcano Detection, Computation and Neural Systems Technical Report CNS-TR-96-01.