U.S. Geological Survey home page


USGS Spectroscopy Lab


Imaging Spectroscopy:
Earth and Planetary Remote Sensing with the
USGS Tetracorder and Expert Systems

Roger N. Clark, Gregg A. Swayze, K. Eric Livo,
Raymond F. Kokaly, Steve J. Sutley, J. Brad Dalton,
Robert R. McDougal, and Carol A. Gent.

U.S. Geological Survey
Box 25046 Federal Center
Denver, CO 80225
(303) 236-1332
(303) 236-1425 FAX
rclark@usgs.gov

Journal of Geophysical Research, 2003.


Version 12f: 8/03/2002

Reference:
Clark, R. N., G. A. Swayze, K. E. Livo, R. F. Kokaly, S. J. Sutley, J. B. Dalton, R. R. McDougal, and C. A. Gent, Imaging spectroscopy: Earth and planetary remote sensing with the USGS Tetracorder and expert systems, J. Geophys. Res., 108(E12), 5131, doi:10.1029/2002JE001847, pages 5-1 to 5-44, December, 2003. http://speclab.cr.usgs.gov/PAPERS/tetracorder

PDF Version in 3 parts:
Main text (429 KBytes).
Figure captions (14.8 MBytes).
Appendix A (206 KBytes).

Full Tetracorder source code and command files are available via ftp by clicking here.
Tetracorder source code and history information.

Abstract

Imaging spectroscopy is a tool that can be used to spectrally identify and spatially map materials based on their specific chemical bonds. Spectroscopic analysis requires significantly more sophistication than has been employed in conventional broad-band remote sensing analysis. We describe a new system that is effective at material identification and mapping: a set of algorithms within an expert system decision-making framework that we call Tetracorder. The expertise in the system has been derived from scientific knowledge of spectral identification. The expert system rules are implemented in a decision tree where multiple algorithms are applied to spectral analysis, additional expert rules and algorithms can be applied based on initial results and more decisions are made until spectral analysis is complete. Because certain spectral features are indicative of specific chemical bonds in materials, the system can accurately identify and map those materials. In this paper, we describe the framework of the decision making process used for spectral identification, describe specific spectral feature analysis algorithms, and give examples of what analyses and types of maps are possible with imaging spectroscopy data. We also present the expert system rules that describe which diagnostic spectral features are used in the decision making process for a set of spectra of minerals and other common materials.

We demonstrate the applications of Tetracorder to identify and map surface minerals, to detect sources of acid rock drainage, map vegetation species, ice, melting snow, water and water pollution, all with one set of expert system rules. Mineral mapping can aid in geologic mapping, fault detection, and provide a better understanding of weathering, mineralization, hydrothermal alteration and other geologic processes. Environmental site assessment, such as mapping source areas of acid mine drainage has resulted in the acceleration of site cleanup, saving millions of dollars and years in cleanup time. Imaging spectroscopy data and Tetracorder analysis can be used to study both terrestrial and planetary science problems. Imaging spectroscopy can be used to probe planetary systems, including their atmospheres, oceans and land surfaces.


Table of Contents

  • 1. Introduction
  • 2. Tetracorder Materials Detection Concept
    • 2.1 Overview
    • 2.2 Feature Isolation: Continuum Removal Algorithm
    • 2.3 Shape-Matching Algorithm
    • 2.4 Multiple Spectral Features: Weighted Results Algorithm
    • 2.5 Spectral Constraint Algorithms
    • 2.6 Not-Feature Algorithm
    • 2.7 Diagnostic/Optional Features
    • 2.8 Nothing Found is an Answer
    • 2.9 Grouping Decisions
  • 3. Tetracorder applied to imaging spectroscopy
    • 3.1 Comparison of Analysis Methods
    • 3.2 Decisions in real-world situations
    • 3.3. Tetracorder and Mixtures
  • 4. The Tetracorder Expert System
    • 4.1 Expert System Lessons Learned
    • 4.2 Tetracorder Groups and Cases
      • 4.2.0 Group 0: the global view
      • 4.2.1 Group 1: 1-µm broad region
      • 4.2.2 Group 2: 2-µm vibrational absorption region
      • 4.2.3 Group 3: Vegetation Chlorophyll Detection
      • 4.2.4 Group 4: Rare Earth Materials
      • 4.4.5 Case 1: Vegetation Red Edge
      • 4.4.6 Case 2: Vegetation Spectral Type
      • 4.4.7 Case 3, 4, 5: Vegetation Leaf Water Content
  • 5. Verification
    • 5.1 Known Deficiencies in the Presented Tetracorder and Expert System
  • 6. Applications
    • 6.1 Geologic Applications: Mapping Minerals and Amorphous Materials
    • 6.2 Environmental Applications
    • 6.3 Vegetation Species/Communities, Health/Senescence Indicators, and Green Leaf Water Abundance
    • 6.4 Water
    • 6.5 Ice and Snow
    • 6.6 Atmospheric Gases
    • 6.7 Other Planetary Applications
  • 7. Discussion and Conclusions
    • 7.1 Availability
  • Acknowledgments
  • Tables
  • References
  • Appendix A


1. Introduction

Spectroscopy is a tool that has been used for decades to identify, understand, and quantify solid, liquid or gaseous materials, especially in the laboratory. In disciplines ranging from astronomy to chemistry, spectroscopic measurements are used to detect absorption features due to specific chemical bonds, and detailed analyses are used to determine the abundance and physical state of the detected absorbing species. Spectroscopic measurements have a long history in the study of the Earth and planets (e.g. Hunt, 1977; Clark et al., 1990a; Pieters and Englert 1993; Clark, 1999). Up to the 1990's remote spectroscopic measurements of Earth and planets have been dominated by multispectral imaging experiments that collect high quality images in a few--usually broad--spectral bands. However, a new generation of sensors is now available that combines imaging with spectroscopy to create the new discipline of imaging spectroscopy(1) (e.g. see Goetz et al., 1985; Rencz, 1999 and references therein). Imaging spectrometers acquire data with enough spectral range, resolution and sampling at every pixel in a raster image so that individual absorption features can be identified and spatially mapped (Goetz et al., 1985).

1Imaging spectroscopy has many names in the remote sensing community, including imaging spectrometry, hyperspectral, and ultraspectral imaging. Ball (1995) argues that spectrometry be limited to measurement not including photons, as in mass spectrometry leading Clark (1999) to argue for "imaging spectroscopy" as the appropriate term.

Some traditional approaches to remote sensing analysis born in the era of multispectral imaging are based on statistical methods exploiting the large number of samples (pixels) in the remotely sensed data. The demonstrated power of these approaches is vastly multiplied by the large increase in information content inherent when the number of spectral bands increases from order 10 to order 100. Such scene statistical results are by their nature scene-dependent, cannot be applied globally, and the statistical approaches do not exploit the information inherent in each individual spectrum concerning the chemical and nature of the remotely sensed surface. Analysis tools from another discipline, signal processing, has had good success at detection of specific spectral signatures across data sets using vector filter approaches. However, this approach requires that the signature be stable from lab to field. Most geologic, and many biologic materials do not meet this criterion.

Our team emerged from a spectroscopic discipline, which focuses on the inherent information in each spectrum. Over the past decade we have developed a software system that takes an explicitly spectroscopic approach to "hyperspectral" analysis. That is, the objective driving our system is to determine the chemical, mineralogical or biological nature of each spectrum as an individual, independent of the rest of the hundreds to millions of companion spectra in the data set. The analysis must also be done with such efficiency that this automated analysis can be performed on very large data sets in short periods of time. A central task we have focused on is robust detection of geologic and biologic materials from visible and near-infrared (IR) spectroscopic measurements. Our software system includes many other capabilities and is being continually expanded, but our material identification concept is our unique contribution and is the subject of this paper.

The spectral analysis system described here is called Tetracorder, paying homage to the "Tricorder" © remote analyzer of the Paramount Pictures Star Trek series. This paper will describe the Tetracorder material detection concept and we will throughout use the term "Tetracorder" synonymously with the detection system. But it should be understood that the Tetracorder software system encompasses more than spectral detection: Tetracorder is a generalized application system where multiple algorithms and analyses can be commanded through an expert system rule set, and decisions about the analyses performed to steer analyses in certain directions. All these capabilities are beyond the scope of this paper, therefore we will focus on material identification and mapping.

This paper will describe the concepts behind our Tetracorder material detection scheme with illustrative examples, then delve into aspects of the implementation of Tetracorder. We will then discuss verification of Tetracorder performance, which is accomplished with a combination of human verification of Tetracorder spectral analyses, field checking Tetracorder maps in situ, and through laboratory analysis of collected samples. We will present examples of Tetracorder analyses of terrestrial data sets, and close with implications for planetary science.

Tetracorder: a software program containing multiple algorithms which can be commanded as an expert system. This paper describes the software and expert system rule set for Tetracorder version 3.5. When we refer to "Tetracorder" in the text, we mean the Tetracorder software and the expert system driving the analysis.

Expert System: In this paper, for spectroscopic analysis and identification of materials, an expert system is a set of rules used to instruct algorithms to analyze spectral data to attain a certain result, such as the identification of minerals in a spectrum, including the influences of mixtures. The expert system rules presented here are the collective result of a team of spectroscopists, physicists, geologists, and botanists who analyzed spectra and imaging spectroscopy data sets at multiple sites and geologic environments for over a decade.

2. Tetracorder Materials Detection Concept

2.1 Overview

At its highest level, Tetracorder identifies materials by comparing a remotely sensed observed spectrum (the unknown) to a large library of spectra of well-characterized materials, but we do so using several innovations to maximize accuracy and performance. The first of our innovations is that in the comparison of a specific reference to the unknown, only the portions of the spectrum that are known to be diagnostic of the reference material are used. Every spectral feature is due to an interaction of photons of particular energies with the atoms and electrons within the chemical under study, and the nature of the absorption is largely unique to the specific chemical structure. At other wavelengths, photon interactions do not give rise to absorption; mostly transmission or scattering occurs. Taken together, the presence of spectrally "active" and "inactive" spectral regions for a material gives rise to the central concept of a diagnostic absorption feature. Diagnostic absorption features are unique to particular materials in shape (variation in intensity with wavelength over a narrow interval) and usually are concentrated in limited ranges of wavelength by type of absorption. Between diagnostic features are portions of the spectrum which contain little information specific to the material of interest. The focus on diagnostic features in analysis of natural scenes is critical because mixtures which obey nonlinear systematics (e.g. coatings, intimate mixtures, solutions) are common in the natural environment and frustrate simple matching of spectra.

In Tetracorder, each comparison of an unknown to a reference spectrum is highly tailored to the chemistry of the reference material by focusing on diagnostic spectral features (Figures 1A and 1B). The tailoring is based on specific expert knowledge of our team of spectroscopists, geologists and biologists. By neglecting portions of the spectrum that are irrelevant to the chemistry of the reference material, we reduce noise or clutter induced by these "inactive" wavelengths. A corollary of this innovation is that Tetracorder can, and routinely does, detect the presence of the influence of many materials in a spectrum. For example, the key spectral features of iron oxides are in the visible, while clay minerals exhibit diagnostic features between 2 and 2.5 microns. The presence of multiple materials may dilute the strength of their spectral features relative to those in a reference spectrum, but in many important and frequent cases, the signatures do not confound each other.


image FIGURES/fig1a1.plot.tgif.gif
Larger 61 KB image

Figure 1A: Continuum removal process employed by the Tetracorder spectral feature shape matching algorithm. Three reference spectra are shown: goethite, jarosite and hematite. Each spectral feature has its own continuum end-points (illustrated by the boxes). The continuum is removed from both the observed and reference spectra. For example, the hematite 0.9-µm feature continuum is removed from the Cuprite unknown spectrum, then the goethite continuum is removed, and so on. This allows a specific comparison between each spectral library feature and the unknown. The spectra are offset for clarity.




image FIGURES/fig1b1.plot.tgif.gif
Larger 16 KB image

Figure 1B. Spectral features as on Figure 7A except for comparison with vibrational absorptions near 2.2 µm. Note that alunite also has a diagnostic absorption near 1.5 µm. The spectra are offset for clarity.


We learned early in our work that the degree of the similarity between reference and unknown (as quantified via a least squares shape-matching algorithm) was far from sufficient to allow robust detection. Above we cited the case where different materials have diagnostic spectral features that are well-separated in wavelength, but our large reference library contains many classes of materials that are chemically and spectrally similar, and even spectrally similar but chemically different (e.g. carbonates, organic compounds and some OH-bearing minerals have similar 2.3-µm absorptions). Though not identical, they are similar enough that noise and natural variations in field spectra make the assignment of the proper threshold at which to define identification or misidentification problematic. If we used the properties of only a single material in our identification, we found that for a very large number of materials there was no threshold setting to our least-squares fitting process which simultaneously provided a high probability of identification (low occurrence of missed identifications) with a simultaneously low probability of false alarm or misidentification owing to spectrally similar reference materials in the library. In other words, if we set the identification limit based on our goodness of fit criteria to ensure rejection of false identifications, we also failed to identify many locations where our material of interest was present. Further, the threshold levels seem to be different for different materials and what they are confused with. Conversely, if we set the limits to ensure identification, we admitted too many false alarms. In practice then, we find that in many important cases of comparison of a single material to a set of spectra, even when combined with proper attention to diagnostic features to reduce noise, there is no satisfactory way to map occurrence of materials in those cases with confidence.

Our solution to this problem, and our second innovation, is to quantitatively compare the similarity of an unknown spectrum to all entries in the library with similar diagnostic features. Over the set of library entries that constitute candidate detections, various similarity parameters are compared, and identification is assigned to the reference material with the greatest similarity to the unknown. Thus, Tetracorder not only compares the unknown's spectral properties to the spectral properties of each entry in the library, but the comparisons themselves are quantitatively compared, assessed, and judged to identify the components present.

We discovered important but surprising false identifications remained even after the above processes were followed. In these cases very different materials share diagnostic spectral regions, and coincidentally have very similar shapes over these regions. Our shape matching algorithm operates on data which has had the reflectance level and local slope removed and band depth normalized over a diagnostic spectral region and the coincidental similarities, which can result in identification ambiguities, almost always arise after this normalization. While the disparate materials in question may be similar in our primary diagnostic region as perceived by our normalization process, they never are similar at all other key wavelengths, or in terms of the local spectral parameters which have been normalized (reflectance, local slope and depth). Our third innovation is to mitigate these coincidental ambiguities using ancillary spectral information. We also use this same approach to resolve ambiguities involving related minerals that have similar diagnostic features, but also differ in straightforward and consistent ways at other wavelengths. This approach complements and supersedes the comparisons of goodness of fit approach where possible.

Our fourth innovation is to partition analyses across the spectrum. As different photon absorption processes tend to operate in different wavelength ranges, we split the spectral identification into several spectral regions, we call groups. This allows multiple components to be identified without the need for mixture analysis. For example, Figure 1A shows diagnostic absorptions due to electronic processes near 1 µm, while Figure 1B shows vibrational absorptions in the 1.5 to 2.5-µm region. It is clear that the AVIRIS spectrum in Figure 1A,and 1B displays absorptions due to both electronic and vibrational processes.

Our fifth innovation is to allow Tetracorder to return a "no answer," that is, a non-detection. It is a frequent occurrence that a remotely sensed spectrum does not pass even the liberal thresholds necessary to assign any candidate detections. In other cases the constraints applied to resolve ambiguities results in a rejection of an unknown in every case. Tetracorder explicitly flags a spectrum as a "no answer" that has no similarities to entries in the library as defined in the expert system.

In summary, Tetracorder identifies materials by comparing them to a large spectral library. Our recognition that material spectral signatures are significant only in their diagnostic wavelengths allows detection of more than one material in a remotely sensed spectrum for common and important combinations of materials. Tetracorder mitigates false alarms caused by spectrally similar materials by quantitatively comparing the degree of similarity of an unknown to a set of spectrally similar reference spectra. Tetracorder mitigates coincidental false alarms permitted by our specific implementation of our shape-matching algorithm by including ancillary information. Finally, Tetracorder is not forced to provide a solution; it allows "No Answer" as an answer.

Implicit in our approach is that our library is comprehensive and unambiguous. In our experience we have found that for our applications it approaches comprehensive status because we have added materials as they have been found in field tests or for specific applications. However, Tetracorder cannot resolve ambiguities inherent to spectroscopy. Any cases where experts differ on the assignment of specific spectral features is as far as Tetracorder can go. For example, the controversy surrounding whether spectra of some portions of Europa's surface represent hydrated salts or radiolytic products cannot be resolved by our approach. However, Tetracorder does solve the problem of how to inspect large numbers of spectra in a way that mimics the method of a trained spectroscopist.

In the following portions of this section we will describe our implementation of the above concept. We will begin with our shape-matching algorithm and its attendant normalizations. We will then illustrate our rejection of false alarms due to similar materials with specific examples. We will also show examples of coincidental false alarms. We will show specific examples of non-detections ("No Answer"s). Closing this section we will describe the expert system framework in which we implement Tetracorder.

First, we describe Tetracorder algorithms, which are the individual algorithms the Tetracorder system applies during analysis. It is our collective opinion that robust material identification with spectroscopy involves diagnostic spectral features, as these are the "fingerprints" of any material. Many analyses of spectra rely on unique spectral characteristics for identification (e.g. Rencz, 1999 and references therein) and the Tetracorder system is similar. The algorithms implemented in this version of Tetracorder isolate and analyze spectral features and their continua, because the levels and slopes in a spectrum contain diagnostic information as do the absorption and emission features.


image FIGURES/fig2a1.plot.tgif.gif
Larger 15 KB image

Figure 2A. The continuum removed spectra from Figure 1A are fit to each other using a modified least squares calculation. The library reference feature strength is increased or decreased to best match the observed feature. Tetracorder compares the least-squares fits to many features from many library reference spectra to determine which one matches best. The solid line in each case is the unknown and the dash double dotted line is the library reference feature. For each feature, the least squares correlation coefficients (the fits) are given, and along a vertical central column, the weighted fits are shown. The best match to the Cuprite spectrum is hematite. Hematite has two features used in the identification: the 0.9 µm feature gives a fit of 0.988, and the 0.5 µm feature gives a fit of 0.965. The area-weighted fit is 0.974.




image FIGURES/fig2b1.plot.tgif.gif
Larger 16 KB image

Figure 2B. The continuum removed spectra from Figure 1B are fit together using a modified least squares calculation. Kaolinite is the best match to the Cuprite spectrum. The muscovite spectrum has two features, one near 2.2 and the other near 2.3 µm. No 2.3-µm muscovite feature could be detected in the Cuprite spectrum, so the weighted fit is zero (left hand column). Note the very similar fits between kaolinite (0.996) and halloysite (0.963), yet the halloysite profile clearly does not match as well as the kaolinite profile. This illustrates that small differences in fit numbers are significant. Alunite has two diagnostic spectral features, but the 1.5-µm feature is not shown.


2.2 Feature Isolation: Continuum Removal Algorithm

In order to identify a spectral feature by its wavelength position and shape, it must be isolated from other effects, such as level changes and slopes due to other absorbing (or emitting) materials. The first step in such isolation is continuum definition and removal (Clark and Roush, 1984). Continuum removal examples are shown in Figures 1 and 2 for several spectral features. A continuum is removed by division in reflectance, transmittance, and emittance spectra because of exponential absorption and scattering processes (Clark and Roush, 1984). Conversely, a continuum should be removed by subtraction with absorbance or absorption coefficient spectra because multiple components are additive.

To isolate and identify absorption features, the continuum removal algorithm first removes a continuum from a library reference spectrum and from the observed spectrum using a wavelength interval on each side of the absorption feature that is to be mapped (Figures 1A, 1B, 2A, and 2B). This can be described mathematically by:

Lc(w) = L(w) / Cl (w) and Oc(w) = O(w) / Co (w), (eqn 1 a and b)

where L(w) is the library spectrum as a function of wavelength, w, O is the observed spectrum, Cl is the continuum for the library spectrum, Co is the continuum for the observed spectrum, Lc is the continuum-removed library spectrum, and Oc is the continuum-removed observed spectrum.

Most remotely sensed spectra are composed of mixtures, not pure materials, and as such will have spectral curves that combine to produce a continuum upon which diagnostic absorptions may be superimposed. The continuum removal algorithm removes the effects of these other absorptions in the spectrum (Clark and Roush, 1984; Clark 1999). For instance, a sloping continuum modifies the appearance of an absorption feature by causing a shift of the local minimum in the curve (e.g. see Clark, 1999), and can also result in the absence of a local apparent minimum. If the minimum in the reflectance spectrum was used as a guide, the apparent minimum would shift with changes in contaminants and grain size. However, if the continuum is removed, the minima show a more stable position. Continuum removal normalizes the spectra, thus reducing the effects of lighting geometry on the level of the spectrum, as well as effects of contaminants and grain size variations (see Clark, 1999 and references therein).

Including enough wavelength range in the spectral data is important to accurately define the continuum. Thus, the definition of each continuum includes a wavelength interval on each side of the feature (Figures 1 and 2). We have implemented straight line continua in Tetracorder. The normalization results in a continuum-removed feature such as those in Figures 2A and 2B, which can then be compared to other spectra such as reference spectra of pure materials. In Figure 1A, which shows spectra from some materials in Appendix A , continuum endpoints for diagnostic iron absorptions in goethite, hematite, and jarosite are shown along with a remotely sensed spectrum from Cuprite NV. Strong atmospheric water absorptions at 1.4 and 1.9 µm in the remotely sensed spectrum have the potential to interfere with spectral identifications, as do water absorptions in the surface materials, hence the continuum endpoints are selected to avoid these regions. This is most obvious in the selection of endpoints in the jarosite spectrum (Figure 1A), where the right side of the jarosite absorption band extends into the atmospheric water band. Similarly, in Figure 1B, which shows spectra from materials in Appendix A, the alunite continuum endpoints were selected to avoid the edge of the atmospheric water absorption. Such careful selection of continuum endpoints is crucial to the knowledge base of the expert system.

2.3 Shape-Matching Algorithm.

The apparent depth, or strength of an absorption feature relative to the continuum is dependent on the intrinsic absorption strength, the grain size and abundance of the material as well as the abundance, absorbing nature, and grain sizes of the other materials mixed with the sample (e.g. Clark and Roush, 1984). The spectral feature depth is generally proportional to the abundance of the material in the sample (holding grain size constant). The depth of a feature increases to a maximum with larger grain size, then decreases as absorption dominates over scattering (Clark and Lucey, 1984; Lucey and Clark, 1985; Clark, 1999). The apparent depth of an absorption feature, D, relative to the surrounding continuum in a reflectance or emittance spectrum (Clark and Roush, 1984) is:

D = 1 - Rb/Rc, (eqn 2)

where Rb is the reflectance at the absorption-band center (the minimum in the continuum- removed feature), and Rc is the reflectance value of the continuum at the wavelength of the band center (Figure 3).


image FIGURES/fig3.plot.tgif.gif
Larger 14 KB image

Figure 3. Characteristics of an absorption feature. A continuum interval is chosen on each side of the feature to reduce noise. The continuum intervals in this example are about 30nm wide. A continuum is fit between the end points. The reflectance at the band center (Rb) and the corresponding continuum reflectance at the band center (Rc) are found to compute the band depth, D. The continuum is removed by division from both the library reference spectrum and from the unknown.


Owing to the presence of many materials, the diagnostic spectral features of materials measured remotely are almost always much weaker than those of pure reference materials. Finally, variations in lighting and local topographic slope affect the apparent reflectance level of materials. These factors do not allow direct comparison of spectra of reference materials to remotely sensed spectra except in highly specialized cases. Because geologic and biologic materials are almost always complex mixtures, comparisons must be made after normalizations that remove the complicating effects and isolate the diagnostic shape of the spectral features in question.

The Tetracorder shape matching algorithm is carried out in a two step process. First, the local spectral slope (the "continuum") is estimated and removed both from reference and observed (unknown) by fitting a straight line to predetermined wavelengths that straddle the diagnostic spectral region (or regions) of these spectra, then dividing these lines out of the observed and reference spectra (equation 1a, b). The continuum wavelength ranges for all materials and their diagnostic spectral features in our library are presented in an electronic supplement Appendix A. Table 1 shows a couple of examples from the expert system.

Because of the near universal weakness of remotely sensed features relative to those of pure materials, the intensity of the features must also be normalized prior to comparison. For example, the features in the AVIRIS spectrum from Cuprite, Nevada in Figures 1A and 1B are weaker than the features in any of the reference spectra. Tetracorder normalizes the intensity of the reference to that of the unknown by changing the spectral contrast of the continuum-removed reference over the diagnostic range to best match the continuum-removed unknown spectrum over the same range (Figures 2A and 2B). The continuum-removed depth (which we call "spectral contrast") in a reference library spectrum absorption feature can be modified by a simple additive constant, k, so that a shape match between the unknown and reference feature can be performed. We simultaneously perform the comparison between reference and unknown by determining the contrast that maximizes the correlation between reference and unknown. Equation 3 governs this process:

Lc' = (Lc + k) / (1.0 +k), (eqn 3)

where Lc' is the modified, continuum-removed spectrum that best matches the observed spectrum. If k is less than zero, feature strength (spectral contrast) increases; if greater than zero, feature strength decreases. Equation 3 can be rewritten in the form:

Lc' = a + bLc, (eqn 4)

where

a = k /(1.0 + k), and

b = 1.0/(1.0 + k). (eqn 5)

Equation 4 linearizes the spectral feature strength problem, so a direct solution can be found without iteration. In Equation 4 we want to find the a and b that gives a best fit to the observed spectrum Oc. The solution is found using standard linear least squares:

a =  ( O<SUB>c</SUB> - b L<SUB>c</SUB>)/n, and
	 b= sum O<SUB>c</SUB>L<SUB>c</SUB> - ( O<SUB>c</SUB> L<SUB>c</SUB>)/n over
		 L<SUB>c</SUB><SUP>2</SUP> - ( L<SUB>c</SUB>)<SUP>2</SUP>/n and
              k = (1-b)/b, also: k = a/(1-a) (eqn 6).

where n is the number of spectral channels in the fit.

Finally, the correlation coefficient, F, to the fit is derived for that feature:

b' = O<SUB>c</SUB>L<SUB>c</SUB> - ( O<SUB>c</SUB> L<SUB>c</SUB>)/n over
	    O<SUB>c</SUB><SUP>2</SUP> - ( O<SUB>c</SUB>)<SUP>2</SUP>/n and
	F = (b b')<SUP>.</SUP>. (eqn 7).

The fit, F, is a measure of how well the spectral features match. Tetracorder uses the highest fit value to decide which spectral feature is best matched by a given library reference feature, independently of the feature depth, and thus, independent of the abundance of the material. Figures 2A and 2B illustrate graphically and numerically the matches between a remotely sensed spectrum and several Tetracorder library minerals.

The effect of different mixing types on shape-matching is shown in Figures 4A and 4B. Figure 4 shows the very distinct difference between spectra of linear (areal) and intimate mixtures of alunite and jarosite. Both shape-matching and linear unmixing algorithms that included all wavelengths would have difficulty with properly identifying cases involving the same components and abundances but different mixture types. However, the spectral shapes of the mixtures in diagnostic regions of the two minerals are similar (Figure 4B) and the components could be properly identified.


image FIGURES/fig4a-3.gif
Larger 171 KB image

Largest 219 KB image

Figure 4A. Reflectance spectra of alunite, jarosite and mixtures of the two. Two mixture types are shown: intimate and areal. In the intimate mixture, the darker of the two spectral components tends to dominate at any given wavelength. In an areal mixture, the brighter component dominates. The areal mixture is strictly a linear combination and was computed from the end-members, whereas the intimate mixture is non-linear and the spectrum of the physical mixture was measured in the laboratory. Jarosite dominates the 0.3 to 1.4-µm wavelength region in the intimate mixture because of the strong absorption in jarosite at those wavelengths and because the jarosite is finer grained than the alunite and tends to coat the larger alunite grains.




image FIGURES/fig4b.alunfeat.tgif.gif
Larger 11 KB image

Figure 4B. Continuum removed spectral features of alunite and alunite plus jarosite mixture spectra from Figure 4A. The features for the pure alunite, intimate and areal mixtures are very close to the same. The Tetracorder feature least squares fit of the pure alunite feature to the intimate mix feature has a correlation coefficient of 0.986 and for the areal mix 0.979. Variations in grain size, and partial vegetation cover contributing to an imaging spectrometer pixel would cause variations in fits of similar magnitude.


Variations in grain size usually do not lead to strong variations in band shape so Tetracorder's task at these wavelengths is to remove the effects of the continuum, reflectance level, and band intensity in order to carry out shape comparisons with reference spectra. For example, continuum removed and intensity normalized spectra of the mineral hypersthene from the pyroxene mineral group, as a function of grain size, is shown in Figure 5. For a wide range of grain sizes, the shape of the diagnostic feature is similar. Clay minerals which naturally have fine grain sizes show much less variation than do pyroxenes. However, some absorptions are so intense that the absorptions are saturated and their widths change with grain size. Hematite and goethite absorptions in the UV and near 0.9 µm often display these properties (e.g. see Clark, 1999). In these cases, the shape of the feature can indicate grain size independent of abundance. The pyroxene spectra in Figure 5 show this effect weakly. Very large pyroxene grains can be spectrally distinguished from very fine grains in this case. However, the pyroxene absorptions shift in wavelength as a function of composition, making grain size determinations difficult to separate from compositional variations if a sample is composed of more than one pyroxene composition.


image FIGURES/fig5.pyx.gif
Larger 43 KB image

Figure 5: Reflectance spectra of a pyroxene as a function of grain size. As the grain size becomes larger, more light is absorbed, the reflectance decreases, and the absorption feature bottoms flatten (from Clark et al., 1993b). Note the trace tremolite contamination causing the narrow absorption features near 1.4 and 2.3 µm. The broader pyroxene absorptions is the continuum background to the narrow tremolite features. This example shows how the components in a mixture can be readily identified even though no unmixing analysis is done. The component features are "spectrally separated" in wavelength. Continuum-removed feature fits (top) show the similarity in shape of features at different grain sizes. The small change in shape can be used to coarsely determine grain size from the spectra, independent of abundance.


2.4 Multiple Spectral Features: Weighted Results Algorithm.

For many materials (Table 1, Appendix A) we have defined only a single diagnostic spectral feature in the 0.4 to 2.5-µm spectral range for use in Tetracorder, despite the fact that for many of these materials other diagnostic features exist. In many of these cases, one of the multiple diagnostic features present is considerably stronger than the others. We have found in practice that including diagnostic features that are too weak actually degrades performance because the ratio of spectral contrast to sensor noise is low and inclusion of these features adds noise to the weighted fit (Swayze et al., 2002). However, in a few cases multiple diagnostic features are strong and we have chosen to exploit them. Because the Tetracorder concept includes direct comparison of quality of fit metrics, the use of multiple features raises the issue of how to normalize our fitness metrics so fits to different materials can be compared.

Tetracorder uses the relative sizes (including the widths) of reference continuum-removed-spectral features to compute a weighted fit which is used in the decision process. The continuum removal and feature fits to multiple features in a spectrum are illustrated in Figures 1A, 1B, 2A, and 2B. Three parameters for each spectrum are computed: weighted fit, Fw, weighted depth, Dw, and weighted fit times depth (fit*depth), FDw. They are computed by weighting the relative areas of the absorption features of the reference library spectrum:

F<SUB>w</SUB> = sum c<SUB>i</SUB>F<SUB>i</SUB>,
	D<SUB>w</SUB> = sum c<SUB>i</SUB>D<SUB>i</SUB> ,		and
	FD<SUB>w</SUB> = sum c<SUB>i</SUB>F<SUB>i </SUB>D<SUB>i</SUB>,
        (eqn 8)

where "i" is the feature number, ci is the relative fractional area of library reference features between each feature and its continuum:

sum c<SUB>i</SUB> = 1.0. (eqn 9)

Fi, Di are the fits (the correlation coefficients) and depths of the corresponding features. The feature depths and relative areas are calculated from the fitted library reference features. The relative area is found by integration of the continuum-removed feature (the area between the feature curve and 1.0) divided by the sum of the areas of all features analyzed for each reference material. Consider an observed spectrum with weak absorption features. The calculation of the areas of these features may be dominated by noise and could lead to bias in the decision process. Thus, the areas are computed from the library reference spectra. This is also a computational advantage because they are computed only once and then used in tests against multiple unknown spectra.

The areas of features help choose the correct solution. Consider the two muscovite features in Figure 1B (top) near 2.2 and 2.35 µm. We will call them feature A and B, respectively. Feature A has a weight of 0.7 and feature B has a weight of 0.3. If an observed spectrum had two features with areas A=0.2, and B=0.8, it would not be a pure muscovite. The fit to feature A would be given more weight, but being weaker, it would be more dominated by noise. The weighted fit would likely be lower and another mineral or mineral mixture would likely show a better fit, thus reducing the probability that pure muscovite is the correct answer.

In the above discussion, we described how we derive our key metrics used for candidate identifications, but sometimes these are not final decisions. In the next section we describe how we use these metrics, and other constraints, to finalize our decisions.

2.5 Spectral Constraint Algorithms

In the shape matching exercise performed on each library entry for an unknown we extract all the constraints we require to perform and refine detection. The identification of materials from their spectra is constrained by 1) the goodness of fit of a spectral feature to a reference, 2) reflectance level, 3) continuum slope, and 4) presence or absence of key ancillary spectral features.

If a spectral match is too poor, we assume either that the material is not present, or that signal-to-noise ratios were too low to allow detection. If the fit metric is below a set threshold, the material is rejected as not detected. It is possible, and frequently happens, that no library entry passes this constraint which leads to Tetracorder declaring a "no answer," rather than forcing a solution dominated by noise. Tetracorder evaluates constraints imposed on each spectral analysis (example constraints are shown in the command set in Table 1).

Because some materials or combinations of materials can mimic one another in terms of shape, we add additional constraints. The first is that the reflectance level of the continuum of the observed spectrum must be consistent with the presence of a particular candidate material. If not, that particular identification will be rejected. Figure 6 shows an example of a continuum difference between two dissimilar materals (water and olivine). If only a continuum removed spectral feature were analyzed, they would appear similar. Indeed, we have encountered conditions where shallow water mapped as iron-bearing minerals such as olivine. In these cases, we know water has a low reflectance, and has different continuum slope than many minerals (continuum slope is discussed below).


image FIGURES/fig6.plot.water.oliv.tgif.gif
Larger 20 KB image

Figure 6. Continuum-removed spectral features sometimes have similar overall shapes for different materials. Here, an example for olivine and water is shown. The blue scattering peak at green wavelengths is similar in position to the green peak in olivine. When the continuum is removed from both spectra (top), the broad olivine feature roughly tracks the water response. Tetracorder would normally only compare the correlation coefficients of the fitted features to check for olivine. Constraining reflectance levels and the slope of the continuum can help distinguish between the two cases. Water has a lower reflectance and a negative continuum slope but olivine has a higher reflectance and positive slope.


In setting reflectance thresholds, consideration must be given to lighting conditions, contaminants that might be present, and in special cases, like water, what other effects might influence levels and spectral features. For example, shadows are dark, and on the Earth, are illuminated by the blue sky, so spectra of shadows can be similar to that of lakes and by setting continuum level constraints, dark shadows can be rejected from being identified as certain materials. The constraint can be set to a minimum and maximum for each spectral feature. If a continuum level does not fit within the minimum and maximum specified, the candidate material's detection is rejected. We also use this constraint frequently to reject very dark materials because detection is signal-to-noise ratio dependent. The darker a spectrum, the more spectral features are suppressed when in an intimate mixture and the higher the signal-to-noise ratio required to detect that feature. Thus, for many materials the threshold is set at 4% reflectance to eliminate false alarms owing to noise. In principal the fit constraint should catch low signal-to-noise ratio cases, but noise can cause features to occasionally pass this constraint. The continuum reflectance constraint catches additional cases. In practice we employ the right, center and/or left continuum level limits. We should note that continuum level constraints are not a match to the reference material continuum, because mixtures can change both the average reflectance and local spectral slopes.

Example continuum constraints are shown in Table 1. Examine the entries for "Lawn_Grass GDS91." The continuum constraints require a minimum reflectance of 0.05 for the 0.69-µm chlorophyll feature and 0.10 for the 0.95 and 1.15-µm water absorptions. This requires an increasing reflectance from visible to near-IR wavelengths as is commonly observed in spectra of vegetation. The levels are set lower than typical grass reflectance levels because the vegetation may by illuminated by low sun angle (e.g. north slope of a mountain in the northern hemisphere).

Just as the continuum reflectance level must be consistent with experience, so to does the local spectral slope across the continuum. The slope constraint we apply is:

slope = Oleft/Oright > X, or

slope = Oright/Oleft > X, (eqn 10)

where Oleft and Oright are the observed continuum levels on the left and right sides of the feature center, respectively and X is a threshold. If the slope in the spectrum falls below the level set by equation 10, the fit for that feature (equation 7) is set to zero. For example, this constraint is needed in the mapping of iron-bearing soils. The curvature of the water spectrum in the near IR is sometimes very similar to the hematite absorption feature after continuum normalization, so occasionally the goodness-of-fit metric can pass remotely sensed water as an Fe-bearing mineral. However, the strong positive visible slope is an additional required characteristic of some minerals such as hematite, in addition to its absorption feature shape. Water is distinguished from Fe-bearing minerals based on slope: an approximately flat or positive sloping spectrum is indicative of Fe-bearing soils, while a negative sloping spectrum is indicative of water spectra. An example of slope differences of water versus an that of an Fe-bearing mineral is shown in Figure 6.

2.6 Not-Feature Algorithm

Despite our best efforts including the constraints above, some materials or mixtures of materials have virtually identical spectral features at some diagnostic wavelengths which are too difficult to distinguish. However, invariably they differ markedly at other wavelengths. For example, Figure 1B shows spectra of the minerals montmorillonite [(Na,Ca)0.33(Al,Mg) 2Si4O10(OH)2*nH2O] and muscovite [KAl2(Si3Al)O10(OH,F)2]. These minerals share a very similar and strong diagnostic absorption near 2.21 µm. Clearly in this case the term "diagnostic" is used loosely. More strictly, the 2.21 µm feature is compelling evidence for the presence of montmorillonite, illite or muscovite. However, montmorillonite has no absorption feature near 2.35 µm but both illite and muscovite have a feature there. Therefore, in testing for montmorillonite, we attempt to detect a 2.35 µm feature. If such a feature is detected, montmorillonite is rejected. In this particular case, we borrow the detection parameters from the Tetracorder test for the 2.35 µm feature of muscovite. If the depth and fitness parameters for this feature exceed our defined thresholds we consider the feature detected, and therefore montmorillonite must be rejected. An analogous example is illustrated in Figures 1A and 2B . In this case, the feature fits show there is no 2.3-µm feature present in the Cuprite AVIRIS spectrum shown, therefore muscovite can not be present so muscovite is rejected (while Figures 1B and 2B show what appears to be one continuum from about 2.15 to 2.43 µm, really two features are defined with a common continuum interval near 2.3 µm).

The detection of a feature in the wavelength position of a NOT feature rule can be accomplished by measuring more than one metric. The metrics currently implemented are band depth threshold (equation 2) and threshold relative to the strength of another feature. For example, in Table 1, the NOT condition to detect montmorillonite is a 2.3-µm muscovite feature that is 12% the strength of the montmorillonite 2.2-µm feature. If the 2.35-µm feature strength was determined to be greater than 12%, montmorillonite would be rejected. In practice, identifying montmorillonite-muscovite (or illite) mixtures is possible. The 2.35-µm feature normally is about 30% the strength of the 2.2-µm feature in muscovites and illites, so constructing several tests with different levels of the NOT feature could, in principle, be used to derive several levels of montmorillonite + muscovite/illite mixtures.

2.7 Diagnostic/Optional Features

Some materials have spectra with less-intense and/or subordinate absorption features in addition to stronger diagnostic absorptions. The diagnostic absorptions may be detectable if the abundance of the material is high enough. However, weaker absorptions might be concealed by absorptions from other materials or might be too weak to be detected at low abundances of the material. In Tetracorder, every feature is assigned as either diagnostic or optional. If a feature is defined as diagnostic or optional, and it is detected in the spectrum, its weighted fit will be included in the analysis and decision process. If an optional feature is not detected, its fit and depth are set to zero and the material might still be identified by the presence of other absorption features (either diagnostic or optional). Of course if a feature is defined as diagnostic, the feature must be detected in the spectrum to identify that material. If any diagnostic feature is not detected, even though other features (diagnostic or optional) for that material are detected, that material is not indicated by the spectrum and is declared to be undetected (fit, depth, and fit*depth are all set to zero).

Care was used in developing the expert system command set because a material's diagnostic feature could potentially be masked by absorptions in other materials. If that is a possibility, the feature would be flagged as optional and not diagnostic. For example, hematite and goethite spectra (Figures 1A and 2A) have features near 0.5 and 0.9 µm. The shorter wavelength features in both minerals are more intense, and when hematite or goethite is present in low abundance, the 0.9-µm feature can be absent. Thus, the 0.9-µm feature for both minerals is optional and the 0.5-µm feature is diagnostic in the Tetracorder expert command set.

2.8 Nothing Found is an Answer

Tetracorder's spectral feature identification algorithm and supporting constraints are not forced to return a detection. Diagnostic spectral features must be present in the spectrum to find that material in the spectrum. Indeed, it is possible that all Tetracorder output is "nothing found." Not finding a material can be important. For example, in doing an environmental assessment, not finding a toxic material can be an indicator of environmental health.

2.9 Grouping Decisions

The final aspect of the materials detection process in Tetracorder is explicitly dealing with broad classes of detections. Above we discussed how mixtures of materials might have well-separated diagnostic features, in which case all such materials can be detected in a spectrum given sufficient signal-to-noise ratio. We also discussed how we deal with materials that have similar or overlapping diagnostic features by comparing the fitness parameters and exploiting ancillary spectral information. In order to achieve the various detections, Tetracorder analyzes portions of the spectrum by partitioning or isolating portions of the spectrum for various tasks needed to identify different materials. We call this partitioning groups and cases.

Tetracorder makes decisions by explicitly grouping reference materials by the wavelength ranges of their diagnostic features. For example, vegetation and iron bearing minerals with Fe2+ and Fe3+ absorptions occupy one group with their diagnostic features in the visible and very near infrared. Clays and other sheet silicates are grouped because their diagnostic features dominate the 2 to 2.5-µm region. Within a group, spectral features can confound one another so Tetracorder selects a single library entry as present. Thus, to identify mixtures within a group, reference spectra for those mixtures must be included. On the other hand, Tetracorder can in principle (and usually in practice) report a detection from each group, and therefore finds multiple components without specific reference mixture spectra. Mixtures will be discussed in more detail in Sections 3.2 and 3.3. Section 4.2 defines the groups, cases and their wavelength ranges.

Partitioned analyses could be done in parallel, or serially and Tetracorder does both. Groups are partitioned decision making analyses that are done independently but in parallel before doing any case analyses. A case is an independent analysis partition completed only after all the group analyses are completed and a specific decision is made to perform a case analysis. Case analyses can be both parallel and sequential and invoke additional case analyses. A group or case analysis can operate in one of two modes: 1) do one analysis and output an answer for that analysis, or 2) do multiple analyses, examining the results from all the analyses in that group or case, and decide which analysis gives the best answer and output that result.

Figures 1 and 2 show an example of partitioned analyses for 2 groups. In Figures 1A and 2A, multiple spectral features are compared and a decision is made to identify hematite as the best answer (Figure 2A). In Figures 1B and 2B, multiple analyses lead to the decision that kaolinite is the best answer (Figure 2B). Again, in each group, only one answer is chosen as the correct answer, but multiple groups lead to multiple answers. In this example, the Cuprite spectrum is identified as a mixture of hematite and kaolinite, even though only pure mineral spectra were used to identify these materials.

3. Tetracorder applied to imaging spectroscopy

Tetracorder was developed largely in response to the potential of imaging spectroscopy data sets provided by NASA via the Earth Observing System HIRIS, the Galileo NIMS, the Cassini VIMS and potential Mars imaging spectrometers. However, excellent data sets became available for testing using the JPL Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) sensor, for which testing and verification at test sites were accessible by our team. AVIRIS is arguably the highest performing and most widely used visible and near-IR imaging spectrometer operational today for terrestrial remote sensing. With a spectral sampling and bandpass of ~10 nm and signal-to-noise ratios exceeding 500 at most wavelengths, AVIRIS data are extremely useful for spectral analysis at the individual spectrum level. Corrections of AVIRIS data for atmospheric transmission are well developed, effective and routinely applied so our applications of Tetracorder detection can ignore atmospheric effects, except in portions of the spectrum where transmittance is so low that essentially no reflected signal is present in the data. In this section we will use AVIRIS data collected in 1995 over the Cuprite Mining District (Figure 7) in Nevada to illustrate the processes described above.


image FIGURES/fig7a.cuprite95.psdo.true.color.gif
Larger 298 KB image

Figure 7A. Pseudo-true color composite AVIRIS image of Cuprite, Nevada. The image has a width of 10.5 km (614 pixels) and a length of 18 kilometers (972 pixels). The spacing between pixels is 17 meters, and the size of each pixel is about 20 meters.




image FIGURES/fig7b.cuprite95.tmratios.gif
Larger 416 KB image

Figure 7B. The same AVIRIS data shown in Figure 7A were used to synthesize Landsat TM response for each of the 6 TM bands in the AVIRIS spectral range. The synthesized data have the same spatial resolution as AVIRIS (17 meters not 30 meters of actual TM data) so only the effects of spectral bandpass and sampling are compared. The TM band ratios shown here are commonly used by researchers to discriminate surface materials, and the many colors seen here show the power of this method. The color composite shows the hydrothermal alteration system appearing different from surrounding unaltered areas, but specific mineral identifications are not possible.


3.1 Comparison of Analysis Methods

Let's first examine some common analysis methods, and compare results to a spectroscopic analysis. This range of examples shows that material identification results do not improve much in ability to discriminate materials until the analysis becomes quite sophisticated. This may explain how some in the scientific community have not seen the advantage of imaging spectroscopy. Simple spectral analyses produce results that are hardly any better than that achievable with data from broader-band multispectral systems. To illustrate these effects, we synthesized broader-band systems from the AVIRIS data so that there is no change in spatial resolution. The signal-to-noise of the synthesized bands is very high, much higher than actual systems such as TM, and has better atmospheric correction. Thus, the examples and observed limitations in the more simple methods are not limited by signal-to-noise ratio.

Multispectral systems contain only a few broad spectral channels. Spectral analysis is not possible with such systems, so more simplistic approaches have been developed to produce "indicator" maps. The ratio of the signals through two different filters (or bands) is called a band ratio and is probably the simplest computation beyond a color image. A color image derived from band ratios is called a color-ratio composite.

A pseudo-true color representation made from three broad bands simulating the color response of the human eye (Figure 7A) shows lighter areas that may be indicative of hydrothermal alteration. Band-ratios can be computed from measurements of any two spectral channels, whether from broad-bands or narrow spectral bands. The broad bands of Landsat Thematic mapper (TM) were computed from the AVIRIS data and ratios between the 5 broad bands were calculated. A color image of TM band ratios, called a color-ratio composite (Figure 7A), shows that many surface materials are distinguished, but specific mineralogic identifications can not be made. Further, sometimes (but not always) the same color is caused by completely different mineralogy (see verified mineralogy in Swayze, 1997). Images such as these are guides for field investigations, and are not mineral maps.

In the next example, let's assume we were interested in finding locations of the clay mineral kaolinite [Al2Si2O5(OH)4]. Well-crystallized kaolinite is the primary mineral used in the production of ceramics. Kaolinite is also commonly found in hydrothermal alteration systems which may contain deposits of economically valuable minerals, such as gold. Weathering of pyrite-rich hydrothermally-altered rocks can produce acidic waters which can pollute drinking water sources (e.g. Swayze et al., 2000 and references therein). Hydrothermal systems may also have provided an environment where life evolved on the Earth, and possibly on Mars as well (Shock, 1996; Farmer, 1996 and references therein). So, for this discussion, let us assume we are interested in locating well-crystallized kaolinite and not other minerals in the kaolinite group, nor other clay minerals.

The sequence of images in Figure 8A to 8G shows the results of increasingly sophisticated analyses. A kaolinite spectrum is shown below each image to illustrate the analysis method. Our reference spectral library (Clark et al., 1993a) entry for kaolinite shows it has a high reflectance (>0.7) at visible wavelengths, and a reflectance >0.7 at 2.1 µm, higher than the reflectance typically found for soils (reflectance ~0.2). Could reflectance alone show kaolinite occurrence? The infrared albedo image (Figure 8A) shows many levels of intensity (apparent surface reflectance) and does not discriminate kaolinite from other minerals.


image FIGURES/fig8a-c.gif
Larger 268 KB image

Figures 8A-C. Cuprite analysis examples. A) 2.07 µm albedo image. B) The AVIRIS-synthesized TM band ratio image is shown for TM band 5 divided by band 7, which indicates a decreasing spectral reflectance from 1.5 to 2.2 µm. Such slopes are common in spectra of clay minerals, but are also common in spectra of carbonates, sulfates, and generally moist or wet soils, rocks, or other materials. C) Image of the AVIRIS 2.07-µm reflectance divided by the 2.17-µm reflectance shows possible kaolinite absorption or strong spectral slope between these two wavelengths. This ratio should discriminate absorptions near 2.2 µm, but surprisingly, there is little difference compared to the TM ratio image in Figure 7B.




image FIGURES/fig8d-e.gif
Larger 170 KB image

Figures 8D-E. D) A "three-point band depth" using AVIRIS data for Cuprite, Nevada shows locations where an absorption feature, centered near 2.2 µm, like that in kaolinite, is expressed in spectra of surface materials. In all these images, brighter levels indicate a greater spectral abundance of that material. E) A corresponding "three-point band depth" for alunite shows locations where an absorption feature, centered near 2.17 µm, is expressed in spectra of surface materials. There is little difference between the alunite and kaolinite 3-point band depth images, showing that more sophistication is needed to spectrally discriminate between these two minerals.




image FIGURES/fig8f-g.gif
Larger 82 KB image

Figures 8F-G. F) A spectral feature shape match was applied to the imaging spectrometer data with a kaolinite 2.2-µm feature shape. The resulting image distinguished kaolinite better but spectral features from other minerals also contribute to the bright areas in the image. G) The Tetracorder decision results in an image of kaolinite locations shows much less than seen in the other images. The Tetracorder expert system has determined locations of kaolinite versus other minerals based on comparison to a library of spectral features from many materials.


A commonly used "clay" mineral discriminator is a TM band ratio: TM channel 5 divided by 7. A high 5/7 ratio indicates a decreasing slope to the spectrum from 1.6 to 2.2 µm, a trait commonly found in clay minerals. The ratio image of TM bands 5/7 (Figure 8B) shows areas of decreasing IR slope as lighter portions of the image. The TM ratio image shows the two alteration centers in the middle of the image as lighter, indicating possible clay content, but does not distinguish kaolinite versus other clay mineralogy. Similar slopes are also found in spectra of any mineral or soil containing OH or water and in spectra of carbonates (e.g. Clark et al., 1993a) so the spectral slope is not a unique indicator of kaolinite.

The higher spectral resolution of AVIRIS allows refinement of the band ratio concept (Figure 8C). A "narrow-band-ratio" image computed from the reflectance at wavelength 2.07 µm divided by the reflectance at 2.17 µm shows the spectral slope over a short wavelength range (lighter in the image means greater slope), and has a better probability of indicating the presence of a narrow absorption feature than does a broad-band ratio. In this case, a few more areas became darker in the image in Figure 8C compared to Figure 8B. It may be somewhat surprising that there is so little difference between Figures 8B and 8C. That is because there are several minerals in the imaged area that have strong absorption near 2.2 µm, including alunite [KAl3(SO4)2(OH)6], muscovite [KAl2Si3O10(OH)2], and the clay montmorillonite [(Na,Ca)0.33(Al,Mg)2Si4O 10(OH)2*nH2O].

Because AVIRIS spectra resolve absorption bands, a simple "three-point band depth" image can be computed (Figure 8D). Such a depth computation further restricts the analysis to detect a relative minimum between the two "continuum" end points. The image is coded to show increasing absorption strength as increasing brightness level (whiter). The results in Figure 8B, C, and D show a lot of similarity, which could lead the analyst to conclude there is extensive kaolinite in the region. However, as noted above, other minerals also have absorptions near 2.2 µm that overlap the kaolinite absorption. A three-point band depth computation for alunite (Figure 8E) shows nearly the same image as that for the kaolinite 3-point band depth. Field checking shows that neither the alunite nor kaolinite is as extensive as indicated by these images (Swayze et al., 1992, Clark et al., 1993b, Swayze, 1997). More sophistication is required to derive the correct locations of kaolinite deposits.

A simple 3-point band-depth analysis does not examine the unique shape of the kaolinite doublet feature nor distinguish it from similar absorptions in spectra of other minerals. Our shape matching algorithm (equations 2-8) was applied to the Cuprite data with a kaolinite spectral library reference spectrum (Figure 8F). Note that more of the imaged area is now dark compared to the images in Figures 8B-E. The shape matching gives results closer to the actual locations of kaolinite, but it is still not completely correct Other minerals with absorptions near 2.2 µm also show a match to the least squares fit, but usually at reduced intensity (note the alluvial fans are darker than the source regions for the fans). Alunite, in particular, gives a response very similar to kaolinite in this analysis as it did in the 3-point band depth analysis (Figure 8D, E).

It should be clear from these examples, that there is no simple algorithm that can be applied to imaging spectroscopy data to map a single mineral (or material) without inadvertent inclusion of other materials. There are too many other common materials and minerals with absorption features similar to each other for simple analyses, like shape matching algorithms, to map materials robustly. Successful materials mapping must be able to distinguish between materials with similar spectral properties.

The Tetracorder map of kaolinite is shown in Figure 8G. The distinguishing step is the Tetracorder decision process: evaluating multiple tests of the spectral feature to determine the best match. The difference between the image in Figure 8F and that in 8G is that for most of the pixels in 8F, Tetracorder determined which pixels contained minerals other than well-crystallized kaolinite. This is the quantum leap of the Tetracorder analysis methods over the previous generations of analyses: the comparison of results competition and decision process allows materials to be uniquely distinguished, identified and mapped.

A simple pattern matching algorithm, like that in Figure 8F produces image maps that are dependent on how much the image is contrast stretched. Compare Figure 8F to 8H. Any curvature to the spectrum will give a response in the shape matching analysis. If the resulting image is stretched hard, so that the smallest absorption depths show as white, one might conclude that kaolinite is present throughout many parts of the image (Figure 8H). This has led to the analyst needing to adjust images based on their knowledge of the area to indicate what is there. In other words, the results are subjective.


image FIGURES/fig8h-i.gif
Larger 25 KB image

Figures 8H-I. These two images, H and I, are the same as in Figure 8F and 8G, except that all of the DNs greater than zero in the image are stretched to appear white. The feature fit image (H) shows kaolinite-like response over large areas of the image, where we know kaolinite is not present. The result from such a simplistic analysis is subjective and depends on how much the analyst stretches the resulting image. The Tetracorder result (I), however, shows little change from Figure 8G. The main difference is that low abundance areas of kaolinite now appear white in the image. The result agrees with field work (Swayze, 1997).


The decision making process of Tetracorder, however, is more objective. Tetracorder has made the decisions as to which spectra indicate the presence of kaolinite, and which ones indicate other minerals through its spectral identification process. In our Cuprite example, Figure 8I shows the maximum stretch for the Tetracorder analysis, and it appears similar to the less stretched image in Figure 8G. Compare that result to the maximum stretch in the least squares single feature analysis (Figure 8H), which shows large portions of the image as kaolinite compared to the less stretched Figure 8F image. The Tetracorder results are less subjective because the decision process has robustly identified spectral features. The hard stretch of the Tetracorder result (Figure 8I) shows where the well-crystallized kaolinite occurs with high confidence even at low feature strengths (which correlates with low abundance). The kaolinite that mapped in Figures 8G and 8I agrees with field observations and verifications (Swayze 1997 and references therein).

Because Tetracorder spectrally identifies materials, a color coded map of materials can be constructed that presents more information than a single material image (as in Figure 8), and is more specific than a color-ratio composite (as in Figure 7B). If the image in Figure 8G is colored (in this example, yellow) such that a stronger absorption feature strength indicates a brighter intensity of that color, and other mineral images are coded as different colors, the color-coded images can be combined into mineral maps (Figures 9A and 9B). Mineral maps such as these have been extensively field checked to confirm the accuracy of the algorithm (e.g. Swayze, 1997). Comparison of the images in Figure 7 and 8 to those in Figure 9 show how much more information can be derived from imaging spectroscopy compared to that from broad-band remote sensing. The minerals in Figure 9A corresponds to the Tetracorder group analysis for the electronic absorptions in the visible and near-IR, while those in Figure 9B correspond to the group analysis for vibrational absorptions occurring primarily in the 2-2.5 µm wavelength region.


image FIGURES/fig9a.cuprite95.1um_map.tgif.gif
Larger 294 KB image

Figure 9A. Tetracorder mapping results from AVIRIS imaging spectrometer data over Cuprite, Nevada. The Tetracorder results distinguish kaolinite minerals as well as many others, shows a much more limited extent of kaolinite than was seen in Figures 8A-H, separate kaolinite from alunite areas, and also indicates where both occur as mixtures.




image FIGURES/fig9b.cuprite95.tgif.2.2um_map.gif
Larger 287 KB image

Figure 9B. Tetracorder mapping results from AVIRIS imaging spectrometer data over Cuprite, Nevada. The Tetracorder results distinguish kaolinite minerals as well as many others, show a much more limited extent of kaolinite than was seen in Figures 8A-H, separate kaolinite from alunite areas, and also indicate where both occur as mixtures.


3.2 Decisions in real-world situations

The decision processes we've discussed so far are straightforward but give no measure of the difficulty in making correct decisions in real-world situations. To begin, we will use data over "alunite hill" in the Cuprite scene where well-exposed outcrops of muscovite, alunite, kaolinite and other minerals are present. Figure 10 shows index images of Cuprite from Figures 9B (mineral map) and Figure 7A (pseudo-true color) and higher spatial resolution, low-altitude AVIRIS data from 1997. Figure 10 also shows a traverse across a portion of alunite hill that has been field sampled and the fits derived from Tetracorder analyzes of low-altitude AVIRIS data for 6 minerals/mineral mixtures along the traverse. Of note are small differences between some of the decisions: fit value differences of <0.01 when the fit is greater than 0.95 are significant!


image FIGURES/fig10.plot.traverse.2ab+loc3.tgif.gif
Larger 103 KB image

Figure 10. Tetracorder mapping over "alunite hill." The upper left index image is a portion of Figure 9B for context. The other index images zooming in are low altitude AVIRIS data from a 1998 flight having approximately 2.3 meters/pixel. A traverse across a portion of alunite hill shows varying mineralogy along the traverse. The samples along the traverse line were collected and analyzed in our laboratory, including XRD analysis. The Tetracorder fit values for the six dominant minerals/mixtures are shown in the plot. Note the scale change. The colors of the curves match the colors in the map and those along the traverse line. Tetracorder chooses the highest fit in the identification process. There are subtle differences in fit values for which decisions are made: sometimes differences less than 0.01 are significant. The kaolinite outcrop labeled Y at pixel position 28 was nearly impossible to identify in the field as all rocks were similarly bleached and fine grained, but field sampling (locations numbered) at station 5 verified the Tetracorder identifications. Note the relatively small difference in fit values between kaolinite and alunite, and even smaller differences between mixtures.


We show in Figure 11A, the mapped depths, fits, and fit*depths for 6 minerals+mixtures from the analysis of the AVIRIS data that are shown in Figure 10. The band depth maps (Figure 11A, left column), where bright represents the deepest band and dark represents the shallowest, are all very similar except for the muscovite. This raises the issue of how to use these band depth maps to distinguish the two minerals. The band depth maps for kaolinites, alunites and mixtures are highly correlated suggesting that applying a simple threshold to the images will give rise to many false alarms, or in other words ambiguous detections. The same is true for the fit and fit*depth data (Figure 11A, center and right, respectively).

We could define a detection of a mineral where the band depth of one exceeds that of another, perhaps scaled to their relative strengths in the library of reference spectra. Such an "identification" based on band depth is shown in Figure 11B, left column. By choosing the maximum depth, we see that different areas show different mapped minerals. However, this method would include all the factors besides relative abundance which control band depth. Further, this method would not distinguish a pure exposure from a mixture dominated by one component or the other. Indeed, from our field data, we know the mineral maps based on maximum depth are not correct. Neither are the maximum fit*depth maps (Figure 11B, right column).


image
FIGURES/fig11a.depths+fits.fd.alunite.hill.tgif.gif
Larger 483 KB image

Figure 11A. The six dominant minerals/mixtures from the alunite hill traverse in Figure 10 were mapped using Tetracorder with the identification step turned off to illustrate the derived depths, fits and fit*depth images. Note the similarity of the images; the muscovite image shows the greatest difference. Most pixels show a response (illustrated as increasing brightness) to the least squares fitting of the features for each mineral. This is due to the similarity in the spectral features which all occur near 2.2 µm, not because all these materials are present in these locations. Compare with Figure 11B.




image
FIGURES/fig11b.depths+fits.fd.alunite.hill.MAXID.tgif.gif
Larger 188 KB image

Figure 11B. Illustration of choosing the maximum value to identify materials. The depth column (left) makes an "identification" assuming the maximum depth, fit column (middle) assuming the maximum fit and the right column the maximum fit*depth. For materials that are not maximum at a given pixel, the value of the pixel is set to zero. Can depth be used for identification? The alunite + kaolinite row illustrates why identification based on depth is incorrect: the depth and fit*depth columns show no alunite + kaolinite, inconsistent with field verification results. Only the identification based on fit agrees with verification data. Identification is based on spectral feature position and shape and not on feature strength.


Contrast the maximum band depth maps (Figure 11B, right column) with maps of maximum goodness of fit in (Figure 11B, center column). Clearly the patterns mapped by maximum fit are different than either maximum depth or fit*depth. But which is correct? Some have suggested that plotting the value of one parameter (like depth) on one mineral against the same parameter of another mineral would show the separation between the two (this is called a two-dimensional histogram). The degree of separation, however, is not necessarily indicative of a correct answer. Proof of the correct answer requires ground sampling to show the mapped minerals are correctly identified.

We conducted a traverse across part of alunite hill, doing field sampling and subsequent X-Ray Diffraction (XRD) analysis of the collected field samples. Figure 12 shows the results for sample locations shown on Figure 10. The field sample numbers (1-7) indicate the XRD sampling locations (Figures 10 and 12). In Figure 12, the spectra of the field samples, AVIRIS spectra, and reference spectral features used to make the identifications in Tetracorder are shown. Note the subtle absorption feature shifts in the Na-K alunite compared to other alunites (the arrows on the diagram are all at the same wavelength). It is small spectral changes like these shifts that mean the difference in correct spectral identifications, and why the fits between such similar spectra are so small. The XRD analysis results are shown in Table 2. The Tetracorder and XRD results for this traverse confirms that the maximum fit correctly identifies the observed mineralogy. Note the small vein of kaolinite at field sample location 5 is correctly mapped by the Tetracorder fit value (see Figure 11B, middle column). Note also that the maximum depth and fit*depth (Figure 11B) miss the alunite+kaolinite mixture, but the maximum fit correctly shows it, agreeing with the traverse results.


image FIGURES/fig12.alunitehill.travers.gif
Larger 160 KB image

Figure 12. Spectra from the alunite hill traverse from Figure 10. The spectra from the AVIRIS pixel where a sample was collected is shown in blue, the laboratory spectrum of the field sample is shown in black, and the reference spectral feature(s) used by Tetracorder for identification are shown in red. The mineral listed is what Tetracorder found and agrees with the XRD analysis of the sample (Table 2).


3.3. Tetracorder and Mixtures

When different materials have widely separated spectral features and occur in different Tetracorder groups, Tetracorder deals with mixtures by explicitly detecting the separate components. By definition, a detection of more than one material constitutes a detection of a mixture. When a mixture occurs within a group the solution is more complicated. In this case, mixtures are particularly insidious because the shape of a spectrum of a mixture is a poor fit to both of the components of the mixture.

A simple mixture series that illustrates Tetracorder identification of mixtures is shown in Figures 13a and 13b. Using a simple linear combination we constructed areal mixture spectra for a kaolinite-montmorillonite mixture series. In Figure 13A (top) we show the pure kaolinite (red) and montmorillonite (green) spectra and a 50-50 mix (blue). In Figure 13A (bottom) is a mixture series between the two end members. If we only used the kaolinite and montmorillonite end members and fit each of the mixtures to these end members with Tetracorder, we would derive the fit curves in Figure 13B (top). Where the kaolinite fit is higher than that for montmorillonite, kaolinite would be chosen as the answer (red on Figure 13B, top). Similarly for montmorillonite (green on Figure 13B, top). Clearly the fit value "droops" as the mixture approaches about 60:40 (the asymmetry is due to the area of the reference spectral features; see section 2.4). Mixture reference spectra need to be included if we want to identify mixtures.


image FIGURES/fig13a.plot_spec.gif
Larger 20 KB image

Figure 13A. Montmorillonite-Kaolinite areal mixture series. End members montmorillonite (green) and kaolinite (red) spectra and a 50-50% mixture in blue are shown (top). A mixture series at 10% intervals, for a continuum removed spectral feature is shown (bottom).




image FIGURES/fig13b.plot_fig.percent.mix.gif
Larger 16 KB image

Figure 13B. The Tetracorder feature fits for the mixture series in Figure 13A are shown. If the end member kaolinite feature is fit to each mixture spectrum, the correlation coefficient (the fit) is shown as the red or dashed line. For the montmorillonite feature fit to the mixture spectra, the fit values are shown in green or dash-dot-dot. If only the end member reference spectra are used by Tetracorder (top), Tetracroder would derive a kaolinite or montmorillonite answer where the curve is colored red or green, respectively. The crossover point occurs at about 65% montmorillonite. So any kaolinite-montmorillonite mixture more than 35% kaolinite would be identified as kaolinite. However, if Tetracorder included one 50%-50% kaolinite-montmorillonite mixture reference spectrum (bottom), kaolinite abundances from about 73% to 18% kaolinite (27% to 82% montmorillonite) would be identified as a kaolinite-montmorillonite mixture. More mixtures could be added to the reference library, but the differences between mixtures, and therefore the fit differences needed to identify different mixture abundances, becomes correspondingly less. Other factors, including grain-size effects, coatings, and other components such as vegetation contributing to the spectral signature confuse accurate mixture abundance determinations. For example, 50% dry-vegetation cover reduces the fit of the pure end members slightly (designated by the points labeled "v" on the bottom plot).


If we include a 50-50 mixture spectrum in the Tetracorder reference library (blue in Figure 13A), then Tetracorder can find one of three answers: pure kaolinite, pure montmorillonite, and kaolinite-montmorillonite mixture. Applying the mixture series to the 3 reference spectra, Tetracorder would derive the fits shown in Figure 13B, bottom. Here we see that the mixture is identified between about 27% to 81% montmorillonite, otherwise the pure end members are identified. More reference mixture spectra could be added to do finer binning of the mixture series, but note how small the differences in fit would be to distinguish between bins. Finer discrimination of kaolinite versus 50% kaolinite+50% montmorillonite would require fit differences on the order of <0.01 with fits >0.99. The two small points plotted at the pure end members, labeled "v" in Figure 13B indicate the degradation of the fit due to an areal mixture of 50% dry, non-photosynthetic vegetation on the spectra. The fact here is that other components will degrade the fits, so one wonders how accurately such mixtures could be separated. Add the issues of grain-size effects, coatings, intimate versus areal mixtures, and accurate fine abundance binning of the mixture series becomes problematic. We have included coarse mixture bins such as that shown here in the Tetracorder reference library to the extent we believe such mixtures can be differentiated. This solution to mixture problems is simple and surprisingly effective: we include a small number of mixtures as library entries, and do not try to rigorously refine the mixture amounts.

4. The Tetracorder Expert System

The development of Tetracorder has been evolutionary, and early in the development process it became evident that using an expert system structure facilitated development of spectral identification methods. We regularly add new constraints, minerals and new types of analyses to our system. Using an expert system gives us ease of modification and a logical layout for implementing our constraints as a set of parallel and hierarchical rules. However, there is no reason why our concept cannot be implemented in other ways.

Specifically an analysis of a spectrum proceeds as follows:

  • 1) Before analysis of spectra, prepare the spectral library: remove the continuum from each feature, compute relative areas and find absorption/emission minima/maxima. These computations need only be performed once thereby speeding up the imaging spectroscopy analysis.

  • 2) Choose algorithm(s) to be applied. For the spectral feature analysis algorithm, the following describes the analysis sequence. Other algorithms could have a different sequence. The sequence is repeated for all entries in all groups.
    • A) Apply initial algorithms, if necessary (e.g. data conversions; ratio to a specific curve or spectrum).
    • B) Remove the continuum for each spectral feature in the unknown using the same continuum wavelengths as those defined in the reference library spectrum (equation 1b).
    • C) Perform a feature-fitting shape analysis deriving a "fit" parameter for each spectral feature of each material (equation 7).
    • D) Apply constraints regarding the presence of diagnostic and optional features, continuum level, and slope constraints (equation 10) to each feature.
    • E) Derive the weighted fit (equation 8) for each material which pass the constraints of step D.

  • 3) Apply constraints regarding the goodness of fit values (threshold values).

  • 4) Find the highest fit for each material in each group: this is the best answer in each group.

  • 5) Implement further analysis (e.g. determine vegetation red edge or apply other algorithms not directly involved in detection, but contingent on a particular detection). These are cases. A case can call another case, so an answer from (step 4) can lead to multiple additional answers.

  • 6) Write results of all analyses. We record 3 results: the weighted fit, weighted depth and weighted fit*depth for each library entry. For those materials not chosen in step 4 and step 5, the results are set to zero. For a Tetracorder imaging spectroscopy analysis using 300 reference materials, 900 output images are created. Note, that for AVIRIS data with 224 spectral channels, the Tetracorder output can be greater than the input (more output fit images than input wavelengths).

4.1 Expert System Lessons Learned

Reflectance and emittance spectra of mixtures typically found in nature are complex not only because of the numerous mixture possibilities, but also because of the multiple scattering typically encountered by photons interacting with a particulate surface (Clark, 1999 and references therein). Thus, spectroscopic analysis must handle such conditions, including changes in grain size (Figure 5), abundance in a mixture (Figure 4), overlapping absorption bands from multiple materials in a mixture (Figure 4), level shifts due to incident lighting, the finite spectral bandpass of each spectral channel and the spectral sampling. In developing, testing, and verifying Tetracorder accuracy, we learned many lessons. Here are some of the important ones.

Continuum-removed kaolinite and montmorillonite spectra are shown in Figure 13A. How is it best to distinguish between these two minerals as well as other possible materials? The end points to the continuum-removed spectra, by definition, average 1.0 after normalization, and because all spectra are analyzed similarly, it may seem that using the continuum end points would not help in a least squares analysis. However, by comparing identification success as a function of signal-to-noise ratio we found that including continuum end points significantly improves identification success (Swayze et al., 2002).

Using fit thresholding, Tetracorder finds nothing in the spectrum as the spectral features become too weak to identify, rather than identify the wrong material. The fit threshold appears in the expert rules tables (Appendix A) as parameters with "FIT" in their labels.

Some minerals have a main diagnostic absorption plus a few subordinate absorptions, which if detected further indicate the presence of that material. Can the weaker features help in identification? Comparing identification accuracy with and without the weaker features, we found that weak features do not help (Swayze et al., 2002).

Sometimes an absorption occurs at a similar wavelength position as an absorption in some other material. Such is the case with some Fe3+-bearing minerals (Figure 1A), plants with chlorophyll absorption and some photosynthetic bacteria and algae. For example, goethite has a UV absorption near 0.5 µm, an absorption near 0.65 µm, and an absorption near 1-µm (Figure 1A), but chlorophyll in plants has an intense 0.6 to 0.7-µm absorption centered at nearly the same location as the middle goethite feature. Trace vegetation contributing to a spectrum can appear to intensify the 0.65-µm Fe3+ absorption and can change its shape. Thus, we do not use the 0.65-µm feature when trying to identify Fe3+-bearing minerals. Confusion with other spectral features that are not handled correctly in the expert system are described in the deficiencies Section 5.

4.2 Tetracorder Groups and Cases

We have chosen to program the Tetracorder expert system by separating analysis into groups and special cases. This section gives an overview of the strategy.

4.2.0 Group 0: the global view

The group 0 entries in Appendix A, Table 0 provide an overview of the spectrum and what spectral features dominate. For example, if the spectral signature of water dominates, it may be difficult to detect mineral absorption features. If there is snow or vegetation features that dominate the spectrum, then it will be difficult to detect other mineral features. All group 0 entries are included in each group (2, 3, 4, etc.) analysis. Separating these entries into a group reduces repetition in coding and in computation time. In the Tetracorder analysis, these entries are computed once and the results are added to the results for the other groups before the identification decisions are made.

4.2.1 Group 1: 1-µm broad region

The group 1 entries in Appendix A, Table 1 describe electronic and other absorption features seen in materials in the visible and near-infrared.

4.2.2 Group 2: 2-µm vibrational absorption region

The group 2 entries in Appendix A, Table 2 describe vibrational overtone and combination bands in the 2 to 2.5-µm spectral region with supporting information at shorter wavelengths when appropriate.

4.2.3 Group 3: Vegetation Chlorophyll Detection

The chlorophyll detection method presented in Appendix A, Table 3 gives a result proportional to green leaf area index. Note the entries include several combinations, from a green lawn grass spectrum (100% cover), to 20% grass cover on a hematitic soil background (material 228 in Appendix A, Table 3). These entries provide detection of trace vegetation in environments ranging from deserts to full canopy forests.

4.2.4 Group 4: Rare Earth Materials

The rare earth absorptions are quite narrow (see spectra in Clark et al., 1993a, Clark, 1999), and thus have different requirements for detection (Appendix A, Table 4). They can be detected in the presence of other broad absorptions such as those due to Fe2+ and Fe 3+. However, vegetation spectra seem to include a weak spectral structure that can be confused with trace neodymium oxide, thus a "Not feature" NOTGREENVEG variable is necessary. This implies that it is difficult to detect neodymium oxide in the presence of vegetation, so if significant vegetation is indicated in the spectrum, there can be no detection of lower abundances of neodymium.

4.4.5 Case 1: Vegetation Red Edge

The position of the chlorophyll red edge absorption feature can be found, as shown in Appendix A, Table 5, by ratioing to a fixed reference spectrum (Clark et al., 1995; Clark, 1999). Detection of shifts a fraction of the channel-to-channel spacing is possible with a ratio method. The reference spectra are unshifted divided by shifted spectra and the spectral structure in the ratio is used for the red-edge position detection. This special case is only done if vegetation is detected by an analysis in group 3.

Similar methods could be employed to track shifts in spectral features in other materials. For example, this method could be used to track temperature when an absorption shifts wavelength position with temperature, like the conduction band edge in sulfur.

4.4.6 Case 2: Vegetation Spectral Type

Vegetation spectra change with species. Appendix A, Table 6 spans a range of shapes found in the chlorophyll absorption from desert plants to lush lawns. The spectra are not unique identifiers of these species, but indicate a spectral type. In different environments, it is our experience that these spectral types often delineate vegetation communities. This special case is only done if vegetation is detected by an analysis in group 3.

4.4.7 Case 3, 4, 5: Vegetation Leaf Water Content

The absorption strength of each leaf water absorption (Appendix A, Table 7) is determined by case 3: 0.95 µm, case 4: 1.15 µm, and case 5: 1.4 µm absorptions. The strength of these features correlates with the amount of water in the plants and the fractional cover by plants in the pixel. These special cases are only done if vegetation is detected by an analysis in group 3.

5. Verification

Through our experience in analyzing millions of spectra from imaging spectroscopy data, field checking the results (Clark et al., 1990b, 1991; Clark and Swayze, 1990; Clark et al., 1992, 1993; King et al., 1995, 1999; Swayze, 1997, Swayze et al., 1992, 1996, 1999a and references therein), analyzing over 40 million synthetic spectra (Swayze and Clark, 1995, Swayze et al., 2002), and selected tests with laboratory spectra, the Tetracorder material identification system has worked well for our team and our sponsors at several agencies. However, strict statistical verification of Tetracorder results as applied to remote sensing data is extremely difficult because ground truthing of pixels 3 to 20 m in size in itself is extremely challenging at best. In the absence of highly accurate subpixel ground-truth, we assume a different stance regarding verification.

In principal we would like verification to be expressed in terms of four metrics: true positives, true negatives, false positives and false negatives. The relative occurrences of these metrics gives us formal detection performance. In practice deriving these metrics has widely varying difficulty. True positives are straightforward: determine the number of cases where Tetracorder predicted the presence of a mineral, and the mineral is present. True negatives are much more difficult because it requires establishing the predicted absence of a material to some level of confidence. For some materials this is straightforward (water, vegetation) but for minerals fully sampling a 20-m pixel is a challenge. False positives are similarly difficult: can we be sure that our field verification methodology is as effective as Tetracorder? Finally, false negatives are similarly difficult because we must find materials incorrectly predicted to be absent.

To our knowledge, no remote sensing study has ever statistically sampled a scene to determine if the mineralogy mapped correctly, especially on the scale of imaging spectroscopy identifications such as presented here. We investigate Tetracorder results by several methods, checking for true positives, true negatives, false positives and false negatives by characterizing multiple sites to the ability of our time and resources. Within the context of this claim, Tetracorder is extremely successful as shown below. We verify this claim in two ways. There are two types of verification of remote sensing imagery information: virtual (King and Clark, 2000) and in situ. Virtual verification can be done by examining the remote sensing data directly if there is sufficient spatial and/or spectral information to positively identify objects in the image by inspection. In situ verification requires direct sampling of the environment to verify the remotely sensed information.

Note that there is a distinction between "identification" and "classification" of results from the analysis of remote sensing imagery. While some objects can be identified directly from the imagery, others can only be inferred to some level of confidence (classification) that requires in situ field checking. Information gained from the analysis of remotely sensed data increases with better spatial resolution and/or spectral resolution and, in general, is maximized when both high spectral and high spatial resolution are used together. Although, in some instances only high spatial or spectral resolution is needed. For example, identifying cars requires high spatial resolution, but only low spectral resolution. Black and white imagery would suffice to identify cars; a simple color photo could be used to determine its color.

Using imaging spectroscopy data, with suitable spectral resolution, it is possible to identify specific minerals in soils, such as kaolinite, based on the wavelength position and shape of characteristic absorption features. The detection of unique kaolinite spectral absorption features allows the positive identification of the mineral and the capability to map its distribution, based on the limits of the spatial resolution of the instrument. In this case, there is no need for in-situ field checking because the spectra are of sufficient resolution to be certain of their identification. The derived kaolinite maps can be verified by examining spectra from the imaging spectrometer data at a computer monitor in a laboratory. This is virtual verification.

From the above discussion, it follows that certain things can be verified directly from the imaging spectroscopy data, both by spectral and spatial contexts. For example, bodies of water, including lakes and streams are self evident in the images. Verification by spatial context can be further proved by examining spectra. Thus, Tetracorder maps of bodies of water are readily verifiable by examining the resulting images, and further verified by comparing to published maps. Fields of vegetation are also easily verified by spatial context and examination of the unique spectral properties of vegetation (both green chlorophyll absorptions and drier vegetation showing lignin, cellulose and protein absorption features). Ice and snow, at least in terrestrial environments can also be verified by spatial context using apparent albedo and, with essentially 100% confidence, by examining the spectra. By spatial context and spectral features, we have verified that Tetracorder analysis for water, snow and vegetation has a high true positive and true negative rate and low false positive and false negative rate. There are exceptions, and these are noted in the deficiencies section, below.

Certain minerals have unique absorption features so that they can be verified by extraction of spectra at a computer without visiting the site. However, such virtual verification can only have a high certainty if the spectral feature is strong enough relative to the noise in the spectrum, and if there are no other minerals present with similar features that can confuse the identification. Examples of such cases we have encountered for minerals will be discussed in the "deficiencies" section below. It is usually evident, from our experience, that such mineral mixtures are obvious by examination of the spectra, meaning that a spectroscopist can tell the spectrum is ambiguous and that the Tetracorder result is uncertain. Such areas are commonly targeted for field investigation to determine what is really there and how Tetracorder performed and what modifications, if any are required to do a better job. It is these field investigations at hundreds of locations in many different geologic environments that have led to the current sophistication of the expert system.

Tetracorder results are based on the following experiences.

First, we have made numerous spectral measurements in the laboratory of minerals and other materials where sample purity, grain size, and measurement viewing geometry were either closely controlled or measured.

Second, we have analyzed millions of real-world spectra obtained with the NASA AVIRIS and subsequently verified the resulting mineral maps in the field (e.g. Swayze et al., 2000, 2002; Clark et al., 2001, and references in these papers and in the applications section below). We have discovered situations not envisioned in the laboratory, made spectral measurements of relevant samples encountered in the field, and devised strategies to detect them.

Third, we have encountered, investigated, and simulated many noisy mineral spectra and identified them under variable signal-to-noise ratio, bandpass and sampling conditions (e.g. Swayze et al., 2002), showing which ones were confused with each other and which diagnostic spectral features could be used to better discriminate among them.

Fourth, we have measured spectra of minerals, rocks, soils and other materials in the field where the spectroscopist could visually assess conditions, again not necessarily envisioned in previous laboratory studies, and rapidly obtain and evaluate the spectra under those conditions. (We use two portable field spectrometers which provide spectra in the 0.4-2.5-µm wavelength region on time scales of a few seconds per measurement.)

Fifth. In field studies, we collect hand samples to verify the Tetracorder results. We have brought those samples back to our laboratories, and done additional analyses on them, including laboratory spectroscopy (range 0.2 to 150 µm), X-Ray Diffraction (XRD), electron microprobe, X-Ray Fluorescence (XRF), Mossbauer spectroscopy, petrographic and binocular microscope examinations, and other analyses as appropriate. Example results of field verification studies are shown in Table 2.

Sixth, we have analyzed spectra, collected by telescopes and interplanetary spacecraft, of materials under non-laboratory conditions some of which have yet to be duplicated and measured in the lab. From this we have gained a better understanding of these materials from modeling done by us and from published research.

Table 2 presents a portion of our results of verifying Tetracorder analyses at numerous sites studied in the western United States. In verifying a Tetracorder result in the field, the field team identifies an area in the image that needs verification, and goes to that location. This is not always easy, as sometimes visual field geologic methods cannot identify minerals (e.g. a few percent clay in a soil). Portable field spectrometers are used to survey the area. Such surveys, assuming clear weather conditions, are often sufficient to verify the presence of materials in question. Hand samples are collected that appear representative of the surface materials that contribute to a pixel. Even with field spectroscopic verification, traditional field geologic techniques, like hand lens examination and acid fizz tests are done. We also collect hand samples and return them to the lab for more detailed analyses.

Table 2 shows verification analyses for over 100 examples of Tetracorder identifications from Cuprite, NV; Leadville CO; Arches National Park UT; Canyonlands National Park, UT; the Oquirrh Mountains region, UT; Summitville CO; Mountain Pass CA; Barstow CA; and Joshua Tree National Park, CA. In nearly every case, the materials predicted were identified in hand samples returned from the field pixel location by at least one analysis method. Tetracorder did not report the presence of all materials detected by XRD, nor did XRD identify all the minerals evident spectroscopically, but this reflects the differing sensitivities of the two techniques. This exercise shows that detections reported by Tetracorder are extremely reliable indicators of the presence of the predicted minerals, and that Tetracorder exhibits a high rate of true positives.

In examining the verification results in Table 2, note that different methods are sensitive to different abundances of materials. For example, the visible to near-IR (Vis-NIR) spectrum (0.4 to 2.5 µm) is very sensitive to Fe3+-bearing minerals and to clay minerals, more so than XRD (e.g. see Farmer, 1974). However, the minerals quartz and low-iron feldspars have no diagnostic absorptions in this spectral range, Vis-NIR spectroscopy cannot detect them, but XRD is very sensitive to them. For example, a red sandstone, with an obvious hematite and kaolinite spectrum may only show quartz by XRD. Laboratory reflectance spectroscopy, however, can be a definitive test of the presence of hematite and kaolinite, if the absorptions appear strong. We have found XRD to be of the greatest help in verifying mixtures. Mixtures of minerals with overlapping absorption bands can be difficult to interpret with spectroscopy, unless suitable examples are known. XRD has provided that critical link.

5.1 Known Deficiencies in the Presented Tetracorder and Expert System

The algorithms and set of expert system rules presented here, while doing an excellent job of spectral identification and mapping, are not perfect. The known deficiencies are presented below so others who may use our system will be able to appropriately interpret the results. In all cases cited below, additional research is required to address the deficiency.

Deficiency 1: calcite-epidote-chlorite mixtures. These three minerals have similar absorptions near 2.3 µm, and in a mixture can be difficult to distinguish. The chlorite absorption shifts with composition, which adds to the complexity. The pure end members are well mapped with the current expert rules, but as increasing abundances of the other two are added, results can be inaccurate. For example, initial mapping in the Animas watershed region of Colorado showed abundant calcite that could not be located in the field. Calcite is important for buffering acidic rock drainage in the area. It was found that much of what mapped as calcite was actually chlorite-epidote-calcite mixtures, including minor amounts of calcite.

Deficiency 2: shallow water/sediment suspended in water. Certain shallow water depths have a spectral signature that is a combination of water transmission and bottom reflectance that results in unusual combined absorption features. These often map as unusual minerals that have broad absorptions in the 0.6- to 1.3-µm region, for example olivine. These errors are readily identifiable by spatial image context. The solution is to include more water spectra in the spectral library simulating such conditions, and placing continuum constraints on reference minerals to restrict water like continuum slopes.

Deficiency 3: halloysite versus kaolinite plus montmorillonite/muscovite/illite. Kaolinite absorption plus another clay absorption near 2.2 µm can produce a spectral feature similar to that of halloysite. The long wavelength side of the halloysite feature rises faster than the mixtures, so additional mixture reference spectra will solve the problem. The expert rules already contain such mixtures, so some mixtures are correctly identified, but additional reference spectra are needed to increase the accuracy of the identifications.

Deficiency 4: talc-hectorite-saponite. These minerals have similar absorptions near 2.3 µm and are difficult to distinguish at AVIRIS sampling (~0.01 µm) and bandpasses (~0.01 µm). The absorption features change shape with different grain sizes. A grain size series is needed for each library mineral. Higher spectral resolution than that of AVIRIS would also help distinguish between these minerals.

Deficiency 5: wet vegetation and/or some desert vegetation maps as melting snow plus vegetation. As water melts, the absorptions shift to shorter wavelengths. The absorption due to water in plants can be broader than in liquid or solid water, thus can be similar to a combination of ice and vegetation. Succulent desert plants (e.g. cactus family) often show broader absorptions than grasses and trees. The current set of vegetation reference spectra in the expert system is limited, so some vegetation is sometimes misidentified as snow plus vegetation (this error is usually obvious when mapping desert regions with data acquired in the heat of summer). The solution is additional vegetation species added to the expert system.

Deficiency 6: clouds. Most of our data sets have been acquired under excellent clear sky conditions, so we have limited experience with the effects of clouds on mapping results. A small cloud can reflect light from adjacent surfaces and change the shape of broad spectral features (like Fe2+ or Fe3+ absorptions) causing unusual minerals to be mapped where the cloud or cloud shadow is located. Cloud and cloud shadow detection algorithms must be designed and implemented.

Deficiency 7: Calibration. While not directly a deficiency, we must note that calibration errors can translate into spectral features or modification of spectral features. This can cause Tetracorder to misidentify materials. Such errors seem to occur mostly between mixtures where the spectral characteristics are only slightly different. Thus, the accuracy of the resulting maps is directly proportional to the accuracy of the calibration to reflectance. Wavelength calibration is also important to the Tetracorder identification accuracy: shifts in wavelength calibration can result in errors in identification.

Deficiency 8: How diagnostic is the Fe2+ absorption? The expert system includes many minerals with Fe2+ absorptions. However, many such absorptions are similar in position and shape, (e.g. see Hunt, 1977, Clark 1999 and references therein). Minerals like jadeite, cummingtonite, and others often map large areas where those minerals do not exist. We typically label our maps "Fe2+-bearing mineral" and not the specific mineral identification. Fe2+-absorptions must be analyzed in greater detail to see how diagnostic such absorptions are. Studies to date that have indicated diagnostic abilities have included only a limited number of samples. When all minerals are considered, along with likely mixtures found in terrestrial environments, it is not clear how diagnostic the absorption is, except perhaps in a few cases, like that of olivine.

Deficiency 9: Alunite paleothermometry may give false high temperature results for some low temperature supergene alunites. The direct measurement of formation temperatures from the width of the 2.17-µm absorption may be restricted to those areas dominated by Al-deficient alunites. Apparently, the width of this absorption is sensitive to Al deficiencies in the octahedral layer and when there are none as in some supergene alunites, the shape of the 2.17-µm band assumes that of the high temperature configuration. More study is needed to confirm this restriction (Swayze, 1997).

Deficiency 10: The reference entry for staurolite maps vegetation plus water because overall its continuum resembles vegetation plus water, and staurolite only has one broad band and no smaller diagnostic spectral features. Thus, maps of staurolite are a spectral shape indicator only, and usually only indicate a spectral vegetation type or certain vegetation density on a background soil. The solution is to include NOT features that exclude snow and vegetation features.

Deficiency 11: Pyrophyllite incorrectly maps too much in areas of hydrothermal alteration. Low levels of pyrophyllite map because of spectral structure near 2.16 µm, the position of the strong pyrophyllite absorption feature, due to incomplete removal of atmospheric absorption features, and/or spectral structure in vegetation that may cover part of a pixel. Pyrophyllite is a high temperature indicator mineral so its detection is important for determining characteristics of hydrothermal systems.

6. Applications

Planetary surfaces are complex. The Earth's surface is probably the most complex in our solar system, showing varied geology, oceans, ice caps, abundant life and anthropogenic influences. Other planets have different geology and different surface compositions. In order to understand our own planet as well as others, we produce maps of materials and other measurable quantities. Maps of the Earth's surface can depict many themes, including geology, ecosystems, environmental, hazards, land management and global change. Geologic mapping can include the depiction of geologic formations (thus providing information on ages and placements of units through geologic time), soils, minerals occurrence, faults, mineralized zones, and aggregate for building materials. Ecosystems maps might include habitat, vegetation species/communities, vegetation health and canopy chemistry, and riparian zone distributions. Environmental applications can include acid rock drainage, oil or toxic waste spills, forest fire potential (including fuel load), water quality, and other distributions. Geologic hazards maps can include volcanic eruption potential, swelling clays and landslide hazards. Land management maps can include ecosystems impact by human activity, grazing impacts by cattle and others. Global change maps might include surface albedo as a function of time, vegetation type and global distribution, ice and snow distribution. The need for accurate and more detailed maps has never been greater. Here are a few examples of how imaging spectroscopy can provide useful data in some of these areas.

6.1 Geologic Applications: Mapping Minerals and Amorphous Materials

Spectra of minerals as well as amorphous materials show many diagnostic absorption features. Minerals are the components of rocks, soils, and geologic formations, so maps of mineralogy contain geologic information.

Continuing with our Cuprite, Nevada example, we mapped the area for minerals, and amorphous and other iron oxides (Figure 12A). The Cuprite AVIRIS data were calibrated to apparent surface reflectance to remove atmospheric absorptions and scattering and to remove the solar response (Clark et al., 1993b; Swayze 1997). Next, the reflectance data were analyzed by the Tetracorder algorithm and over 254 spectral categories were sought, producing 762 output images (254 * 3: fit, depth, and fit*depth). About 70 categories had significant fits as determined by spatial groupings of pixels with fit values above the thresholds. Some of those categories are vegetative; others are mineralogical and the best 45 (approximately) of those are shown in Figure 9. Some mixture series were combined into one color for display in Figure 9. The material maps, shown in Figures 9A and 9B show the diversity of minerals and depict their complex distribution. Also significant, but not shown were vegetation occurrence, vegetation species (spectral type), and vegetation leaf water abundance. Some minerals mapped in only a few pixels and are not shown, but subsequent higher resolution imaging has confirmed these small outcrops. All mineralogic entries in the map keys have been confirmed by field work except pyrophyllite. Pyrophyllite plus alunite has been searched for without success in the field. Subsequent work indicates trace vegetation plus alunite sometimes mimics this mixture spectrally confirming deficiency 11, above.

The Cuprite area consists of dual hydrothermal systems whose origins appear to be chronologically independent (Swayze 1997). The western hydrothermal center is zoned from muscovite (sericite) at the exterior, to kaolinite, Na-alunite, through K-alunite (Figure 9, spectra in Figures 1 and 12). The eastern center is similarly zoned from intermediate (Na-K) alunite, through K-alunite, with a central cap of siliceous rocks. The western center is eroded so that the highest portions of the hydrothermal system, where the siliceous cap would be, is no longer present, and we are now seeing exposed a once deeper level of the system. Some of the minerals exposed in the eastern center formed higher up in the hydrothermal system, further from the source of heat, and thus had lower temperatures involved in their formation compared to the minerals exposed in the western center (Swayze, 1997). The presence of certain minerals and their positions in their solid solution series (e.g. high versus low aluminum muscovite, Na- versus K-alunite, dickite, and others) in the different centers places constraints on the formation conditions (Swayze, 1997) that would be difficult to determine in the field without considerable sampling and extensive laboratory analysis. The Cuprite mineral maps also provide new insight into structures: there are boundaries in the mineral maps that indicate faults not shown on existing geologic maps (Swayze, 1997). Even with detailed field sampling, it is unlikely that maps of this detail showing the complex relationships could be produced by any other method than imaging spectroscopy.

Additional geologic studies at Yellowstone National Park using Tetracorder analysis (Livo et al., 1999, 2000) relates altered mineral occurrences to geologic processes and hydrothermal water chemistry. Kaolinite, alunite, and hematite form from rising acidic waters where the water flow is restricted in volume. These mineral deposits form topographic highs resistant to erosion. Higher-volume neutral pH waters form altered ground within basins that are capped with siliceous sinter and montmorillonite and usually lack iron-oxides.

Geologists usually map geologic formations. Mineral occurrence maps, such as those for Cuprite, do not necessarily depict geologic formations. However, geologic formations are composed of minerals, so mineral maps can usually be used to map the extent of geologic formations. Mineral maps can also be used to sub-divide geologic member units, if there is a compositional change between members. Mineral maps provide a significant new tool for the field geologist, enabling him or her to focus on the interesting areas, helping to produce a better pro