USGS Spectroscopy Lab

http://speclab.cr.usgs.gov

Spectroscopic Determination of Leaf Biochemistry
Using Band-Depth Analysis of Absorption Features
and Stepwise Linear Regression

Raymond F. Kokaly and Roger N. Clark

U.S. Geological Survey, MS 973
Box 25046 Federal Center
Denver, CO 80225-0046
(303) 236-1359
(303) 236-1425 FAX

raymond@speclab.cr.usgs.gov
rclark@speclab.cr.usgs.gov

Derived from (use the following as reference):

Kokaly, R.F. and Clark, R.N., Spectroscopic Determination of Leaf Biochemistry Using Band-Depth Analysis of Absorption Features and Stepwise Linear Regression Remote Sensing of Environment Vol. 67, pp. 267-287, 1999




ABSTRACT

We develop a new method for estimating the biochemistry of plant material using spectroscopy. Normalized band-depths calculated from the continuum-removed reflectance spectra of dried and ground leaves were used to estimate their concentrations of nitrogen, lignin and cellulose. Stepwise multiple linear regression was used to select wavelengths in the broad absorption features centered at 1.73 µm, 2.10 µm, and 2.30 µm that were highly correlated with the chemistry of samples from eastern U.S. forests. Band-depths of absorption features at these wavelengths were found to also be highly correlated with the chemistry of four other sites. A subset of data from the eastern U.S. forest sites was used to derive linear equations that were applied to the remaining data to successfully estimate their nitrogen, lignin and cellulose concentrations. Correlations were highest for nitrogen (R2 from 0.75 to 0.94). The consistent results indicate the possibility of establishing a single equation capable of estimating the chemical concentrations in a wide variety of species from the reflectance spectra of dried leaves. The extension of this method to remote sensing was investigated. The effects of leaf water content, sensor signal-to-noise and bandpass, atmospheric effects, and background soil exposure were examined. Leaf water was found to be the greatest challenge to extending this empirical method to the analysis of fresh whole leaves and complete vegetation canopies. The influence of leaf water on reflectance spectra must be removed to within 10%. Other effects were reduced by continuum removal and normalization of band-depths. If the effects of leaf water can be compensated for, it might be possible to extend this method to remote sensing data acquired by imaging spectrometers to give estimates of nitrogen, lignin and cellulose concentrations over large areas for use in ecosystem studies.
 

INTRODUCTION

Ecosystems play an important role in the exchange of water, energy and greenhouse gases between soil, vegetation, and the atmosphere (e.g. Mooney et al. 1987, Stuedler et al. 1989, and Wofsy et al. 1993). The ability to detect changes in ecosystem processes such as carbon fixation, nutrient cycling, net primary production and litter decomposition is an important part in defining global biogeochemical cycles and identifying changes in climate. In models of forest ecosystems, these processes have been linked to canopy biochemical content, specifically to nitrogen, lignin and cellulose concentrations (e.g. Aber and Federer 1992, and references therein). However, estimates of canopy chemistry by traditional field sampling methods are time consuming and difficult to make for large regional and global studies. Therefore, remote sensing measurement of canopy biochemistry is crucial to studying changes in ecosystem functioning. In this paper, we present a new approach for using spectroscopic measurements to estimate leaf biochemistry.

The measurement of plant biochemical content by remote sensing is a complex problem. Vegetation reflectance is primarily influenced by the optical properties of plant materials (e.g. proteins, lignin, cellulose, sugar, starch, etc.). Plant materials are composed largely of hydrogen, carbon, oxygen, and nitrogen. Thus, the absorption bands observed in reflectance spectra of vegetation arise from vibrations of C-O, O-H, C-H, and N-H bonds, as well as, overtones, and combinations of these vibrations (see review by Curran 1989). The absorptions from the different plant materials are similar and overlapping, so a single absorption band can not be isolated and directly related to chemical abundance of one plant constituent.

Several methods for estimating vegetation biochemistry with remote sensing are being investigated. These methods, including empirical equations and plant reflectance models, relate spectral reflectance measurements to the concentrations of plant biochemical constituents. Models describing radiative transfer through vegetation canopies are increasing the understanding of plant reflectance. Investigations by Aber et al. (1994) have shown that simple linear mixing models combining end member spectra of leaf constituents (chlorophyll, proteins, starch, etc.) are inadequate for making quantitative estimates of biochemical concentrations in leaves. More complex leaf radiative transfer models have been investigated (Fourty et al. 1994, Ganapol et al. 1998). At the canopy level, radiosity based models have demonstrated non-linear mixing effects on vegetation reflectance spectra (Borel and Gerstl 1994).

Many past studies have suggested that empirical estimates of canopy chemistry based on remote spectroscopic measurements may be possible (e.g. Card et al. 1988, Curran 1989, Wessman et al. 1989, and Martin and Aber 1997). These studies used stepwise multiple linear regression to predict canopy chemistry from derivative reflectance spectra. This methodology is based on laboratory techniques developed in the agriculture industry for rapid estimation of forage quality from the reflectance spectra of dried and ground foliage (Norris et al. 1976 and Marten et al. 1989). The original applications stressed the importance of controlled laboratory methods for reducing noise levels and the limited application of regression equations to samples of the same type used in calibration (Marten et al. 1989).

Recent studies have correlated plant canopy chemistry to imaging spectrometer data (Johnson et al. 1994, LaCapra et al. 1996 and Martin and Aber 1997); however, the results at leaf and canopy scales are inconsistent and the derived regression equations are not reliable predictors for other remotely-sensed data. Grossman et al. (1996) found the use of regression techniques with derivative reflectance spectra to give inconsistent results between different forest sites. Under the NASA Accelerated Canopy Chemistry Program, derivative analysis had also given inconsistent results between test sites (ACCP 1994).

In this paper, we develop a new approach for using spectroscopic measurements to estimate leaf biochemistry. This method is first applied to laboratory reflectance measurements of dried and ground leaves and then examined for its extension to remote sensing data. Traditional techniques of spectral analysis used in remote sensing by the terrestrial geology and planetary science disciplines were utilized, specifically, continuum-removal and band ratios. In this approach, continuum removal (Clark and Roush, 1984) was applied to broad absorption features in dry leaf spectra and absorption-band-depths relative to the continuum were calculated. Band-depths in each absorption feature were normalized. Normalization was investigated using the band-depth at the center of the feature and also the area under the band-depth curve. Figure 1 shows how spectral changes due to non-foliar influences are reduced by using continuum removal and normalization to the area of the absorption band. Figure 1a shows a typical dry leaf spectrum and the same spectrum modified by influences of 10% leaf water content, areal mixture with 25% soil background, and residual atmosphere absorptions due to a 50 meter layer of atmosphere. These spectra were analyzed by our new approach and the commonly employed 1st and 2nd derivative methods. In Figure 1b, three broad absorption features have been continuum-removed and normalized. The plotted lines from the original and modified spectra overlay closely and show the invariability of the normalized band-depth approach. In contrast, the 1st and 2nd derivative methods, Figures 1c and 1d, respectively, are sensitive to these common influences on remote sensing spectra and show more variation between the original and modified spectral shapes.
 

Figure 1 at 50dpi.

Figure 1 at 100dpi
Figure 1 at 300dpi

Figure 1. Spectral curves showing the dry leaf spectrum for sugar maple (solid line) and the effects of three contaminants to remote sensing data, an added 10% leaf water content, 25% soil background, and absorptions due to a 50 meter layer of atmosphere (dashed line). Spectral curves are for 1a) reflectance, 1b) continuum removed and normalized band-depths for three absorption features, 1c) the 1st derivative of log (1/reflectance), and 1d) the 2nd derivative of log (1/reflectance).


Normalized band-depths were analyzed by a multiple stepwise linear regression algorithm to select wavelengths highly correlated with leaf nitrogen, lignin and cellulose concentrations. The absorption bands used, the continuum end points, and the analysis were constrained to be applicable to remotely sensed imaging spectrometer data such as that from the NASA Airborne Visual and Infra-Red Imaging Spectrometer (AVIRIS). In order to test the applicability of the approach to different data sets, wavelength selection was based on only two of the seven data sets used in this paper. Subsequently, the selected wavelengths were tested for correlation with the chemical concentrations of the five other data sets. To further test the robustness of this approach, regression equations developed from a subset of three of the data sets were used to predict the concentrations in the remaining samples. Finally, since a broadly applicable method for dry leaves serves only as a foundation for a remote sensing algorithm, the method was critiqued for its extension to remote sensing measurements of plant canopies. The influences of leaf water content, sensitivity to noise, required sensor bandwidths, incomplete vegetation coverage, and atmospheric absorptions were considered.
 
 

METHODS

Data Sets

Data used in this study were comprised of reflectance spectra and foliar chemistry measured from specimens of dried and ground leaves. These samples were gathered and analyzed by the NASA Accelerated Canopy Chemistry Program (ACCP 1994). We examined data from seven sites: three eastern U.S. forests (Blackhawk Island, Wisconsin, Harvard Forest, Massachusetts, and Howland, Maine), a slash pine plantation near Gainesville, Florida, rice fields in California, Douglas-fir seedlings grown in a greenhouse, and a data set consisting of a diversity of plants collected from Long Term Ecological Research (LTER) sites. More than 30 deciduous and coniferous tree species were represented by the 840 samples. Prior to analysis, all samples were oven dried at 70o C for 48 hours, ground through a 1 mm mesh, and homogenized by Newman et al. (1994) as part of the ACCP. Spectral reflectance was measured with a NIRSystems* Model 6250 scanning monochromator with spinning sample cup module by Bolster et al. (1996). Reflectance data were gathered over the wavelength range from 1.100-2.498 µm at a 2 nm interval with a 10 nm bandpass. For Douglas-fir samples, Johnson and Billow (1996) measured reflectance with a NIRSystems Model 6500 scanning monochromator over the wavelength range 0.400-2.498 µm at the same sampling interval and bandpass as the NIRS 6250 instrument.

With the exception of the Douglas-fir greenhouse samples, all data sets had chemical concentrations (% by dry weight) of nitrogen, cellulose and lignin determined using wet chemistry methods by Newman et al. (1994). Nitrogen determination was done using a Perkin-Elmer 2400 CHN Elemental Analyzer. Lignin and cellulose content were determined by a modified wood-products chemistry procedure. Douglas-fir samples gathered and measured by Johnson and Billow (1996) had total nitrogen determined with a Perstorp Analytical RFA/2 continuous flow autoanalyzer after samples were digested using a sulfuric acid mercuric oxide catalyst. Cellulose and lignin concentrations were not measured for these data.

Newman et al. (1994) discusses the accuracy of the wet chemistry methods. The average standard deviation (S.D.) of replicate samples relative to the mean concentration, coefficient of variation (C.V.), was less than 10% for lignin and cellulose. The combustion analysis applied to National Institute of Standards and Technology (NIST) pine standards underestimated nitrogen concentrations. The results of the CHN analysis were an average of 0.10% lower than the standard concentration of 1.20%, an 8.1% relative error. Similarly, Johnson & Billow (1996) found an average relative error of 6.6% for nine samples of NIST pine material for their total nitrogen determinations for Douglas-fir needles using a Perstorp flow autoanalyzer.

Statistical descriptors of the chemical concentrations at each site are shown in Table 1. Blackhawk Island was the most nutrient rich site with the highest average nitrogen concentration (2.38%). The samples included a wide range of temperate forest species, primarily sugar maple, basswood, oak, red and white pine and hemlock. At the Harvard Forest, the species composition consisted of conifer stands of red and white pine, Norway spruce and hemlock, and mixed hardwood stands, mostly oak and maple. Samples from Howland, Maine, a boreal-northern hardwood transitional forest, had the lowest average nitrogen concentration (1.34%) of the eastern U.S. forest data sets and included primarily hemlock, red maple, and pine. The slash pine samples were gathered from a plantation near Gainesville, Florida by Kupiec and Curran (1995). These samples have low nitrogen concentrations (mean = 1.00%) characteristic of coniferous species, relative to deciduous. Rice samples were gathered from fields in Pleasant Grove and Dunnigan, California, north of Sacramento by LaCapra et al. (1996). Rice samples had low nitrogen concentrations (mean = 0.91%) and low lignin concentrations (mean = 14.7%), but high cellulose concentrations (mean = 56%). Douglas-fir samples covered a wide range of nitrogen concentrations 0.68-3.35%. This was a result of giving the seedling plants different amounts of fertilizer. The last data set examined in this paper, the LTER samples, consists of leaf, wood and root tissues collected from sites throughout North America and the tropics as part of the National Science Foundation (NSF) Long-term Ecological Research (LTER) inter-site decomposition study. This data includes 14 species of coniferous and deciduous trees, seven of which are not present in the other data, as well as, grass specimens and root tissue. The range of chemical concentrations in this diverse data set exceeds that of all others. Overall, the foliar chemistry of the 840 samples covers a wide range, from approximately 0.2 to 3.5% for nitrogen, 8-44% for lignin, and 24-74% for cellulose.

Continuum Removal

Working with spectral reflectance data, broad absorption features in the dry leaf spectra centered near 1.730, 2.100, and 2.300 µm, were selected for continuum analysis. These are demonstrated for a dry leaf spectrum in Figure 2a. Previous studies have shown many chemical bonds in foliar constituents have vibrational absorptions in these wavelength regions (see the review by Curran 1989). The continuum is simply an estimate of the other absorptions present in the spectrum, not including the one of interest (Clark and Roush 1984). In practice, linear segments can be used to approximate the continuum. However, a Gaussian analysis might be more accurate for strong overlapping absorptions (e.g. see Clark 1981, Sunshine et al. 1990, Sunshine and Pieters 1993).

Figure 2 at 50dpi.

Figure 2 at 100dpi
Figure 2 at 300dpi

Figure 2. Continuum analysis is demonstrated for a white pine sample. Figure 2a shows the continua used to isolate each major absorption feature in dry leaf reflectance spectra (1.73 µm, 2.10 µm, and 2.30 µm). Figure 2b shows the result of continuum removal for the three features. The continuum end points are defined in Table 2.


Once the continuum line is established, the continuum-removed spectra are calculated by dividing the original reflectance values by the corresponding values of the continuum line (Figure 2b). The end points of the continua of the three absorption features used in this study are defined in Table 2. Although some imaging spectrometers, such as AVIRIS, fully cover the wavelength range from 0.350-2.500 µm, the analysis here excludes wavelengths near strong atmospheric absorptions (around 1.400 µm and 1.900 µm) and regions where the AVIRIS signal-to-noise ratio (S/N) is low due to atmospheric water absorption and decreasing solar flux (wavelengths greater than 2.400 µm). From the continuum-removed reflectance, the band-depth (D) of each point in the absorption feature is computed by:

D = 1 - R'                      (eqn 1)

where R' is the continuum-removed reflectance value.

Band-depth Normalization

Reflectance spectra of plants measured in the laboratory subtly vary with changing leaf chemistry. However, remotely sensing measurements of vegetation canopies are also affected by other factors, including atmospheric absorptions, the abundance of other absorbers in the leaf (such as water), and soil exposed by incomplete coverage of vegetation. Therefore, analytical methods for estimation of plant biochemistry must overcome any sensitivity to these extraneous factors. Band ratios are routinely used with broad-band remote sensing data such as Landsat to reduce topographic and atmospheric effects. Similarly, we used a normalization procedure on band-depths calculated from continuum-removed reflectance spectra to minimize these influences. Band-depths within absorption features were ratioed to the band depth at the center of the feature. The normalized band-depths (Dn) within the continuum-removed absorption band are calculated by dividing the band-depth of each channel by the band-depth at the band center (Dc):

Dn = D / Dc                        (eqn 2)

where the band center is the minimum of the continuum-removed absorption feature (see Table 2). Variations of Dn with wavelength describe the shape of the absorption feature. Changes in the shape of absorption features, such as those shown in Figure 3, are correlated to foliar chemistry.

Figure 3 at 50dpi.

Figure 3 at 100dpi
Figure 3 at 300dpi

Figure 3. Example shape changes in the normalized band-depth profiles for the 2.30-µm absorption feature in dry leaf data are shown for two samples, white pine (solid line) and red maple (dashed line). The original reflectance spectra have been continuum removed and the band depths normalized to the band center at 2.304 µm. This process clearly shows the different absorption shapes in these two samples. The shape changes are due to varying amounts of absorbers in the dry leaf, including nitrogen contained in proteins, and structural biochemicals such as lignin and cellulose.


Stepwise Multiple Linear Regression

Normalized band-depth Dn values for all wavelengths in the three continuum-removed absorption features were analyzed using a stepwise multiple linear regression routine in an attempt to determine wavelengths correlated with chemistry. Stepwise multiple linear regression fits an observed dependent data set (e.g. chemical concentration of nitrogen, lignin or cellulose) using a linear combination of independent variables (e.g., Dn values at discrete wavelengths). The result of this statistical method was a number (N) of wavelengths (li=1,N) correlated to the dependent variable and a linear equation combining the values of the independent data set at these wavelengths with coefficients (a0,ai=1,N) established by the regression. The linear equation determined by the regression has the form:

Chemical concentration = (eqn 3)

The stepwise regression was run separately for each of the three leaf constituents: nitrogen, lignin, and cellulose. A stepwise regression routine in IDL (Interactive Data Language), STEPWISE, was used. This routine is based on an algorithm by Afifi and Azen (1971).

Simulation of Wider Bandpass

Most past studies using stepwise regression to relate chemical concentrations to reflectance spectra have not used the reflectance measurements directly. Instead, these methods are applied to an approximation of absorptance which is calculated by taking the logarithm of the inverse of the reflectance, log(1/R). Furthermore, the best results to date have been obtained with approximates of the first and second derivatives of the log(1/R) data. This requires smoothing of the data, usually over a 10 nm wavelength range, and data transformation using finite-difference approximations to derivatives (Dixit and Ram 1985). In general, the calculation of log(1/R) derivatives exacerbates the influence of noise and, although smoothing compensates in part for the reduction in signal-to-noise, wide smoothing ranges can severely attenuate absorption features (O'Haver and Begley 1981). These methods have been routinely applied to high resolution laboratory spectrometer data that have very high signal-to-noise and fine sampling intervals.

In order to develop a method applicable to remote sensing data, differences between laboratory and remote sensing measurements must be considered. Thus, calculations appropriate for laboratory spectra with 10 nm resolution and 2 nm spacing may not be directly transferred to remote sensing data. Sampling positions and intervals are different. Also, the laboratory spectrometers used to measure the dry leaf reflectance spectra had much higher S/N than common for current imaging spectrometers like AVIRIS. As a result, we average reflectance values over an increasing range of wavelengths to investigate the use of larger bandpasses. This potentially allows more than one AVIRIS channel to be included in the analysis, providing better signal-to-noise. The results of simulating a wider bandpass are presented in a later section of the paper. In the majority of the paper, unless otherwise noted, calculations were made using single channel values which have a nominal 10 nm bandpass.
 

RESULTS

Wavelength Selection from Blackhawk Island and Havard Forest Data Sets

Two of the eastern U.S. forest data sets, Blackhawk Island and Harvard Forest, were used to derive wavelengths correlated with nitrogen, lignin and cellulose concentrations. Stepwise multiple linear regression was applied to the normalized band-depths of these data. Figure 4a-c show the location of wavelengths selected by the regression plotted on a dry leaf spectrum with continua and band centers (C1730, C2100 and C2300) shown as well. Five wavelengths were selected in the nitrogen regressions. These are all in the 2.100 µm absorption feature. Lignin required six wavelengths, two in the 1.730 µm and four in the 2.300 µm absorption features. Eight wavelengths, a few in each of the three broad absorption features present in dry leaf spectra, were selected in regression to cellulose concentrations.

Figure 4a at 50dpi.

Figure 4a at 100dpi
Figure 4a at 300dpi

Figure 4b at 50dpi.

Figure 4b at 100dpi
Figure 4b at 300dpi

Figure 4c at 50dpi.

Figure 4c at 100dpi
Figure 4c at 300dpi

Figure 4. Wavelengths correlated with dry leaf chemistry as selected using stepwise linear regression on normalized band-depths for samples from the Blackhawk Island and Harvard Forest data sets. The position of correlated wavelengths and the absorption band centers are shown for nitrogen, lignin and cellulose in Figures 4a, 4b, and 4c, respectively.


Testing Correlation of Wavelengths to the Chemistry at Other Sites

Following wavelength selection using the Blackhawk Island and Harvard Forest sites, linear regression was used with these wavelengths to independently establish regression equations at each of the sites. Table 3 shows the R2 and standard error of calibration (SEC) for each of the sites and each constituent. The standard error of calibration is the root mean square error (RMSE) between the chemical concentrations calculated from the regression equation and the values obtained by wet chemistry laboratory methods. Nitrogen correlations were very high (R2 from 0.90 to 0.97) and SEC were low (0.06 to 0.17% nitrogen by dry weight). Relative to the mean nitrogen concentrations of each data set the SEC were less than 10% error.

Cellulose and lignin regressions resulted in lower correlations. In general, cellulose correlations were good (R2from 0.75 to 0.93). Correlations for lignin were all significant (R2from 0.65 to 0.83) except for the rice field data which had a low R2 of 0.32. This rice data set had a much lower average lignin concentration relative to the other data sets (approximately 25% lower). In fact, many of the samples in the rice data set were below the range of lignin concentrations present in the calibration data set. However, the error of the SEC relative to the mean lignin concentration for the data set was only 6.2%.

Testing the Predictive Ability of Regression Equations

Regression equations were tested for their ability to predict chemistry for new data sets. To accomplish this, we used two-thirds of the samples from the eastern U.S. forest sites (Blackhawk Island, Harvard Forest and Howland, Maine) and the previously derived wavelengths to establish the coefficients in the regression equations. This calibration equation was then used to predict the chemical concentrations of the remaining validation data sets: the one-third of forest samples not used in the calibration, the slash pine data, the rice data, and the Douglas-fir data. The results for the calibration regression and validation predictions are presented in Table 4.

As expected, correlations were highest and RMSE lowest for the predictions of the remaining one-third of the eastern forest samples. The RMSE in nitrogen estimates for slash pine and Douglas-fir data sets were only slightly worse than the eastern forest validation set. Predictions for lignin concentrations of the slash pine data were poor. However, cellulose estimates for the slash pine samples were better (R2=0.35 and RMSE=2.60, only a 7.2% error relative to the mean concentration).

The rice data set provided a test for the application of regression equations derived from forest foliage to non-forest vegetation. Nitrogen predictions were very good (R2=0.83 and RMSE=0.13). However, the lignin and cellulose concentrations for the rice data were not well predicted. This might be influenced by structural or biochemical differences particular to this grass species compared to all the other samples of tree foliage. Furthermore, as shown in Table 1, mean concentrations for foliage constituents in the rice samples are significantly lower for nitrogen and lignin and much higher for cellulose than the other sites. Indeed, the majority of rice samples fall outside the concentration ranges of the other data sets.

The LTER data set also contained some types plant tissues not present in the calibration data set, including, forest species different from calibration set, grasses, and roots. Again, lignin and cellulose predictions were not robust. Nitrogen predictions were surprisingly good (R2=0.93 and RMSE=0.23).
 
 

DISCUSSION

Leaf Chemical Concentrations

Wavelength Selection

Three of the five wavelengths selected by the stepwise multiple linear regression are near or at wavelengths related to chemical bonds in leaf materials. Curran (1989; Table 1) and Peterson and Hubbard (1992; Tables 1-4) summarized wavelengths from 0.400-2.500 µm that have been attributed to vibrational absorption by chemical bonds in leaf material. These tables also listed which leaf materials (e.g., lignin, cellulose, protein, nitrogen, starch, sugar, etc.) have been linked to the wavelength of the absorption. Two wavelengths selected by the regression, 2.05 µm and 2.18 µm, are at wavelengths related to protein bonds, specifically bonds including nitrogen. The wavelength at 2.078 µm is near an O-H stretch/O-H deformation bond linked by past studies to sugar or starch. The remaining wavelengths selected by the regression, 2.036 and 2.152 µm, have not been attributed to any bonds within leaf material. Finally, the band center wavelength, 2.106 µm, which was not selected by the regression analysis but was used for normalization of the band-depths, is near an absorption attributed to O-H bend/C-O stretch/C-O-C stretch, third overtone which in past studies has been used for cellulose estimations. Bolster et al. (1996) also applied stepwise multiple linear regression and partial least squares methods to derivatives of log(1/R) for the eastern U.S. forest data. Similar to the results presented here, the wavelengths 2.056 µm, 2.076 µm and 2.168 µm were found to be highly correlated with nitrogen by these methods.

All wavelengths used in the lignin regression, including the band centers, are near wavelengths related to chemical bonds in the leaf. The locations are within 10 nm of those listed in Curran (1989) and Peterson and Hubbard(1992). The majority of the eight wavelengths used in the regression are near C-H or O-H bonds. The band center at 2.306 µm is also near a N-H stretch. Perhaps due to the complex chemistry of the lignin molecule and the presence of the same bond in many leaf materials, these wavelengths have been related in past studies to a variety of leaf constituents, including: lignin, protein, starch, cellulose and sugar (Curran 1989).

Cellulose regressions used some wavelength positions near those of nitrogen and lignin regressions. Three wavelengths were unique to the cellulose regression, 2.066 µm, 2.202 µm and 2.288 µm. An N-H bond is associated with the 2.066 µm wavelength selection. A C-H bond has been linked to an absorption near 2.288 µm. Finally, the 2.202 µm wavelength has not been attributed to any chemical bonds in leaf materials by past studies.

In summary, most of the wavelengths selected in the nitrogen, lignin and cellulose regressions are near wavelengths corresponding to vibrational absorptions of chemical bonds within leaf materials. Nitrogen had the clearest correspondence between regression wavelengths and chemistry. Two of the five wavelengths in the nitrogen regression (2.050 and 2.180 µm) were at wavelengths where protein bonds involving nitrogen are located. In many past studies, wavelengths near these have been selected by regression analysis. The band center of the 2.100 µm absorption feature is near a bond associated with O-H and C-O bonds that have been related to starch and cellulose. For cellulose and lignin, there was no explicit correspondence between wavelengths selected by stepwise regression and chemical bonds solely present in these compounds. The wavelengths were all near O-H, C-H, and C-O bonds that have been attributed to a wide variety of leaf constituents. The corresponding selection of wavelengths by the cellulose and the lignin and nitrogen regressions may possibly be influenced by the fact that these bonds are common in organic molecules.

Application of Wavelengths Derived from Blackhawk Island and Harvard Forest to All Sites

As shown in Table 3, wavelengths derived from an analysis of only the Blackhawk Island and Harvard Forest sites were used to successfully estimate concentrations with multiple linear regression. The results for nitrogen regressions have high correlations (R2> 0.90) for both forest and nonforest data sets. These consistent results are significant considering that Grossman et al. (1996) found that wavelengths derived from any single data set were not able to reliably predict nitrogen concentrations in other data sets. Those tests were performed on log(1/R) and its first and second derivatives and only achieved low R2 (0.14 to 0.49). A few predictions in these previous studies were higher but they did not consistently give good results for other data sets. In contrast, the results of this study, which used normalized band-depths describing the shape of absorption bands, show consistently high correlations and low errors for nitrogen, lignin and cellulose.

Predictive Ability of Regression Equations

In order to be useful for remote sensing, an algorithm for predicting concentrations should be applicable over a wide variety of vegetation types. To test the method presented here, regression equations established by using two-thirds of the eastern U.S. forest data as a calibration were used to predict the concentrations in the remaining data (Table 4). As expected, predictions for the remaining one-third of eastern U.S. forest data were good. However, predictions for the slash pine and Douglas-fir seedlings were lower but still very good. Predictions of nitrogen concentrations in rice and LTER samples were good despite the fact that these tissues differ from the leaf and needle material of the calibration data and the fact that the mean concentration of the rice data is lower than the concentration of the calibration data.

An influence on extending the regression equations may be from the normalization procedure since the band center used for normalization was not selected by the regression. For nitrogen, the band center of the 2.100 µm absorption feature is near an absorption arising from bonds not containing nitrogen. In order to assess the influence of the normalization, predictions were also made from band-depths normalized to the area under the band-depth curve (see Figure 3). The results of these predictions are shown in Table 5. Overall, these predictions are highly accurate and improved for the slash pine, rice, and Douglas-fir data sets. The bias evident in predictions of nitrogen for slash pine and Douglas-fir using normalization to band center (Table 4) were almost fully reduced in predictions made with normalization to the band area (Table 5). Figure 5 shows the accuracy (R2=0.95 and RMSE=0.17, bias = 0.00) of all predicted nitrogen concentrations using area normalization plotted against the wet chemistry measured values. Such highly accurate predictions for vegetation types and concentrations beyond those present in the calibration highlight the consistency of the normalized band-depth approach. Indeed, inconsistency has been a major criticism of SMLR applied to derivatives of log(1/R) (Grossman et al., 1996).

Figure 5 at 50dpi.

Figure 5 at 100dpi
Figure 5 at 300dpi

Figure 5. Prediction of nitrogen concentrations in the validation data set using regression equation derived from area normalized band-depths of the calibration data set (two-thirds of the eastern U.S. forest samples). The data from different sites are represented by the various symbols: triangle = LTER, asterisk = Rice, cross = slash pine, square = one-third eastern U.S. forests calibration set, diamond = Douglas-fir.


Lignin predictions in Tables 4 and 5 also show that normalization to the area of the absorption band is the more accurate method. Forest leaf and needle samples are well predicted by the regression equation developed from the area normalized band-depths of the eastern U.S. calibration data set. The forest validation set and slash pine data set had low RMSE (2.57 and 2.79, respectively). The rice and LTER data have higher errors of prediction (RMSE of 7.49 and 6.88, respectively). The mean lignin concentration of the rice samples (~15%) falls well below that of the forest data (~22%). Thus, in contrast to predictions for nitrogen, the predictions for the lignin concentrations for samples beyond the range of the calibration data set and for materials not existent in the calibration data set are not robust.

Cellulose predictions from band center normalization (Table 4) and band area normalization (Table 5) show the same trends as lignin predictions. However, band center normalization is slightly more accurate. This may be explained by the fact that the absorption band centers (1.728, 2.106, and 2.306 µm) lie near C-H, O-H and C-O bonds. Such bonds are present in the cellulose molecule. As with lignin, predictions for the forest data were better than for non-forest tissues that have cellulose concentrations out of the range of the calibration data.

Regressions with the Entire Data Set

Considering the improved results for nitrogen predictions using band area normalization and the poor predictions for lignin and cellulose concentrations in non-forest samples, more generally applicable regression equations should be derived using area normalized band-depths. Figures 6a-c show the results of such regressions using all samples in the data sets for nitrogen, lignin and cellulose, respectively. The derived constants and coefficients for terms in these regression equations are given in Table 6. The nitrogen correlation is again very high (R2 = 0.94, SEC = 0.17) compared with good results for lignin (R2 = 0.66, SEC = 2.90) and cellulose (R2 = 0.82, SEC = 3.57) regressions. Given the wide range of biochemical concentrations and numerous vegetation types in the 840 samples, these results suggest that these equations may be applicable to future dry leaf spectral studies. In the remainder of this paper, calculations of the concentrations of nitrogen, lignin and cellulose are made using normalization to the band area and the constants and coefficients in Table 6.

Figure 6a at 50dpi.

Figure 6a at 100dpi
Figure 6a at 300dpi

Figure 6b at 50dpi.

Figure 6b at 100dpi
Figure 6b at 300dpi

Figure 6b at 50dpi.

Figure 6c at 100dpi
Figure 6c at 300dpi

Figure 6. Results of regressions of band-depths normalized to band area with leaf chemistry using all data sets. Regressions for nitrogen, lignin and cellulose are shown in Figures 6a, 6b, and 6c, respectively. The data from different sites are represented by the various symbols: circle = Blackhawk Island, filled circle = Harvard Forest, triangle = LTER, asterisk = Rice, cross = slash pine, x = Howland, Maine, diamond = Douglas-fir (nitrogen only). Regression equations are given in Table 6.


Although nitrogen is a small fraction of the dry leaf, the nitrogen correlation is higher than for lignin and cellulose which comprise a more substantial portion of the dry leaf (see Table 1). Proteins contain the majority of nitrogen in the leaf. According to Elvidge (1990), the most abundant nitrogen bearing compound in green leaves is D-ribulose 1-5-diphosphate carboxylase. As shown by Elvidge (1990), the N-H bonds in this compound absorb at 2.05 µm and 2.17 µm on the edge of the broad 2.1 µm absorption feature in dry leaves. In contrast, Elvidge (1990) shows the reflectance spectra of dry plant materials such as cellulose, hemicelluloses, lignin, starch and pectins which all have absorptions centered near 2.1 µm and 2.3 µm.

Absorption of near-infrared radiation by cellulose, a carbohydrate molecule, is due to O-H and C-H bonds. For this large molecule, overlapping overtone and combination absorptions arising from these basic bonds contribute to the broad nature of the 2.1 µm absorption feature. These O-H and C-H bonds are common to most organic materials within the leaf and as a result may be confused with absorptions due to other leaf constituents. For example, starch has nearly the same chemical structure as cellulose. Both are composed of monomers of glucose with the only difference being a change in configuration of hydrogen and hydroxyl on one carbon atom. As a result, the near infrared spectra of these leaf constituents are very similar as presented by Elvide (1991). We suggest that this similarity in cellulose and starch reflectance degrades the estimation of cellulose concentrations from reflectance spectra and explains the larger errors in cellulose estimations compared to nitrogen estimations in this and other studies.

Lignin consists of carbon, oxygen and hydrogen in an aromatic structure. The aromatic structure with its conjugated double bonds changes its absorption features and this may aid in estimation of its concentration using near-infrared reflectance spectroscopy. This may explain the lower errors in lignin estimation compared to cellulose.

Thus, chemical bonds that are unique in the wavelength location of their absorption compared to other dry leaf constituents increase the ability to estimate biochemical concentrations from reflectance spectra. Nitrogen-hydrogen bonds are uniquely situated, in comparison to other absorptions in the dry leaf, on the shoulders of the broad 2.1 µm absorption feature. The changing strength of N-H absorptions on the edges of this feature alters its shape and allows the estimation of the concentration of such a small fraction of the dry leaf. In comparison, cellulose, a large fraction of the leaf by weight, has a absorption features similar to other leaf components and errors in estimation of this compound are higher than for nitrogen.

Accuracy Requirements

Schimel (1995) discusses accuracy and precision required from remote sensing in order to map large scale variations in foliar nitrogen and lignin. Accuracy of ~0.5%(absolute) nitrogen is necessary to distinguish between ecosystems with differences in nitrogen large enough to affect photosynthesis. Accuracy of ~5.0% lignin concentration is needed to detect between-system gradients. Higher resolution, ~1% lignin, is required to detect changes within a single forest type in order to map changes relevant to decomposition and nutrient cycling.

The method used in this study has demonstrated errors below the required accuracy for nitrogen concentrations, even when making predictions to different vegetation types (i.e., tree foliage to rice). Lignin estimates were also within the desired accuracy for detecting between system gradients. However, the 1% resolution needed for detecting changes within a forest type was not reached with this method. Such a high accuracy may be difficult to achieve since lignin is not a well defined compound and even the errors in laboratory wet chemistry methods are at this limit. Finally, while the results from the analysis of laboratory measured reflectance spectra of dried and ground samples presented here are encouraging, accuracy will obviously degrade for extensions of this method to fresh, whole leaves or remotely sensed canopies. The next part of this paper discusses the effects that common influences on remotely sensed data will have on the prediction of chemical concentrations.

Remote Sensing Considerations

Although the analysis of dried and ground leaves indicate that accurate prediction of chemical concentrations can be made, additional complexities at the remote sensing scale will be encountered, including: different instrument characteristics (S/N and bandpass), atmospheric effects, leaf water, fractional canopy coverage (e.g., the influence of soil background), and canopy architecture (e.g., leaf area index and leaf angle distribution). This section addresses the effects of several of these influences on this method.

Noise Sensitivity Analysis

Sensitivity of the regression equations to noise was tested by adding Gaussian noise at various levels to a dry leaf spectrum. Chemical concentrations were calculated using the noisy spectra and the equations from Table 6. For each level of noise, measured by the root mean square, many "noisy" spectra were generated and a new calculation of chemical concentrations made. The root mean square difference between the original and the "noisy" spectra were calculated at each noise level. Figures 7a-c show the results for nitrogen, lignin and cellulose, respectively. A high signal-to-noise (S/N) of approximately 700 is required to achieve a RMSE of 0.50% nitrogen (absolute) as specified by Schimel (1995). For RMS error of 5% (absolute) in lignin concentration as required by Schimel (1995), the S/N for must be only 100. However, for the desired accuracy of 1% lignin concentration, the S/N must be 1250. For cellulose, a S/N of 550 gives a RMSE of 5%.

Reflectance levels measured over forest canopies by AVIRIS are typically in the 5% reflectance range in the 2.100 µm and 2.300 µm regions, and ~15% in the 1.700 µm region (e.g. Martin and Aber 1994, Clark et al. 1994). These low reflectance levels are due to water in the vegetation which has absorption bands near these wavelength regions. The required S/N calculated from the dry leaves measured in the laboratory, with an average reflectance level of 40%, must increase by a factor of 8 for AVIRIS data over vegetation canopies. This results in requirements for S/N of 800, 4400, and 5600 for lignin, cellulose and nitrogen, respectively. Presently, we have observed AVIRIS signal-to-noise ratios of about 360 range at 2.100 µm. Several methods may be employed to increase the S/N of the remote sensing measurements, including channel averaging and pixel averaging. The S/N increases with the square root of the number of pixels averaged. Thus, averaging 5 AVIRIS pixels increases the S/N to the level required for predicting lignin concentrations to the error level of 5% by dry weight. However, the error in leaf chemistry predictions are affected by other factors, some of which, are addressed in the remainder of this paper.

Figure 7a at 50dpi.

Figure 7a at 100dpi
Figure 7a at 300dpi

Figure 7b at 50dpi.

Figure 7b at 100dpi
Figure 7b at 300dpi

Figure 7c at 50dpi.

Figure 7c at 100dpi
Figure 7c at 300dpi

Figure 7. The effect of increasing signal-to-noise (S/N) on errors in nitrogen predictions using the area normalized band-depth approach. Needed S/N to achieve required accuracy in leaf chemistry are shown plotted on the curves for nitrogen and lignin in Figures 7a and 7b, respectively. The variation in errors in cellulose prediction with S/N is shown in Figure 7c.


Required Sensor Bandpass

We simulated broader bandpasses by averaging a block of channels around the selected wavelengths. By averaging an increasing number of channels we simulated bandpasses up to 78 nm. Samples from all data sets were used in the regressions. The results of this analysis are presented in Table 7. For nitrogen, R2 increased only slightly over the range of simulated bandpasses. At a simulated bandpass (BP) of 30 nm, the regression was still very good (R2= 0.94, SEC=0.17). Even at a simulated BP of 78 nm the correlation remained strong, R2= 0.91 and SEC=0.21. Lignin and cellulose results were similar at the one channel (10 nm) and five channel averages (18 nm simulated BP). For lignin, using averages of more than five channels caused the correlations and errors to worsen. A similar trend is seen in the cellulose results. However, a 31 channel average (simulated BP of 70 nm) only slightly decreased the R2from the single channel result (a change from 0.82 to 0.81). The error increased from 3.57% to 3.68% cellulose.

Instead of averaging channels, a more accurate approach would be to convolve the laboratory data to AVIRIS sampling and bandpass (which is 10 nm bandpass and 10 nm sampling). The results from block averaging demonstrate that lignin and cellulose regressions still perform very well at 18 nm bandpass. Nitrogen regressions are still high at 30 nm bandpass. Furthermore, regressions using a simulated bandpass of 70 nm has surprisingly strong correlations to cellulose and nitrogen. These results indicate the possibility of averaging two or more channels for computations with AVIRIS data, helping to offset the lower S/N of the imaging spectrometer relative to laboratory instruments.

Leaf Water

The largest effect in reflectance spectra of dried and ground leaves compared to spectra of a vegetation canopy is due to leaf water. Leaves in a plant canopy can be composed of 40 to 80% water by weight (Elvidge 1990). Because water is highly absorbing in the near infrared, and because the water comprises so much of the leaf, the spectral signatures of the other chemical components are, to a large degree, masked by the water. To test the sensitivity of our method to water, spectra of dry leaves and with added liquid water were computed using the Hapke (1981) radiative transfer theory. We measured spectra of liquid H2O on a Nicolet Fourier transform spectrometer. Using these data and the known index of refraction of water (Irvine and Pollack 1968), we added water to dry leaf spectra. The absorption coefficient of the dry leaf component were derived by inverting the Hapke equations (e.g. Hapke, 1981, Hapke 1983, and Clark and Roush 1984), using the index of refraction of water, and deriving the absorption coefficients as a function of wavelength. Given the optical constants (index of refraction and absorption coefficients as a function of wavelength), reflectance spectra of dry leaf plus water were computed for 10%, 20%, 30%, 40%, 50%, 60%, 70%, and 80% water (% by weight) added to the dry leaf (Figure 8).

Figure 8 at 50dpi.

Figure 8 at 100dpi
Figure 8 at 300dpi

Figure 8. The spectrum of a dry leaf (top) and with increasing amounts of water added using Hapke radiative transfer model. Top to bottom are: no added water, 10%, 20%, 30%, 40%, 50%, 60%, 70% and 80% added water. Note how the apparent band minimum at 1.73 µm shifts to longer wavelengths with increasing water, and how the structure in the 2.3-µm band becomes approximately a straight line with increasing water. The sequence shows the necessity of accurately removing the water signature to recover the spectral signatures of the other chemical components of the leaf.


As can be seen from the spectra in Figure 8, the masking of dry leaf spectral features increases with increasing water. The absorptions originally apparent in the dry leaf spectrum decrease in strength and change shape with water content. Also, the local slope of a spectrum changes as a function of water content. The continuum-removed spectral features of "wet" leaves show large changes in depth and shape for the 1.730 and the 2.100 µm features, going from well defined absorption shapes at low water content, to convex shapes at high water contents. The 2.300 µm feature is more stable in shape because the spectral response of liquid H2O is nearly linear over that absorption feature. For each of the broad absorption features, the same trend is present to varying degree: as the water content increases the depth of the absorption at the band center decreases.

In order to assess the effect of leaf water on calculations of leaf chemistry we added the influence of water to six samples from the data set. These samples were selected to span a range of chemistry and include different species, red oak, white pine, hemlock, red maple, sugar maple, and slash pine. The average errors of chemistry calculated from the dry leaf plus water compared to the original dry leaf spectra are presented in Table 8. The largest impact is on lignin and cellulose estimates. The errors increase rapidly for addition of water greater then 20%. Errors in nitrogen estimates remain small for 10% and 20% added water but increase rapidly for 30% and greater added water. Thus, to apply the equations developed from dried leaves measured in the laboratory to remotely sensed canopy spectra, the influence of water must be spectrally/computationally removed to an accuracy of at least 10%.

The results show the long established fact that water has a dominant influence on the reflectance from green leaves and canopies. In fact, it was once proposed that reflectance in the 1.450-2.500 µm range might be completely explained by a thin sheet of water (Allen et al. 1969). Recently, however, Gao and Goetz (1994) have emphasized the effects of both leaf water and leaf biochemicals on near infrared reflectance. Prominent liquid water absorptions are centered at 0.760, 0.970, 1.190, 1.450 and 1.940 µm. Thus, the influence of water absorption is great throughout the near infrared wavelength region. These bands are quickly saturated and increasing water content extends its influence to reflectance on the wings of these bands. As a result, all methods, including this method and log(1/R) derivatives will be sensitive to variations in leaf reflectance that may be solely caused by changes in leaf water content. Indeed, at the canopy level, Matson et al. (1994) correlated AVIRIS data from 1.500-1.800 µm to nitrogen concentration estimates and found that the first wavelength selected, explaining 64% of the variance, was associated with absorption due to water rather than nitrogen. Any remote sensing algorithm must remove the influence of leaf water. We experimented with using radiative transfer unmixing of the water component. However, the water in the leaves is hydrogen bonded, causing the absorption bands to be shifted slightly to shorter wavelengths as compared to liquid H2O. We are working on deriving appropriate optical constants for water within leaves before attempting accurate unmixing with fresh leaf spectra.

Incomplete Vegetation Canopy

A common complication in remote sensing of vegetation is the partial exposure of the underlying soil surface. When vegetation cover is not 100%, other components, such as soil, water, or man-made objects will contribute to the remotely sensed signal. If the sensor spectral coverage includes the visual portion of the spectrum, the chlorophyll absorption can be used to estimate the fractional coverage of green vegetation in the pixel. While traditional methods are susceptible to errors in chlorophyll estimation, our method using continuum-removed band-depth ratios, is less sensitive to fractional canopy coverage as long as the background components do not have strong spectral signatures. Consider the soil spectrum presented in Figure 9a The soil spectrum shows only weak absorption features due to water (at 1.4 and 1.9 µm) and the presence of the clay mineral kaolinite (at 2.2 µm). Combining the spectra of a dried and ground sugar maple leaf with soil in a 75% leaf to 25% soil mixture yields Figure 9b. The calculations of chemical concentration from the soil contaminated spectrum (2.01% nitrogen, 17.52% lignin, and 37.10% cellulose) are very close to the original dry leaf results (2.12% nitrogen, 18.28% lignin, and 34.13% cellulose). In this way, normalizing the band-depth alleviates the soil influences on the spectrum.

Figure 9a at 50dpi.

Figure 9a at 100dpi
Figure 9a at 300dpi

Figure 9b at 50dpi.

Figure 9b at 100dpi
Figure 9b at 300dpi

Figure 9. The effect of the reflectance of a spectrally bland soil background (Figure 9a) added to the reflectance of a dry leaf (solid line in Figure 9b). The spectrum of a linear mixture of 25% soil and 75% leaf is represented by the dashed line in Figure 9b.


To test the influence of soil contamination on a larger data set, the samples from Blackhawk Island were added linearly with the soil spectrum for increasing proportions of soil. The RMSE between the calculations from the soil contaminated spectra and the original dry leaf spectra are presented in Table 9. Nitrogen estimates are relatively insensitive to soil background until a soil cover of 40% and greater is reached. Errors in lignin estimates are relatively low until half the signal is comprised of the soil spectrum. Errors in cellulose estimates, on the other hand, are large and increase rapidly beyond a 20% soil cover.

Atmospheric Effects (Path Radiance & Residual Atmosphere Absorptions)

Path radiance can be a significant part of canopy spectra. Clark et al. (1998) observed path radiance in the Blackhawk Island data set of about 2% in the 2.000-2.500 µm region, where the canopy reflectance was only ~5%. Thus, path radiance was 40% of the returned signal. Because path radiance is an additive component, this algorithm is independent of the path radiance component (similar to the spectrally bland soil background) and the analysis could be conducted with no path radiance correction.

Atmospheric influences on remotely sensed vegetation canopies include the incomplete removal of atmospheric absorptions. Residual atmosphere absorption features due to water vapor, carbon dioxide and other gases are commonly observed in AVIRIS data. Methods for estimating canopy chemistry should not be sensitive to these residuals. In our analysis we restricted the absorption features examined to avoid strong atmospheric absorptions (see Figure 2a). All analyses should similarly avoid these wavelength regions.

We tested our method for sensitivity to atmosphere residuals. We used Modtran (Berk et al., 1989) to calculate the transmittance spectrum of a 50 meter layer of atmosphere at an elevation of 8000 feet. Even a thin layer like this reduces transmittance to 0.87 at 1.84 µm. Each sample in the Blackhawk Island data set was multiplied by this residual atmosphere. Calculations were made using these "contaminated" spectra and compared to results from the original dry leaf spectra. The RMSE errors were 0.14%, 0.72%, and 1.03% for nitrogen, lignin, and cellulose, respectively.

Combined Effects (Atmosphere, Soil, Leaf Water)

In a final test for the sensitivity of the area normalized band-depth method to the influences encountered at the remote sensing level, we contaminated dry leaf spectra with 10% leaf water, added a soil background of 25%, and multiplied by the 50 meter atmosphere residual. A comparison, for one sample, of the original dry leaf spectrum to the "contaminated" spectrum is shown in Figure 1. As presented, our approach (Figure 1b) is less sensitive to these influences than derivative methods (Figures 1c-d). We added the soil, water, and atmosphere contaminations to the spectra for the six samples mentioned previously (see the section on leaf water). Chemical concentrations were calculated from these "contaminated" spectra and compared to calculations from the original unaltered spectra. The average errors for estimates are shown in Table 10. In addition to the normalized band-depth method, we also calculated errors from 1st and 2nd derivative computations made according to Bolster et al. (1996). The advantage of continuum removal and band normalization in reducing the impact of extraneous influences was shown in Figure 1, and is also reflected by the lower errors for this method as shown in Table 10. Nitrogen errors are doubled when using the 2nd derivative methods. Errors for lignin estimates are extremely high for the derivative approaches. For the normalized band-depth approach, the errors for estimates of the biochemical concentrations in the "contaminated" spectra are at a reasonably low level for these realistic non-biochemical influences on canopy spectra.
 

CONCLUSIONS

Remote sensing algorithms are needed to make measurements of canopy chemistry for large scale monitoring of ecosystem functioning. This paper developed a new method for estimating nitrogen, lignin and cellulose concentrations in foliage using spectroscopy. Our approach used normalized band-depths, calculated from continuum-removed reflectance spectra, coupled with stepwise multiple linear regression. Laboratory spectra of dry, ground leaves were used to establish the baseline validity of this empirical method. The method was designed with an awareness of the influences that will be encountered in remote sensing applications. A set of wavelengths highly correlated with leaf chemistry was determined. Independent applications of linear regression analysis using these wavelengths to chemical concentrations of seven sites were accurately made. Furthermore, regression equations developed from a calibration subset of data were used to predict the chemical concentrations of the remaining validation samples. Lignin and cellulose predictions were accurate for new forest samples but less accurate for non-forest data. Nitrogen predictions were successful using all samples and area normalized band-depths (R2 from 0.85 to 0.94 and RMSE from 0.12 to 0.23%(absolute) nitrogen). The method was consistent across independent data sets and a wide diversity of species.

The results of this study suggest the possibility of developing generally applicable equations to simply and rapidly estimate chemical concentrations in dried leaves from their reflectance spectra. These laboratory results are a necessary first step in establishing the validity of this empirical approach before analyzing remote sensing data. However, additional complexities were considered for the development of a remote sensing algorithm. We demonstrated that the normalized band depth approach was insensitive to common influences of soil background, atmosphere absorptions, and changing leaf water content. Of those effects analyzed in this paper, the foremost is the effect of leaf water. In order for this method to work for fresh whole leaves or complete vegetation canopies, the influence of leaf water on spectral reflectance must be removed to within 10%. This presents a challenging problem and an area for future research before important canopy chemistry information may be acquired on a large scale for the study of ecosystems.
 

ACKNOWLEDGMENTS

This research was supported by the NASA Accelerated Canopy Chemistry Program through Interagency Agreement W-18,644.
 

REFERENCES

ACCP (1994), Accelerated Canopy Chemistry Program Final Report to NASA-EOS-IWG, National Aeronautics and Space Administration, Washington D.C., 19 October 1994.

Aber, J.D., and Federer, C.A. (1992), A generalized, lumped-parameter model of photosynthesis, evapotranspiration and net primary production in temperate and boreal forest ecosystems, Oecologia 92: 463-474.

Aber, J.D., Bolster, K.L., Newman, S.D., Soulia, M., and Martin, M.E. (1994), Analyses of forest foliage II: Measurement of carbon fraction and nitrogen content by end-member analysis, Journal of Near Infrared Spectroscopy2: 15-23.

Afifi, A.A. and Azen, S.P. (1971), Statistical Analysis, A Computer Aided Approach, Academic Press, London.

Allen, W.A, Gausman, H.W., Richardson, A.J., and Thomas, J.R. (1969), Interaction of isotropic light with a compact leaf, Journal of the Optical Society of America 59: 1376-1379.

Berk, A., Bernstein, L.S., and Robertson, D.C., 1989, "MODTRAN, A moderate resolution model for LOWTRAN 7," Final Report, GL-TR-0122, AFGL, Hanscomb AFB, MA, 42 pp.

Bolster, K.L., Martin, M.E., and Aber, J.D. (1996), Determination of carbon fraction and nitrogen concentration in tree foliage by near infrared reflectance: a comparison of statistical methods, Canadian Journal of Forest Research 26: 590-600.

Borel, C.C. and Gerstl, S.A.W. (1994), Nonlinear spectral mixing models for vegetative and soil surfaces, Remote Sensing of Environment47: 403-416.

Card, D.H., Peterson, D.L., Matson, P.A., and Aber, J.D. (1988), Prediction of leaf chemistry by the use of visible and near infrared reflectance spectroscopy, Remote Sensing of Environment 26: 123-147.

Clark, R.N. (1981), Water frost and ice the near-infrared spectral reflectance 0.65-2.5 µm, Journal of Geophysical Research 86: 3087-3096.

Clark, R.N. and Roush, T.L. (1984), Reflectance spectroscopy: Quantitative analysis techniques for remote sensing applications, Journal of Geophysical Research 89: 6329-6340.

Clark, R.N., Swayze, G.A., Heidebrecht, K., Green, R.O., and Goetz, A.F.H. (1998), Calibration to surface reflectance of terrestrial imaging spectrometry data: Comparison of methods, To be submitted to Applied Optics.

Curran, P.J. (1989), Remote sensing of foliar chemistry, Remote Sensing of Environment 30: 271-278.

Dixit, L., and Ram, S. (1985), Quantitative analysis by derivative electronic spectroscopy, Applied Spectroscopy Reviews 21:311-418.

Elvidge, D.E. (1990), Visible and near infrared reflectance characteristics of dry plant materials, Remote Sensing of Environment 11: 1775-1795.

Fourty, Th., Baret, F., Jacquemoud, S., Schmuck, G. and Verdebout, J. (1994), Leaf optical properties with explicit description of its biochemical composition: Direct and inverse problems, Remote Sensing of Environment56: 104-117.

Ganapol, B.D., Johnson, L.F., Hammer, P.D, Hlavka, C.A., and Peterson, D.L. (1998), LEAFMOD: A new within leaf radiative transfer model, Remote Sensing of Environment 63: 182-193.

Gao, B. and Goetz, A.F.H. (1994), Extraction of dry leaf spectral features from reflectance spectra of green vegetation, Remote Sensing of Environment47: 369-374.

Grossman, Y.L., Ustin, S.L., Jacquemoud, S., Sanderson, E.W., Schmuck, G., and Verdebout, J. (1996), Critique of stepwise multiple linear regression for the extraction of leaf biochemistry information from leaf reflectance data, Remote Sensing of Environment 56: 1-12.

Hapke, B. (1981), Bidirectional reflectance spectroscopy, 1: Theory, Journal of Geophysical Research 86: 3039-3054.

Irvine, W.M., and Pollack, J. B. (1968), Infrared optical properties of water and ice spheres, Icarus 8: 324-360.

Johnson, L.F., Hlavka, C.A., and Peterson, D.L. (1994), Multivariate analysis of AVIRIS data for canopy biochemical estimation along the Oregon transect, Remote Sensing of Environment 47: 216-230.

Johnson, L.F. and Billow, C.R. (1996), Spectrometric estimation of total nitrogen concentration in Douglas-fir foliage, International Journal of Remote Sensing 17:489-500.

Kupiec, J.A., and Curran, P.J. (1995), Decoupling the effects of the canopy and foliar biochemicals in AVIRIS spectra, International Journal of Remote Sensing 16: 1731-1739.

LaCapra, V.C., Melack, J.M., Gastil, M., and Valeriano, D. (1996), Remote sensing of inundated rice with imaging spectrometry, Remote Sensing of Environment 55: 50-58.

Marten, G.C., Shenk, J.S., and Barton, F.E. II. Eds. (1989), Near-Infrared Reflectance Spectroscopy (NIRS): Analysis of Forage Quality, U.S. Dept. of Agric. Handbook 643, USDA, Washington D.C., pp. 1-96.

Martin, M. E. and Aber, J. D. (1997), High spectral resolution remote sensing of forest canopy lignin, nitrogen and ecosystem process, Ecological Applications, 7: 431-443.

Matson, P., Johnson, L., Billow, C., Miller, J., and Pu, R. (1994), Seasonal patterns and remote spectral estimation of canopy chemistry across the Oregon transect, Ecological Applications, 4: 280-298.

Mooney, H., Vitousek, P., and Matson, P. (1987), Exchange of materials between terrestrial ecosystems and the atmosphere, Science 238: 926-932.

Newman, S. D., Soulia, M.E., Aber, J.D., Dewey, B., and Ricca, A. (1994), Analyses of forest foliage I: Laboratory procedures for proximate carbon fractionation and nitrogen determination, Journal of Near Infrared Spectroscopy2:5-14.

Norris, K.H., Barnes, R.F., Moore, J.E., and Shenk, J.S. (1976), Predicting forage quality by infrared reflectance spectroscopy, Journal of Animal Science 43:889-897.

O'Haver, T.C., and Begley, T. (1981), Signal-to-noise ratio in higher order derivative spectrometry, Analytical Chemistry 53: 1876-1878.

Peterson, D.L., and Hubbard, G.S. (1992), Scientific issues and potential remote sensing requirements for plant biogeochemical content, Journal of Imaging Science and Technology 36: 445-455.

Schimel, D. (1995), Terrestrial biogeochemical cycles: Global estimates with remote sensing, Remote Sensing of Environment 51: 49-56.

Stuedler, P., Bowden, R., Melillo, J.M., and Aber, J.D. (1989), Influence of nitrogen fertilization on methane uptake in temperate forest soils, Nature 341: 314-316.

Sunshine, J.M., Pieters, C.M., and Pratt, S.F. (1990), Deconvolution of minerals absorption bands: An improved approach, Journal of Geophysical Research 95: 6955-6966.

Sunshine, J.M. and Pieters, C.M. (1993), Estimating modal abundances from the spectra of natural and laboratory pyroxene mixtures using the modified Gaussian model, Journal of Geophysical Research 98: 9075-9087.

Wessman, C.A., Aber, J.D., and Peterson, D.L. (1989), An evaluation of imaging spectrometry for estimating forest canopy chemistry, International Journal of Remote Sensing 10: 1293-1316.

Wofsy, S.C., Goulden, M.L., Munger, J.W., Fan, S.M., Bakwin, P., Daube, B., Bassow, S., and Bazzaz, F.A. (1993), Net exchange of CO2 in a mid-latitude forest, Science 260: 1314-1317.
 
 
 
 
 
 

Table 1. Chemistry and sample size of data sets.
 
Chemistry by Site
# of Samples
Mean of Concentration
Std. Dev. of Concentration
Minimum Concentration
Maximum Concentration
Nitrogen          
Blackhawk Island
184
2.38
0.54
1.09
3.51
Harvard Forest
193
1.86
0.54
0.93
3.19
Howland, Maine
189
1.34
0.44
0.69
2.67
Slash pine
78
1.00
0.29
0.62
1.67
Rice
69
0.91
0.18
0.54
1.29
Douglas-fir*
96
1.85
0.64
0.68
3.35
LTER
31
0.88
0.58
0.22
2.40
All combined
840
1.67
0.71
0.22
3.51
Lignin          
Blackhawk Island
184
23.24
4.37
12.42
31.00
Harvard Forest
193
21.64
4.66
13.75
33.70
Howland, Maine
189
22.72
4.55
13.83
32.20
Slash pine
78
22.12
1.68
18.84
25.54
Rice
69
14.70
1.05
11.62
16.65
Douglas-fir*
-
-
-
-
-
LTER
31
23.17
8.44
7.60
44.62
All combined
744
21.78
4.93
7.60
44.62
Cellulose          
Blackhawk Island
184
42.33
4.07
33.22
54.41
Harvard Forest
193
37.19
6.88
23.69
67.57
Howland, Maine
189
34.45
5.17
24.20
48.12
Slash pine
78
36.03
2.36
31.94
42.42
Rice
69
56.19
3.05
50.83
62.69
Douglas-fir*
-
-
-
-
-
LTER
31
46.28
13.92
24.56
74.73
All combined
744
39.79
8.43
23.69
74.73
* indicates lignin and cellulose concentrations were not measured for this data set
 
 
 
 

Table 2. Continuum end point and band center definitions for the large absorption features observed in dried, ground leaves.
 
Absorption Feature (µm)
Continuum Line Start (µm)
Band Center (µm)
Continuum Line End (µm)
1.730
1.652
1.728
1.778
2.100
2.030
2.106
2.218
2.300
2.238
2.304
2.366

 
 
 

Table 3. Results of using stepwise multiple linear regression to estimate leaf chemical concentrations from band-depths, normalized to the band center, calculated from near infrared spectra of dried and ground leaves.
 
Site
Nitrogen
Lignin
Cellulose
 
R2
SEC
R2
SEC
R2
SEC
Blackhawk Island
0.94
0.14
0.83
1.85
0.75
2.10
Harvard Forest
0.94
0.14
0.81
2.04
0.85
2.74
Howland, Maine
0.95
0.11
0.77
2.21
0.79
2.40
Slash pine
0.95
0.07
0.72
0.93
0.81
1.10
Rice
0.90
0.06
0.32
0.91
0.81
1.41
Douglas-fir*
0.93
0.17
-
-
-
-
LTER
0.97
0.12
0.65
5.58
0.93
4.33
* indicates lignin and cellulose concentrations were not measured for this data set
 
 
 
 

Table 4. Results of stepwise linear regression equation applied to band-depths, normalized by band center, calibrated with two-thirds of eastern U.S. forest data and used to predict remaining data.
 
Chemistry by Site
# of Samples
R2
RMSE
Bias (calc-meas)
Calibration (Eastern U.S. Forests - 2/3)
Nitrogen
377
0.94
0.17
 
Lignin
377
0.65
2.73
 
Cellulose
377
0.78
2.98
 
Validation
Nitrogen        
Eastern U.S. Forests (1/3)
189
0.94
0.17
0.00
Slash pine
78
0.86
0.30
0.22
Rice
69
0.83
0.13
-0.08
Douglas-fir*
96
0.86
0.35
0.23
LTER
31
0.93
0.23
-0.05
Lignin        
Eastern U.S. Forests (1/3)
189
0.72
2.52
-0.23
Slash pine
78
0.43
3.53
0.54
Rice
69
0.01
9.38
-8.69
Douglas-fir*
-
-
-
-
LTER
31
0.38
7.90
-1.71
Cellulose        
Eastern U.S. Forests (1/3)
189
0.75
3.41
-0.17
Slash pine
78
0.35
2.60
-1.47
Rice
69
0.00
13.31
-12.01
Douglas-fir*
-
-
-
-
LTER
31
0.28
16.24
-7.12
* indicates lignin and cellulose concentrations were not measured for this data set
 
 
 
 

Table 5. Results of stepwise linear regression equation applied to band-depths, normalized by band area, calibrated with two-thirds of eastern U.S. forest data and used to predict remaining data.
 
Chemistry by Site
# of Samples
R2
RMSE
Bias (calc-meas)
Calibration (Eastern U.S. Forests - 2/3)
Nitrogen
377
0.94
0.17
 
Lignin
377
0.64
2.75
 
Cellulose
377
0.79
2.92
 
Validation
Nitrogen        
Eastern U.S. Forests (1/3)
189
0.93
0.18
0.00
Slash pine
78
0.93
0.14
0.02
Rice
69
0.85
0.12
0.09
Douglas-fir*
96
0.94
0.17
0.02
LTER
31
0.92
0.23
-0.11
Lignin        
Eastern U.S. Forests (1/3)
189
0.71
2.57
-0.27
Slash pine
78
0.49
2.79
-0.12
Rice
69
0.02
7.49
-6.94
Douglas-fir*
-
-
-
-
LTER
31
0.52
6.88
-1.86
Cellulose        
Eastern U.S. Forests (1/3)
189
0.77
3.29
-0.22
Slash pine
78
0.03
3.34
-2.03
Rice
69
0.30
17.31
-15.93
Douglas-fir*
-
-
-
-
LTER
31
0.14
18.97
-8.51
* indicates lignin and cellulose concentrations were not measured for this data set
 
 
 
 

Table 6. Regression equations devloped from all data - Normalized to band area.
 
Biochemical Estimated
Term
Wavelength of Term (µm)
Coefficient Value
Nitrogen
a0
Constant
15.1911
 
a1
2.036
-5.5917
 
a2
2.050
3.3523
 
a3
2.078
-2.1982
 
a4
2.152
-0.7080
 
a5
2.180
-0.3221
Lignin
a0
Constant
18.1348
 
a1
1.666
1.0782
 
a2
1.762
2.9033
 
a3
2.246
4.8285*
 
a4
2.266
1.8642
 
a5
2.324
-5.7570
 
a6
2.346
0.5254
Cellulose
a0
Constant
-8.6890
 
a1
1.660
-4.9517
 
a2
1.766
-2.7841
 
a3
2.066
8.9206
 
a4
2.186
26.8997
 
a5
2.202
-31.2168
 
a6
2.266
2.7751
 
a7
2.288
-6.6202
 
a8
2.322
1.9227
* The correct sign for this quantity is positive (note that the published paper has an incorrect negative sign for this quantity - see Kokaly, R.F. and Clark, R.N., Spectroscopic Determination of Leaf Biochemistry Using Band-Depth Analysis of Absorption Features and Stepwise Linear Regression Remote Sensing of Environment Vol. 67, pp. 267-287, 1999).
 
 
 

Table 7. Normalized to band area - The impact of simulating wider bandpass on stepwise multiple linear regression to estimate leaf concentrations for all data.
 
Wavelength Range

(# of averaged channels)

Simulate Bandpass (nm)

 

Nitrogen
Lignin
Cellulose
   
R2
SEC
R2
SEC
R2
SEC
1
10
0.95
0.16
0.66
2.90
0.82
3.57
5
18
0.95
0.16
0.66
2.90
0.81
3.68
11
30
0.94
0.17
0.59
3.16
0.66
4.93
15
38
0.93
0.19
0.46
3.64
0.66
4.95
21
50
0.94
0.18
0.34
4.02
0.65
5.01
25
58
0.93
0.19
0.44
3.70
0.73
4.40
31
70
0.92
0.20
0.44
3.69
0.81
3.68
35
78
0.91
0.21
0.35
4.00
0.80
3.79

 
 
 

Table 8. Normalized to band area: The effect of leaf water content on calculated chemical concentrations (RMSE from 6 samples).
 
Water (%)
RMSE in Nitrogen (%)
RMSE in Lignin (%)
RMSE in Cellulose (%)
0
0.00
0.00
0.00
10
0.19
1.82
1.98
20
0.22
5.75
7.17
30
0.36
34.18
49.07
40
1.67
99.13
111.38
50
2.82
18.85
76.17
60
2.34
38.79
442.21
70
1.77
42.34
200.66
80
1.26
60.70
40.90

 
 
 

Table 9. Normalized to band area: The effect of soil background on calculated chemical concentrations (Blackhawk Island).
 
Areal coverage of soil (%)
RMSE in Nitrogen (%)
RMSE in Lignin (%)
RMSE in Cellulose (%)
0
0.00
0.00
0.00
10
0.04
0.34
1.05
20
0.10
0.70
2.35
25
0.13
0.89
3.12
30
0.16
1.09
3.99
40
0.25
1.53
6.16
50
0.37
2.04
9.18
60
0.55
2.69
13.70
70
0.85
3.60
21.32
75
1.09
4.24
27.50

 
 
 

Table 10. Normalized to band area: The effect of 10% leaf water content, 25% soil background, and 50 meter atmosphere residual on calculated chemical concentrations from Normalized Band-Depth, 1st Derivative, and 2nd Derivative methods (RMSE from 6 samples).
 
 Chemical Estimated RMSE for Normalized Band-Depth Method RMSE for 1st Derivative Method RMSE for 2nd Derivative Method
Nitrogen 0.21 0.31 0.39
Lignin 1.36 12.63 12.98
Cellulose 3.44 27.76 2.87

 
 
 
 
 

Figure Captions

Figure 1. Spectral curves showing the dry leaf spectrum for sugar maple (solid line) and the effects of three contaminants to remote sensing data, an added 10% leaf water content, 25% soil background, and absorptions due to a 50 meter layer of atmosphere (dashed line). Spectral curves are for 1a) reflectance, 1b) continuum removed and normalized band-depths for three absorption features, 1c) the 1st derivative of log (1/reflectance), and 1d) the 2nd derivative of log (1/reflectance).

Figure 2. Continuum analysis is demonstrated for a white pine sample. Figure 2a shows the continua used to isolate each major absorption feature in dry leaf reflectance spectra (1.73 µm, 2.10 µm, and 2.30 µm). Figure 2b shows the result of continuum removal for the three features. The continuum end points are defined in Table 2.

Figure 3. Example shape changes in the normalized band-depth profiles for the 2.30-µm absorption feature in dry leaf data are shown for two samples, white pine (solid line) and red maple (dashed line). The original reflectance spectra have been continuum removed and the band depths normalized to the band center at 2.304 µm. This process clearly shows the different absorption shapes in these two samples. The shape changes are due to varying amounts of absorbers in the dry leaf, including nitrogen contained in proteins, and structural biochemicals such as lignin and cellulose.

Figure 4. Wavelengths correlated with dry leaf chemistry as selected using stepwise linear regression on normalized band-depths for samples from the Blackhawk Island and Harvard Forest data sets. The position of correlated wavelengths and the absorption band centers are shown for nitrogen, lignin and cellulose in Figures 4a, 4b, and 4c, respectively.

Figure 5. Prediction of nitrogen concentrations in the validation data set using regression equation derived from area normalized band-depths of the calibration data set (two-thirds of the eastern U.S. forest samples). The data from different sites are represented by the various symbols: triangle = LTER, asterisk = Rice, cross = slash pine, square = one-third eastern U.S. forests calibration set, diamond = Douglas-fir.

Figure 6. Results of regressions of band-depths normalized to band area with leaf chemistry using all data sets. Regressions for nitrogen, lignin and cellulose are shown in Figures 6a, 6b, and 6c, respectively. The data from different sites are represented by the various symbols: circle = Blackhawk Island, filled circle = Harvard Forest, triangle = LTER, asterisk = Rice, cross = slash pine, x = Howland, Maine, diamond = Douglas-fir (nitrogen only). Regression equations are given in Table 6.

Figure 7. The effect of increasing signal-to-noise (S/N) on errors in nitrogen predictions using the area normalized band-depth approach. Needed S/N to achieve required accuracy in leaf chemistry are shown plotted on the curves for nitrogen and lignin in Figures 7a and 7b, respectively. The variation in errors in cellulose prediction with S/N is shown in Figure 7c.

Figure 8. The spectrum of a dry leaf (top) and with increasing amounts of water added using Hapke radiative transfer model. Top to bottom are: no added water, 10%, 20%, 30%, 40%, 50%, 60%, 70% and 80% added water. Note how the apparent band minimum at 1.73 µm shifts to longer wavelengths with increasing water, and how the structure in the 2.3-µm band becomes approximately a straight line with increasing water. The sequence shows the necessity of accurately removing the water signature to recover the spectral signatures of the other chemical components of the leaf.

Figure 9. The effect of the reflectance of a spectrally bland soil background (Figure 9a) added to the reflectance of a dry leaf (solid line in Figure 9b). The spectrum of a linear mixture of 25% soil and 75% leaf is represented by the dashed line in Figure 9b.


U.S. Geological Survey, a bureau of the U.S. Department of the Interior
This page URL= http://speclab.cr.usgs.gov/PAPERS/chanchem99/canchem99.html
This page is maintained by: Raymond F. Kokaly raymond@speclab.cr.usgs.gov and
Roger N. Clark rclark@speclab.cr.usgs.gov
Last modified December 13, 1999.