Mixture Modeling

Mixed pixels

In any given image, the smallest area that can be seen is a pixel (= picture element.) The size of a pixel varies, depending on the sensor used, but in the Landsat Thematic Mapper images a pixel represents a square on the ground of about 30 meters on a side. For the Coastal Marsh Project (CMP) these have been resampled to 28.5 meters. An area this size on the ground can have all or part of many different objects in it - for example, steel buildings, concrete roads, dirt roads, bridges, ponds, trees, sand, mud and grasslands. Despite the variety of objects possible in a pixel, the satellite sensor will record a single point for that pixel, and the light detected by the satellite is the total of all the reflected light from each object in the pixel.

Often, it would be useful to know what percent of a pixel is composed of open water, grasslands and bare soil, or, in some instances, to separate different land uses, for example, row crops, orchards and buildings. This is the point of spectral unmixing - to find these hidden percentages.

The Coastal Marsh Project goals are most interested in water, vegetation and soil; that example will be followed for the rest of this discussion. Before any calculations can be made it is necessary to look at what input information is available. Generally, for instance, the reflectance of water, vegetation and soil are known, as are their reflectances in several spectral bands. The Thematic Mapper records seven bands; the CMP uses three bands to record information about each pixel: how bright it is (the reflectance) in the red, near infrared, and middle infrared bands. These are, respectively, 0.63 - 0.69 micrometers (band 3), 0.76 - 0.90 micrometers (band 4) and 1.55 - 1.75 micrometers (band 5).

This provides three sets of equations:

1. fw * r3w + fv * r3v + fs * r3s = R3
2. fw * r4w + fv * r4v + fs * r4s = R4
3. fw * r5w + fv * r5v + fs * r5s = R5

where fw, fv and fs are the fraction of the pixel covered by water, vegetation and soil, and r3w, r4w, r5w represent the reflectance of water in each of the three spectral bands and R3, R4 and R5 represent the signal recorded at the satellite in bands 3, 4, and 5.

Therefore, Equation 1 can be interpreted as saying:

• (fraction of the pixel covered by water ) * (reflectance of water from band 3)
+
• (fraction of the pixel covered by vegetation) * (reflectance of vegetation from band 3)
+
• (fraction of the pixel covered by soil) * (reflectance of soil from band 3)
=
• (total reflectance recorded by the satellite as band 3)

This mixing model assumes that the reflectance seen at the satellite is the additive sum (or linear mixture) of the individual elements and is a reasonable approximation. It also implies that the scene being analyzed has been corrected for atmospheric effects and changes in the position of the sun and the satellite.

Looking at equations 1 - 3, the individual cover type reflectances are known and the total reflectance for each of three bands is known, leaving only the three cover fractions unknown. With three equations, the three unknowns can be solved. There are additional constraints on this that will help to understand if the answer is correct.

1. fw + fv + fs = 1; that is:
(fraction of water) + (fraction of vegetation) + (fraction of soil)= 1
2. fw, fv and fs >= 0

Equation 4 will only be true, of course, if accurate endmembers have been chosen (see below) and there are no other materials in the pixel.

Millions of pixels are assessed for image work, so the fractions are calculated with matrix algebra using the method of Freemantle and Gong (Gong, P., Miller, J., Freemantle, J., and Chen, B. 1991. Spectral decomposition of Landsat Thematic Mapper data for urban land-cover mapping. Proceedings of the 14th Canadian Symposium on Remote Sensing.), available from PCI Inc., Toronto, Ontario

Selecting Endmembers

What are endmembers? Figure 1 shows a plot of TM bands 3 & 4 in a two-dimensional image space. The spectral endmembers are the points that define the extreme limits of the data. In a system composed entirely of water and vegetation (perhaps a large lily-pond), the endmembers shown would represent pure water and pure vegetation. The amount of water and vegetation in each pixel that falls between the endmembers in the spectral space could be calculated approximately from the distance the pixel is from each of the endmembers. The endmembers are average values for a set of pixels thought to represent a particular cover type. These numeric values give the reflectances needed for equations 1 - 3. For three cover types, three endmembers would be needed and this is best represented in three-dimensional space. Figure 1a) Scatterplot of TM band 3 (horizontal axis) and band 4 (vertical) showing water and vegetation endmembers for a two-covertype system. Figure 1b) Gradations of color on the same plot (Band 3 vs Band 4) show bands of percent water (ie. 10 -20 %, 20 - 30 %) from pure water to pure vegetation.

The three cover types of interest--water, vegetation and soil--can be separated spectrally by Landsat TM bands 3, 4 and 5; that is, the spectral values for water are very different from those of vegetation and soil in these bands. So if it can be determined just what the digital value for each type of landcover is, then it will be possible to calculate fractions of land cover that will produce the given spectra.

The procedure for choosing the endmembers is the main variant in different techniques of mixture modeling. In general there are three options:

1. develop spectra in the laboratory under carefully controlled conditions.
2. extract spectra from the image.
3. create a new dataspace and extract the endmembers from that.

Pure substances can be used for laboratory measurements, and the lighting and other factors (such as atmospheric interference between the target and the sensor) can be carefully controlled. On the other hand, the relectances of pure substances do not necessarily correspond to any reflectances in a satellite image, because pure substances rarely occur in nature; and using laboratory spectra requires that the image being studied be corrected for all variations in lighting and atmospheric effects to match the laboratory results.

There are a number of techniques for extracting endmembers from the image itself. This has the advantage that the endmembers selected do represent actual substances in the image and include lighting and general atmospheric components. Since most scenes can be corrected for much of the atmospheric effects and changes in the position of the sun, image endmembers will represent surface covers in the scene. The problem with this method is that virtually all pixels in a scene are mixtures and it may be difficult to know exactly what a given endmember represents.

The third method has several options, each with its own set of advantages and disadvantages.

One of the problems with water, vegetation and soil as endmembers is that they have a natural variance in their spectra, especially soil. Soil brightness can vary from a dark soil to bright sand while still registering as a "soil" type. Figure 2 shows a scatterplot of bands 3 and 4 again,but this time in a system with water, vegetation and soil. Clearly thewater and vegetation form tight clusters; however, there is no true endmember for soil, since soil spectra vary across a wide area of the scatterplot. Two choices are available in dealing with this large variability (over a factor of 2): try to choose a different soil endmember for each type of soil (this is difficult because there are essentially an infinite variety of soil colors) or try to find a spectral space in which all soils look the same. Figure 2: Three component system showing endmembers of water, vegetation and soil. The yellow stripe represents soil spectra in this spectral space.
Coastal Marsh   