Vous êtes sur la page 1sur 13

Digital Image Classification Geography 4354 Remote Sensing

Lab 11 Dr. James Campbell December 10, 2001

Group #4 Mark Dougherty Paul Bartholomew Akisha Williams Dave Trible Seth McCoy

Table of Contents:
Table of Contents: ............................................................................................................... 2 Introduction ......................................................................................................................... 3 Overview of Procedure........................................................................................................ 3 Preliminary procedures ....................................................................................................... 4 Training Sets ................................................................................................................... 4 Signature Editor............................................................................................................... 5 CLASSIFICATION ............................................................................................................ 6 Unsupervised Classification............................................................................................ 7 Supervised Classification ................................................................................................ 7 Non-Parametric Classification Techniques:................................................................ 7 Results ............................................................................................................................... 10 Error MatricesLand-use Percentages As Classified ...................................................... 12 Land-use Percentages As Classified ............................................................................. 13 Conclusion......................................................................................................................... 13

Introduction
The purpose of this lab was to allow students an opportunity to gain hands-on experience in digital image classification. Government and private agencies use image classification, such as that presented here, as a tool in urban planning, in policy development, and in implementing laws dealing with present and future land use The objective was to classify the SW quarter of a Landsat TM image that approximately covers the area of the USGS 15-minute Radford Quadrangle (Figures 1 and 2).

Figure 1 - Landsat scene of area approximating much of the area represented in the USGS Radford 15 minute quadrangle. Image is 1024 x1024 pixels.

Figure 2 - Subset of original nrv.img image. The area covers the 512 x 512 area of pixiels in the SW quarter of the nrv.img.

Overview of Procedure
The image subset was classified into four distinct classifications; 1) urban, suburban or bare (fallow), 2) Agriculture, 3) Water, and 4) Forested. ERDAS Imagine 8.4 was used in manipulating the image and creating the classification and analysis. Areas that appeared to be related to the land types of interest where digitized using the Area Of Interest Tool (AOI) and added to a signature file. Each digitized area was added to the programs signature editor and checked to determine if the area had a unimodal leptonic distribution for its brightness signature. A unimodal distribution indicates that the reflectance values for the AOI likely comes from one type of land feature, such as forest or cropland. Once various AOIs were chosen and checked statistically, the signatures for the various AOIs were merged into the four classification types as described. These classifications used for training sets for the final supervised classification procedure. Various algorithms were run for the final supervised classification and compared to one another for accuracy. Microsoft Excel was used in the analysis of Error matrices used to check the separability of the training sets used in the classification of the various land uses and to determine percentages of land use in each category, based on 30m x 30m pixel counts.

Preliminary procedures
The following procedures were used in this project to obtain a classified image of land use. Training sets and the signature editor, described below, were the first steps used in this procedure.

Training Sets
After subsetting the image to the SW quadrant, areas of visually homogeneous spectral response were chosen as AOIs and added to the spectral signature editor (Figure 3).

Figure 3 Areas that visually seemed to represent the spectral characteristics of the land use categories of forest, agriculture, urban, and water were digitized and added to the Signature editor.

The regions were digitized using two methods; polygon and seed tool. The polygon method simply allows one to draw a polygon that represents the spectral characteristics that represent the particular classification. Alternatively, the seed tool allows one to pick a single pixel from which to grow a region by way of a Figure 4 - Seed Growing Properties dialog box as statistical algorithm. The water of used to select the training data. the New River, shown in Figure 3, was chosen this way. In using the seed tool, a Region Grow Properties dialog box can be invoked to set the way in which the seed tool works. In Figure 4, the 500000 given in the Area box sets the maximum number of pixels to which the seed can grow. The Euclidean distance parameter constrains the variation in spectral characteristics for the pixels that can be selected. In short, the subsetted image was classified into spectral classes and then assigned meaningful informational classes. Spectral classes are classes that are uniform with respect to the spectral values that compose each pixel. Informational classes are composed of numerous spectral classesspectrally distinct groups of pixels that together 4

may be assembled to form a unique descriptive class. Spectral classes and informational classes are used together when identification is more or less obvious from the spectral properties of the class. It is possible to match spectral and informational categories by examining patterns on the image. Many informational categories are recognizable by the positions, sizes, and shapes of individual parcels and their spatial correspondence with areas of known identity. It was possible to take advantage of this feature with the imagery provided because of our familiarity with the geographic region.

Signature Editor
Once the base training sets are established (Figure 6), each training set signature needs to be scrutinized by looking at the brightness count histogram for each band of each set. The histogram should exhibit a unimodal distribution in each band (Figure 5). A bimodal distribution would be evidence that the training area had two distinct classes of pixels instead of one classification (i.e., a region being picked to train pixels of agricultural region may include some forested area also). Training sets that fail this test need to be deleted and replaced if necessary.

Figure 5 -Histogram of an area picked as a Figure 6 - Signature Editor showing training sets. Individual training region for forest. The left shows the training sites were picked so they were evenly distributed blue band, middle is the green band, and at right throughout the image is mid infrared. The mid infrared has a marginally unimodal distribution.

Training signatures were merged to produce five signature files (four plus one for forested areas in shadow) classifications, as shown in figure 7.

Figure 7 - Signature editor with five subclasses including a separate shadow spectral signature

Because the shadowed forest areas have different spectral signatures than the water areas due to reflectivity in the infrared band, we merged the forest and the shadowed forest areas to produce one band for forest, for a total of four signatures, as shown in the screen figure 8.

Figure 8 - Signature editor with four subclasses; the shadow spectral signature merged with merge forest

CLASSIFICATION
The following sections describe the various forms of image classification used in this project, all of which were forms of supervised classification. Supervised classification is usually appropriate when you want to identify relatively few classes, when you have selected training sites that can be verified with ground truth data, or when you can identify distinct, homogeneous regions that represent each class. A brief discussion of the difference between unsupervised and supervised classification is presented in order to better understand the two main groups of classification procedures available. In order to describe the classification procedures used in an intuitive manner, the supervised classification techniques are divided into two standard groups called non-parametric and parametric procedures.

Unsupervised Classification
Unsupervised Classification is the identification of natural groups, or structures, within multi-spectral data by the algorithms programmed into the software. The following characteristics apply to an unsupervised classification: There is no extensive prior knowledge of the region that is required for unsupervised classification unlike supervised classification that requires detailed knowledge of the area. The opportunity for human error is minimized with unsupervised classification because the operator may specify only the number of categories desired and sometimes constraints governing the distinctness and uniformity of groups. Many of the detailed decisions required for supervised classification are not required for unsupervised classification creating less opportunity for the operator to make errors. Unsupervised classification allows unique classes to be recognized as distinct units. Supervised classification may allow these unique classes to go unrecognized and could inadvertently be incorporated into other classes creating error throughout the entire classification.

Supervised Classification
Supervised classification is the process of using samples of known identity to classify pixels of unknown identity. The following characteristics apply to a supervised classification: The analyst has control of a set, selected menu of informational categories tailored to a specific purpose and geographic region. Supervised classification is tied to specific areas of known identity, provided by selecting training areas. Supervised classification is not faced with the problem of matching spectral categories on the final map with the informational categories of interest. The operator may be able to detect serious errors by examining training data to determine whether they have been correctly classified. In supervised training, it is important to have a set of desired classes in mind, and then create the appropriate signatures from the data. You must also have some way of recognizing pixels that represent the classes that you want to extract.

Non-Parametric Classification Techniques:


A nonparametric classifier uses a set of nonparametric signatures to assign pixels to a class based on their location, either inside or outside the area in the feature space image. A nonparametric signature is based on an AOI that you define in the feature space image for the image file being classified.
Figure 9 - Parallelepiped Results

Parallelepiped: or the box decision rule classifier


In this procedure two image bands are used to determine the training area of the pixels in each band based on maximum and minimum pixel values. Although parallelepiped is the most accurate of the classification techniques, it is not the most widely used because it has several disadvantages. The most important disadvantage is that it can leave many unclassified pixels. Another disadvantage of this classification method is that it can have overlap between training pixels. In the parallelepiped decision rule, the data file values of the candidate pixel are compared to upper and lower limits. These limits can be either: the minimum and maximum data file values of each band in the signature, the mean of each band, plus and minus a number of standard deviations, or any limits that you specify, based on your knowledge of the data and signatures. This knowledge may come from the signature evaluation techniques using the Parallelepiped Limits utility in the Signature Editor. There are high and low limits for every signature in every band. When a pixels data file values are between the limits for every band in a signature, then the pixel is assigned to that signatures class. Twodimensional parallelepiped classifications are common. Any Feature Space AOI can be defined as a non-parametric signature as part of a classification procedure, as follows: a polygon is drawn in the feature space image that corresponds to an area that you have identified as a specific type of land cover. The specific land types are identified in a feature space image by looking at the value and frequency of two band combinations used in the image. The intensity of the colors indicates the number of pixels for each band in a certain brightness range, as shown in Figures 7 and 8, below.

Figure 10 - Feature Space Band 3,4

Figure 11 - Feature Space with labels

The feature space decision rule determines whether or not a candidate pixel lies within the nonparametric signature in the feature space image. When a pixels data file values are in the feature space signature, then the pixel is assigned to that signatures class as two-dimensional feature space classification. The polygons in the image are AOIs used to define the feature space signatures.

Parametric Classification Techniques


Parametric methods of supervised classification take a statistical approach. A parametric signature is based on statistical parameters (e.g., mean and covariance matrix) of the pixels that are in the training sample or cluster. A parametric signature includes the following attributes in addition to the standard attributes for signatures: the number of bands in the input image (as processed by the training program) the minimum and maximum data file value in each band for each sample or cluster (minimum vector and maximum vector) the mean data file value in each band for each sample or cluster (mean vector) the covariance matrix for each sample or cluster the number of pixels in the sample or cluster Maximum Likelihood: This classification method uses the training data as a means of estimating means and variances of the classes, which are then used to estimate probabilities. Maximum likelihood classification considers not only the mean or average values in assigning classification, but also the variability of brightness values in each class. It is the most powerful of the classification methods as long as accurate training data is provided. Therefore this method requires excellent training data. An advantage of this method is that it provides an estimate of overlap areas based on statistics. This method is different from parallelpiped that uses only maximum and minimum pixel values.

Figure 12 - Maximum Likelihood classification

The maximum likelihood decision rule is based on the probability that a pixel belongs to a particular class. The basic equation assumes that these probabilities are equal for all classes, and that the input bands have normal distributions. The maximum likelihood algorithm assumes that the histograms of the bands of data have normal distributions. If this is not the case, you may have better results with the parallelepiped or minimum distance decision rule, or by performing a first-pass parallelepiped classification. Mahalanobis distance: Mahalanobis distance classification is similar to minimum distance classification, except that the covariance matrix is used in the equation. Variance and covariance are figured in so that clusters that are highly varied lead to similarly varied classes, and vice versa. For example, when classifying urban areas typically a class whose pixels vary widelycorrectly classified pixels may be farther from the mean than those of a class for water, which is usually not a highly varied class. The Mahalanobis distance decision rule uses the covariance matrix in the equation. Variance and covariance are figured in so that clusters that are highly varied will lead to similarly varied classes, and vice versa. The 9

Figure 13 - Mahalanobis distance results

Mahalanobis distance algorithm assumes that the histograms of the bands have normal distributions. If this is not the case, you may gain better results with the parallelepiped or minimum distance decision rule, or by performing a first-pass parallelepiped classification. Minimum distance: The minimum distance decision rule (also called spectral distance) calculates the spectral distance between the measurement vector for the candidate pixel and the mean vector for each signature. This classification method derives distance between any pair of pixels after defining training data. The minimum distance classifier can be used as a supplement to the parallelpiped classification method, which can leave unanswered pixels. The classification technique takes pixels of known identity and then includes pixels closest to it as training pixels. Like the other methods of classification, this method uses two bands to evaluate the training data.

Figure 14 - Minimum Distance results

The source of much of the above information on classification came from ERDAS Field Guide, Fifth Edition, Revised and Expanded. Taken from ERDAS IMAGINE On-Line Help Copyright (c) 1982-1999 ERDAS, Inc.

Results
The following section presents the results of this project, including separability analyses, error matrix analysis, and final land use classification acreages and percentages. Separability and Error Matrices: Transformed divergence (TD) has upper and lower bounds. If the calculated divergence is equal to the appropriate upper bound, then the signatures can be said to be totally separable in the bands being studied. A calculated divergence of zero means that the signatures are inseparable. According to the ESRI Field Guide, the range for transformed divergence is as shown below. TD is between 0 and 2000. A separability listing is a report of the computed divergence for every class pair and one band combination. The listing contains every divergence value for the bands studied for every possible pair of signatures. The separability listing also contains the average divergence and the minimum divergence for the band set. These numbers can be compared to other separability listings (for other band combinations), to determine which set of bands is the most useful for classification. The separability cell array shown figure 15, presents the results of one of the classifications, showing the range of values (from 0 to 2000) used to quantify the separability of the signatures. 10

Figure 15 Separability array

As can be seen in the above figure, all separability figure are above 1900 showing very good separability between the five classes implying that our results of the final classification are accurate. The results would need to be ground-truthed as a final accuracy check. Error matrices, presented in table 1, show the percent accuracy of each classification method using the image provided.

11

Error Matrices
Parallelpiped
ag Ag Water Urban Forest column total CA(%)

Table 1

water urban forest totals PA% EO% EC% 3859 0 48 8 3915 98.5696 1.430396 1.253838 0 5654 0 517 6171 91.6221 8.377897 2.06229 17 0 3134 11 3162 99.11448 0.885515 1.508485 32 98 0 5598 5728 97.73045 2.269553 8.738181 3908 4752 3182 6134 18245 98.74616 118.9815 98.49151 91.26182

Maximum Likelihood
ag Ag Water Urban Forest column total CA(%) water urban forest totals PA% EO% EC% 3859 0 52 20 3931 98.1684 1.831595 1.253838 0 4650 0 521 5171 89.92458 10.07542 2.146465 17 0 3130 31 3178 98.48962 1.510384 1.634192 32 102 0 5562 5696 97.64747 2.352528 9.325073 3908 4752 3182 6134 17201 98.74616 97.85354 98.36581 90.67493

Mahalanobis
ag Ag Water Urban Forest column total CA(%) water urban forest totals 3819 0 25 47 0 3866 0 154 65 0 3157 96 24 886 0 5837 3908 4752 3182 6134 97.72262 81.35522 99.21433 95.15813 PA% EO% EC% 3891 98.14958 1.850424 2.27738 4020 96.16915 3.830846 18.64478 3318 95.14768 4.852321 0.785669 6747 86.51252 13.48748 4.841865 16679

Minimum Distance
ag Ag Water Urban Forest column total CA(%) water urban forest totals PA% EO% EC% 3773 0 22 180 3975 94.91824 5.081761 3.454452 0 4750 0 2240 6990 67.95422 32.04578 0.042088 122 0 2925 972 4019 72.7793 27.2207 8.076681 13 2 235 2742 2992 91.64439 8.355615 55.29834 3908 4752 3182 6134 14190 96.54555 99.95791 91.92332 44.70166

12

Land-use Percentages As Classified


A tabular list of acres for each informational class, as well as a percentage for each land use, is shown in table 2 below. Acreages were determined using the 30m-pixel resolution. Minimum distance classification was found to be the most different of the four methods used, as far as percent of land in each category.
Classified Land-use Acreages and Percentages (table 2) Urban Acre Minimum 18519 Parallel 11723 Maximum 11081 Mahalanobis 14537 Agriculture Water Forest Acre Percent Acre Percent Acre Percent 10813 18.54 2975 5.10 26002 44.59 12691 21.76 1313 2.25 32582 55.88 14737 25.27 1317 2.26 31174 53.46 14073 24.14 978 1.68 28722 49.26

Percent 31.76 20.11 19.00 24.93

Conclusion
The parallelpiped classification method was determined to be the most accurate of the methods tested in this study, based on both producer and consumer accuracies (found in the error matrices). This is interesting because the parallelpiped classifier does not take a statistical approach. Rather, this method uses two image bands to determine the training area of the pixels in each band based on maximum and minimum pixel values. Although parallelepiped is one of the most accurate of the classification techniques, it is not the most widely used because it has several disadvantages, the most important of which is that it can leave many unclassified pixels. The minimum distance classifier was found to be the least accurate of the four classification techniques evaluated, with the lowest overall accuracies. The Mahalanobis distance classifier resulted in the most aesthetically pleasing image because it did the best job of filtering out shadows. A possible explanation for this is that the parametric Mahalanobis distance classifier is able to classify highly varied clusters, such as was evident with the various shaded area, into similarly varied classes. The maximum likelihood classification method was found to be intermediate relative to the other classifiers evaluated.

13