Académique Documents
Professionnel Documents
Culture Documents
Remotely-Sensed Images
An Introduction
Second Edition
Paul M. Mather
School of Geography, The University of Nottingham, U K
ailand,
rested
2 lower
A
I .(
P"
(I!
sh
wl
gu
da
tri
(I,,.
N
=
.xii -
.xi+.x+;
i= l
i= l
x
r
N'
i= I
410x350-28820
168 100 - 28 820
114680
- 0.82
139 280
Table 8.4 Conlusion or error matrix lor six classcs. The row labels (Rel.)are those given by an operalor usins pound reference
data. The column labels (Class.)are those generated by the classification procedure. See text for explanation. The lour r~ght-hand
columns are as lollows: (i) number of pixels in class from ground reference data; (ii) estimated classificarion accuracy (per cent);(iii)
class i pixels in relerence data but not given label by classifier; and (iv) pixels given label i by classifier but not class i in rererence
data. The sum of the diagonal elements of the confusion matrix is 350, and the overall accuracy is therefore
(350/410) x 100 = 85.4%.
Class
Ref.
Col. sums
71
72
76
67
81
43
410
60
il
8.10 CIassiJicationaccuracy
,en correctly
nts of row i
Is that have
;iccuracy for
:ell i divided
;he operator
The overall
e individual
:d in percen.. the kappa
on provided
: 5 ) . Kappa is
~sionmatrix.
ectively. the
l e confusion
ie confusion
latrix shown
;ided (i) and
e sum of the
= 6). and the
nn marginal
2 or kappa is:
>undreference
, u r right-hand
(per cent);(iii)
s i in reference
is therefore
207
values could be summarised by a conventional probability distribution, for example the hypergeometric distribution, which describes a situation in which there are
two outcomes to an experiment, labelled P (successful)
and Q (failure), and where samples are drawn from a
population of finite size. If the population being sampled is large the binomial distribution (which is easier to
calculate) can be used in place of the hypergeometric
distribution. These statistical distributions allow the
evaluation of confidence limits, which can be interpreted as follows: If a very large number of samples of
size N are taken and if the true proportion P of successful outcomes is P, then 95% of all the sample values will
lie between P, and P, (the lower and upper 95% confidence limits around P,). The values of the upper and
lower confidence limits depend on (i) the level of probability employed and (ii) the sample size N. The confidence limits get wider as the probability level increases
towards 100% so that we can always say that the 100%
208
Classification
confidence limits range from minus infinity to plus infinity. Confidence limits also get wider as the sample size N
becomes smaller, which is self-evident.
Jensen (1986, p. 228) provides a formula for the calculation of the lower confidence limit associated with a
classification accuracy value obtained from a training
sample of N pixels. The formula used to determine the
required r% lower confidence limits given the values of
P, Q and N is:
s = 79.375 - [1.645/=
=
z]
This result indicates that, in the long run, 95% of training samples with observed accuracies of 79.375% will
have true accuracies of 76.223% or greater. As mentioned earlier, the size of the training sample influences
the confidence level. If the training sample in the above
example had been composed of 80 rather than 480
pixels then the lower 95% confidence level would be
s = 79.375
-/
[I645
79.375 x 20.625
g]
= 71.308%
of cc
inter
pixel
espel
map
bet^
grad
data
nece
scrir
clas5
Con
shot
edit1
autl
clas
bee1
om1
fuz/
feat
OCCl
,
.
j
I
)
I
1
F
ly sc
follc
hav
of c
ord
In t
be
tail
Pro
eas
twa
net
Pro
acq
agc
me
1
roo
yea
an!
Ea
agl
im;
len
lation of
commit: labelled
s of comrnbers of
of class i.
~lculated.
.) identify
I alter his
11. ( 19831,
986) give
.sification
;opal and
196). Burs general
The ques:I?-sensed
I'erence to
)rmula to
area estlrence and
e effect of
from re:es (1994)
ce remote
.scribe the
ibution of
rithm and
(but unstic. First,
ial pattern
.y level of
d i f all the
c correctly
the image
al number
The same
ie pixels in
~ l ydistrib~ t of
s 'overr sins. For
:lasses will
ould a lar~ l yif one of
nber of rerain and to
: is illogical
her than a
sses. More
)f measures
8.11 SUMMARY
Compared to other chapters of this book, this chapter
shows the greatest increase in size relative to the 1987
edition. T o some extent this is a reflection of the
author's own interests. However. the developments in
classification methodology over the past 10 years have
been considerable, and the problem has been what to
omit. The introduction of artificial neural net classifiers,
fuzzy methods. new techniques for computing texture
features, and new models of spatial context have all
occurred during the past decade. This chapter has hardly scratched the surface, and readers are encouraged to
follow up the references provided at various points. I
have deliberately avoided providing potted summaries
of each paper or book to which reference is made in
order to encourage readers to spend some of their time
in the library. However, 'learning by doing' is always to
be encouraged. The C D supplied with this book contains some programs for image classification. These
programs are intended to provide the reader with an
easy way into image classification. More elaborate software is required if methods such as artificial neural
networks. evidential reasoning and fuzzy classification
procedures are to be used. It is important, however, to
acquire familiarity with the established methods of imageclassification before becoming involved in advanced
methods and applications.
Despite the efforts of geographers following in the
footsteps of Alexander von Humboldt over the past 150
years, we are still a long way from being able to state with
any acceptable degree of accuracy the proportion of the
Earth's land surface that is occupied by different cover
types. At a regional scale, there is a continuing need to
observe deforestation and other types of land cover
change, and to monitor the extent and productivity of
agricultural crops. More reliable, automatic, methods of
image classification are needed if answers to these problems are to be provided in an efficient manner. New
becoming available. The early years of the new millennium will see a very considerable increase in the volumes
of Earth observation data being collected from space
platrorms,and much greatercomputerpower(with intelligent software) will be needed if the maximum value is to
be obtained from these data. An integrated approach to
geographical data analysis is now being adopted, and
this is having a significant effect on the way image
classification is performed. The use of non-remotelysensed data in the image classification process is providing the possibility of greater accuracy, while - in turn the greater reliability ofimage-based products is improving the capabilities of environmental GIS, particularly
with respect to studies of temporal change.
All or these factors will present challenges to the
remote sensing and GIS communities, and the focus of
research will move away from specialised algorithm
development to the search for methods that satisfy user
needs and are broader in scope than the statistically
based methods of the 1980s, which are still widely used
in commercial GIS and image processing packages. If
progress is to be made then high-quality interdisciplinary work is needed, involving mathematicians, statisticians, computer scientists and engineers as well as Earth
scientists and geographers. The future has never looked
brighter Tor researchers in this fascinating and challenging area.
8.12 QUESTIONS
1. Explain the following terms: labelling, classification,
clustering, unsupervised, supervised, pattern, feature,
pattern recognition, Euclidean space, per-pixel, perfield, texture, context, divergence, decision rule,
spatial autocorrelation, prior probability, neuron,
reed-forward, multi-layer perceptron, steepest descent, geostatistics, variogram, image segmentation,
GLCM, fractal dimension, kappa.
2. What is meant by the term 'feature space'? How can
you measure similarities between points (representing objects to be classified) in a feature space of n
dimensions, where n > 3?
3. Compare the operation of the k-means and also
ISODATA unsupervised classifiers. Use the programs k-means and isodata (described in Appendix B) to carry out two unsupervised classifications
of one of the test images on the CD. Summarise your
experiences in note form.
4. The parallelepiped, supervised k-means and maximum likelihood classifiers are described as parametric. Explain. These three classifiers use, respectively,
the extreme pixel values in each band, the mean pixel