Vous êtes sur la page 1sur 12

An Open Source Implementation of Automated Orthorectification using

a Rational Polynomial Coefficients Model


Christoph Stallmann, James Meyer, Laurette Pretorius, Francois Maass
Department of Computer Science, University of Pretoria, Pretoria, 0002, South Africa
foxhat.solutions@gmail.com

Abstract
Every day, thousands of satellite and aerial photographs are taken for the use in domains such as GIS
analysis and city planning. These images are subject to distortion due to camera tilt and ground relief.
In order to use the imagery, this distortion must be corrected by creating a top-down view of the source
image. This process is called orthorectification and enables the end-user to accurately measure real
distances, angles and areas. In our paper we present an open source implementation for automated
orthorectification using a set of cross-referenced Ground Control Points (GCPs), called AutoGCP.
Commercial software packages providing automated orthorectification are either extremely expensive
or inaccurate in detecting GCPs. Manual GCP selection on the other hand is labour intensive and may
result in warped images, due to misplaced GCPs. The openness of the source code allows for
continuous improvements and by using academically proven algorithms and calculations, AutoGCP
provides a trusted implementation that can be verified. In our paper we firstly explain the theoretical
process of detecting and extracting GCPs from a previously orthorectified reference image, crossreferencing these points with the distorted raw image and finally correcting the raw image, creating an
ortho-image. Secondly, the practical implementation of the above-mentioned processes is discussed.
The AutoGCP C++ library extends the existing open source Quantum GIS (QGIS) library and a
Python plug-in for QGIS provides an aesthetical frontend interface for the automated orthorectification
processes.

1. Introduction
Satellite images have become an important part of many modern technologies and services. When
used by GPS navigation, agricultural and district planning or climate observation, satellite and air
photographs provide a base map layer that can be used for spatial analysis. These images also play an
important role during both disaster prediction and the management thereof. For example, in May 2008
the United Nations Institute for Training and Research (UNITAR) Operational Satellite Application
Program (UNOSAT) used satellite imagery to indicate the path and calculate the impact of Cyclone
Nargis that struck Myanmar in Burma (Bjorgo et al, 2008). However, raw satellite images are subject to
distortions caused by various factors such as ground relief and camera tilts, resulting in a
misrepresentation of the area covered by a satellite image. This may lead to incorrect measurements of
distances, areas and angles, rendering the raw satellite images unsuitable for tasks relying on map-like

accuracy. Orthorectification is the process of reducing the distortion by creating an orthographic view
(or ortho-image) of the raw input satellite image and eliminating most of the above mentioned
distortions.
In cases such as the cyclone in Burma, time is of utmost importance to take the necessary
precautions or start an evacuation. Most current software implementations of this process still require
that many tasks be performed manually by the user, resulting in a very time-consuming process, while
others have limited accuracy, or a low level of usability. These limitations are described in more detail
in each relevant section.
In this paper we discuss an orthorectification implementation that automates the process of
correcting a distorted image. Orthorectification is available through many open source and commercial
applications. Commercial products, such as PCI Geomatica, are mostly well established and produce
good results, but due to the cost of purchasing licences, the privilege of using these products are limited
to big companies or single users who can afford it. An alternative comes in with open source
applications like Orfeo Toolbox (OTB). These applications are free of charge and mostly provide
solutions that are equally good compared to their commercial counterparts. The major problem with
most open source remote sensing applications is firstly that they are difficult to use, especially by users
with a minor GIS background. Secondly the individual components for the orthorectification process in
these applications are many times poorly linked or even absence. The user may be required to use
different applications for different processes, for example one application for detecting the GCPs and
another one for the orthorectification. Thirdly open source applications often require parameters that
are not known by the user and could easily be read from the images metadata. For instance, OTBs
orthorectification filter requires the input of the images UTM zone and the coordinates of the top left
corner. These parameters are in most cases available in the metadata header of the image. The proposed
solution in this article provides an open source implementation of the orthorectification process from
beginning to end without the need of human interaction in-between. All parameters and algorithms (as
far as possible) are automatically retrieved and adjusted, leaving the only task for the user to specify the
input images, elevation model and output destination. The implementation was provided to the Council
for Scientific and Industrial Research (CSIR) Satellite Application Centre (SAC) as an extension to an
existing GIS application Quantum GIS (QGIS). A library and plug-in called AutoGCP
(http://foxhat.org) were developed as an extension to QGIS (http://www.qgis.org), which is a userfriendly, open-source desktop Geographic Information System. It is a project of the Open Source
Geospatial Foundation (OSGeo) and provides viewing, analysis and editing functionality in Windows,
Linux, Unix and Mac OSX.
There are several ways for achieving orthorectification. Some aerial and satellite cameras produce
exact physical sensor model data that can be used to accurately correct raw images, which is then
embedded in the metadata of the raw images. Unfortunately this metadata is not always incorporated
into the images. The approach taken by AutoGCP is to construct an analytical sensor model that

approximates physical models to a very high degree of accuracy. The process consists of the selection
of ground control points (GCP) from an existing orthographic reference image of the same area
covered by the newly taken raw image. These control points are then found on the raw image and their
relative coordinate offsets are used to construct the analytical sensor model.
The following sections will describe the four primary stages of the process as it is implemented in
the AutoGCP library, namely the collection of ground control points from sharp edges on the reference
image, projection of the GCP and surrounding chip (area of an image with a user defined width and
height surrounding a GCP) to suitable target coordinate systems, cross-correlating the GCP with the
raw image, and producing the corrected image by constructing an analytical Rational Polynomial
Coefficients (RPC) sensor model.

2. The Orthorectification Process in AutoGCP


The rectification process as implemented in AutoGCP can be divided into four sub-processes,
namely edge detection, chip projection, cross-correlation and rectification of the image. A previously
orthorectified image will be used as reference image from which to detect prominent edges and extract
the GCP which will then be cross-referenced with the new (raw) image. The information collected
during the cross-correlation is then used to build a RPC model to orthorectify the raw image.
2.1 Edge detection
The first step in rectifying an image is to collect Ground Control Points (GCP) on the reference
image. GCPs are prominent points on the earths surface with fixed coordinates over time, for example
crossroads, rivers or cliffs. A set of cross-referenced GCPs serves as input to the orthorectification
process. Software such as the Georeferencer plug-in (http://gis-lab.info/qa/qgis-georef-new-eng.html)
for QGIS relies on the selection of GCPs by hand. Manual selection is time consuming and labour
intensive and a single accidental misplaced GCP will lead to a poorly constructed RPC model and
therefore an unwanted warped ortho-image. This might further result not only in low productivity, but
also in too few GCPs being selected leading to reduced rectification accuracy. Furthermore, at the time
of writing, the Georeferencer plug-in does not provide access to the RPC correction model and is
limited to two-dimensional algorithms.
For automated detection algorithms a common problem is that the set of GCPs collected from the
reference image is often limited to only a small surface area of the image, due to high variance in these
areas. This might occur when the algorithm selects GCPs around the edges of clouds or lakes. Such an
uneven distribution of GCPs over the image may result in poorly corrected, distorted, or warped
images. Therefore, any proposed implementation would also need to compensate for this. The
AutoGCP library aims to automate the edge detection process, but also to ensure the even distribution
of the GCPs extracted from these edges. The algorithm used in AutoGCP for edge detection is the Haar
wavelet transform. The wavelet can be compared to the Fourier transform, with the difference that the

wavelet is a localised function. The main difference between these transformations is that the Fourier
series choose sinusoids as basis functions which lead to the desired properties. Wavelet analysis relies
on the input of desired properties which will be used to derive the resulting basis function (Burrus et al,
1998). A discussion of the wavelet implementation follows.
The Haar wavelet transform expresses local image variation at different scales, by decomposing an
image into a multi-resolution representation of that image. Lower resolutions are obtained through a
higher level of decomposition. This property allows for the detection of salient points at any required
degree of localisation (Van den Dool, 2005). Park et al (2000) found in their study that wavelet
transformations in general are more efficient and more accurate than other functions such as the
Intensity Hue Saturation (IHS) transformation. They concluded that the Haar and Daubechies wavelet
result in almost the same quality, but that Daubechies is far more efficient. Since the Haar wavelet
transform is more commonly known and easier to implement than the Daubechies wavelet transform,
the former approach was followed for edge detection.
Van den Dool (2005) proposed using the Haar wavelet transform on-board Low Earth Orbit (LEO)
satellites to geo-reference images on the fly. However, no known open source implementation of the
wavelet for these applications exists. The commercial software application Geomatica (PCI Geomatics,
2011) collects GCPs in a rigid uniform grid without concern for the underlying terrain. In most
instances the results are acceptable, but when the terrain covered by the image contains many repetitive
areas, such as crop fields, or water, the method is not optimal. Figure 1(a) shows a set of 16 GCPs
detected by PCI Geomatica using the AUTOGCP module (not to be confused with the library and plugin introduced in this paper) as opposed to Figure 1(b) which illustrates the same amount of GCPs
detected by us using the Haar wavelet transform. Using this grid-based approach from Geomatica we
can clearly see that some GCPs may not be optimally detected on an edge as shown in Figure 2(a) as
compared to the Haar wavelet detection in Figure 2(b).

(a)
(b)
Figure 1. The difference in GCP selection between PCI Geomatica (a) and AutoGCP (b)

(a)
(b)
Figure 2. PCI Geomaticas grid-based GCP selection (a) versus AutoGCPs edge-based GCP
selection (b)
The implemented wavelet function takes as input the rows and columns of pixels from the source
image and it outputs a compressed resolution of the input image as well as the information required to
reconstruct the original image. For each of the four neighbouring data elements which, at the first
level of decomposition, correspond to the original pixels an average coefficient is calculated and
stored along with the detail coefficient (which is required to reconstruct the original image). The
calculated average coefficient becomes the input data element for the next pass of the transform, at its
relative position. The process is repeated a certain number of times, depending on the optimal pixel size
of the geographical features being detected (more on this to follow). An n-level transform decomposes
a region of 2n2n pixels into a single final average coefficient and 2n-12n-1 detail coefficients. Once a
region has been decomposed, one of the four data elements, used to calculate the average coefficient,
that lies furthest from the mean is selected as the salient point at that level. This process is repeated
recursively back up to the original data pixels until a single most salient point is found for the region.
Although every decomposed region will deliver a point that is determined to be the most salient in
that region, this point might not be sufficiently prominent. This is especially the case in areas with
highly repetitive terrain, such as large bodies of water or crop fields. Because only good points are
useful to the implementation, their degree of saliency is calculated from the sum of the differences
from the mean at all levels of decomposition.
As mentioned above, the number of transformation levels is determined by the pixel size of the
geographical features to be detected. This size depends largely on the scale of the image.
Due to practical memory constraints and source images as large as one billion pixels (approximately
1GB per colour band), we made some obvious refinements to the proposed implementation. Most
notably, instead of compressing the entire image for every pass, minimal regions are compressed one
by one. This reduces the amount of in-memory data considerably. Reconstruction data was completely
discarded, as the small memory size of a region meant that the entire original data could be used for
variation comparisons instead.

2.2 Coordinate Transformation


Whether you treat the Earth as a sphere or as a spheroid, you must transform its three-dimensional
surface to create a flat map sheet. This mathematical transformation is commonly referred to as a map
projection (Kennedy, 1994). Map projections are used to convert geographic coordinates into an X
(longitude) and Y (latitude) coordinate system, which makes it easier to display an area as a twodimensional layer. The Geographic Coordinate System (GCS), sometimes also called the Geographic
Reference System (GRS), differs from image to image, due to human pre-processing or the image
capturing software of the satellite. People, communities, organisations and governmental institutions
have different requirements and preferences when it comes to the GCS they use for their spatial
imagery. Apart from this, the variety in satellite camera types creates inconsistency when processing
images. AutoGCP can handle images with a diversity of GCSs that are supported by the Geospatial
Data Abstraction Library (GDAL) and QGIS. The chips extracted around the detected GCPs on the
reference image, may have different coordinate references than the GCPs on the raw image. The
coordinates of the GCP chips are transformed using an identical GCS to create consistency between
these images and their coordinates. This conversion ensures that the GCPs can be cross-referenced.
2.3 Cross-correlation
Once GCPs have been identified on the reference image, they are to be matched with the possible
GCPs found on the raw image. Any two image chips can be compared using normalised crosscorrelation, as described by Hong and Zhang (2008), which assigns a correlation coefficient to a pair of
chips. The chip on the raw image that shows the highest correlation to a specific GCP chip on the
reference image is assumed to be a match, i.e. the same GCP has been found on the raw image. The
major problem that arises if an iterative search is done for each reference GCP through all the raw
GCPs is clearly that of computation. A worst case search of O(n2) is conducted without even taking into
account the instructions required to compare each pair of GCP chips.
For every reference GCP one would instead prefer to search only through the likely matches. A third
degree RPC model is sufficient to correct the distortions caused by optical projection, earth curvature,
camera vibration and the like (Tau et al, 2001). The nature of the distortion is therefore such, that the
search space for each GCP can be defined by directly projecting the GCP onto the raw image and then
choosing a sufficiently large area around the point. This principle can be implemented in the following
way.
The raw image is divided into a grid of which the size of the blocks is chosen so as to allow for the
kind of distortion likely to have occurred. Each block is assigned a list of GCPs identified within its
area and just outside its boundaries. Each reference GCP is then projected directly onto the raw image,
its corresponding grid block is identified and each GCP within the list is correlated with it. The highest
of these correlations is assumed to be the match, unless the correlation coefficient is beneath a chosen
threshold, in which case the GCP is rejected. The reference GCP is then updated to include information

regarding its pixel coordinates on the raw image.


2.4 Rectification
Once all GCPs have information about their pixel position on the raw image, a Rational Polynomial
Coefficients (RCP) model can be constructed that captures and models the distortion information
indicated by the GCPs. The advantage of using a RPC model for the rectification of images as
compared to other algorithms and models such as the Thin Plate Spline (TPS), is that not only the
longitude (X) and latitude (Y) offsets are taken into account, but also the height variances (Z) between
different points in the image. The height values are most commonly acquired from a Digital Elevation
Model (DEM) and denote the heights of given points above sea level (and sometimes below sea level).
Lutes (2004) conducted tests to prove that RPCs are also valid for regions with extreme heights, even
for elevations of up to 8000 meters. This makes the use of RPCs not only appropriate for areas with
average heights (0m 3000m), but the validity is assured for all regions on the earth.
The process of using GCPs to construct an RPC is described in detail in Tao & Hu (2001) and will
only be briefly summarised here. The model takes the form of two ratios of polynomials resulting in
two third degree polynomials, each representing a transformation from geographical coordinates to
pixel coordinates, one returning the pixel row and one the pixel column of the geographical point. Both
the geographical and pixel coordinates are offset and scaled in order to normalise the model.
( , , )

= 1 (,,)
2

[1]

( , , )

= 3 (,,)
4

[2]

1 1
1 2
=

[1

13
23

3

1 1
2 2

1 13
2 23

3 ]

[3]

1
2
=[]

[4]

( ) =

[5]

= [1

39 ]

[6]

( +2 +2 +3 ++18 3 +19 3 +20 3 )

1
= (1+

21 +22 +23 ++37

3 + 3 + 3 )
38
39

[7]

( +2 +2 +3 ++18 3 +19 3 +20 3 )

1
= (1+

21 +22 +23 ++37

3 +

38

3 +

39

3)

[8]

Equations [1] and [2] represent the model, with equations [7] and [8] showing the form of the
polynomials p1, p2, p3 and p4. Matrix M is constructed as shown, with Xi, Yi, and Zi representing the
geographical coordinates of the i-th GCP in the set and ri the pixel row coordinate of the same GCP. A
similar matrix N can be constructed replacing each ri with the column coordinate ci, of the GCP, as well
as a column matrix C similar to R. Solving equation [3] to [5] produces J [6], whose elements are the
solution to p1 and p2 as shown in [7]. Replacing M with N and R with C in [3] and [4] produces a row
matrix K whose elements provide the solution to p3 and p4 in [8].
With an RPC model that accurately converts between geographical coordinates and pixel coordinates
on the raw image, the image is ready to be orthorectified. A target geographic region is determined for
the output ortho-image from the raw image extents. Then, for each pixel in this target region, the
corresponding raw image pixel coordinates are calculated using the RPC functions. The correct data
value is subsequently sampled from the raw image, interpolated if necessary and written to the output
image. Interpolation is the construction of new data points that lie between existing data points. In this
case, the pixel value for a coordinate is calculated from the nearest four neighboring pixels, by first
performing linear interpolation in one direction (X) and then in the other (Y) direction.
An important aspect of RPC model construction is the amount of GCPs used as X, Y and Z inputs for
[1] and [2]. Too few (and sometimes too many) points can reduce the accuracy of the RPCs. Ahmad
(2001) concluded in his research that 8 and 10 is respectively the optimum number of GCPs for SPOT
Panchromatic and RadarSat Fine Mode datasets. The OTB orthorectification process on the other hand
requires a minimum of 16 GCPs for accurate estimations (depending on the accuracy of the GCPs); this
approach was adopted by AutoGCP

3. The Implementation
As mentioned earlier, one of the base requirements of the project was that the software should be
developed as a plug-in extension to the open source QGIS application. QGIS provides an effective
means to download and install plug-ins at runtime by the user. To take advantage of this system, the
plug-in had to be developed using Python as its programming language. Furthermore, the QGIS
application is built on a powerful C++ library that can be used not only by its plug-ins, but also by
external third-party applications. This enables other open source developers to reuse the code and
possibly suggest improvements.
Based on this a C++ library was developed as the core of the project, to handle all the
computationally expensive tasks. The library was developed according to QGIS development
standards, with the aim of integrating it into the QGIS library as part of its analysis module. (At the

time of writing the library is still in the process of being accepted into the QGIS main built). This will
allow other QGIS developers to build custom applications and plug-ins using the AutoGCP library.
A Python plug-in featuring an easy-to-use graphical user interface was written as the front end to the
AutoGCP library. The user interface was developed using the Qt cross-platform application and UI
framework (http://qt.nokia.com/products), the same technology used by QGIS itself.
Due to the characteristics of Python as a dynamic programming language, no pre-compilation is
required and the user can easily download and install the AutoGCP plug-in through either the QGIS
plug-in manager or the user-contributed online repository (http://pyqgis.org/repo/contributed).
In order for the Python front-end to interface with the required C++ library elements, a bridge has to
be established between the two languages. The technology used to generate these bindings is SIP
(http://www.riverbankcomputing.co.uk/software/sip/intro).
PostgreSQL and SQLite database connections are available and can be used to store GCP sets that
can be used at a later stage to orthorectify images without the need to redetect the GCPs. The databases
can also be used to share GCP data between users.
GDAL (Geospatial Data Abstraction Library) was used as the primary means for performing
efficient reading and writing of geospatial data in various different formats (http://www.gdal.org).
Besides the orthorectification process using a RPC model, the user also has the option of
georeferencing the image or rectifying it by using the Thin Plate Spline (TPS) algorithm provided by
GDAL.
Figure 3 shows the graphical user interface of AutoGCP under Ubuntu. Other similar software
packages often rely on command line input for extensive operations, which makes them difficult to use
and impossible to present an aesthetically pleasing solution. GDAL for example is solely a command
line based package which relies on text input. OTB provides a graphical user interface part and wrapper
applications such as Monteverdi. These interfaces provide limited functionality, and more complicated
queries may even require the user to write their own code to use OTBs functionality. Most commercial
packages such as Geomatica provide their entire functionality through their GUI and therefore have an
advantage over open source implementations, especially for novice users.

Figure 3. The AutoGCP plug-in under Ubuntu


Figure 4 is an enlargement showing the different attributes of the cross-correlated GCPs. The source
coordinates(X and Y) represent the position of the individual GCPs on the reference image, whereas
the destination coordinates are extracted from the raw image. Notice the slight different between these
values which is acceptable since the provided images differ (which is mostly the case). By applying the
rectification process, we try to reduce these differences as much as possible. The match quality show
the correlation between the reference and raw GCP, based on the extracted chip.

Figure 4. The GCP coordinates and match quality

4. Conclusion
There are many implementations of the orthorectification process. Certain packages such as
Geomatica are well established and have many advantages over open source packages (for instance
server-side processing). They might however get trapped in their old traditional ways of doing things
without providing the users with a wider range of options. AutoGCP provides an alternative way

detecting GCPs, which arguably lead to better results. Manual GCP selection on which many
implementations rely is labour intensive and may result in warped images, as a result of misplaced
GCPs. AutoGCP provides a solution to all these problems in a user friendly environment which is
available for a variety of platforms. The openness of the source code allows for continuous
improvements and extensions. By using academically proven algorithms and calculations, AutoGCP
provides a trustworthy implementation that can be easily verified.
Automating the orthorectification process is an important improvement for many GIS applications
which rely on the correction of distorted images. By reducing the manual input for this process and by
providing a user-friendly interface, AutoGCP can easily be used by users without prior advanced GIS
knowledge. The implementation as an open source product also creates a fresh alternative compared to
the costly commercial products. The library can be integrated into a satellite image processing software
package, which will allow images to be directly drawn from satellites, corrected, orthorectified and
made available to the end-user for further processing. This will eliminate much of the human
interaction for a level three satellite image product, increasing the throughput and reducing human
faults. Future work includes the extension of the GCP detection module. Currently a reference image of
the same area is needed to register the raw image. These reference images are not always available,
especially to users who only have a small amount of images without access to satellite image
repositories. A solution to this problem is to use Google Maps as base reference layer for automatic
GCP detection. DEMs are also available online free of charge. When incorporating this functionality,
the user will only have to specify the input and output paths, reducing the complexity of using the plugin even further by introducing total automatization of the orthorectification process.

5. Acknowledgements
The authors would like to thank Wolfgang Lck from the CSIR SAC for the theoretical support and
introduction to the required GIS background; as well as Tim Sutton from Linfiniti Consulting, one of
the main developers of QGIS, for the technical support related to the development of AutoGCP and the
integration of the underlying QGIS libraries.

References:
Ahmad, S 2001, Orthorectification of Stereo SPOT Panchromatic and RadarSat Fine Model Data, Malaysian
Journal of Remote Sensing & GIS, vol. 2, pp. 19-24.
Bjorgo, E, Pisano, F, Lyons, J, Heisig, H 2008, Satellite imagery in use, Forced Migration Review, no. 31, pp.
72-73.
Burrus, CS, Gopinath, RA, Guo, H 1998, Introduction to Wavelets and Wavelet Transforms: A Premier, Prentice
Hall, New Jersey.
Georeferencing raster data in QGIS using polynomials 2010, viewed 8 January 2011, <http://gis-

lab.info/qa/qgis-georef-new-eng.html>.
GDAL Geospatial Data Abstraction Library, viewed 4 January 2011, <http://www.gdal.org>.
Park, JH, Tateishi, R, Wikantika, K 2000, Multisensor Data Fusion Using Multiresolution Analysis (MRA),
International Archives of Photogrammetry and Remote Sensing, vol. 33, part B2, pp. 430-437.
PCI Geomatics 2011, Geomatica, viewed 25 January 2011, <http://www.pcigeomatics.com>.
Hong, G & Zhang, Y 2008, Wavelet-based image registration technique for high-resolution remote sensing
images, Computers & Geosciences, no. 34, pp. 17081720.
Kennedy, M 1994, Understanding Map Projections, ArcInfo 8, Esri Press, viewed 2 January 2011,
<http://www.duc.auburn.edu/academic/classes/fory/7470/lab08/understanding%20map%20projections.pdf>.
Lutes, J 2004, Accuracy Analysis of Rational Polynomial Coefficients for Ikonos Imagery, ASPRS Annual
Conference Proceedings, May 2004, Denver, Colorado.
QGIS 1.x User-Contributed Python Plugins, viewed 4 January 2011, <http://pyqgis.org/repo/contributed>.
Qt

cross-platform

application

and

UI

framework

2011,

viewed

January

2011,

<http://qt.nokia.com/products>.
Stallmann, FC 2010, Foxhat Solutions, viewed 6 January 2011, <http://www.foxhat.org>.
Tau, C & Hu, Y 2001, A Comprehensive Study On The Rational Function Model For Photogrammetric
Processing, Photogrammetric Engineering and Remote Sensing, vol. 67, no. 12, pp. 13471357.
The Quantum GIS Project, OSGeo Project, viewed 7 January 2011, <http://www.qgis.org>.
Van den Dool, R 2005, Onboard Image Geo-referencing for LEO Satellites, MSc(Eng), University of
Stellenbosch.

Vous aimerez peut-être aussi