Os Sift

This article has been accepted for inclusion in a future issue of this journal.
Content is final as presented, with the exception of pagination.
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 1
OS-SIFT: A Robust SIFT-Like Algorithm for

High-Resolution Optical-to-SAR Image
Registration in Suburban Areas
Yuming Xiang , Feng Wang, and Hongjian You
Abstract— Although the scale-invariant feature transform GaoFen (GF) series. The significant increase of HR imagery
(SIFT) algorithm has been successfully applied to both optical has led to a large number of applications in diverse areas, such
image registration and synthetic aperture radar (SAR) image as disaster reduction, water conservancy, and ocean observa-
registration, SIFT-like algorithms have failed to register high-
resolution (HR) optical and SAR images due to large geometric tion. Images obtained by active SAR sensors provide the capa-
differences and intensity differences. In this paper, to perform bility to view through clouds and to perform analysis in all-day
optical-to-SAR (OS) image registration, we proposed an advanced and all-weather conditions [1]. Images taken by passive optical
SIFT-like algorithm (OS-SIFT) that consists of three main mod- sensors, however, in comparison with SAR images, have
ules: keypoint detection in two Harris scale spaces, orientation very different characteristics. Consequently, it is important but
assignment and descriptor extraction, and keypoint matching.
Considering the inherent properties of SAR images and optical difficult to jointly exploit both optical and SAR images.
images, the multiscale ratio of exponentially weighted averages As the fundamental task of these image applications, image
and multiscale Sobel operators are used to calculate consistent registration is the process of aligning two or more images
gradients for the SAR images and optical images on the basis of of the same scene captured at different times from different
which, as a result, two Harris scale spaces can be constructed. viewpoints or by different sensors [2]. Numerous studies have
Keypoints are detected by finding the local maxima in the scale
space followed by a localization refinement method based on been devoted to the registration of remote sensing images,
the spatial relationship of the keypoints. Moreover, gradient and these studies can be roughly divided into two categories:
location orientation histogram-like descriptors are extracted feature-based algorithms and area-based algorithms. Feature-
using multiple image patches to increase the distinctiveness. based techniques depend on detecting and matching distinc-
The experimental results on simulated images and several HR tive features from the images, whereas area-based techniques
satellite images show that the proposed OS-SIFT algorithm
gives a robust registration result for optical-to-SAR images implement geometric transformation estimation by optimizing
and outperforms other state-of-the-art algorithms in terms of a similarity measure between the two images [3].
registration accuracy. Area-based techniques first define a template in the sensed
Index Terms— High resolution (HR), image registration, multi- image and then search for the optimal correspondence in
sensor, optical, scale-invariant feature transform (SIFT), synthetic the reference image using different kinds of similarity mea-
aperture radar (SAR). sures, including mutual information [4], the normalized
cross-correlation coefficient [5], and cross-cumulative residual
entropy [6]. Area-based methods have been widely used in the
I. I NTRODUCTION
registration of multimodal remote sensing images with inten-
W ITH the rapid development of remote sensing sensors,

a large number of high-resolution (HR) satellites
have been put into use in both the optical and syn-
sity differences [7]. However, these methods may lead to local
extrema and are associated with high computational loads [8].
Moreover, in order to solve the geometric distortion between
thetic aperture radar (SAR) domains, such as TerraSAR-X, multisensor and multimodal images, area-based techniques
COSMO-SkyMed, Ikonos, Quickbird, and the Chinese require georeferenced and orthorectified data provided by
satellites. Although the latest satellites can provide these data
Manuscript received November 13, 2017; revised December 19, 2017;
accepted January 2, 2018. This work was supported by the National Natural products, there still exist some distortion differences between
Science Foundation of China under Grant 61331017. (Corresponding author: various data sets as a result of the lack of digital elevation
Yuming Xiang.) model data and due to rectification errors. In addition, manual
Y. Xiang and H. You are with the Key Laboratory of Technology in
Geo-spatial Information Processing and Application System, Institute of preregistration is time-consuming and subject to interpretation
Electronics, Chinese Academy of Sciences, Beijing 100190, China, and also errors [9]. For these reasons, we still need a feature-based
with the University of Chinese Academy of Sciences, Beijing 100000, China method to eliminate the obvious scale, rotation, and translation
(e-mail: z199208081010@163.com).
F. Wang is with the Key Laboratory of Technology in Geo-spatial Infor- differences between these images.
mation Processing and Application System, Institute of Electronics, Chinese Compared with area-based techniques, feature-based algo-
Academy of Sciences, Beijing 100190, China. rithms can successfully tackle scale differences and rota-
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. tion differences by establishing reliable feature matches.
Digital Object Identifier 10.1109/TGRS.2018.2790483 These algorithms are mainly composed of three steps: feature
0196-2892 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
2 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
detection, feature matching, and transformation estimation.

First, significant features, such as points [10], lines [11],
and regions [12], are detected in the two images. Second,
the corresponding features are extracted by computing the
similarity of various feature descriptors, such as the famous
scale-invariant feature transform (SIFT) descriptor [13], shape
context [14], and phase congruency [2]. Finally, the geometric
transformation model between two images can be established
using the reliable feature correspondences. Among the feature-
based methods, the SIFT-like algorithms are the most widely
used techniques owing to their efficient performance and
invariance to scale, rotation, and illumination changes [13].
Considering the complex nature of remote sensing images,
the traditional SIFT algorithm does not perform well [15]. As a
consequence of significant differences in intensity and geomet-
ric information, it may be difficult to find enough repeatable
keypoint correspondences for remote sensing image pairs [16].
A number of improvements have been made to the SIFT
algorithm. Some algorithms have improved the SIFT algorithm
by extracting features starting from the second octave [15],
skipping the dominant orientation assignment when the match- Fig. 1. Optical-to-SAR image pair. (a) SAR image. (b) Optical image.
ing images do not have rotation transformation [17], replacing (c) Keypoints in SAR image. (d) Keypoints in optical image.
the Gaussian filter with several anisotropic filters [18], [19],
and designing a new gradient specifically dedicated to SAR approach using the local and global geometrical relationship
images [9]. Some algorithms combine the SIFT algorithm with between binary robust invariant scalable keypoint features for
other techniques, such as image segmentation [20], mutual optical-to-SAR coregistration.
information [21], and novel matching methods [22]. Even though these approaches have achieved automatic
In general, optical-to-SAR (OS) image registration is a and robust results on various optical and SAR images, there
challenging task because of three major problems: different are limitations that cannot be ignored. First, all methods
geometric properties, different intensity characteristics, and use the same feature detector for both optical and SAR
speckle noise [11]. Using HR imagery, building structures in images. However, highly repeatable features are very difficult
urban areas become visible. The side-looking mechanism of to extract due to the substantial differences between opti-
SAR sensors causes urban areas suffer a series of geometrical cal and SAR images. Herein, we characteristically choose
distortions like foreshortening, layover, and shadow that do a pair of corresponding images to illustrate this problem,
not exist in the corresponding optical image [3]. The intensity where Fig. 1(a) and (b) shows the SAR and optical images
information differs widely as a result of different imaging and Fig. 1(c) and (d) shows the keypoints detected by the
techniques and conditions. Moreover, SAR images are strongly Laplacian of Gaussian (LoG) method. It can be observed
corrupted with multiplicative speckle noise, which inhibits the that the detected keypoints are almost unrepeatable. Second,
process of feature detection. some approaches have focused on special features, such as
Recently, a few studies have been conducted on the reg- roundabout templates and line intersections. These approaches
istration of optical and SAR images. Suri and Reinartz [3] can be applied only on some particular scenes, and the
used mutual information to form a histogram-based method feature detectors are strongly affected by speckle noise in
for automatic registration between TerraSAR-X and Ikonos SAR images. Moreover, the area-based techniques based on
images over urban areas. Gong et al. [21] proposed a the similarity metrics, such as MI, DLSS, and HOPC, can be
coarse-to-fine scheme by combining the SIFT algorithm with computationally expensive when solving HR optical-to-SAR
mutual information. Merkle et al. [23] focused on geomet- image registration, and they need a coarse registration stage
ric feature templates like roundabouts in both SAR and to remove obvious scale, rotation, and translation differences
optical images and matched them by an intensity-based between two images.
method. Fan et al. [17] obtained spatially consistent matches To the best of our knowledge, there does not exist a robust
by exploring the spatial relationships between keypoints feature-based algorithm that can be widely used for coreg-
detected by an improved SIFT algorithm. Ye et al. [24], [25] istration between optical and SAR images. Aiming to solve
proposed two new similarity metrics for optical-to-SAR reg- the aforementioned limitations, we propose an automatic and
istration: the dense local self-similarity (DLSS) based on robust SIFT-like algorithm (OS-SIFT). The main contributions
shape properties and the histogram of orientated phase con- of this paper are given as follows.
gruency (HOPC) based on structural similarity. Sui et al. [11] 1) A feature-based algorithm is proposed, which can be
combined iterative line extraction and the Voronoi integrated widely used for optical-to-SAR image registration.
spectral point matching method to register optical and SAR 2) Specifically, the proposed OS-SIFT utilizes two different
images Salehpour and Behrad [26] proposed a hierarchical operators, multiscale ratio of exponentially weighted
XIANG et al.: OS-SIFT: ROBUST SIFT-LIKE ALGORITHM FOR HR OPTICAL-TO-SAR IMAGE REGISTRATION 3
averages (ROEWA) and multiscale Sobel, to calculate

gradients for SAR and optical images, respectively.
Instead of building Gaussian scale space, we constructed
two Harris scale spaces for SAR and optical images.
Considering the gradient reversal phenomenon, gradient
orientation is restricted to an interval [0, 180°).
3) A localization refinement method is proposed to detect
highly repeatable keypoints. Multiple image patches are
aggregated to construct a gradient location orientation
histogram (GLOH)-like descriptor.
Since the SIFT algorithm is so well known, we do not
describe this algorithm in detail. The rest of this paper is orga-
nized as follows. The methodology of the proposed algorithm
is described in Section II. The experimental results on the
simulated and real-world images are illustrated in Section III.
The performance and comparative analyses are discussed in
this section, followed by the conclusions in Section IV.
Fig. 2. (a) Horizontal processing window. (b) Vertical processing window.
II. M ETHODOLOGY (c) Parallel ISEF filter. (d) Vertical ISEF filter.
In this section, we first introduce the consistent gradient
computation based on the multiscale ROEWA and Sobel oper- the infinite symmetric exponential filter (ISEF) as its vertical
ators for SAR and optical images. Then, we utilize the gradient filter and another ISEF as its parallel filter. The ROEWA
magnitude and orientation to detect keypoints and extract operators oriented in the horizontal and vertical directions are
feature descriptors. As an SIFT-like algorithm, OS-SIFT also given as follows:
consists of three main modules: keypoint detection, orientation M/2 N/2 − |m|+|n|
n=1 I (x + m, y + n)e
αi
assignment and descriptor extraction, and keypoint matching. Rh,αi =
m=−M/2
(1)
M/2 −1 − |m|+|n|
n=−N/2 I (x + m, y + n)e
αi
A. Consistent Gradient Computation m=−M/2
M/2 N/2 − |m|+|n|
n=−N/2 I (x + m, y + n)e
As mentioned in Section I, the intensity information differs αi
m=1
widely as a result of different imaging techniques and condi- Rv,αi = N/2 (2)
−1 − |m|+|n|
tions between optical and SAR images. If the same gradient m=−M/2 j =−N/2 I (x + m, y + n)e αi
operator is applied to both optical and SAR images, the sig- where M and N denote the size of the processing window,
nificant intensity differences may result in different gradient which relates to the scale parameter αi ; (x, y) stands for
orientations and gradient magnitudes. Therefore, the keypoint the location of the central point; and I represents the pixel
detector cannot extract repeatable features, as shown in Fig. 1. intensity. Fig. 2(a) and (b) shows the corresponding horizontal
Moreover, the feature descriptors are not robust to these and vertical windows, and the vertical and parallel ISEF filters
differences, because the dominant orientation of each keypoint are presented in Fig. 2(c) and (d). Then, the horizontal and
is used in the process of descriptor extraction to ensure vertical gradients can be computed by the horizontal and
rotation invariance [16]. In the feature matching stage, both vertical ROWEA operators, which are given as follows:
a small number of repeatable keypoints and dissimilar feature
descriptors may lead to misregistrations. To obtain consistent G h,αi = log(Rh,αi ) G v,αi = log(Rv,αi ). (3)
gradient magnitudes and orientations for optical and SAR The gradient magnitude and orientation are derived as follows:
images, we introduce a new method to calculate multiscale
gradients without building Gaussian image pyramids. G m,αi = (G h,αi )2 + (G v,αi )2
Considering the coherent imaging mechanism, SAR images
G v,αi
are always corrupted with multiplicative speckle noise. The G o,αi = arctan . (4)
classical gradient operator using difference is strongly affected G h,αi
by random high-frequency components caused by speckles, In the ROEWA operator, αi , which relates to the scale
especially in homogenous areas of high reflectivity. A few false parameter, controls the size of the processing window. Con-
keypoints may be detected in these areas after thresholding. sequently, the multiscale ROWEA can be obtained by set-
The gradient by ratio (GR) used in the SAR-SIFT algo- ting several values for αi . Assuming that we have a list
rithm [9] is specifically dedicated to SAR images. Benefiting of {α1 , . . . , αn } with the relationship between two consecu-
from the constant false alarm rate property of the ROEWA tive scale values αi+1 /αi = k, multiscale gradient magni-
detector, gradients derived by the GR method are robust to tudes and orientations are derived. Here, k is constant for
speckle noise. The ROEWA operator is an advanced ratio- i ∈ {1, . . . , n − 1}.
based detector obtained by computing the ratio of exponential For the optical images, we introduce the multiscale Sobel
weighted local means [27]. This operator can be regarded as operators to calculate gradients. The traditional Sobel oper-
a 2-D separable filter composed of two orthogonal 1-D filters: ator is one of the best edge detection operators. Generally,
Fig. 3. One-dimensional ISEF filter and mean filter.
the Sobel operator calculates the horizontal gradient and

vertical gradient by convolving image intensity with two
templates [28]. The two templates are given as follows:
⎡ ⎤ ⎡ ⎤
−1 0 1 −1 −2 −1
f H = ⎣ −2 0 2 ⎦ f V = ⎣ 0 0 0 ⎦. (5)
−1 0 1 1 2 1
The template can be regarded as two rectangular subwindows
convolved with a weighted neighborhood average. The gradi-
ents are strongly affected by the size of the rectangle process-
ing window. Consequently, the multiscale Sobel operators can Fig. 4. (a) Simulated SAR image. (b) Simulated optical image.
be obtained by setting different values for the template size, (c) Gradient magnitude on SAR image by ROEWA. (d) Gradient magnitude on
which are given as follows: optical image by Sobel. (e) Gradient orientation on SAR image by ROEWA.
(f) Gradient orientation on optical image by Sobel.
Fh,β j = G β j × Hβ j Fv,β j = Gβ j × Vβ j (6)
Consequently, the scale parameters α and β meet the following
where Fh,β j and Fv,β j denote the horizontal and vertical
requirements:
gradients, respectively, and Gβ j stands for a Gaussian kernel
with standard deviation β j , which is used as the weighted αi+1 /αi = k βi+1 /βi = k, α1 = β1 (9)
average value. Hβ j and Vβ j are the horizontal and vertical where k is a constant. Moreover, the gradient magnitudes G m,α
rectangle windows with a size of β j , and × denotes the and Fm,β are normalized to further reduce the differences.
matrix multiplication operator. Then, the gradient magnitude Fig. 4 illustrates the gradient magnitudes and orientations
and orientation can be derived
as follows: on two simulated images. The first image is a rectangle
Fm,β j = (Fh,β j )2 + (Fv,β j )2 corrupted with speckle noise and is marked as the simulated

SAR image, and the second image is the same rectangle
Fv,β j
Fo,β j = arctan . (7) corrupted with Gaussian noise and is labeled as the simulated
Fh,β j optical images. The intensity value inside the rectangle is 150,
Similar to the multiscale ROEWA operator, a list of whereas the intensity value outside the rectangle is 50. The
{β1 , . . . , βn } is also assigned for the multiscale Sobel operator. speckle noise is a three-look multiplicative noise with μ = 1
In order to obtain consistent gradients for optical and SAR and σ = 2, and the Gaussian noise is an additive noise
images, the relationship between the scale parameters αi and with μ = 50 and σ = 50. For the multi-ROEWA and
β j is studied. We typically choose the vertical filter to address multi-Sobel operators, we choose the results of the first scale,
this issue. The vertical filter of the ROEWA operator at the which are shown in Fig. 4(c)–(f). Since the two operators are
first scale α1 is a 1-D IESF filter denoted as f ISEF,α1 , and specifically designed for SAR and optical images, both show
the vertical filter of the Sobel operator at the first scale β1 is robustness to noise, especially for the homogenous areas of
a 1-D mean filter (a rectangular window can be regarded as high reflectivity in the simulated SAR image. We can see from
two orthogonal 1-D mean filters) denoted as f MF,β1 . As shown these figures that the gradient magnitudes and orientations of
in Fig. 3, fISEF,α1 and f MF,β1 are presented, and in order to two simulated images are very consistent, meaning that the
obtain consistent gradients, the effective support region of the gradient computation has successfully reduced the differences
two filters must be the same [29]. Then, the relationship can between the simulated SAR and optical images, which can be
be computed as follows: further utilized in keypoint detection and descriptor extraction.
+∞ |x| +∞
− B. Proposed OS-SIFT Algorithm
e α1 d x = f M F,β1 d x
−∞
−∞ The flowchart of the proposed algorithm is presented
1 |x| ≤ β1 in Fig. 5. The three main modules of the proposed OS-SIFT
f M F,β1 = . (8)
0 else algorithm are described in turn.
Fig. 5. Flowchart of the proposed OS-SIFT algorithm.
1) Keypoint Detection: In the conventional SIFT algorithm, gradient computation. The multiscale Harris function [30] can
a Gaussian image pyramid is first constructed by convolving be derived by replacing the first derivative with the multiscale
the image with Gaussian filters at different scales, and then, gradient computations

a series of difference of Gaussian (DoG) images, as an approx-
√ (G h,αi )2 (G h,αi ) · (G v,αi )
imation of LoG, is obtained by subtracting adjacent Gaussian M S (αi ) = G 2αi ∗
(G v,αi ) · (G h,αi ) (G v,αi )2
images. Keypoints are extracted by finding local maxima in
R S (αi ) = det(M S (αi )) − d · tr(M S (αi ))2 (10)
the DoG scale space in three dimensions (x, y, α), followed
by a process of subpixel localization and unstable keypoint (Fh,β j )2 (Fh,β j ) · (Fv,β j )
M O (β j ) = G √2β j ∗
elimination by a criterion based on the Hessian matrix [13]. (Fv,β j ) · (Fh,β j ) (Fv,β j )2
However, the LoG approach cannot detect reliable keypoints
R O (β j ) = det(M O (β j )) − d · tr(M O (β j ))2 (11)
in SAR images. This approach based on the second derivatives
is strongly corrupted by multiplicative speckle noise [9]. where G h,αi and G v,αi are the horizontal and vertical gradients
Compared with keypoints detected by the DoG method, at scale αi for SAR images, Fh,β j and Fv,β j are the horizontal
we find that corner points are more stable and repeatable and vertical gradients at scale β j for optical images, d is
in both optical and SAR images. Consequently, we focus an arbitrary parameter, G stands for a Gaussian kernel, ∗ is
on the corner point detection in the proposed OS-SIFT the convolution operator, det denotes the value of the matrix
algorithm. Instead of constructing the DoG scale space, determinant, and tr signifies the matrix trace. Consequently,
we build two Harris scale space based on the aforementioned the SAR-Harris scale space R S and optical Harris scale space
Despite the good detection results for the multiscale Harris

method, we find that the locations of corresponding keypoints
in the two simulated images are not very precise, which can be
seen from the enlarged images in Fig. 6(e) and (f). Deviations
exist between the corresponding keypoints due to the effect
of strong noise. Hence, a keypoint localization refinement
needs to be done. The accurate keypoint localization [13]
used in the LoG approach is based on the Hessian matrix,
which is no longer suitable in our proposed algorithm. Herein,
we propose a localization refinement method based on the
spatial relationship of keypoints. Keypoints of different scales
that belong to the same corner should have similar structural
properties, which means that they are on a straight line. The
method is described as follows.
1) Due to the different imaging mechanisms, isolated bright
pixels in SAR images cannot be detected as keypoints in
optical images, so these pixels in SAR images are first
abandoned by comparing the intensity with that in its
eight-pixel neighborhood. Then, keypoints that belong to
the same corner are selected by the restriction of location
and dominant orientation
√
p(x 1 , y1 , α1 )− p(x i , yi , αi )2 ≤ 2i, αi = α1 k i−1 (12)
π
|θ (x 1, y1 , α1 )−θ (x i , yi , αi )| ≤ , i = 2, . . . , N (13)
36
where 2 denotes the Euclidean distance, N is the
number of the scale, k is a constant that controls the
ratio between adjacent scales, and θ is the dominant
orientation of each keypoint. An orientation histogram
is computed based on gradient orientations on a scale-
dependent neighborhood, and the dominant orientation
corresponds to the maximum bin in the histogram.
2) We assume that the mathematical model of a straight line
is y = ax + b. By finding the sum of the minimum
offsets from each keypoint to the line, the model para-
meters can be computed using the least-squared-error
Fig. 6. Keypoint detection on two simulated images. (a) SAR-Harris on criterion [31]. The offset of each keypoint is measured
SAR image. (b) Optical-Harris on optical image. (c) LoG on SAR image.
(d) LoG on optical image. (e) Enlarged corner on SAR image. (f) Enlarged by the sum of variance as follows:
corner on optical image. (g) Enlarged corner on SAR image after localization
N
refinement. (h) Enlarged corner on optical image after localization refinement.
T (a, b) = (ax i + b − yi )2 (14)
i=1
R O are constructed by setting two lists of scales {α1 , . . . , αn }
where (x i , yi ) is the coordinate of keypoints at scale αi .
and {β1 , . . . , βn }. The relationship between scale parameters
is given in (9). The extreme can be obtained when the two partial
derivatives fulfill (∂ T /∂a) = 0, (∂ T /∂b) = 0. Then,
By finding local maxima in the Harris scale space, candidate
the mode parameters a and b can be derived as
keypoints are then selected at each level, followed by nonmax-
N
ima suppression and thresholding. Fig. 6(a) and (b) shows the
N
N
keypoints on the two simulated images using the multi-Harris a= N x i yi − xi yi

scale spaces. We also present the detection results of the LoG ⎛i=1 i=1

i=1
2 ⎞
N
N
approach in Fig. 6(c) and (d). In this experiment, eight scales
⎝N xi 2 − xi ⎠
are used for multiscale Harris method. We can see from the
results that the SAR-Harris and optical-Harris methods both i=1 i=1

N
successfully find 32 keypoints located at the four corners of the
N
rectangle and that there are no false detections. For the LoG b= yi − a xi N. (15)
approach, a few false detections appear in homogeneous areas i=1 i=1
with high reflectivity in the simulated SAR image, which is in 3) Finally, the estimation error defined as er = ax + b − y
accordance with the previous analysis. There also exist false is calculated for each keypoint. If the error is larger than
alarms in the boundary of rectangle in the simulated optical a given threshold, this keypoint location is refined using
images. the line model.
Fig. 8. Scheme of the log-polar sectors.
reversal, the gradient orientations are quantized in eight bins,

resulting in a length of 136.
To make the descriptor more distinctive, we utilize mul-
tiple image patches to construct the descriptor. The larger
the neighborhood used for building a descriptor, the more
structural information the descriptor will contain, and the
higher possibility will be of yielding a stable registration
result [35]. In the proposed OS-SIFT, three GLOH-like circular
neighborhoods with the size of {8α, 12α, 16α} are used for
descriptor construction.
3) Keypoint Matching: The conventional SIFT-like algo-
rithm selects correspondences by a matching strategy based on
the distance between descriptors. A few matching strategies
have been studied, such as the nearest neighbor distance
ratio (NNDR) method [13], the dual matching [18], spatial
consistent matching [17], enhanced feature matching [16],
and sparse representation [10]. The most commonly used
strategy is the NNDR method, which consists of two parts,
Fig. 7. Keypoint detection on two satellite images. (a) SAR image.
(b) Optical image. (c) SAR-Harris on SAR image. (d) Optical-Harris on NN and DR. In the NN step, the nearest Euclidean distance
optical image. (e) LoG on SAR image. (f) LoG on optical image. between descriptors is chosen. Then, a threshold is applied on
the ratio of the closest distance to the second closest distance
to filter unstable matches. Moreover, outliers are removed
Figs. 6(e)–(h) show the results of the keypoint refinement using the fast sample consensus (FSC) algorithm [36]. The
method on the previous simulated images. We can see from FSC algorithm can get more correct matches than RANSAC
the comparative performance that the deviated keypoints have in fewer iterations. The sensed image is rectified using the
been rectified. Moreover, an experiment is conducted on the affine transformation model estimated by correct-matched
real-world images to evaluate the detection performance. It can correspondences.
be seen from the results in Fig. 7 that the multi-Harris method
can detect many repeatable keypoints, whereas the keypoints III. E XPERIMENTAL R ESULTS AND D ISCUSSION
extracted by the LoG approach cannot be matched. In this section, we first evaluate the performance of key-
2) Orientation Assignment and Descriptor Extraction: point detection on the simulated images with different noise
In SIFT-like algorithms, dominant orientations are assigned to conditions. Then, several satellite SAR and optical images
keypoints to maintain rotation invariance. Both the dominant are utilized to test the proposed OS-SIFT algorithm. The
orientation and descriptor extraction are based on the gradient registration performances are evaluated in two ways. The
orientation histogram on a scale-dependent neighborhood. first is a visual check of the checkerboard mosaic image
Consequently, the gradient orientations between optical and and enlarged subimages. The second includes two quantitative
SAR images need to be consistent. However, it is very com- criteria, the root mean square error (RMSE) and correct
mon in multisensor images that the gradients of corresponding matching rate (CMR). All the experiments were conducted
parts of the images will change their directions by exactly with the MATLAB R2014a software on a computer with an
180°, which is called the gradient reversal [32]. This can be Intel Core 3.2-GHz processor and 8.0 GB of physical memory.
one of the main reasons that SIFT-like methods fail with regis-
tering multimodal images [33]. In order to make the orientation A. Parameter Settings and Data Sets
invariant to this reversal, the orientation is restricted to the For the SAR-Harris method, the first scale is set as
interval [0, 180°). α1 = 2, the constant between two adjacent scales is k = 21/2.5 ,
Instead of using a square neighborhood and 4 × 4 square the number of the scale is set to 8, and the arbitrary parameter
sectors as in the original SIFT descriptor, a GLOH-like [34] d is set to 0.04. Scale parameters of the Optical-Harris method
circular neighborhood with a radius of 12α and log-polar follow the relationship in (9), and the other parameters are
sectors (17 location bins) is utilized to create a feature descrip- the same as those in the SAR-Harris method. The thresholds
tor, shown in Fig. 8. Since we have considered the gradient that are used in keypoint detection have a significant influence
Earth images are set as the sensed images and are shown
in Figs. 11(b), 12(b), and 13(b). The first image pair describes
an airport in Tucson, AZ, USA. It can be observed that there
exist a slight rotation difference and a translation difference.
The second image pair describes an industrial area in Tucson,
AZ, USA. It is obvious that there exist a scale difference and
a slight rotation difference. The third image pair displays an
urban area with roads and low buildings in Tucson, AZ, USA.
The fourth image set is a pair of GF images describing the
Forbidden City of Beijing, China. The GF-2 satellite is an
optical satellite with the highest resolution available in China
and was launched in August 2014, and the GF-3 satellite is
a multipolarized C-band SAR satellite that was launched in
August 2016. These satellites are part of the GF series, which
serves as the China HR Earth Observation System [37]. The
sampling resolutions of the two images are 2 and 1.6 m/pixel.
We set the GF-3 image as the reference image, which is shown
in Fig. 14(a), and the GF-2 image as the sensed image, which
is shown in Fig. 14(b). The scale difference between them
Fig. 9. Simulated images. (a) HR optical image. (b) Simulated optical image. is 1.25 times, and we can observe that there exist a rotation
(c) Simulated SAR image. difference (about 30°) and a translation difference.
TABLE I B. Experiments on Keypoint Detection
I MAGE PAIRS AND T HEIR C HARACTERISTICS
In this section, the keypoint repeatability rate is used to
evaluate the proposed method. Given a pair of registered
images, two keypoints are considered as two repeatable points
only if their coordinates satisfy
pi (x, y) − q j (x, y)2 < dr (16)
where pi is the i th keypoint in the reference image, q j is
the j th keypoint in the sensed image, and dr is the distance
threshold. The repeatability rate is then defined as
γr = Ncor /Ntotal (17)
on the performance. These thresholds are empirically set to where Ncor is the number of repeatable keypoints and Ntotal
0.4 and 0.6 for the simulated SAR and optical images and is the number of total keypoints.
to 1 and 5 for satellite SAR and optical images. For different In the detection experiments, two simulated images are used
data sets, the thresholds need to be fine-tuned to obtain a and are shown in Fig. 9. For the simulated SAR image, mul-
reasonable number of keypoints. tiplicative noise with different numbers of looks is simulated,
For the descriptor extraction and keypoint matching, ranging from one-look to nine-look imagery. Correspondingly,
the gradient orientations are quantized into eight bins, and additive Gaussian noise with different variances is also sim-
three GLOH-like circular neighborhoods with a size of ulated. The distance threshold dr is set to 2. The proposed
{8α, 12α, 16α} are aggregated to construct descriptor. The method (multi-Harris with refinement) is compared with the
ratio threshold used in the NNDR method is set to 0.9. multi-Harris detection method and the LoG approach. Curves
In the keypoint detection experiment, an HR optical image is based on the percentage of repeatability rate γr against the
utilized to generate two simulated images by adding speckle noise level are drawn in Fig. 10. It can be observed that the
noise and Gaussian noise. Fig. 9 illustrates the HR image, proposed detection method gives the best performance, and it
the simulated SAR image (three-look), and the simulated is robust to noise. Even when the noise is very strong, such
optical image. In order to assess the robustness of the detection as the case with the largest noise level (corresponding to the
method to noise, simulated images with different noise levels one-look imagery for multiplicative noise and the maximum
are also tested. variance for additive noise), it also reaches a repeatability rate
In the registration performance experiments, four pairs of of 50%. For the smallest noise level, the differences between
real-world images are used, and details of the test images the two simulated images caused by noise are small, and
are listed in Table I. The first, second, and third image hence, the three methods show similar performances.
pairs all consist of a TerraSAR image and a Google Earth
image. The sampling resolution of these images is 1 m/pixel. C. Experiments on Matching Performance
The TerraSAR images are set as the reference images and In this section, the matching performance of the proposed
are shown in Figs. 11(a), 12(a), and 13(a), and the Google algorithm is presented. For each image pair, the reference and
Fig. 10. Percentage of repeatability rate versus different noise levels, where
the small noise level refers to high number of looks in synthetic SAR images
and small variance in synthetic optical images.
sensed images are first presented. Large geometric differences

and intensity differences can be observed between the two
images. Then, the correctly matched keypoint correspondences
are demonstrated in Figs. 11–13 and 14(c). The transforma-
tion model is estimated by the correctly matched keypoints
between the images, and then, the sensed image can be
rectified using this model. The fusion results of the two images
are illustrated in checkerboard mosaic images, which can
be seen in Figs. 11–13 and 14(d). Several subregions in the
checkerboard mosaic images are enlarged to make straight-
forward illustrations. In order to enhance the visualization,
we decreased the contrast and increased the opacity of the
dark parts. It can be observed from the mosaic images and
enlarged subimages that the proposed OS-SIFT gives a good
matching performance.
D. Comparison With Other Registration Algorithms

In this section, the matching performance of the OS-
SIFT algorithm is compared with that of two other methods.
Since the OS-SIFT algorithm is a feature-based technique,
we choose two state-of-the-art feature-based methods for
comparison. The first method proposed by Fan et al. [17]
is denoted as SIFT-M, i.e., SIFT modified. This method
applies three modifications to SIFT in the feature extraction
stage, including extracting features from the second octave,
skipping the dominant orientation assignment and multi-
ple support regions to construct the descriptor. The second
method proposed by Ma et al. [16] [position scale orientation
SIFT (PSO-SIFT)] replaces the traditional gradient computa-
tion with a new method that is robust to complex nonlinear
intensity transformation. Both the SIFT-M and PSO-SIFT Fig. 11. Registration result of the first image pair. (a) Reference image.
(b) Sensed image. (c) Corresponding keypoints. (d) Fusion result.
show good performances for multisensor remote sensing image (e)–(g) Enlarged subimages.
registration. The parameter settings of the comparative meth-
ods follow their authors’ instructions, and we fine-tuned the We evaluate the registration performance in three ways. The
detection thresholds to ensure that the comparative methods first method is a visual check of the checkerboard mosaic
extract similar numbers of keypoints. All methods use the image and the enlarged subimages. The second measure is
same keypoint matching methods (NNDR and FSC). the quantitative criterion, RMSE and CMR. The RMSE can
Fig. 12. Registration result of the second image pair. (a) Reference
image. (b) Sensed image. (c) Corresponding keypoints. (d) Fusion result.
(e) and (f) Enlarged subimages.
be computed as
1
Ncorr
i i i i
ξrmse = H x , y − x , y (18)
1 1 2 2 2
Ncorr
i=1
Fig. 13. Registration result of the third image pair. (a) Reference image.
where H represents the ground truth transformation model (b) Sensed image. (c) Corresponding keypoints. (d) Fusion result.
(e)–(g) Enlarged subimages.
between the reference image and the sensed image, 2
is the Euclidean distance, and (x 1i , y1i ) and (x 2i , y2i ) are the
coordinates of the i th corresponding pair. The model H has image pair. The CMR is defined as
been estimated between all three image pairs by manually
selecting 20 pairs of corresponding control points for each γcorr = Ncorr /Norig (19)
A small RMSE indicates that the accuracy of the pixel

location is high. We can see from Table II that the proposed
OS-SIFT gives the best matching performance among the three
comparative algorithms, both on the RMSE and CMR.
For the first image pair, the SIFT-M fails to correctly register
the two images due to the nonlinear intensity difference. The
PSO-SIFT utilizes a new gradient computation that is robust to
the nonlinear intensity transform, and thus, it can successfully
register the two images. Since the OS-SIFT algorithm consid-
ers the inherent properties of SAR and optical images, and
the consistent gradient computation is also robust to nonlinear
intensity difference. However, the PSO-SIFT algorithm detects
more keypoint correspondences than the proposed OS-SIFT by
the NNDR method. The number of correct correspondences
in OS-SIFT is higher than that in PSO-SIFT, resulting in
a higher CMR. Moreover, the RMSE of the OS-SIFT is
smaller than that of the PSO-SIFT. For the second image pair,
similar conclusions can be drawn. Both the first and second
image pairs describe suburban areas with low altitudes, and
the geometrical distortions between the reference and sensed
images are small, so the proposed OS-SIFT gives a good
registration performance with a subpixel RMSE.
The third image pair describes an urban area with some
low-height buildings. In HR remote sensing images, building
structures become visible. However, as a consequence of
different imaging mechanisms in SAR and optical sensors,
the appearances of buildings are different, which increases
the geometrical distortions between the reference and sensed
images. Moreover, the third image set is a pair of images
describing a large area. It is challenging and time-consuming
to match very large remote sensing images. Consequently,
only the proposed OS-SIFT has successfully matched the two
images, and it runs for more than half an hour with a relatively
low registration accuracy. For the fourth image pair, it shows
the Forbidden City in China, which is an urban region with
medium-height buildings. Affected by the speckle noise and
geometric distortions, the PSO-SIFT fails to correctly register
the two images. We can see from the reference and sensed
images that the intensity difference is linear and relatively
small. Hence, the SIFT-M algorithm can successfully match
the two images. Although the proposed OS-SIFT algorithm
gives a better performance than the SIFT-M, both RMSEs
of the two algorithms are above 2 pixel, and the CMRs are
also relatively lower than those for the first two image pairs.
Meanwhile, we can see from Figs. 13(e)–(g) and 14(e)–(g)
that there exists a slight deviation, yet the enlarged subimages
in the first and second image pairs are precisely registered.
Fig. 14. Registration result of the fourth image pair. (a) Reference image. As mentioned in Section I, the side-looking mechanism of
(b) Sensed image. (c) Corresponding keypoints. (d) Fusion result.
(e)–(g) Enlarged subimages.
SAR sensors makes the building areas suffer from geometrical
distortions like foreshortening, layover, and shadow, which do
where Ncorr is the number of correctly matched keypoints after not exist in the corresponding optical image. These geomet-
eliminating false matches, and Norig is the number of keypoints rical distortions may lead to deviations in keypoint location,
selected by the NNDR method. The third is the computation resulting in a relatively high RMSE. Some related researches
cost. The running times for the four experiments are presented. have considered this problem. Esch et al. [38] analyzed the
Comparisons of RMSE and CMR and the running time speckle statistics of different land cover types and extracted
for the four image pairs are presented in Table II. Generally, urban area by means of an unsupervised analysis of scatter-
a large CMR indicates that there exist more correctly registered plots and standardized histograms of the local coefficient of
keypoints, resulting in a more accurate transformation model. variation. Byun et al. [39] proposed an area-based multisensor
TABLE II
C OMPARISON OF RMSE, CMR, AND RUNNING T IME FOR T EST I MAGES
image fusion method. The authors designed different fusion spaces are constructed for two images. Repeatable keypoints
rules for homogenous regions and heterogenous regions to are detected by finding the local maxima in two Harris
improve fusion quality. Han and Byun [40] removed densely scale spaces, followed by a proposed localization refinement
extracted keypoints in urban regions to improve registration method. Orientation restriction and multiple image patches
reliability between very HR optical and SAR images. How- are utilized to construct the GLOH-like descriptors. Finally,
ever, numerous keypoints can be detected in urban regions corresponding keypoints are selected by the NNDR method
owing to their rich structural properties, and abondoning these and the FSC algorithm. The experimental results on simulated
keypoints may result in a lack of enough correspondences. images and several HR satellite images have demonstrated
Consequently, registration in urban areas between SAR and that the proposed OS-SIFT gives a good performance in
optical images is still a problem to be solved for feature-based keypoint detection and image registration. Compared with two
techniques. For our proposed OS-SIFT, we will further inves- state-of-the-art optical-to-SAR image registration algorithms,
tigate an invariant feature description for building structures the registration accuracy of the OS-SIFT approach is higher
in urban areas, which represents one of our future works. while requiring a comparable running time.
The running time of four image pairs using three compara- However, there still exist some limitations for the proposed
tive algorithms is also presented in Table II. The computational algorithm that need to be improved in our future work.
complexity of the SIFT-like algorithms highly depends on the First, the registration precision can be further increased by
size of descriptor and the number of keypoints. Since the combining the proposed OS-SIFT algorithm with an area-
SIFT-M and OS-SIFT both use multiple support regions to based method like mutual information, resulting in a two-
construct the descriptor, the size of their descriptors is larger stage registration scheme. Second, we discovered that the
than that in the PSO-SIFT. For the number of keypoints, registration performance of HR optical-to-SAR images in
we have fine-tuned the detection thresholds to ensure that urban areas is worse than that in suburban and rural areas.
the three algorithms all extract similar numbers of keypoints. Buildings areas in HR SAR images are strongly affected by
Consequently, the shortest running time is given by PSO-SIFT, geometrical distortions due to the side-looking mechanism.
followed by the proposed OS-SIFT. Moreover, the computa- Nevertheless, there exist numerous keypoints in building areas
tional cost becomes very large for registering large remote owing to the abundant structural information. We will focus
sensing images, such as the third image pair shown in Fig. 13. on finding an intermediate feature description or exploring
All the experiments are conducted in the MATLAB software, invariant features for building areas in future works.
and the computational efficiency can be further improved by
implementing the proposed algorithm in C/C++. R EFERENCES
Our proposed OS-SIFT algorithm yields the smallest RMSE
[1] R. Hänsch, O. Hellwich, and X. Tu, “Machine-learning based detec-
and largest CMR for all four image pairs primarily for three tion of corresponding interest points in optical and SAR images,” in
reasons, which are given as follows. Proc. IEEE Int. Geosci. Remote Sens. Symp. (IGARSS), Jul. 2016,
pp. 1492–1495.
1) Benefiting from the consistent gradient computation and [2] A. Wong and D. A. Clausi, “ARRSI: Automatic registration of remote-
keypoint refinement method, the proposed OS-SIFT can sensing images,” IEEE Trans. Geosci. Remote Sens., vol. 45, no. 5,
find more repeatable keypoints than the LoG approach pp. 1483–1493, May 2007.
between SAR and optical images. [3] S. Suri and P. Reinartz, “Mutual-information-based registration of
TerraSAR-X and Ikonos imagery in urban areas,” IEEE Trans. Geosci.
2) In order to be invariant to the gradient reversal situa- Remote Sens., vol. 48, no. 2, pp. 939–949, Feb. 2010.
tion, we restrict the gradient orientation to the interval [4] M. Mellor and M. Brady, “Phase mutual information as a similarity
[0, 180°). Large intensity differences between multisen- measure for registration,” Med. Image Anal., vol. 9, no. 4, pp. 330–343,
2005.
sor images can be reduced. [5] J. Lewis, “Fast normalized cross-correlation,” Vis. Interface, vol. 10,
3) Multiple image patches are utilized to increase the no. 1, pp. 120–123, 1995.
distinctiveness of the descriptor. [6] F. Wang and B. C. Vemuri, “Non-rigid multi-modal image registration
using cross-cumulative residual entropy,” Int. J. Comput. Vis., vol. 74,
no. 2, pp. 201–215, 2007.
[7] Y. Ye and J. Shan, “A local descriptor based registration method for mul-
IV. C ONCLUSION AND F UTURE W ORK tispectral remote sensing images with non-linear intensity differences,”
ISPRS J. Photogram. Remote Sens., vol. 90, pp. 83–95, Apr. 2014.
In this paper, an automatic and robust SIFT-like algorithm is [8] Y. Keller and A. Averbuch, “Multisensor image registration via implicit
proposed to solve the registration of optical-to-SAR images, similarity,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 5,
denoted as OS-SIFT. The proposed OS-SIFT algorithm uti- pp. 794–801, May 2006.
[9] F. Dellinger, J. Delon, Y. Gousseau, J. Michel, and F. Tupin, “SAR-SIFT:
lizes two different operators to calculate consistent gradi- A SIFT-like algorithm for SAR images,” IEEE Trans. Geosci. Remote
ents for SAR and optical images. Then, two Harris scale Sens., vol. 53, no. 1, pp. 453–466, Jan. 2015.
[10] J. Fan, Y. Wu, F. Wang, P. Zhang, and M. Li, “New point matching [33] A. Kelman, M. Sofka, and C. V. Stewart, “Keypoint descriptors for
algorithm using sparse representation of image patch feature for SAR matching across multiple image modalities and non-linear intensity
image registration,” IEEE Trans. Geosci. Remote Sens., vol. 55, no. 3, variations,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
pp. 1498–1510, Mar. 2017. Jun. 2007, pp. 1–7.
[11] H. Sui, C. Xu, J. Liu, and F. Hua, “Automatic optical-to-SAR image [34] K. Mikolajczyk and C. Schmid, “A performance evaluation of local
registration by iterative line extraction and Voronoi integrated spectral descriptors,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 10,
point matching,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 11, pp. 1615–1630, Oct. 2005.
pp. 6058–6072, Nov. 2015. [35] J. Fan, Y. Wu, M. Li, W. Liang, and Q. Zhang, “SAR image registration
[12] B. Xiong, Z. He, C. Hu, Q. Chen, Y. Jiang, and G. Kuang, “A method using multiscale image patch features with sparse representation,” IEEE
of acquiring tie points based on closed regions in SAR images,” in J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 10, no. 4,
Proc. IEEE Int. Geosci. Remote Sens. Symp. (IGARSS), Jul. 2012, pp. 1483–1493, Apr. 2016.
pp. 2121–2124. [36] Y. Wu, W. Ma, M. Gong, L. Su, and L. Jiao, “A novel point-matching
[13] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” algorithm based on fast sample consensus for image registration,” IEEE
Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004. Geosci. Remote Sens. Lett., vol. 12, no. 1, pp. 43–47, Jan. 2015.
[14] L. Huang and Z. Li, “Feature-based image registration using the [37] X. Tong, W. Zhao, J. Xing, and W. Fu, “Status and development of
shape context,” Int. J. Remote Sens., vol. 31, no. 8, pp. 2169–2177, China High-Resolution Earth Observation System and application,” in
2010. Proc. IEEE Int. Geosci. Remote Sens. Symp. (IGARSS), Jul. 2016,
[15] P. Schwind, S. Suri, P. Reinartz, and A. Siebert, “Applicability of the pp. 3738–3741.
SIFT operator to geometric SAR image registration,” Int. J. Remote [38] T. Esch, A. Schenk, T. Ullmann, M. Thiel, A. Roth, and S. Dech, “Char-
Sens., vol. 31, no. 8, pp. 1959–1980, 2010. acterization of land cover types in TerraSAR-X images by combined
[16] W. Ma et al., “Remote sensing image registration with modified SIFT analysis of speckle statistics and intensity information,” IEEE Trans.
and enhanced feature matching,” IEEE Geosci. Remote Sens. Lett., Geosci. Remote Sens., vol. 49, no. 6, pp. 1911–1925, Jun. 2011.
vol. 14, no. 1, pp. 3–7, Jan. 2017. [39] Y. Byun, J. Choi, and Y. Han, “An area-based image fusion scheme
[17] B. Fan, C. Huo, C. Pan, and Q. Kong, “Registration of optical and SAR for the integration of SAR and optical satellite imagery,” IEEE J. Sel.
satellite images by exploring the spatial relationship of the improved Topics Appl. Earth Observ. Remote Sens., vol. 6, no. 5, pp. 2212–2220,
SIFT,” IEEE Geosci. Remote Sens. Lett., vol. 10, no. 4, pp. 657–661, Oct. 2013.
Jul. 2013. [40] Y. Han and Y. Byun, “Automatic and accurate registration of VHR
[18] S. Wang, H. You, and K. Fu, “BFSIFT: A novel method to find feature optical and SAR images using a quadtree structure,” Int. J. Remote Sens.,
matches for SAR image registration,” IEEE Geosci. Remote Sens. Lett., vol. 36, no. 9, pp. 2277–2295, 2015.
vol. 9, no. 4, pp. 649–653, Jul. 2012.
[19] F. Wang, H. You, and X. Fu, “Adapted anisotropic Gaussian SIFT
matching strategy for SAR registration,” IEEE Geosci. Remote Sens.
Lett., vol. 12, no. 1, pp. 160–164, Jan. 2015.
[20] H. Goncalves, L. Corte-Real, and J. A. Goncalves, “Automatic image Yuming Xiang received the B.S. degree in
registration through image segmentation and SIFT,” IEEE Trans. Geosci. electronic engineering from Tsinghua University,
Remote Sens., vol. 49, no. 7, pp. 2589–2600, Jul. 2011. Beijing, China, in 2013. He is currently pursuing
[21] M. Gong, S. Zhao, L. Jiao, D. Tian, and S. Wang, “A novel coarse- the Ph.D. degree with the Institute of Electronics,
to-fine scheme for automatic image registration based on SIFT and Chinese Academy of Sciences, Beijing.
mutual information,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 7, His research interests include multisensor
pp. 4328–4338, Jul. 2014. remote-sensing image registration and feature
[22] H. Zhu, W. Ma, B. Hou, and L. Jiao, “SAR image registration based detection in remote-sensing images.
on multifeature detection and arborescence network matching,” IEEE
Geosci. Remote Sens. Lett., vol. 13, no. 5, pp. 706–710, May 2016.
[23] N. Merkle, R. Müller, and P. Reinartz, “Registration of optical and sar
satellite images based on geometric feature templates,” Int. Archives
Photogram., Remote Sens. Spatial Inf. Sci., vol. 40, no. 1, pp. 447–452,
2015.
Feng Wang received the B.S. degree in photoelectric
[24] Y. Ye, L. Shen, M. Hao, J. Wang, and Z. Xu, “Robust optical-to-SAR
information engineering from the Beijing University
image matching based on shape properties,” IEEE Geosci. Remote Sens.
of Aeronautics and Astronautics, Beijing, China,
Lett., vol. 14, no. 4, pp. 564–568, Apr. 2017.
in 2010, and the Ph.D. degree in signal and infor-
[25] Y. Ye, J. Shan, L. Bruzzone, and L. Shen, “Robust registration of
mation processing from the University of Chinese
multimodal remote sensing images based on structural similarity,” IEEE
Academy of Sciences, Beijing, in 2015.
Trans. Geosci. Remote Sens., vol. 55, no. 5, pp. 2941–2958, May 2017.
Since 2015, he has been an Assistant Researcher
[26] M. Salehpour and A. Behrad, “Hierarchical approach for synthetic
with the Institute of Electronics, Chinese Academy
aperture radar and optical image coregistration using local and global
of Sciences, Beijing. His research interests include
geometric relationship of invariant features,” J. Appl. Remote Sens.,
multisource remote sensing image processing, image
vol. 11, no. 1, p. 015002, 2017.
registration, and change detection.
[27] R. Fjortoft, A. Lopes, P. Marthon, and E. Cubero-Castan, “An optimal
multiedge detector for SAR image segmentation,” IEEE Trans. Geosci.
Remote Sens., vol. 36, no. 3, pp. 793–802, May 1998.
[28] Q.-J. Yang and X.-M. Xiao, “A new registration method base on
improved Sobel and SIFT algorithms,” in Proc. 3rd Int. Conf. Comput. Hongjian You received the B.S. degree in engineer-
Elect. Eng. (ICCEE), 2012, pp. 1–6. ing from Wuhan University, Wuhan, China, in 1992,
[29] Q.-R. Wei and D.-Z. Feng, “An efficient SAR edge detector with a lower the M.S. degree from Tsinghua University, Beijing,
false positive rate,” Int. J. Remote Sens., vol. 36, no. 14, pp. 3773–3797, China, in 1995, and the Ph.D. degree from the Uni-
2015. versity of Chinese Academy of Sciences, Beijing,
[30] C. Harris and M. Stephens, “A combined corner and edge detector,” in in 2001.
Proc. Alvey Vis. Conf., 1988, vol. 15. no. 50, pp. 147–152. He is currently a Professor with the Key Lab-
[31] E. J. Hannan and M. Deistler, The Statistical Theory of Linear Systems. oratory of Technology in Geo-spatial Information
Philadelphia, PA, USA: SIAM, 2012. Processing and Application System, Institute of
[32] M. T. Hossain, G. Lv, S. W. Teng, G. Lu, and M. Lackmann, “Improved Electronics, Chinese Academy of Sciences, Beijing.
symmetric-SIFT for multi-modal image registration,” in Proc. Int. Conf. His research interests include remote sensing image
Digit. Image Comput. Techn. Appl. (DICTA), Dec. 2011, pp. 197–202. processing and analysis and SAR image applications.

Os Sift

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Os Sift

Transféré par

Droits d'auteur :

Formats disponibles

This article has been accepted for inclusion in a future issue of this journal.

Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 1

OS-SIFT: A Robust SIFT-Like Algorithm for

W ITH the rapid development of remote sensing sensors,

2 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

detection, feature matching, and transformation estimation.

averages (ROEWA) and multiscale Sobel, to calculate

4 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

Fig. 3. One-dimensional ISEF filter and mean filter.

the Sobel operator calculates the horizontal gradient and

Fig. 5. Flowchart of the proposed OS-SIFT algorithm.

6 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

Despite the good detection results for the multiscale Harris

keypoints on the two simulated images using the multi-Harris a= N x i yi − xi yi

Fig. 8. Scheme of the log-polar sectors.

reversal, the gradient orientations are quantized in eight bins,

8 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

sensed images are first presented. Large geometric differences

D. Comparison With Other Registration Algorithms

10 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

A small RMSE indicates that the accuracy of the pixel

12 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

Vous aimerez peut-être aussi