Vous êtes sur la page 1sur 6

Proceedings of the IEEE ITSC 2006 TB4.

1
2006 IEEE Intelligent Transportation Systems Conference
Toronto, Canada, September 17-20, 2006

Color Model-Based Real-Time Learning for


Road Following
Ceryen Tan, Tsai Hong, Tommy Chang, and Michael Shneier
National Institute of Standards and Technology
Gaithersburg, MD 20899

Abstract - Road following is an important skill vital An examination of previous approaches highlights a
to the development and deployment of autonomous vehi- number of difficult road conditions that road following algo-
cles. Over the past few decades, a large number of road rithms must handle. Existing approaches make assumptions
following computer vision systems have been developed. about these conditions in order to simplify the task of road
All of these systems have limitations in their capabilities, detection. The approach we describe here attempts to deal
arising from assumptions of idealized conditions. The with most of these conditions, including:
systems show dependency on highly structured roads,
road homogeneity, simplified road shapes, and idealized Unstructured roads
lighting conditions. In the real world, the systems are Unclear road edges
only effective in specialized cases. Nonhomogeneous road appearance
This paper proposes a vision system that is capable Arbitrary road shape
of dealing with many of these limitations, accurately Poor, inconsistent lighting conditions (e.g., shad-
segmenting unstructured, nonhomogeneous roads of ar- ows)
bitrary shape under various lighting conditions. The
Paved and unpaved roads
system uses color classification and learning to construct
and use multiple road and background models. Color
One class of road following algorithms focuses on the
models are constructed on a frame by frame basis and
detection of highway lane markings (Dickmanns [8], Thorpe
used to segment each color image into road and back-
[22], Schneiderman [21] Rotaru et al., [20], Beucher [4], and
ground by estimating the probability that a pixel belongs
Kosecks [15]). These algorithms are fast and well-suited for
to a particular model. The models are constructed and
the task of highway driving. They do, however, make as-
learned independently of road shape, allowing the seg-
sumptions about the structure of the road, and do not work
mentation of arbitrary road shapes. Temporal fusion is
on roads that do not exhibit this structure, such as unmarked
used in the stabilization of the results. Preliminary test-
paved roads and dirt roads. In addition, faded, obscured
ing demonstrates the system's effectiveness on roads not
markings and poor, inconsistent lighting conditions such as
handled by previous systems.
shadows may lead to problems. The algorithms must make
additional assumptions about the condition of the road and
Index Terms Road following, learning, color models,
the ambient illumination.
temporal fusion, segmentation.
Another approach is to detect road boundaries through
I. INTRODUCTION the use of gradient-based edge techniques (Hong et al., [11],
Broggi and Berte, [3], Rotaru et al.,[20]). These algorithms
Road following is a widely studied field of computer make the assumption that road edges are clear and fairly
vision that has clear importance in the development of sensor sharp. Road boundaries can be obscured by dirt and leaves.
systems for autonomous vehicles. In this paper, we make a In addition, for specific roads, such as dirt roads, it may be
distinction between road detection, in which the goal is to difficult to locate edges. Lighting conditions also play a ma-
locate roads anywhere in an image, and road following, in jor role in these approaches. Shadows can lead to false
which we assume the vehicle is on the road, must stay on the edges that are often stronger than the actual boundaries of
road as it travels, and must detect the road in front of it. The the road, while poor lighting can eliminate road edges in the
paper deals only with road following. image. Thus, these algorithms also work only under ideal
Since the early 1980s, a large number of computer vi- road and lighting conditions.
sion systems have been developed to address road following. Other systems attempt to capitalize on the often homo-
Powerful algorithms now exist that segment roads in video geneous appearance of roads. However, the assumption of
images, often in real time. Desouza and Kak [9] and Dick- homogeneous appearance often fails, for example in the
manns[7] provide excellent surveys. These systems are, presence of shadows. Color and texture are common features
however, demonstrably limited in their capabilities. Limita- used in road extraction. A well-known computer vision sys-
tions arise from various assumptions of idealized conditions, tem known as SCARF (Supervised Classification Applied to
which reduce the effectiveness of the systems in real world Road Following) (Crisman and Thorpe, [5]) has proven its
conditions.

1-4244-0094-5/06/$20.00 2006 IEEE 939


effectiveness in areas where other algorithms fail. The algo-
rithm, based upon color segmentation using road/background
II. ALGORITHM DESCRIPTION
Gaussian mixture color modeling, is capable of detecting
nonhomogeneous, unstructured roads with poor edges under The idea behind the algorithm is the segmentation of
heavy shadow conditions. Close examination of this result, road from background through the use of color models. Data
however, shows that the algorithm makes assumptions on the are collected from a standard video camera mounted on an
shape of the road. It is first assumed that the width of the all-terrain vehicle. In each video frame, color models of the
road is known. It is also assumed that the road is linear. For road and background are constructed through a scheme that
the most part, the system can get away with the assumption makes only one assumption about the shape and location of
of linearity through high frame rates. However, the authors the roadthat the region directly in front of the vehicle is
note that less accurate results arise in sharply curving roads. road. The color models are used to calculate the probability
Difficulties may also arise when road widths change. The that each pixel in a frame is a member of the road class.
system displays its inability to segment arbitrary road shapes Temporal fusion of these road probabilities helps to stabilize
through the fact that intersections must be explicitly modeled the models, resulting in a probability map that can be thresh-
in order to be detected. olded to determine areas of road and nonroad (Fig. 1).
The UNSCARF (UNSupervised Classification Applied
to Road Following) (Crisman and Thorpe, [6]) system is Road Region Road Color
another algorithm that works through road shape recognition Histogrammed Model Updated
from groups of similarly colored pixels. This algorithm, like Road Probabilities
SCARF works on nonhomogeneous, unstructured roads with Calculated
Background
poor edges under heavy shadow conditions. The system is Background Color
Histogrammed using
Model Updated
once again capable of recognizing only linear roads of Temporal Fusion
known width. Additional road shapes must explicitly be
modeled to be recognized, at the cost of increased processing Temporal Fusion
time. Updated
Various neural networking approaches also show poten-
tial in solving the road following problem. ALVINN Fig. 1. The system architecture after initialization.
(Pomerleau, [19]) is a neural-network based algorithm that
takes a video image as input and outputs a steering direction. A. Building the Color Models
Trained properly, the system can handle nonhomogeneous
roads in various lighting conditions. However, the approach The algorithm needs a method to accurately represent
only works on straight roads. A neural network-based sys- the color distributions seen in road and background. Previ-
tem (Fdish and Takeuchi, [10]) makes use of a dynamically ous approaches to color modeling have often made use of
trained neural network to distinguish between areas of road Gaussian mixture models. However, this approach assumes
and nonroad. This approach is capable of dealing with non- Gaussian color distributions. Our experiments have shown
homogeneous road appearance if the nonhomogeneity is ac- this assumption to be false. A superior method is to make
curately represented in the training data. In order to generate use of color histograms. In addition, many road detection
training data, assumptions on the shape and location of the systems have simply made use of the RGB color space in
road are made. Six predefined regions of road and nonroad their methods. However, previous research (He et al., [11],
are used to generate training data. As noted in the paper, Kristensen, [14], Lin and Chen, [17]) has shown that other
positioning these six predefined regions is difficult for arbi- color spaces may offer advantages in terms of robustness
trary road shapes. against changes in illumination, which should prove useful in
As can be seen through this summary of past ap- the real world environment.
proaches, existing approaches make simplifying assumptions We found that 30x30 histograms of normalized red (R)
about road conditions in order to achieve the goal of road and green (G) gave the best results in color modeling of road
following. This paper proposes an algorithm for the real- and background. Normalized R and G are fairly robust to
time segmentation of road from color video images that at- changes in illumination, while at the same time being fast to
tempts to overcome rather than simplify the difficult condi- calculate. Normalized R and G are calculated as R / (R + G
tions seen on the road. The algorithm is similar to SCARF, + B) and G / (R + G + B). Note that dividing by R + G + B
which appears to be the most successful of all existing ap- eliminates the need for normalized B, as normalized R + G +
proaches. Like SCARF, the algorithm is capable of segment- B = 1. Once normalized R and G are known, normalized B
ing nonhomogeneous, unstructured roads under a wide range is determined. Further experimentation may be carried out to
of lighting conditions, but has the additional capability to determine if histograms of different sizes will give better
segment arbitrary road shapes. results, and if other color spaces will have additional benefit.

940
(a) (b)
Fig. 2 (a) The red box area is used to construct the initial
Fig. 3. Nonhomogeneous road
road model. (b) The histogram of colors in the box.
A color distribution model is implemented as a set of
Separate models are created for road and background.
histograms created over time. A color distribution model
Both models make use of color distributions, represented as
starts off with a single histogram. New histograms may be
2D histograms of normalized R and G (Fig. 2b). The major
added to update the color distribution model until a maxi-
difference between the models is that for the road model, we
mum number has been reached. At this point, in order to
assume that the road is constructed from multiple color dis-
update the color distribution model, the oldest histogram is
tributions (currently, four), whereas the background is repre-
removed to make room for the new. This is a form of tempo-
sented with a single model. Our approach is to make one
ral integration carried out on color distributions, done in or-
assumption, that the area in front of the vehicle is road (Fig.
der to increase robustness, stability, and accuracy. While
2a). This is a common assumption in road following algo-
multiple histograms are constructed for the road, a single
rithms.
background histogram is constructed from frame to frame.
Modeling the road as multiple color distributions helps
This is because we are only interested in the road, so there is
to increase robustness. It allows the algorithm to adapt to
no need to identify multiple background regions. The back-
nonhomogeneous roads, shadows, illumination changes, and
ground color distribution model is a summation of a number
any other condition that causes spatial change in appearance
of previous background histograms. If temporal fusion con-
(Fig. 3). Having multiple color distributions allows the algo-
tains mistakes, using a summation of previous results reduces
rithm to learn and remember previously seen road condi-
their impact. The number of histograms in each color model
tions.
controls how quickly the algorithm adapts to new data.
There are several reasons why the background is not
As new data are processed, each color distribution
similarly modeled by multiple color distributions. The first
model is updated with new histograms, changing with time to
is relative ease of segmentation. It is a much simpler task to
fit changing conditions. For example, the road model may
segment the road into multiple color distributions than the
contain a color distribution representing shadows. As time
background. This is due to the fact that in many situations,
progresses, shadows may slowly get darker. The color dis-
color distributions on the road are encountered one at a time
tribution model will be updated to reflect this change, allow-
rather than all at once. For example, while driving along a
ing the overall model to see darker shadows as road.
road that contains shadows in places, often the vehicle is
The way in which the models are constructed and up-
either in shadow or it is not, allowing it to learn a single
dated is a direct result of our assumption that the area in
distribution at a time. The background, however, has many
front of the vehicle is guaranteed to be road. The scheme
different color distributions all appearing at the same time,
was designed so that no other assumptions are needed to
leading to difficulties in segmentation. Whereas the road can
segment the road from the background. There are two con-
usually be represented by a small number of color distribu-
sequences of this assumption. First, the road model must
tions, the background can be arbitrarily complicated, and
accurately learn and model the road only from the region of
since we are not interested in segmenting the background,
the image assumed to be road. Second, the background
there is no need to try to model it except in how it differs
model must be constructed without any prior knowledge of
from the road.
where background appears in the image and must be accurate
enough to prevent the road color model from classifying too
much of the image as road. These requirements lead to dif-
ferent methods for updating the road and background classes.
The first task is to learn and model the road from just the
region of the image assumed to be road. In the first frame,
the region assumed to contain road is simply histogrammed
and used to create an initial color distribution model. Subse-
quent frames are processed as follows.

941
Assuming that the road model already contains a number Here Proad is the road probability, N road is the number of
of existing color distribution models, the algorithm must
observations in the road histogram distribution (the number
decide when to create a new model. At each frame, the re-
of elements in the bin in which the pixel falls) and
gion assumed to be road is histogrammed. The algorithm
then has two choices. It can either update an existing color N background is the number of observations in the correspond-
distribution model or create a new distribution. If the new ing bin in the background distribution.
histogram is too different from all existing distributions (de- As there are multiple road models, multiple road prob-
termined by a threshold), then the algorithm enters learning abilities are calculated at each pixel. The largest road prob-
mode. Otherwise, the histogram is used to update the model ability is selected as the road probability for that pixel. The
that is most similar. justification is that a color may fit into one road model better
In learning mode, the algorithm attempts to locate a his- than all of the other models. For example, a dark pixel may
togram that is most different from the color distributions that fit into a model of shaded road better than into a model of
already exist. The same region in successive images is his- unshaded road.
togrammed and compared with existing distributions. While In order to reduce processing requirements, probabilities
the difference continues to increase, learning mode is ex- are calculated on a reduced version of the original image.
tended. Once the difference stops increasing, the histogram The original image is resized to 128x96 through an averag-
with the largest difference is used to create a new color dis- ing filter. This step has the additional benefit of noise reduc-
tribution model. tion. Experiments show that this step does not significantly
The purpose of learning mode is to locate color distribu- impact the final segmentation. A noteworthy aspect of this
tions that are as different from each other as possible. The algorithm is that the color models are constructed from the
color distributions may potentially overlap with each other. original image for better accuracy, whereas probabilities are
This may happen, for example, when the region assumed to calculated on a reduced version of the image for greater
be road contains colors from two different color distribu- speed.
tions. Learning mode attempts to eliminate this possibility
by selecting the most different histograms for candidates of C. Temporal Fusion
new color distributions
The histogram comparison function is geared towards This step takes road probabilities from multiple frames
speed rather than accuracy. Both histograms are convolved and fuses them. The end result is a final probability map that
with a 3x3 Gaussian mask and normalized. The function is more consistent than individual probability maps con-
then calculates the sum of the squared differences between structed from single frames. The temporal fusion algorithm
the bins of the two histograms. computes a running average with a parameter to adjust for
Construction of the background model is simpler. The the influence of new data:
background is assumed to be where the road is not. To con-
struct the background model, the algorithm looks at the pre- ( w * Pt 1 ) + P
vious segmentation result to determine areas of road and Pt =
nonroad. The algorithm then randomly samples the nonroad ( w + 1)
areas to construct a color histogram, which is used to update w = w + 1 if w < wmax
the background color distribution model.
Note that both road and background color models are
constructed from the original image without subsampling. where P is the current probability, Pt is the temporal fusion
probability, and w is a weight.
B. Road Probability Calculation The ability to adjust the influence of new data is an im-
portant aspect of the temporal fusion algorithm. For example,
Given the road and background models, the image must while the vehicle is turning, it is likely that the probability
be segmented into road and background regions. This is done maps from frame to frame will have very little overlap. In
by computing, for each pixel, the probability that it belongs such a case, it is helpful to set the influence of new data very
to each model. The pixel is then assigned to the most prob- high.
able model. The algorithm visits every pixel in the image and To determine areas of road and background from the
calculates a road probability based upon its color. The end final probability map, the probability map is thresholded.
result is a probability map that represents the likelihood that However, this may result in multiple regions with a high
an area is road. Given a specific pixel color, a road color probability of being road. To select the proper region, 20
model and the background color model, road probability is random seeds are selected within the area in front of the ve-
calculated as: hicle that is assumed to be road. Regions of high road prob-
N road ability connected to these 20 random seeds are taken as re-
Proad = gions of road. Fig. 4 shows the result of temporal fusion.
N road + N background Fig. 5 to Fig. 7 show examples of the results of the algo-

942
rithm. In these Figures, blue and white represent regions elimination of dependency on mathematical road modeling is
classified as road, while red and black are background. what allows our algorithm to segment arbitrary road shapes.
The implementation of the algorithm can be separated
into two major parts, road area extraction and road model
fitting. Road area extraction is responsible for the extraction
of arbitrarily-shaped road. Road model fitting takes the road
area extraction and attempts to fit a mathematical model.
This is done to simplify tasks for a robot controller. This
step is separate from the actual road detection and is op-
tional.
The average computation time for the algorithm is ap-
(a) (b) proximately 20-30 milliseconds on a Pentium-M CPU ma-
chine. A more systematic performance evaluation [14], [10]
Fig. 4 (a). Road probability after thresholding. (b) After for the algorithm will be carried out in the near future
temporal fusion. Blue represents road, while black and In summary, the system makes use of color classification
red are background. and learning to create road and background color models.
Color models are constructed on incrementally and used to
segment a color image into road and background by comput-
D. Fitting a Model to the Road Edges ing the most likely model for each pixel in the image. Color
models are constructed and learned through a scheme inde-
As a postprocessing step, a model of the road can be pendent of road shape. Temporal fusion is used in the stabi-
created. While road region extraction is independent of road lization of the results. Preliminary testing demonstrates the
shape, road modeling is limited in the number of road shapes system's effectiveness on roads not handled by previous sys-
that can be represented, depending on the selected model. tems.
This step thereby represents a loss in generality. However,
this step may not be necessary for many applications.
In previous research (Crisman and Thorpe, [5], Hong et
al., [12], Aufrere et al., [1], [2]), a number of road models
have been proposed. The SCARF road model, for example,
is capable of robustly fitting straight roads and intersections.
We estimate road boundaries using a second-order polyno-
mial fit to edge points of the area extracted as road. Example
results are shown in Fig. 5 through Fig. 7.

III. CONCLUSION AND RESULTS


We have described a real-time color learning algorithm
that is capable of dealing with unstructured, nonhomogene- Fig. 5. Nonhomogeneous road example. Upper left shows
ous, complex road shapes, with varying lighting conditions. the raw image, bottom left is the current road probabil-
The system has been tested on image sequences taken from a ity, bottom right is the temporal fusion, and upper right
camera mounted on a vehicle driving on the road. The image is the segmented road
sequences contain thousands of frames taken at different
times of day and on different roads. The roads included
trails in the woods, homogeneous and nonhomogeneous
roads, urban roads, and dirt roads.
Color segmentation using color models was explored in
the SCARF system. However, there are a number of key
differences between that algorithm and ours. SCARF de-
pends on mathematical road modeling for a robust result.
Our algorithm makes use of accurate color modeling and
temporal fusion to try and achieve the same robustness. For
the temporal fusion, we assume a fast cycle time so that suc-
cessive frames are close together. SCARF uses road model
results to construct color models in subsequent frames. The

943
[1] Aufrere, R., Chapuis, R., Chausse, A fast and robust vision-based road
following algorithm, Proc. IEEE Symposium on Intelligent Vehicles,
Oct. 2000.
[2] Aufrere, R., Marmoiton, F., Chapuis, R., Collange, F., and Derutin,
J.P., Road Detection and Vehicle Tracking by Vision for Adaptive
Cruise Control, The International Journal of Robotics Research, April
2001.
[3] Broggi, A. and Berte, S., Vision-based road detection in automotive
systems: a real-time expectation-driven approach, Journal of Artificial
Intelligence Research, 1995
[4] Beucher, S. and Bilodeau, M. Road segmentation and obstacle detec-
tion by a fast watershed transformation, Proc. Intelligent Vehicles '94
Symposium, 296-301, 1994.
[5] Crisman, J.D., Thorpe, C.E., SCARF: A color vision system that tracks
roads and intersections, IEEE Trans. Robotics and Automation, Feb.
1993.
[6] Crisman, J. and Thorpe, C. UNSCARF, A color vision system for the
Fig. 6. Nonhomogenous and complex road shape exam- detection of unstructured roads. Proc. International Conference on
Robotics and Automation, 2496-2501. 1991.
ple. Upper-left shows the raw image of an intersection; [7] Dickmanns, E. D., The development of machine vision for road vehi-
bottom-left is the current road probability, bottom-right cles in the last decade, Intelligent Vehicle Symposium, 2002.
is the temporal fusion, and upper-right is the segmented [8] Dickmanns, E. D. Vehicles capable of dynamic vision. International
road. Joint Conference on Artificial Intelligence, 1577-1592. 1997
[9] DeSourza, G. and Kak, A.,Vision for mobile robot navigation: a survey,
When the background is similar in color to the road, the IEEE Trans. Pattern Analysis and Machine Intelligence, Feb. 2002.
fundamental color assumptions are violated. In this case, [10] Fdish, M. and Takeuchi, A., Adaptive real-time road detection
using neural networks, Proc. 7th International IEEE Conference on In-
additional information such 3D structure (e.g. from LADAR, telligent Transportation Systems, Oct. 2004.
stereo, optical flow, etc.) needs to be incorporated. [11] He, Y., Wang, H., Zhang, B., Color-based road detection in urban
traffic scenes, IEEE Trans. Intelligent Transportation Systems, Dec.
2004.
[12] Hong, T., Rasmussen, C. and Shneier, M., road detection and tracking
for autonomous mobile robots, Proc. SPIE 4715, April 2002.
[13] Jochem, T. and Baluja, S., A massively parallel road follower, Com-
puter Architecture for Machine Perception, Dec. 1993.
[14] Kristensen, D., Autonomous road following a study of methods for
tracking unmarked roads in image sequences, Masters degree project,
Stockholm, Sweden , 2004.
[15] Kosecks, R., Blasi, C.J., Taylor, J., Malik, J., A comparative study of
vision-based lateral control strategies for autonomous high-way driv-
ing, IEEE International Conference on Robotics And Automation,
1998.
[16] Kuan, D., Phipps, G., Hsueh, A., autonomous robotic vehicle road
following, IEEE Trans. Pattern Analysis and Machine Intelligence,
Sept. 1988.
[17] Lin, X. and Chen, S., Color image segmentation using modified his
system for road following, Proc. 1991 IEEE International Conference
on Robotics and Automation, 1991.
[18] Pomerleau, D. RALPH: Rapidly adapting lateral position handler.
Fig. 7. A difficult case with a nonuniform road surface. Proc. IEEE Intelligent Vehicles Symposium, 506-511. 1995
[19] Pomerleau, D., ALVINN: An autonomous Land Vehicle in A Neural
Network, Technical Report, CM-CS-89-107, Carnegie Mellon, 1989.
In future work, we plan to use this system to learn to [20] Rotaru, C., Graf, T., Zhang, J., Extracting road features from color
identify traversable and nontraversable areas for autonomous images using a cognitive approach, IEEE Intelligent Vehicle Sympo-
on- and off-road driving. sium, 2004.
[21] Schneiderman, H. and Nashman, M., A discriminating feature tracker
ACKNOWLEDGEMENT for vision-based autonomous driving, IEEE Trans. Robotics and
Automation, vol. 10, no. 6, 1994.
This work was partly funded by the Army Research Labora- [22] Thorpe, C., Hebert, M. H., Kanade, T., Shafer, S.A., vision and navi-
tory and by the DARPA LAGR program. Their support is gation for Carnegie-Mellon Navlab, IEEE Trans. Pattern Analysis and
gratefully recognized. Machine Intelligence, vol. 10, May, 1988.
[23] Zhang, J. and Nagel, H., Texture-based segmentation of road images,
REFERENCES Intelligent Vehicle symposium, Oct. 1994.

944

Vous aimerez peut-être aussi