Académique Documents
Professionnel Documents
Culture Documents
Germano Guimares
Emanoel Xavier
Guilherme Silva
Veronica Teichrieb
Judith Kelner
Keywords
FPGA, Image Processing Algorithms, Augmented Reality,
Embedded Systems.
1. INTRODUCTION
With the evolution of embedded systems and applications using
Virtual Reality (VR), the creation of systems that amplify in real
time user perception of the real world became feasible.
Augmented Reality (AR) is the research area that studies and
creates systems that support the coexistence of virtual and real
worlds [1]. Unlike VR, the user is not immersed in an artificial
environment, but in the real world, with virtual objects or
information superimposed to it. Ideally, in AR, virtual and real
objects coexist in a natural way. A case in the point is the movie
Who Framed Roger Rabbit, where virtual characters are shown
in the real world. In an industrial environment, AR can help to
identify problems and point to solutions, ranging from a simple
help in a procedure to a simulation of the future based on current
information. Many other areas can use (and they are already
using) AR tools, including medicine, production and repair
manufacturing
applications.
lines,
robotics,
entertainment
and
military
2. RELATED WORK
Although several AR applications have been developed, they are
targeted to devices which have a general purpose processor, like
computers [3] and PDAs (Personal Digital Assistants) [4]. In
these contexts, all the processing is done by software, which
usually implies in performance loss or image quality reduction
when the application operates under stringent real-time
constraints. Conciliating the use of general purpose processors
with applications that need to maintain good performance in realtime invariably results in a high cost to the overall project, due to
some factors, e.g. the need of higher clock frequency and power.
Often users need to carry a notebook in order to run a real-time
application, which makes the solution less versatile and more
expensive, mainly because of its weight and power consumption
[3].
On the preliminary research performed to produce this project it
was not found any flexible solution from the point of view of
hardware and software for AR applications. Most existent
solutions for AR are still not accessible for the general audience,
because they are in the research phase and / or are dedicated to a
specific applications [3], [5], [6]. For instance, ID CAM [6] is an
3. FRAMEWORK ARCHITECTURE
Related research on smart camera architectures is normally
targeted picture processing problems, like pattern detection used
to identify gestures, recognize defects and track objects
movement. ARCam intends to implement not only picture
processing modules in hardware, but also to provide the necessary
infrastructure to superimpose virtual elements on the captured
image of the real world, as a form to improve the interface with
the applications user.
The proposed solution in ARCam uses an Altera development
board with the FPGA Stratix II, an image sensor and a VGA
video monitor.
The flexibility of the implementation of firmware through a
hardware description language, such as VHDL, makes possible
the scalability when it is wanted to duplicate a component inside
the FPGA in order to improve processing performance. These
systems which are implemented in an FPGA integrating some
modules are called SoC (System-on-a-Chip). Figure 1 shows the
development environment used in the project, where an image
sensor (lower right) was connected to an FPGA-based
development board.
next section.
4. IMPLEMENTED COMPONENTS
Several image processing components for different purposes were
implemented to compose the ARCam framework. These
components perform typical image functions and are intended to
be used in the design of AR applications. Hardwire, a component
responsible for rendering 3D wireframe virtual elements, was also
implemented.
The overall descriptions of the implemented features along with
the obtained results are given next.
(2)
otherwise
Pout = black
In order to show this result on the display, it is possible to choose
one color to represent the virtual world and another one to
represent the real word. In Figure 3, white color was chosen to
represent the real world and black color represents the virtual
world. Then, when a white pixel is found in memory, the real
world correspondent pixel is printed on the screen, and the black
pixels are printed on the screen representing the virtual world.
(1)
4.2 Labeling
Labeling of a binary image refers to the act of assigning a unique
value to pixels belonging to the same connected region [11]. This
process is often used in marker based AR applications [10].
Definition of a connected region can consider only 4 neighbors
(left, right, up and down) or 8 neighbors. Based on this decision
different results can be obtained. Figure 4 shows this difference.
This process reads the binary image from top to bottom, and left
to right.
(3)
this happens, the next step of the quad detection, namely vertex
reduction, takes place using as its input the memories with the
coordinates of the border pixels.
Once quad detection is finished for the last found border, the
border tracing algorithm searches for another border in the current
frame. This border is then traced and this process is repeated until
the quad detection is executed on each border present on the
current frame and just then a new frame is loaded.
Currently, only the border tracing step is implemented and Figure
13 shows its results. Each border was rendered with a different
color in order to show that they can be distinguished by the
application.
4.8 Hardwire
4.8.3 3D rotations
4.8.4 3D projections
Similarly to an artist, when representing on paper the image of a
3D object, there is also a necessity of generating a projection of
the object to be displayed by the computer. A 3D projection is
nothing more than a 2D representation of a 3D object. The
simplest projection is the orthogonal one, and the most used, the
perspective projection. The last one can simulate the projection
used by human vision, when images from an object are captured.
In our initial prototype of Hardwire, a very simple scheme of
orthogonal projection was implemented, achieved simply by
ignoring the vertex z coordinate on the viewer's coordinate
system. This way, the 3D object is projected directly on the
projection plane, keeping the original size of objects independent
from the viewer's distance.
5.1 Pong
An AR application using the architecture described in Section 3
was implemented as a case study, namely the Pong game, where
two players control two bars that can strike an object that moves
on the screen. Each time the player does not move the bar to
prevent the collision of the ball with the side edges of the scene,
he/she loses points and the edges of the scene blink in red,
announcing the collision. Figure 18 shows Pong at the moment
that the ball (red) collides with the center of the left side edge
(blue player), because the player did not strike it. The blue and
green bars are the players representations and compose the scene
of the game, as well as the upper bars that indicate the players
scores.
Description
0.5 Hz clock
Ball Y position
Ball X position
Vertical movement
Horizontal movement
Players Y position
Players score
Function
Divide clock from 100 MHz to 0.5 Hz
Read
Write
A
B, D
B, D
C, E, F
C, E, G
4
5
F
B, C, F, G
6. LESSONS LEARNED
Hardware development in FPGA adds a series of difficulties and
challenges different from those encountered in software
development. Some of them are the FPGA size (number of logical
elements available), the frequency clock of the circuit and its
speed. These variables must be carefully studied during the
project of an embedded system.
The time to compile and generate the simulation hardware is
really bigger than a software compilation process. In a matter of
seconds, a simple software can be compiled, while a simple
hardware takes some minutes to be compiled. It is a problem that
makes hardware development more complex and must be
considered in any hardware development agenda.
Concerning the framework architecture, there were some
difficulties to start the image sensor capture, due to the fact that
the camera uses an I2C register bus to configure the image. The
image sensor documentation is incomplete and the authors
experienced some problems in identifying its features. In fact,
even at this moment, one of the problems still faced relates to the
image appearing green, even when using a RGB format (8 bits per
color).
Another problem was parallel access to memory. In the beginning
of the project, there were some moments where two processes
were accessing the memory concurrently leading to data
inconsistency. Currently, these two processes have been merged
and only one process is responsible for memory access.
In Hardwire, lessons like algebraic manipulations with signed and
unsigned vectors, and differences between signed vectors and
arithmetic library were learned.
In the implementation of Pong, the most important challenge was
memory access. The internal memory was not sufficient to store a
640x480 frame and the application game, so the use of an external
memory is important in order to increase frame resolution, which
currently is 320x240 pixels.
8. ACKNOWLEDGMENTS
The authors want to thank MCT and CNPq, for financially
supporting this research (process 507194/2004-7).
9. REFERENCES
[1] Azuma, R. A survey of augmented reality. Presence, 6, 4
(Aug. 1997), 355-385.
[2] Guimares, G., Silva, S., Silva, G., Lima, J., Teichrieb, V.,
and Kelner, J. ARCam: Soluo Embarcada para Aplicaes
em Realidade Aumentada. In Workshop de Realidade
Aumentada (WRA 06) (Rio de Janeiro, Brazil, September
27-29, 2006). Brazilian Computer Society, So Paulo, SP,
2006, pp. 23-26.
[3] Umlauf, E., Piringer, H., Reitmayr, G., and Schmalstieg, D.