Vous êtes sur la page 1sur 6

“DRAFT” Proceedings of the ASME 2010 International Mechanical Engineering Congress & Exposition

November 12-18, 2010, Vancouver, British Columbia, Canada



Wook Choi Vladimir Rubtsov

Mechanical and Aerospace Engineering Dept. Intelligent Optical systems, Inc. (IOS)
University of California, Los Angeles (UCLA) Torrence, CA, USA
Los Angeles, CA, USA vrubtsov@intopsys.com

Chang-Jin “CJ” Kim

Mechanical and Aerospace Engineering Dept.
University of California, Los Angeles (UCLA)
Los Angeles, CA, USA

ABSTRACT shading or surface discontinuity [1-2]. The difficulties in

finding the 3D information, however, can be greatly reduced if
Depth information from an image can greatly increase the work any depth information about an object is provided for the image
efficiency when observing or inspecting objects, because the analysis. To find such depth information, various stereo
size, distance, and relative locations can be estimated. Various imaging methods are used these days, clarifying the size,
stereo imaging methods are being used to find depth distance, and 3D shapes of objects in a range of application
information in a wide range of application fields, typically by fields. The most common method is using two or more separate
placing multiple optical systems side-by-side to create multiple imaging systems having certain distances between them [3-5]
shifted images. In this study, we develop a miniature stereo to provide multiple viewing angles using triangulation
image generating device, which can augment an existing single methods. However, such multiple imaging systems lead to an
optical system (i.e., a two-dimensional images capturer) with increase in size and structural complexity of the overall system.
three-dimensional capability. Developed with MEMS To address this issue, several stereo systems with a single
technology, the device consists of a single translating aperture, imaging system [6-8] or an off-axis aperture [9-11] have been
which shifts laterally between two positions (up to 100 µm introduced.
apart demonstrated) by means of electrostatic comb actuators. We have previously proposed dramatic miniaturization of
Attached at the objective end of conventional (i.e., nonstereo) those stereo systems through use of micro electro-mechanical
optical systems, this stereo converter with an aperture 900 µm systems (MEMS) technologies, reporting success with a single
in diameter is shown to successfully generate slightly different glass disc that flips around the neutral angle in the light path
viewing angles, providing stereo images. Being miniaturized, [12]. The operating mechanism for the flipping was
this device is suitable for microscopic or endoscopic electrostatic actuation at resonance, consuming low power. The
applications, where the size of the system is limited or axial glass parts, including the glass disc, were fabricated by thermal
depth of focus is relatively large. molding into a micromachined silicon mold, followed by
lapping and polishing steps. Although the device was
successful, we have since been exploring new designs suitable
1. INTRODUCTION for more conventional MEMS fabrication, which would be
more reliable and low-cost.
Analyzing two-dimensional (2D) images to extract three- In this study, we report an all-silicon (i.e., no glass parts)
dimensional (3D) information is a complex and time- device much simpler in structure and fabrication processes. The
consuming task mostly relying on limited information such as completed device is attachable to an existing optical system

1 Copyright © 2010 by ASME

(e.g., a camera or a microscope) to add stereo imaging axis, it is hard to distinguish their relative locations (Fig. 1(a))
capability if a correctly sized aperture is used. simply by observing the images on the sensor. However, when
a screen with an aperture is placed in front of the lens and
2. IMAGING PRINCIPLE translates up and down, the images of the square and the circle
shift down and up, respectively, while the image of the triangle
Several off-axis aperture stereo imaging methods have remains still. That is, objects closer or farther than the lens’
been developed as alternatives to multiple-lens stereo imaging focusing distance have their images on the sensor shifted in the
methods to avoid the large size and complexity of the usual same or the opposite direction, respectively, as the aperture
multiple-lens systems. Such off-axis aperture methods include translation, providing views from different angles as shown in
the use of pinholes (i.e., no lens) located at a certain distance Fig. 1 (b) and (c).
from the sensor which is widely used in the pinhole stereo If a thin lens is assumed as in Fig. 1 and the aperture is very
photography, and the use of multiple apertures off-axis in front close to the lens, the image shift at the sensor can be estimated
of an imaging system with lenses [10]. However, the pinhole by the following equation [9];
method generally requires a significant exposure time or strong
illumination, which is not suitable for general imaging
1 h⎛ 1 1 ⎞ 1
applications, while the multiple-apertures method requires = ⎜ − ⎟+
several imaging filters and post imaging processes to extract d v⎝F D⎠ D
separate images coming from each aperture. To avoid those
limitations and complications, a stereo imaging method with a
where d: distance to an object in front of the lens
single translatable aperture placed in front of an existing
v: aperture translating distance
imaging system [11] is used in this study.
h: image shift at the image sensor
Fig. 1 shows three objects – a square, a triangle, and a
D: distance to a plane conjugate to sensor plane
circle – in front of the objective lens of the imaging system,
F: focal length of the lens
with the triangle at the focusing distance of the lens. Because of
their different physical distances from the lens, their imaging
Fig. 2 shows the actual image shift test using a translating
locations behind the lens (L1, L2, and L3, respectively) are also
aperture. There are two pens placed at different distances; the
different. If all those three objects are lined up along the optical
ballpoint pen (without a cap) in the front is closer to the camera
than the marker pen (with a cap) in the back. A screen with an
aperture 1 mm in diameter is placed in front of the camera. The
(a) camera is kept focused on the marker pen in the back during
this test. As the aperture is moved up and down, only the image
of the pen in the front which is not in focus shifts while the
image of the pen in the back on which the camera focuses does
not move, providing depth information to the observer that
cannot be easily obtained using a regular 2D camera. This
(b) experiment confirms that when an aperture of adequate size is
used in front of the imaging system, only the image of objects
not in focus shifts as the aperture translates, which can be used
to find the relative locations of each object.


Figure 1. Image shift by aperture translation. (a) Objects at Figure 2. Captured image shift by a camera. A screen with
different locations have different imaging distances behind an aperture 1 mm in diameter is placed in front of the
the lens. When an aperture is used and translates upward (b) camera. The camera focuses on the capped marker pen in
or downward (c), each object has its image on the sensor the back. As the aperture translates up and down, only the
shifted according to its relative location. image of the opened ballpoint pen in the front shifts while
the image of the pen in the back remains still.

2 Copyright © 2010 by ASME


Fig. 3 illustrates the implementation of the stereo converter

introduced in the previous section in the form of a miniaturized Comb drive
stereo converting device in conjunction with a lens system. A actuator
silicon screen with an aperture patterned at the center is located
right in front of an optical system. With the translation motion
of this screen, images of objects viewed through the aperture Y
shift to generate stereo images from what would otherwise be a
planar 2D image, giving the attached optical system 3D
imaging capabilities. Required aperture size depends on (1) the X
imaging applications because the aperture size determines the Aperture
images’ depth of focus, (2) the light sensitivity of the imaging
system, and (3) the illumination setup used for the imaging test L
because the brightness of the image through the aperture is
proportional to the square of the aperture size. Comb drive

Translating aperture Figure 4. Illustration of the stereo-generating silicon device.

device A screen with an aperture in the center translates (up and
down in the figure) by the opposing comb drive actuators.
The translation is guided by the attached folded spring
Objective lens
With the given folded spring length, the translating
distance is determined by the number of the comb fingers at the
actuators and the width of the folded springs, and can be
calculated by

Figure 3. A proposed stereo imaging system. A silicon ε ⋅ n ⋅ L3

microdevice with a translating aperture screen is located DY = V2
right in front of the optical system to generate image shifts. 4⋅ g ⋅ E ⋅w 3

where ɛ: permittivity constant of air

Fig. 4 shows a simplified view of the stereo converter’s n: number of gaps between the comb fingers
design. The entire converter is made of silicon, and the central L: length of the folded spring
screen with the aperture at the center is suspended by folded G: gap between comb fingers
spring structures which guide the screen’s translating motion. E: Young’s modulus of the silicon
The translation is generated by two sets of electrostatic comb w: width of the folded springs
drive actuators patterned on the screen. By alternately applying V: applied voltage
an electric potential to the comb drives, the aperture’s
translation motion can be achieved. Considering the minimal
overall device size for future endoscopic applications, the
folded spring fixtures are designed to have 1 mm of length.
The spring constant of the folded spring fixtures in Y-
direction is given by
The miniaturized 3D image converter is fabricated by
using a series of dry etching methods on a silicon-on-insulator
2 ⋅ E ⋅ t ⋅ w3 (SOI) wafer used as a starting material. The fabrication
KY = sequence in detail is shown in Fig. 5. The SOI wafer used for
L3 the device in this study has 50 µm of device silicon, 0.5 µm of
buried oxide (BOX), and 500 µm of handle silicon layer (Fig.
5(a)). To avoid additional processes for metal pads for electric
contact, a highly doped device layer is used. As the first step,

3 Copyright © 2010 by ASME

silicon dioxide layers are deposited on top and bottom of the 1 mm-long folded spring fixtures, among several variations,
wafer by plasma enhanced chemical vapor deposition has been used for device tests in this study. The gap between
(PECVD) method, followed by patterning those layers using the opposing fingers is 3 µm. The device layer has been
reactive ion etching (RIE) (Fig. 5(b)). After that, the silicon patterned to form both signal and ground electrodes. For the
handle layer is etched deeply (Fig. 5(c)) by using a deep secure electrical connections, copper tapes were used to deliver
reactive ion etching (DRIE) or using an anisotropic wet etching the driving and ground electric signals directly to each
method (i.e., KOH etching). This step is to free the silicon electrode with silver paint brushed around the contact point
device layer on the top so that the patterned structure can move between the tapes and the device electrodes.
freely, while the remaining thick structures in the handle layer Fig. 6 (b) and (c) are microscopic views of the white
will be used to provide mechanical strength to certain areas on dotted circle area in Fig. 6 (a), with ruler bars on the left with
the thin device layer at the points of electric contact. The device 10 µm increments. Fig. 6 (b) shows the comb drive actuator
layer with the aperture screen and the comb drive actuator is with top stationary and bottom moving comb fingers when no
then patterned by DRIE, followed by silicon dioxide layer voltage is applied to the actuator, which is the initial finger
removal by RIE (Fig. 5(d)), finalizing the device fabrication. position as fabricated. Fig. 6 (c) shows the comb drive
The fabricated device is not only much simpler in actuation when 44 V is applied to the actuator, translating 60
fabrication process with only two silicon etching steps, but also µm of distance when 4 µm-wide folded spring is used.
much more robust compared to the flipping device with a thick
glass plate [12] because no heavy structures need to be
suspended by the folded springs, as shown in the final cross
(a) +
section view (Fig. 5(d)).
- -
5 mm
(a) Oxide

SOI as starting material +
(b) (c) Stationary fingers

Oxide deposition and patterning

(Top and bottom surfaces)

(c) Moving fingers

Figure 6. (a) Photo of the fabricated silicon device 5 mm in

overall diameter. Microscopic views of the white dotted
Backside anisotropic etching circle area are shown in (b) and (c) during the comb drive
operation. (b) Initial comb finger location when no voltage
is applied, and (c) 60 µm of aperture translation when 44 V
(d) is applied.

Aperture Comb drive

Using the same device, up to 62 µm of screen translation
Frontside etching and oxide removal could be achieved at 45 V of applied voltage, as shown in Fig.
Figure 5. Process flow for the silicon stereo converter 7. When 5 µm-wide springs were used, up to 60 µm of finger
device translation could be achieved at 58 V before some fingers
laterally touched each other. Even though 4 µm- and 5 µm-
wide folded spring fixtures were used, their translating motion
5. DEVICE OPERATION behaved like the ones with the slightly thinner spring widths.
This was because of the tapering down of the 50 µm-thick
Fig. 6 (a) shows the fabricated silicon stereo converting folded spring structures during the DRIE process, resulting in
device. It is 5 mm in overall diameter and has an aperture 900 the narrower spring bottom width than the top.
µm in diameter. The design with 150 moving comb fingers and

4 Copyright © 2010 by ASME

4 µm-wide spring (a) (b)

3 µm-wide spring
Aperture translation (µm)

4 µm-wide
spring tested

4.5 cm
5 µm-wide spring

5 µm-wide spring

Applied voltage (V)

Figure 8. (a) 2D photo showing a LEGO® figure in the front
Figure 7. Aperture translation by comb drive actuators. 4 and a door in the back at the end of the hallway without the
µm and 5 µm-wide folded springs were used for the testing. stereo converter attached. When the converter is attached
and the aperture translates left and right, the image of the
figure shifts (b) to the right and (c) left, respectively. For (b)
6. VIEWING ANGLE CHANGES and (c), the camera focuses on the door in the back whose
image remains still.
A stereo system on a regular point-and-shoot camera (NV3,
Samsung Electronics) was built by attaching the fabricated
silicon device right in front of the camera lens. Additional
black screen material is attached in front of the silicon device to
block unnecessary light coming through the spring and comb
drive areas. Fig. 8 (a) shows a conventional 2D photo of a
LEGO® figure (without using the stereo converter) 4.5 cm in
height placed approximately 10 cm from the camera. The Figure 9. Recorded
background is a hallway with doors at the end. Fig. 8 (b) and microscopic images of a
(c) are photos taken with the camera optically zoomed to the ruler viewed through the
dotted square area shown in Fig. 8 (a) with the fabricated stereo translating aperture device.
converter attached in the front. Keeping the camera focusing on As the aperture translates
the doors during the experiment, the aperture screen translated ±50 µm horizontally, the
horizontally 50 µm each direction. As the aperture translated to width of each division on
the left and right, the image of the LEGO® figure, which was the ruler changes (left) due
closer and not in focus, shifted to the right and left, to the viewing angle
respectively, while the image of door at the end of the hallway change, which is equivalent
did not move. to looking at the ruler from
The same silicon stereo converter was used for a two different angles
microscopic stereo imaging test as in Fig. 9, which shows the (above).
images of a ruler covering approximately 3 cm captured at the
two extrema aperture locations. As the aperture makes ±50 µm
horizontal translation, the distance between each marking on CONCLUSIONS
the ruler expanded and contracted due to the slight change in
Sets of stereo images can be used as a powerful tool to find
viewing angle. The microscope focused on the marking in the
3D information when observing or inspecting an object by
middle (at 6.5 cm location), such that the image of that point
providing depth information. This paper presented a
remains unmoved. As in these experiments, relative locations
microfabricated stereo imaging generator using a relatively
or views from different angles of objects can be found by
simple stereo imaging method with a single translatable
observing the directions and the amount of each object’s image
aperture. This miniaturized stereo converter was shown to

5 Copyright © 2010 by ASME

successfully generate image shift when attached and operated [9] Adelson, E. H. and Wang, J. Y. A., 1992, “Single lens stereo
in front of imaging systems, even with the limited translating with a plenoptic camera”, IEEE Transactions on Pattern
distances microfabricated silicon actuators can provide. The Analysis and Machine Intelligence, vol. 14, pp. 99-106.
presented technique would be especially beneficial for
microscopic object inspection or endoscopic applications where [10] Seo, S., 2005, “Stereo-image capturing device”, US Patent
extraction of depth information from 2D images is more No. US 6,977,674 B2.
challenging due to the relatively high axial depth of focus
within the working distance of the device. [11] Dou, Q. and Favaro, P., 2008, “Off-axis aperture camera:
3d shape reconstruction and image restoration”, in Proceedings
of IEEE Conference on Computer Vision and Pattern
ACKNOWLEDGEMENTS Recognition, pp. 1-7, Anchorage, USA.
The project has been supported by the Small Business
Innovation Research (SBIR) grants from National Institutes of [12] Choi, W., Akbarian, M., Rubtsov, V., and Kim, C.-J., 2009,
Health (NIH). Authors want to thank Mr. James Jenkins for “Microfabricated flipping glass disc for stereo imaging in
valuable discussions and the input regarding the assembly of endoscopic visual inspection”, in Proceedings of IEEE
the paper. International Conference on Micro Electro Mechanical
Systems, pp. 160-163, Sorrento, Italy.


[1] Zhang, R., Tsai, P., Cryer, J. E., and Shah, M., 1999, “Shape
from shading: A survey”, IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 21, pp. 690-706.

[2] Saxena, A., Sun, M., and Ng, A. Y., 2007, “Learning 3-d
scene structure from a single still image”, in Proceedings of
IEEE International Conference on Computer Vision, pp. 1-8,
Rio de Janeiro, Brazil.

[3] Durrani, A. F. and Preminger, G. M., 1995, “Three-

dimensional video imaging for endoscopic surgery”,
Computers in Biology and Medicine, vol. 25, pp. 237-247.

[4] Schreier, H. W., Garcia, D., and Sutton, M. A., 2004,

“Advances in light microscope stereo vision”, Experimental
Mechanics, vol. 44, pp. 278-288.

[5] Okutomi, M. and Kanade, T., 1993, “A multiple-baseline

stereo”, IEEE transactions on Pattern Analysis and Machine
Intelligence, vol. 15, pp. 353-363.

[6] Goshtasby, A. and Gruver, W. A., 1993, “Design of a single-

lens stereo camera system", Pattern Recognition, vol. 26, pp.

[7] Gao, C. and Ahuja, N., 2004, “Single camera stereo using
planar parallel plate”, in Proceedings of the International
Conference on Pattern Recognition, vol. 4, pp. 108-111,
Cambridge, UK.

[8] Lee, D. H. and Kweon, I. S., 2000, “A novel stereo camera

system by a biprism”, IEEE Transactions on Robotics and
Automation, vol. 16, pp. 528-541.

6 Copyright © 2010 by ASME