3D Navigation

THE DABR A MULTITOUCH SYSTEM FOR INTUITIVE 3D SCENE NAVIGATION J rg Edelmann, Andreas Schilling o University of T bingen u WSI/GRIS
ABSTRACT Multi-touch capable displays are one of the central emerging technologies in Human Computer Interfaces and many commercial applications like the Apple iPhone or the Microsoft Surface already show the benet of this interaction technique. But most of the applications are limited to 2D interaction and only little effort has been spent on intuitive 3D interaction techniques. Since 3D scene navigation is a central aspect of 3DTV applications we present easy and intuitive multi-touch gestures to control a virtual 3D camera. These gestures were implemented on The DabR a complete and robust multitouch system designed at our institute. The underlying computer vision algorithms to cover the tracking of the users ngers are GPU accelerated to ensure very low latency for the interaction. Index Terms HCI, multi-touch, Tabletop, 3D Navigation, Natural User Interface, 3DTV 1. INTRODUCTION For a successful entry into consumer market it is essential for 3DTV applications to provide an intuitive user interface to interact with 3D content. Multi-touch interaction concepts provide natural interfaces in many elds. Especially 2D interaction techniques that allow a very direct manipulation show the benet of multi-touch and have shown to be very appealing to users. In this paper we present our multi-touch system The DabR 1 Fig. 1. It allows for direct 3D scene manipulation, i.e., a setup where the users ngers and the rendered 3D scene are closely coupled, thus extending the well known benets for 2D interaction to the 3D world. This includes both temporal directness (low latency) which is addressed by a GPU accelerated image processing pipeline and spatial directness addressed by novel navigation gestures. This paper is organized as follows. First, Section 2 describes related work. Then we present the architecture and design of The DabR system in Section 3. Finally, experimental results are described in Section 4 before we conclude this paper.
1 The
Sven Fleck SmartSurv Vision Systems GmbH
Fig. 1. Multi-touch System Prototype The DabR 2. RELATED WORK Different approaches for multi-touch sensing have been published. The Diamondtouch [1] uses capacitive sensing. In contrast to most optical systems it allows to distinguish different users. However, contact with a special mat is required. Moreover, only front projection is available leading to occlusions while interacting. Other approaches are based on optical touch sensing. These systems can be used with rear-projected images and allow shadow free interaction. The reactable [2] for example illuminates the display with infrared light from below the surface. An advantage besides the high spatial resolution obtained by using a camera is that also objects can be used for tangible interaction. But problems with detected objects slightly above the surface can lead to confusion during interaction. The multi-touch system Surface from Microsoft uses a similar
DabR ... Swabian for nger marks left after touching something
technique to sense ngers and objects. A different promising technique for optical multi-touch sensing was presented by Han [3]. A plexiglas surface is illuminated from the sides with infrared light. Touching the surface leads to frustrated total internal reection (FTIR): the infrared light escapes and gets captured by a infrared sensitive camera below the display. Hence, only clear contacts and nothing above the screen is detected. Due to these advantages, we use this method as basis for our work. Related work with respect to 3D navigation with a 2D input device comprises the following publications. The Navigadget proposed in [4] is a well designed Graphical User Interface Element to determine the camera position for single pointing devices. But it has to be learned and cant be operated by intuition. Moreover, it does not take any advantage of multi-touch devices, nor it does include any direct manipulation functionality. Jung et al. [5] presented gestures for 3D navigation within multi-touch systems. However, the feeling of direct manipulation is also missing. The work of Steinicke et al. [6] concentrates on gestures for scene manipulation on multi-touch systems that visualize interscopic data, however only basic navigation concepts similar to keyboard cursor navigation are presented. To summarize, none of the existing approaches cover techniques for direct view point manipulation. 3. The DabR SYSTEM In this section we describe our The DabR multi-touch system with respect to hardware setup, image processing pipeline and user interaction design. Moreover, some practical experiences gained during the design of The DabR are denoted. 3.1. Hardware Setup The hardware architecture of The DabR is illustrated in Fig. 2. The operational principle is based on the FTIR method presented by Han et al. [3]. To ensure a static and robust setup all components are connected with aluminium structural beams. For visualization, an XGA beamer together with a mirror are used to perform shadow free rear-projection onto the surface. To also enable stereo visualization for stronger immersion, a standard color anaglyph approach is taken. The surface itself is coated by a silicon layer and is illuminated by 50 IR LEDs. Due to the FTIR principle, total internal reection is disturbed when it is touched. A high speed infrared sensitive camera senses the resulting frustrated light from the touch surface. This video stream is processed as described in the following section. 3.2. Image Processing Pipeline MacKenzie and Ware [7] showed that lag in HCI degrades users performance in motor-sensory tasks on interactive systems. Especially within touch environments, lack of a low la-
Silicon coated projection surface IR-LEDs Plexiglass
XGA Projector
Firewire camera (80Hz) Mirror
PC
Fig. 2. The Hardware Architecture of The DabR
tency navigation can quickly turn the best approach of gesture based navigation to an undesirable experience due to tightly coupled input and feedback mechanism. Hence, low latency is one of our top priority goals for The DabR . To this end, besides using a camera with high temporal resolution we implemented a GPU accelerated image procssing to detect and track ngers on the surface. The operations are implemented utilizing the CUDA framework, a hard- and software architecture by NVIDIA. The pipeline architecture is illustrated in Fig. 3. After uploading the live camera image onto the GPU, an efcient background model detects foreground pixels, i.e., potential regions where users interact with The DabR . Afterwards, a low pass convolution lter is applied for noise removal, followed by thresholding. The result is then put in the erosion stage. Finally, connected components are extracted to abstract from pixel level to object level (e.g., ngertips). The resulting images are downloaded to main memory again (as shown in Fig. 3) for further processing. After computing the centroids, point tracking for continuous interaction is performed. The inherent correspondence problem is solved using the hungarian algorithm which leads to an optimal tracking result. The whole process including GPU based image processing and CPU based tracking is performed in 3.6 ms on the integrated PC equipped with Core2Duo 2.6GHz and NVIDIA GeForce 8800GTS. This pipeline was integrated into the Open Source multi-touch framework Touchlib. The standard implementation of the framework is based on OpenCV and performs blob tracking in about 33 ms on VGA video streams as also illustrated in Fig. 4.
CPU
Camera capture Upload to GPU
GPU (CUDA)
of the ngers moves the camera parallel to the image plane.
Background removal
Box Blur
pan / tilt
move
Threshold Point Tracking Erosion Download from GPU
Connected Components
Fig. 3. The image processing pipeline architecture of The DabR

DabR Touchlib
0 5 10 15 20 25
~3.6 ms ~33 ms
30 35 40
ms
Fig. 4. Performance comparison: Resulting lag due to image processing: The DabR vs. Touchlib. 3.3. Intuitive 3D Scene Navigation
Fig. 5. Multitouch gestures used for 3D Scene Navigation. Left column: Pan/Tilt operation. Right column: Object movement and zooming.
4. RESULTS 3D Scene navigation is inherently challenging using 2D input devices. However, multi-touch systems increase the degrees of freedom (DoF) as every nger in action adds another independent 2 DoF. In this section we introduce gestures as illustrated in Fig. 5. In contrast to previously proposed methods our gestures were designed to provide maximum directness thus resulting in a more intuitive user interaction. Directness means that the selected points on the 3D Objects are always directly coupled to the users ngers. To simplify the problem we only allow for pan/tilt rotation, translation along the view vector and parallel to the image plane. Rolling the camera or changing the eld of view is not supported to avoid confusion. The gestures work as follows: Moving one nger leads to a pan/tilt of the camera around its actual view point. This is performed in a way that the touched object strictly stays at the ngertip. Using two ngers allows moving the camera. If the two ngers are moved apart from each other, zooming is performed, still holding the characteristics of direct connection to the touched object as close as possible. Parallel movement The hardware setup as shown in action in Fig. 6 turned out to serve as a robust platform both for user studies under controlled laboratory conditions but also for real world applications. For example, the system has been shown to the public and worked reliably during the annual University of T bingen u Day. For proof-of-concept we implemented the presented 3D navigation gestures in RadioLab, a real-time photorealistic visualization software. The yet informal user studies showed quite good results. The users could interact with the system intuitively with very short training. 4.1. Application Study: Surveillance Visualization The DabR allows visualizing of and interacting with video surveillance systems. Work in progress is illustrated in the following. Pan-tilt-zoom (PTZ) cameras are widely used in surveillance applications. Nowadays, the underlying mechanical unit is often replaced with a high resolution imaging sen-
Fig. 6. The DabR in action, interactively visualizing a 3D model. sor and wide angle lens (e.g., as within the AXIS 212PTZ IP camera). This results in purely virtual PTZ functionality which allows for instant control as no mechanical movement is necessary. The user interface used today is based on a joystick which performs pan & tilt. Zooming is done by rotating the joystick. Obviously, this is not a perfectly direct and intuitive interaction technique. This 2D application is directly covered by our navigation system: Applying The DabR on PTZ cameras naturally offers these directness and intuitivity. Our work in [8] describes a system where the scene is analyzed by multiple viewpoints concurrently and the results are integrated in one consistent 3D world model which now can be visualized on The DabR . The system consists of multiple smart cameras which perform fully embedded background modeling, person tracking and activity recognition (e.g., falling detection). The world model is either a 3D model acquired by our W gele platform as shown in Fig. 7 or Google Earth. a For users familiar with 2D PTZ surveillance the step to The DabR s 3D navigation becomes a straight forward natural extension.
we have introduced a multi-touch 3D navigation approach that uses the benet of direct interaction. This comprises a low latency GPU accelerated image processing mechanism which leads to virtually lagfree behavior. Experimental tests within informal user studies show promising results. Moreover, application studies in the eld of surveillance visualization have been presented where The DabR is directly applicable. The known benets of a multi-touch approach coupled with a low latency user interface and intuitive navigation gestures is a desirable combination for a truly intuitive and authentic experience. Since The DabR is a robustly working platform, formal user studies will be the most important following step. Future work also includes additional gestures and further scene interaction including object manipulation and collaborative use. 6. REFERENCES [1] Paul Dietz and Darren Leigh, Diamondtouch: a multiuser touch technology, in ACM User interface software and technology (UIST2001), 2001. [2] Martin Kaltenbrunner, Sergi Jorda, Gunter Geiger, and Marcos Alonso, The reactable: A collaborative musical instrument, in Workshop on Enabling Technologies (WETICE2006), 2006. [3] Jefferson Y. Han, Low-cost multi-touch sensing through frustrated total internal reection, in User interface software and technology (UIST2005), 2005. [4] Martin Hachet, Fabrice D` cle, Sebastian Kn del, and e o Pascal Guitton, Navidget for easy 3d camera positioning from 2d inputs, in IEEE Symposium on 3D User Interfaces (3DUI2008), 2008. [5] Y. Jung, J. Keil, J. Behr, S. Webel, M. Z llner, T. Engelke, o H. Wuest, and M. Becker, Adapting x3d for multi-touch environments, in ACM International symposium on 3D web technology (Web3D2008), 2008. [6] Frank Steinicke, Klaus H. Hinrichs, Johannes Sch ning, o and Antonio Kr ger, Multi-touching 3d data: Towards u direct interaction in stereoscopic display environments coupled with mobile devices, in Advanced Visual Interfaces Workshop (AVI2008). [7] I. Scott MacKenzie and Colin Ware, Lag as a determinant of human performance in interactive systems, in Human factors in computing systems (CHI1993), 1993. [8] Sven Fleck and Wolfgang Straer, Smart camera based monitoring system and its application to assisted living, Proceedings of the IEEE, Special Issue on Distributed Smart Cameras, vol. 10, 2008.
(1)
(2)
(3)
(4)
Fig. 7. SmartSurv Surveillance visualization: (1, 3) live smart camera view with overlaid tracking information; (2, 4) Results on The DabR : Live rendering of tracked object embedded in 3D model acquired by the W gele platform. a
5. CONCLUSION In this paper, we have presented The DabR a complete and robust multi-touch hardware system. As a key contribution

3D Navigation

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

3D Navigation

Transféré par

Droits d'auteur :

Formats disponibles

THE DABR A MULTITOUCH SYSTEM FOR INTUITIVE 3D SCENE NAVIGATION J rg Edelmann, Andreas Schilling o University of T bingen u WSI/GRIS

Sven Fleck SmartSurv Vision Systems GmbH

Silicon coated projection surface IR-LEDs Plexiglass

Firewire camera (80Hz) Mirror

Fig. 2. The Hardware Architecture of The DabR

of the ngers moves the camera parallel to the image plane.

Threshold Point Tracking Erosion Download from GPU

Fig. 3. The image processing pipeline architecture of The DabR

Vous aimerez peut-être aussi