Real-time camera-based 3D hand tracking
Hand tracking has applications in many fields, for example for navigation in virtual environments, virtual prototyping, gesture recognition, and motion capture. The goal of this project is to track the global position and all finger joint angles of a human hand in real-time.
Due to measurement noise, occlusion, cluttered background, inappropriate illumination, high dimensionality, and real-time constraints, hand-tracking is a scientific challenge.
We use multiple cameras to capture images of the hand from different directions. Features like skin segmentation, edge detection, skin texture, and previous hand position can be used to extract the 2D shapes of the hand in the images. We utilize dimension reduction techniques to cope with the high complexity of the tracking problem (the hand has about 21 local DOFs and 6 global DOFs).
The following figure shows the overall architecture of our system.
Poster
This poster illustrates the main steps of the tracking algorithm.Publications
- Continuous Edge Gradient-Based Template Matching for Articulated Objects. VISAPP 2009 (International Conference on Computer Vision Theory and Applications), Lisbon, Portugal, February 2009
- Segmentation of Distinct Homogeneous Color Regions in Images. CAIP 2007 (The 12th International Conference on Computer Analysis of Images and Patterns), Vienna University of Technology, August 2007
- More publications can be found on our list of publications.
Position Papers
- Real-Time Hand Tracking for Natural and Direct Interaction. Whole Body Interaction Workshop, part of the The 28th Annual CHI Conference on Human Factors in Computer Systems, Atlanta, Georgia, USA, April 2010
Results
Continuous Edge Gradient-Based Template Matching
Videos
|
|
||||||
|
|
Note:
- At each frame, the values in the combined confidence maps are scaled to the full gray scale range for better visualization.
- The reason for the homogeneous black regions in the confidence map, generated by the chamfer distance, is the edge pixel to edge pixel distance limit of 20. At each pixel in this black regions, the maximum distance is reached.
The videos demonstrate that our approach generates fewer and much more significant maxima (possible hand positions), which leads to considerably easier true hand position finding.
Skin Segmentation
Original Image |
Jones and Rehg |
Our Approach |
Video
The Video below shows the segmentation algorithm. The Camera is positioned on the rear right side. On the screen the left window shows the captured image from the camera, the right window the segmentation result of our algorithm.
![]() |
| DivX 2.2MB, Windows/Linux |
| MOV 3.5MB, MacOS |


