In a recent paper published in the journal Applied Sciences, researchers explored advancements in pupil-tracking techniques using event camera imaging, known as neuromorphic cameras. The study employs classical machine-learning-based computer vision techniques for remote pupil tracking.
Background
Pupil tracking is an essential task in human-computer interactions, virtual reality (VR), computer vision, and augmented reality (AR) systems. It enables various applications, including gaze estimation, attention tracking, biometric identification, and Autostereoscopic 3D displays. Additionally, it finds utility in psychology and medicine to diagnose conditions such as stress by studying eye movements and related body signals. Extensive research exists on head-mounted eye-pupil tracking, designed primarily for wearable devices. Remote eye-tracking is a non-intrusive technology. It provides insights into the visual attention of users and cognitive processes. Advancements in remote eye-tracking methods have enabled practical and non-intrusive implementations. Event camera imaging, known for its unique capabilities in dynamic vision tasks, asynchronously captures pixel-level intensity changes triggered by scene alterations. It excels at capturing fast eye movements and subtle motions that traditional frame-based systems often miss. Event cameras depict motion by capturing both negative and positive pixel intensity changes, concisely representing dynamic scenes that minimize redundancy and motion blur.
Advancing pupil tracking with event camera technology
In their prior work, the authors established an effective face-centric eye-tracking method. It employs 11-point eye-nose shape tracking based on the supervised descent method (SDM). The current work extends that success, applying their machine-learning-based approach to event camera imaging. The proposed novel pupil-tracking algorithm encompasses eye-nose detection, feature extraction, and alignment methods customized for event camera data.
By amalgamating principles from conventional frame-based eye-tracking and their prior research on bare-face eye tracking, their objective is to unlock the capabilities of event camera imaging for enhanced and more efficient pupil tracking. This approach promises to advance eye-tracking technologies for practical, real-world applications.
Event cameras instantly register pixel intensity changes induced by scene alterations, yielding high temporal resolution and minimal latency. They differ from frame-based cameras by producing events in real-time, offering a concise representation of dynamic scenes, and consuming significantly less power. Event camera data, characterized by precise timestamps, intensity changes, and pixel coordinates, is apt for capturing swift and subtle eye movements crucial for accurate tracking.
In their research, the authors employed a 346 x 260 pixels event camera (DAVIS 346) with an active pixel frame sensor. It is positioned at distances ranging from 50 to 100 cm from the user's face. The setup is tailored for Autostereoscopic 3D PC displays and AR 3D head-up display (HUD) systems. This configuration enables non-intrusive, user-friendly eye-tracking solutions for diverse real-world scenarios. Leveraging the capabilities of the DAVIS 346 event camera, the authors possess the necessary tools to delve into the capabilities of event camera imaging for more advanced pupil-tracking endeavors.
The proposed pupil-tracking algorithm involves creating event frames, accumulating events over a 33 ms interval, and translating them into a familiar visual format. It encompasses eye-nose region detection, eye center position tracking, and a tracking checker for swift tracking maintenance. The algorithm utilizes cascaded Adaboost classifiers with multi-block local binary patterns (LBPs) for eye-nose region detection, optimizing CPU efficiency. Eye center position tracking employs the SDM with scale-invariant feature transform (SIFT) features, enabling accurate and real-time tracking of pupils. The proposed approach offers a comprehensive and efficient machine-learning-based computer vision alternative, featuring speed advantages over convolutional neural network (CNN)-based algorithms.
The authors prepared a specialized event camera image database captured using the DAVIS 346 event camera. This dataset plays a pivotal role in training both the eye-nose region detector and aligner. By including distinct motion categories in the training, the algorithms adapt to varying motion levels and diverse eye shapes encountered in real-world scenarios.
Notably, their event-camera-based pupil-tracking method excels at capturing rapid eye movements, which is challenging for traditional RGB-frame-based systems. Event asynchronous operation and high temporal resolution of cameras enable precise tracking of fast eye movements, a significant advantage over conventional frame-based cameras.
Results and analysis
The evaluation of event camera-based pupil tracking involved extensive experiments with a diverse dataset encompassing various eye movement scenarios and lighting conditions. In comparison with previous frame-based eye-tracking algorithms, the proposed work highlighted the potential of event camera imaging to significantly enhance tracking accuracy, especially during rapid eye movements.
The algorithm employed cascaded Adaboost classifiers with multi-block local binary patterns (LBPs) for eye-nose detection and an SDM-based 11-point eye-nose-alignment technique integrating SIFT features for pupil localization. The accuracy of the tracking system was evaluated by comparing the detected pupil centers to the actual positions. This evaluation, which used the inter-pupil distance (IPD) as a reference, showed that the detection accuracy was 98.1 percent and the tracking accuracy was 80.9 percent. The dataset used for training and testing was thoughtfully constructed, ensuring adaptability to real-world scenarios with varying motion levels and lighting conditions, especially emphasizing accurate pupil tracking during rapid eye movements.
Conclusion
In summary, the event-camera-based pupil-tracking algorithm showed promising results in the detection and tracking of rapid eye movements in real time. Challenges remain, especially with subtle movements and occluded eyes. Future research may expand datasets and explore new machine-learning approaches to enhance performance across diverse eye movement scenarios and lighting conditions.