In a paper published in the journal Scientific Reports, researchers presented an innovative solution addressing the limitations of computer vision-driven table tennis ball detection. Their method revolutionized real-time ball landing point determination, minimizing reliance on visual equipment. This approach significantly enhanced processing speed and accuracy by employing dynamic color thresholding, target area filtering, keyframe extraction, and advanced detection algorithms.
Tests performed on the Jetson Nano development board demonstrated exceptional performance, attaining a high detection speed and accuracy in identifying landing point frames. This method showcased the significant potential for accurately identifying table tennis ball drop points in real-time, especially in environments with lower frame rates and limited visual acquisition devices, by demonstrating a substantial accuracy rate in drop point detection.
Background
In sports like table tennis, capturing accurate ball landing information is crucial for refereeing using video-based slow-motion playback. Previous methods heavily relied on complex setups like multi-vision or high frame rate cameras, neural networks, and extensive data labeling. However, these approaches faced challenges in practical implementation due to equipment dependency, slow processing, and substantial data requirements.
Detection Methodologies for Table Tennis
In the present study, researchers explore the methodologies used to detect table tennis balls' landing points. The detection process focuses on mitigating interference from additional balls and noise within video sequences captured by a monocular camera under natural lighting conditions. Leveraging the Hue, Saturation, and Value (HSV) color model, a color gamut method isolates the table tennis ball by precisely defining a color threshold interval. The vibrant color threshold method dynamically adjusts HSV thresholds based on the specific round, countering dynamic illumination variations and ensuring consistent and accurate ball detection.
Moreover, noise-prone areas, excluding the table, are masked within the video frames to enhance accuracy. A strategic exclusion of unrelated objects, guided by contour area measurements, ensures the retention of valid table tennis ball contours. Subsequently, keyframe extraction optimizes computation by pinpointing frames relevant to the ball's trajectory, mainly focusing on its landing phase.
The subsequent segment discusses the four-frame differential slope method, which capitalizes on the spatiotemporal correlation of table tennis ball movements. This method identifies the landing coordinates by scrutinizing the directional changes in the ball's motion trajectory. Furthermore, a polygon area determination algorithm divides the table into zones. It locates the landing point within these delineated areas, supporting accurate data representation for post-match analysis and refereeing purposes. By integrating dynamic color gamut methods with interference and noise exclusion techniques, alongside keyframe extraction and trajectory analysis methodologies, this approach offers a comprehensive solution for precise table tennis ball landing point detection.
Experimental Evaluation of Table Tennis
Using human–machine sparring table tennis ball datasets in real scenarios validated the effectiveness of the table tennis ball landing point detection and landing area determination methods. Assessing these methodologies was integral to the experimental platform's role. This platform featured a Jetson Nano equipped with Quadro Tegra X1, Advanced RISC Machine Cortex-A57 (ARM CORTEX-A57), and a Linux operating system, combined with a distortion-free industrial camera operating at 60fps and a resolution of 720*480. At a height of 166 cm above the ground, the arrangement placed the camera, table tennis table, and tee 75 cm from the table's edge.
Analyzing the detection results of various algorithms revealed nuanced performance variations. The dynamic color threshold method and Variance-Based Background Estimation (ViBe) algorithm showcased distinct strengths and weaknesses in speed, accuracy, and adaptability to hardware configurations.
While You Only Look Once version 5 (YOLOv5) exhibited superior accuracy on high-performance graphics processors, its slower detection speed on lower-performance development boards posed challenges for real-time small target detection. The study highlighted the dynamic color threshold method's efficiency in detection speed and accuracy, especially on low frame rate vision acquisition devices and development boards.
Researchers also investigated the relationship between camera frame rate and the accuracy of detecting table tennis balls. Results indicated that a 60fps camera provided optimal conditions for accurate detection, with significant improvements observed compared to 30fps or 120fps setups.
Researchers examined how the noise exclusion method affected interference balls and noise within the video frames—applying noise exclusion techniques, which substantially enhanced table tennis ball landing point detection accuracy by mitigating interference and noise-related detection errors. Moreover, the keyframe extraction method demonstrated efficiency by significantly reducing the number of processed video frames while maintaining a high accuracy rate in identifying relevant frames for table tennis ball analysis.
Further evaluation of landing point detection methods, including trajectory fitting, Table Tennis Network (TTNet), and the four-frame differential slope method, highlighted each method's computational efficiency, accuracy, and suitability for different hardware configurations. The four-frame differential slope method emerged as a promising approach due to its computational speed and effectiveness, particularly on low-frame rate vision acquisition devices and low-computational development boards.
Conclusion
To sum up, this paper introduces a table tennis landing point detection algorithm based on spatial domain information. It incorporates methods like target area threshold exclusion and spatial Euclidean distance to avoid false detections, implements key frame extraction to enhance computing efficiency, and utilizes the four-frame differential slope method for accurate landing point detection, achieving a 78.5% accuracy rate.
Future work will address current limitations by employing neural network algorithms for fast, small target detection, integrating global and local methods for enhanced precision, and implementing strategies to exclude interfering objects based on movement speed differences and joint spatial intersection ratios. Additionally, plans involve using a monocular camera and non-3-dimensional reconstruction methods to refine the detection accuracy of table tennis balls and their landing points.