Combining RFID technology with cutting-edge AI, this revolutionary smart mask overcomes the barriers of face masks, paving the way for enhanced communication and next-gen hearing aids.
A visual illustration of the Lip-reading. (a) Vowels. (b) consonants. (c) Words. Research: Artificial intelligence enabled smart mask for speech recognition for future hearing devices
In a paper published in the journal Scientific Reports, researchers proposed a radio frequency identification (RFID)-based bright mask for lip-reading that addresses the limitations of camera and wearable technologies.
The system used RFID technology for RF sensing-based lip-reading, enabling speech recognition under face masks. This approach leverages passive RFID tags for low-cost and privacy-preserving applications, offering a unique advantage over vision-based systems that struggle with occlusion and privacy concerns.
A dataset of vowels, consonants, and words was collected and fed into machine learning (ML) models, achieving high classification accuracy. The study utilized various ML models, fine-tuned through hyperparameter optimization, to ensure robust performance. The Random Forest (RF) model achieved high accuracy on the RFID combined dataset.
Related Work
Past work on RFID-based lip reading has faced challenges such as signal interference, limited accuracy in detecting subtle lip movements, and difficulty handling multiple users simultaneously. Additionally, these systems often struggle with localized identification in dynamic environments and require significant computational resources for real-time applications. Privacy concerns and discomfort associated with wearing RFID-enabled devices also pose challenges for widespread adoption.
RFID-Based Smart Mask System
The passive ultra-high-frequency (UHF) RFID tag used in the proposed smart mask is a flexible, low-profile textile laundry tag. It features a copper dipole antenna and an Impinj Monza R6P integrated circuit (IC) that complies with Gen2 standards. This tag is engineered with a lumped-element model to optimize frequency response and enhance impedance matching with the RFID reader, ensuring reliable operation within the UHF range of 868–920 MHz. The tag's dimensions are 58mm x 15mm x 1.5mm, designed for versatile attachment methods.
The tests measured the forward and backscatter signals across frequencies ranging from 800 to 1000 MHz. The tag exhibited exceptional read sensitivity of up to −22.1 dBm when used with a dipole antenna, and its autotune technology maintained consistent performance across varying dielectric materials. This advanced autotune feature allows for seamless operation in diverse environmental conditions, a key consideration for integration into Internet of Things (IoT) applications. However, the limited read range of passive RFID tags, up to 6.5 meters in dry conditions, remains a constraint.
(a) Linearised RF-Model of the tag. (i) Tag chip lumped element model. (ii) Tag antenna lumped element model). (b) Experimental setup for tag measurements, using Tagformance Pro device. (c) Analysed power on tag forward, and backscatter signal at 800–1000 MHz with multiple transmit-power levels for both the dry and wet tag. (d) Read range measurements of the tag in both dry and wet conditions.
An RFID-based smart mask was used to gather lip-reading data for the data collection phase. The mask incorporated the RFID tag stitched onto disposable face masks, with multiple mask variants of different colors and thicknesses tested for system authenticity.
Participants were positioned 0.50 meters away from the RFID reader, with a time limit of 4 seconds for each activity. A total of 2,800 data samples were collected from four participants (two males and two females), aged 16 to 50 years, ensuring a diverse representation of speech patterns. These data were divided into training and testing subsets, with 80% of the data used for training and 20% for testing.
The collected received signal strength indicator (RSSI) data was pre-processed and analyzed using ML models, including RF, k-nearest neighbors (KNN), and support vector machine (SVM). Various features such as mean, median, standard deviation, and moments like skewness and kurtosis were extracted from the data. These features provided robust descriptors for characterizing lip movement patterns.
The RF model utilized a meta-estimator to handle overfitting, while the KNN model relied on selecting the optimal value of 'k.' The SVM used gamma and C parameters to balance margin maximization and classification accuracy. These models were trained and tested on datasets corresponding to vowels, consonants, and words to effectively differentiate and classify lip-reading patterns.
RFID Lip-Reading Framework
This study introduced an RFID-based smart mask for lip-reading recognition and compared the performance of various ML models, including RF, KNN, and SVM RBF algorithms. The framework relied on RSSI signal variations caused by lip movements, converting them into distinguishable features for classification.
The results revealed that the RF algorithm achieved the highest accuracy across all datasets, with an 80% classification accuracy for the combined vowels dataset, 89.5% for consonants, and 93.0% for words. The confusion matrix showed that while all three models could classify the 14 classes, RF outperformed the others. However, minor misclassifications were noted, particularly between similar phonemes, such as 'U' and 'F,' and these limitations were more pronounced in certain models like KNN.
The RFID-based lip-reading framework utilized RSSI signals from an RFID reader to detect lip movements, providing a unique solution to improve communication, especially in contexts like coronavirus disease 2019 (COVID-19), where masks obscure facial expressions. The system demonstrated high classification accuracy with a maximum of 93.0% on the combined dataset.
The study highlighted the RFID technology's potential in standalone devices or as a complementary tool for hearing aids, addressing challenges like visual obstructions caused by mask-wearing. The integration of RFID into multimodal hearing aids could transform communication aids by combining audio and RF-based lip detection.
While the RFID smart mask improves lip detection and addresses privacy concerns, it presents practical challenges for real-world deployment. The cost of RFID-enabled masks and the required infrastructure, including RFID readers and antennas, can be significant. Furthermore, scalability is a concern, as the system must accommodate diverse users with varying speech patterns. Real-time applications may require higher computational efficiency and optimization of data transmission protocols. Integrating RFID with existing hearing aids also requires seamless synchronization to ensure reliable performance. Future work will focus on refining the system for real-time applications and addressing these challenges to enhance communication for individuals with hearing impairments.
Conclusion
This paper presented a contactless, privacy-preserving lip-reading recognition framework using a passive RFID tag embedded in a wearable mask. ML models, including RF, k-NN, and SVM RBF, were applied to RSSI data, achieving a maximum of 93.0% classification accuracy for words and an overall accuracy of 80% across all 14 classes. Future work aimed to develop a real-time system for broader word recognition and enhance user accessibility, particularly for individuals who are deaf or blind.
Journal reference:
- Hameed, H., Usman, M., Kazim, J. U., Assaleh, K., Arshad, K., Hussain, A., Imran, M., & Abbasi, Q. H. (2024). Artificial intelligence enabled smart mask for speech recognition for future hearing devices. Scientific Reports, 14(1), 1-11. DOI: 10.1038/s41598-024-81904-y, https://www.nature.com/articles/s41598-024-81904-y