In a study published in the journal Mathematics, researchers developed a real-time sensor-based mobile user authentication approach using deep learning to enhance security in complex environments. With the widespread use of smart mobile devices, protecting user privacy has become an urgent issue.
Many existing methods, like passwords or biometric authentication, have usability, security, and privacy limitations. To address this, the researchers proposed using motion sensors like accelerometer, gyroscope, and gravity sensor for implicit and continuous user authentication. However, extracting subtle user features from sensor data and handling noisy labels remain vital challenges.
Data Collection and Preprocessing
In this study, the researchers presented a system called ST-SVD that combines S-transform (ST) and singular value decomposition (SVD) to convert sensor time-series signals into 2D time-frequency images. This allows extracting enhanced user micro-features invisible to the naked eye. The data collection involved 1513 volunteers using mobile devices in complex real-world environments over ten days, gathering 283 million sensor data samples.
For each user, the built-in motion sensors captured the accelerometer, gyroscope, and gravity signals when using the device. The data collection was triggered when the screen lit up or the foreground app changed, recording sensor values for 3 seconds at 50Hz. The International Mobile Equipment Identity (IMEI) was used to identify each user on the server side. The raw time-series signals were preprocessed by applying ST-SVD to create S-matrix images. ST enabled a time-frequency representation using a Gaussian window. SVD denoised the signals and enhanced subtle traits of individual usage patterns in the resulting diagrams.
The Model
A semi-supervised Teacher-Student tri-training algorithm was used to reduce label noise in the training data. The dataset was split into labeled, unlabeled, and verification subsets. Three differential classifiers were trained on bootstrapped labeled samples and used to re-label unlabeled data through majority voting. This iterative process identified and corrected mislabeled samples, improving training accuracy.
The refined dataset was input to a convolutional neural network (CNN) model with convolutional, pooling, and fully connected layers for feature extraction and classification. The input layer took the 2D ST-SVD matrix images. The convolutional layers extracted local features, which were downsampled by the pooling layers into global features. The fully connected layers flattened the features into a classification vector processed by a softmax output layer.
Experimental Results and Evaluation
Comprehensive evaluations demonstrated the effectiveness of the proposed techniques. ST-SVD provided superior performance in capturing nuanced user behaviors compared to standard and generalized S-transform. The semi-supervised tri-training algorithm eliminated noisy labels, increasing authentication accuracy from 92.89% to 96.32%. The overall test accuracy of the system reached 96.32%, outperforming existing state-of-the-art methods. The approach also showed strong resilience against imitation attacks.
The computational performance was also analyzed. On the server side, the average training time per user was around 48 seconds with GPU acceleration. On Android mobile clients, battery drainage, CPU usage, and memory footprint were minimal, enabling real-time on-device authentication with low latency.
Advantages Over Prior Works
Compared to prior research in sensor-based authentication, this study has several notable advantages. Many existing methods rely on expert-defined feature extraction, which cannot capture complex nonlinear relationships in sensor data. The proposed deep learning approach automatically learns feature representations tailored to the specific dataset. Additionally, most studies do not address the issue of label noise, which can accumulate errors during model training. By using semi-supervised tri-training, this approach explicitly corrects mislabeled samples to improve accuracy.
Another critical benefit of the new approach is converting time-series signals into 2D images, which better suits CNN processing. Raw statistical features often lead to information loss for mixed sensor data. Learning directly from 2D representations helps preserve contextual relationships in the temporal sensor values. Moreover, the sizeable real-world dataset validates the techniques under practical conditions. Many prior works evaluated controlled lab data, which fail to generalize to uncontrolled environments. The robust performance of this approach highlights its applicability to real-world mobile authentication scenarios.
Future Outlook
This study highlights the feasibility of leveraging motion sensors and deep learning for robust smartphone user authentication without compromising usability or privacy. The techniques can enhance security for mobile devices in organizations like enterprises and governments. Future work will involve further improving the efficiency and generalization capability of the system and, for example, reducing the dataset size needed for training or enabling cross-device authentication. Overall, the proposed approach provides an automated and reliable means of verifying device owners in real-world conditions.
Journal reference:
- Weng, Z., Wu, S., Wang, Q., & Zhu, T. (2023). Enhancing Sensor-Based Mobile User Authentication in a Complex Environment by Deep Learning. Mathematics, 11(17), 3708. https://doi.org/10.3390/math11173708, https://www.mdpi.com/2227-7390/11/17/3708