The proliferation of Internet of Things (IoT) devices has raised concerns about their vulnerability to cyberattacks. To address this, robust security mechanisms are necessary. Intrusion detection systems (IDS) play a crucial role in identifying and alerting about attacks by analyzing system activities. However, traditional IDS must be better suited for IoT due to device limitations and decentralization. To overcome these challenges, an article published in the arxiv* server explored a smart system that utilizes federated learning (FL) and machine learning (ML) techniques for monitoring and detecting anomalies in IoT devices.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
Background
While intrusion detection systems have been extensively studied for traditional systems, only limited work has been conducted on host-based IDS (HIDS) for IoT devices. IoT devices pose unique challenges due to limited resources, diverse technologies, and protocols. HIDS analyses data from the host system's audit and logging mechanisms to identify signs of intrusion. System call traces, which capture sequences of system calls executed by processes, are commonly used in HIDS to detect intrusions. These traces enable the identification of behavioral patterns and abnormal activities.
Leveraging federated learning
To address privacy concerns linked to centralized methods, this research proposes the use of federated learning (FL). FL is a distributed ML technique that enables training models on data located on multiple devices without centralizing the data. In traditional ML scenarios, data from various devices are aggregated on a centralized server, posing privacy risks. FL allows cooperative training of a shared global model while ensuring data privacy, as user data is not shared with a central entity. This is particularly advantageous in IoT, where privacy is a significant concern.
Design of experiments in IoT
The experimental design focuses on developing a host-based intrusion detection system (HIDS) for IoT devices. The proposed architecture includes data extraction, feature extraction, feature selection, model training, and result evaluation. System call traces are recorded and stored as part of the normal profile. These traces, representing system behavior, undergo processing using various feature extraction techniques such as trivial, vector space, and TF-IDF representation. Principal component analysis (PCA) is employed for feature selection to reduce dimensionality.
To train the model, anomalous data is generated by simulating an environment where the operating system is under intrusion. This data trains a model capable of recognizing anomalous device behavior. Two training approaches are evaluated: a centralized model and a decentralized FL-based approach. In the FL approach, IoT clients are represented as nodes, enabling distributed training without centralizing the data.
Empirical findings
Performance evaluation of ML algorithms using TF-IDF representation on the ADFA-LD dataset demonstrates promising results. Decision trees, random forest, K-nearest neighbors (KNN), and multi-layer perceptron (MLP) achieve high accuracy, recall, and precision. Random forest outperforms other classifiers, achieving the highest F1 score. In the FL setting, the algorithm exhibits comparable performance to the baseline, achieving approximately 96% accuracy. Weighted federated averaging (WFA) performs slightly better than federated averaging (FA) when data distribution is uneven.
Future Directions and Implications
While the proposed intelligent mechanism for monitoring and detecting intrusions in IoT devices shows promise, there are several areas for future research. Incorporating deep learning techniques such as convolutional neural networks (CNN), recurrent neural networks (RNN), and long short-term memory (LSTM) can enhance the system's performance by capturing complex patterns and temporal dependencies in system call traces. These techniques are well-suited for analyzing sequence-based data and improving the system's ability to detect sophisticated attacks.
Implementing multi-class classification to differentiate between various types of attacks is another important direction for future work. Currently, the system focuses on binary classification, classifying system calls as either benign or malicious. Identifying and classifying different attack types can provide valuable insights into the nature and severity of intrusions, enabling more targeted countermeasures.
Adapting the intrusion detection system to handle the increasing scale and complexity of IoT deployments is crucial. This involves addressing challenges related to device heterogeneity, protocols, and network architectures. The system should be capable of handling diverse IoT environments and adapting to emerging threats and attack vectors.
Practical deployment and integration of the proposed HIDS in real-world IoT devices should be explored, considering resource-constrained environments. Optimizing the system for lightweight IoT devices with limited computational power, storage, and memory resources is necessary. Evaluating the system's performance and efficiency in these constrained settings ensures its practical viability and scalability.
Conclusions
The proposed a smart system for monitoring and detecting anomalies in IoT devices that leverages federated learning and machine learning techniques and offers a robust solution to address security challenges in IoT environments. The system effectively detects intrusions by analyzing system call traces and achieves high accuracy in classifying benign and malicious samples. By leveraging federated learning, the system ensures data privacy and mitigates risks associated with centralized approaches.
This research contributes to the development of an autonomous trust, security, and privacy management framework for IoT devices. Incorporating deep learning techniques and multi-class classification can further enhance the system's capabilities and provide a comprehensive intrusion detection solution.
As IoT deployments continue to expand, refining and adapting the system to handle the evolving threat landscape and meet the requirements of resource-constrained IoT devices is crucial. Intelligent intrusion detection mechanisms can play a pivotal role in safeguarding IoT devices and ensuring the security and privacy of IoT ecosystems through continued research and development.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.