Detecting Advanced Persistent Threats with Machine Learning

In the modern era, nations confront many electronic threats spanning various sectors, including private corporations and public institutions. Among these, advanced persistent threats (APTs) represent a prominent menace. APTs are meticulously crafted, stealthy network attacks designed for unauthorized access and prolonged undetected presence. In a recent paper published in the journal Sustainability, researchers addressed APTs through a multi-stage, time-series-based automated detection framework, distinguishing it from previous models. 

Study: Detecting Advanced Persistent Threats with Machine Learning. Image credit: ozrimoz/Shutterstock
Study: Detecting Advanced Persistent Threats with Machine Learning. Image credit: ozrimoz/Shutterstock

Background

Understanding network vulnerabilities is crucial for effective cyberattack detection and mitigation. Teams must grasp the motives of the attackers and target information to devise appropriate responses. Among cyber threats, APTs stand out. APTs are prolonged, targeted cyberattacks seeking unauthorized access and operating covertly. They prioritize data theft over immediate damage, making them perilous for national defense, industry, and finance.

APT groups employ advanced techniques, exploiting zero-day vulnerabilities and social engineering methods. They adapt their code and use concealment tactics to evade detection. APT attacks often have state or criminal backing, aiming for competitive advantage or financial gain. Detecting APT attacks presents challenges because they rely on behavioral patterns rather than readily identifiable signatures. While APT attacks are tough to spot, data exfiltration can serve as evidence. Cybersecurity professionals identify outgoing data anomalies as indicators of network breaches. In smart cities and ad hoc architectures, securing data transmission and information becomes paramount.

To build a secure digital content delivery system in smart cities, this study proposes a dedicated system using secure architecture and machine learning analytics. Continuous performance monitoring is essential for societal acceptance. While AI adoption in the commercial industry is slow, it has made strides in scientific services.

Machine learning approaches for APT detection

Previous studies on APT detection have revealed shortcomings, including the inability to identify real-time attacks and a significant occurrence of false alarms. APT attacks employ sophisticated techniques, challenging traditional intrusion detection systems (IDS) relying on fixed signatures or identifiers. Machine learning (ML) methods have emerged as promising for APT detection.

A literature review reveals that many ML approaches combine features such as system logs, network traffic data, and user behavior to train models. Previous studies employed different ML techniques for APT attacks, including support vector machines (SVM), combined SVM and genetic algorithms, combined bee colony and SVM, and combined K-NN, naïve Bayes, and one-class SVM. These algorithms are applied to public data sets for network-based IDSs: KDD-99 and NSL-KDD. 

On the NSL-KDD dataset, applying dimensional reduction to SVM with a radial basis function (RBF) kernel achieved the best detection accuracy (97.22 percent). ML algorithms with feature selection, with the decision tree (J48) model and AdaBoost achieving 97 percent accuracy on KDD-99 data.

Building an APT detection system with machine learning

The current study aims to develop an APT detection system utilizing machine learning techniques to differentiate between normal and abnormal network traffic patterns. The approach offers the advantage of identifying APT attacks, even when their signatures are unknown, by detecting deviations from standard traffic patterns. Common supervised machine learning methods, such as support vector machines (SVM) and k-means, are employed via Microsoft Azure Cloud for real-time and precise anomaly detection.

The study uses a dataset with 57 network packet file features, categorized into live system logs, journal logs, and various network appliance logs. It undergoes extensive preprocessing, including data cleaning, transformation, reduction, integration, discretization, splitting, feature selection, and augmentation.

The model development process includes univariate and linear regression analysis, feature extraction, and the utilization of APT history to identify anomalies in live network flows. Various datasets are employed, and manual labeling is done. A composition-based decision tree (CDT) model, coupled with the naïve Bayes algorithm, generates rules for real-time APT detection using an intrusion detection and prevention system (SNORT).

The live-case attack simulation involves flow capture, manual classification, feature extraction, model training, and rule generation for the IDS application. Specialized datasets with real APT cases are used for evaluation and comparative analysis.

Results and analysis

This research contributes significantly by introducing a three-stage model. In the first stage, it detects APT attacks. The second stage analyzes the attack's nature through nine different patterns, which provides deeper insights. The third stage correlates scenarios by training SNORT rules with CDT.

The CDT algorithm, proposed in this study, was evaluated against existing algorithms using a dataset. Results demonstrate its superiority in detecting both malicious and benign attacks. For instance, it achieved a precision of 96 percent for malicious attacks, surpassing existing algorithms. On average, the proposed model achieved a precision estimate of 94.3 percent, outperforming existing algorithms. The evaluation was conducted using WEKA software. This approach effectively addresses a machine learning problem by continuously detecting, saving behaviors, and accurately classifying APT attacks.

Conclusion

In summary, researchers reviewed recent research on using ML for APT detection. Various ML algorithms were evaluated, highlighting the potential for improving intrusion detection systems with ML. While challenges remain, the proposed CDT model consistently outperformed existing algorithms. Future research can expand the three algorithms to detect multiple attack types simultaneously, not just APTs. Enhanced attack identification may involve monitoring employee movements within the organization and correlating them with computer and network events, aiming for proactive attack detection.

Journal reference:
Dr. Sampath Lonka

Written by

Dr. Sampath Lonka

Dr. Sampath Lonka is a scientific writer based in Bangalore, India, with a strong academic background in Mathematics and extensive experience in content writing. He has a Ph.D. in Mathematics from the University of Hyderabad and is deeply passionate about teaching, writing, and research. Sampath enjoys teaching Mathematics, Statistics, and AI to both undergraduate and postgraduate students. What sets him apart is his unique approach to teaching Mathematics through programming, making the subject more engaging and practical for students.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Lonka, Sampath. (2023, September 19). Detecting Advanced Persistent Threats with Machine Learning. AZoAi. Retrieved on July 07, 2024 from https://www.azoai.com/news/20230919/Detecting-Advanced-Persistent-Threats-with-Machine-Learning.aspx.

  • MLA

    Lonka, Sampath. "Detecting Advanced Persistent Threats with Machine Learning". AZoAi. 07 July 2024. <https://www.azoai.com/news/20230919/Detecting-Advanced-Persistent-Threats-with-Machine-Learning.aspx>.

  • Chicago

    Lonka, Sampath. "Detecting Advanced Persistent Threats with Machine Learning". AZoAi. https://www.azoai.com/news/20230919/Detecting-Advanced-Persistent-Threats-with-Machine-Learning.aspx. (accessed July 07, 2024).

  • Harvard

    Lonka, Sampath. 2023. Detecting Advanced Persistent Threats with Machine Learning. AZoAi, viewed 07 July 2024, https://www.azoai.com/news/20230919/Detecting-Advanced-Persistent-Threats-with-Machine-Learning.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Optimizing Wastewater Treatment with Machine Learning