Enhancing Cyber Threat Intelligence with AI

Cyber threat intelligence (CTI) is a branch of cyber security (CS) concerning the contextual information surrounding cyber-attacks. This involves understanding the future, present, and past tactics, techniques, and procedures (TTPs) of diverse threat actors. Organizations utilize CTI to assist their security teams in protecting their networks from cyber-attacks by including threat data feeds into the systems/networks. This article deliberates on the importance and applications of artificial intelligence (AI) in CTI.

Image credit: Golden Dayz/Shutterstock
Image credit: Golden Dayz/Shutterstock

CTI Basics

The complexity and frequency of cyber threats are constantly growing as cybercriminals successfully bypass organizations' security controls using customized TTPs and intrusion kill chains. The development and implementation of robust CTI is one of the feasible approaches to mitigate security breaches. For instance, nation-states already use CTI as an efficient solution for devising preventive CS measures in advance.

CTI is also a proactive security measure involving real-time gathering, collation, and analysis of information on potential attacks to prevent data breaches and the resultant consequences. The primary objective is to provide thorough information on security threats posing a significant risk to the organization’s infrastructure and guide the security teams on preventative actions simultaneously.

Importance of AI in CTI

AI methods, specifically machine learning (ML), can improve CS measures significantly when cyber-attacks are ever-increasing. For instance, AI/ML-powered CS applications perform anomaly detection on a network more effectively compared to conventional methods.

Several AI/ML applications are employed in CS solutions, including hacking incident forecasting, CS ratings, secure user authentication, botnet detection, credit scoring and next-best offers, fraud detection, network intrusion detection and prevention, and spam filter applications.

In the context of CTI, organizations can automate data processing and acquisition, integrate with their current security solutions, absorb unstructured data from diverse sources, and finally link information from various places by incorporating context on compromise and modi operandi of malicious actors using AI/ML methods.

This is especially important in the big data context as the massive processing scales necessitate comprehensive automation. The processing must comprise the fusion of data points from different sources, such as technical, dark web, deep web, and open web sources, to devise a more effective strategy.

This approach can assist in converting these massive amounts of data into actionable CTI. Additionally, by separating and assembling concepts, AI/ML techniques can be leveraged to structure the data into categories of entities depending on their relationships to each other, properties, names, and events.

The approach facilitates robust searches on categories, enabling data sorting automation, eliminating manual data sorting. AI/ML techniques effectively structure text in multiple languages through natural language processing (NLP). For instance, text from infinite unstructured documents across several languages can be analyzed and categorized based on language-independent groups and events by exploiting AI/ML techniques.

Moreover, ML methods can be developed for text categorization into code, data logs, or groups prose, and eliminate ambiguities between entities having the same name using contextual clues in the surrounding text. By implementing statistical methodology and ML, events, and entities can be sorted even further depending on significance, such as by assessing risk scores to malicious entities.

Risk scores are calculated by the ML when it is trained using an already examined dataset. Classifiers like risk scores deliver the context describing the score, and a judgment as various sources verify that a particular IP address is malicious.

Risk classification automation saves a significant amount of time by effectively sorting through false positives and determining the risks that must be prioritized. Events and entity properties can be predicted using ML by more accurately generating predictive analysis models compared to those developed by humans based on deep data pools that have been mined and categorized previously.

ML techniques can also serve as active sensors feeding data into a common threat intelligence network that the entire user base has employed. Thus, implementing AI/ML methods at various CTI levels is at very different stages. For instance, studies in operational intelligence type are currently in the research and experiment stage, necessitating substantial resources.

ML Applications in CTI

Applications of ML techniques to threat intelligence, especially attribution, are currently being tweaked, tested, and developed. Attribution is expected to remain a major problem due to its political and convoluted nature. However, ML can automate several parts of the analysis process to increase the scalability of attribution and threat intelligence efforts by decreasing the threat intelligence analyst workload.

Microsoft Defender Advanced Threat Protection: The Microsoft Defender Advanced Threat Protection Research Team has developed an NLP system that extracts TTPs from documents available publicly, identifies categories, and labels relationships between those identified categories. Specifically, the system is trained using documentation of known threats, receives unstructured text as input, and identifies attack techniques, threat actors, malware families, and relationships to create attacker timelines and graphs.

This ML model was employed to identify the common techniques between the Emotet malware family and identified threat actor groups, enabling organizations to implement defensive choke points to prevent/detect these attacker techniques to stop both commodity malware and high-profile targeted attacks. Thus, the platform is a good example of ML being applied to provide actionable threat intelligence for preventing cyber-attacks.

APTinder: APTinder is an ML model under development by FireEye with the objective to assist in automating the daunting manual intelligence analysis process and the threat actor grouping. FireEye possesses a large existing dataset required for the model development.

The primary objectives of the model are to build a single interpretable similarity metric between groups, assess past analytical decisions, and identify new potential matches. Every potentially unique feature or topic has its model, which allows fine-tuning each model and changing topic weights for the final grouping. Similar to the approach followed by the Microsoft Defender Advanced Threat Protection Research Team, the data required for FireEye’s project is gathered from a vast body of reports.

The inverse document frequency (IDF) technique scores the uniqueness of terms from vectorized reports. Additionally, cosine similarity is employed to measure similarities between different groups. The groups are represented by the vectors from the earlier step. Cosine similarity refers to the cosine value of the angle between two vectors, which primarily determines how parallel these vectors are. This process is repeated for every topic or category.

Although the categories are currently weighted with a straight average, an objective weighting system must be built based on existing data for the overall model. A robust attack attribution can only be achieved by automating and enabling the analysis of massive quantities of data in a scalable manner.

Recent Developments

A paper published in Information Systems Security proposed a model to generate actionable threat intelligence by implementing a supervised ML approach employing the Naïve Bayes classifier. The objective of the ML-based model was to extract the potential threat intelligence from structured data sources and predict the threat.

Although several algorithms, like convolutional neural networks and recurrent neural networks, are effective for text analysis and NLP, researchers here adopted the Naïve-Bayes classifier to extract high-level threat intelligence from the dataset through text classification, considering 30% data and 70% data for test and training datasets, respectively.

Additionally, every feature in the model is independent of another feature’s existence, indicating that every feature contributes to the prediction without having a correlation. The text vector acted as the data feed for training this model, which was followed by model testing to assess the model performance. Eventually, true events, such as malware or threats, were predicted for unknown data.

Researchers evaluated several performance metrics, such as f1-score, precision, and accuracy, of the model against the test and training dataset using the Naïve Bayes classifier. The model demonstrated 98.2% accuracy and 96.6% accuracy for the training and test datasets, respectively.

To summarize, AI/ML techniques are playing an effective role in automating and improving CTI. However, the challenges of using ML, especially bias and discrimination, explainability, and adversarial attacks, must be addressed to fully exploit the advantages of these techniques.

References and Further Reading

Montasari, R., Carroll, F., Macdonald, S., Jahankhani, H., Hosseinian-Far, A., Daneshkhah, A. (2021). Application of artificial intelligence and machine learning in producing actionable cyber threat intelligence. Digital Forensic Investigation of Internet of Things (IoT) Devices, 47-64. https://doi.org/10.1007/978-3-030-60425-7_3

Barker, C. (2020). Applications of Machine Learning to Threat Intelligence, Intrusion Detection and Malware. https://digitalcommons.liberty.edu/honors/985

Dutta, A., Kant, S. (2020). An overview of cyber threat intelligence platform and role of artificial intelligence and machine learning. Information Systems Security, 81-86. https://doi.org/10.1007/978-3-030-65610-2_5

Last Updated: Jan 9, 2024

Samudrapom Dam

Written by

Samudrapom Dam

Samudrapom Dam is a freelance scientific and business writer based in Kolkata, India. He has been writing articles related to business and scientific topics for more than one and a half years. He has extensive experience in writing about advanced technologies, information technology, machinery, metals and metal products, clean technologies, finance and banking, automotive, household products, and the aerospace industry. He is passionate about the latest developments in advanced technologies, the ways these developments can be implemented in a real-world situation, and how these developments can positively impact common people.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Dam, Samudrapom. (2024, January 09). Enhancing Cyber Threat Intelligence with AI. AZoAi. Retrieved on November 23, 2024 from https://www.azoai.com/article/Enhancing-Cyber-Threat-Intelligence-with-AI.aspx.

  • MLA

    Dam, Samudrapom. "Enhancing Cyber Threat Intelligence with AI". AZoAi. 23 November 2024. <https://www.azoai.com/article/Enhancing-Cyber-Threat-Intelligence-with-AI.aspx>.

  • Chicago

    Dam, Samudrapom. "Enhancing Cyber Threat Intelligence with AI". AZoAi. https://www.azoai.com/article/Enhancing-Cyber-Threat-Intelligence-with-AI.aspx. (accessed November 23, 2024).

  • Harvard

    Dam, Samudrapom. 2024. Enhancing Cyber Threat Intelligence with AI. AZoAi, viewed 23 November 2024, https://www.azoai.com/article/Enhancing-Cyber-Threat-Intelligence-with-AI.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.