AI Boosts Advanced SQL Injection Detection

In a recent study published in the journal Applied Sciences, researchers comprehensively examined how large language models (LLMs) could enhance the efficiency and accuracy of automated black-box detection of structured query language (SQL) injection (SQLI) in web applications. They developed and tested a new method called "SQL injection generative pre-trained transformer (SqliGPT)," which uses LLMs to improve SQLI detection. The objective was to enhance web application security against the rising threat of SQLI.

Study: AI Boosts Advanced SQL Injection Detection. Image Credit: Ar_TH/Shutterstock.com
Study: AI Boosts Advanced SQL Injection Detection. Image Credit: Ar_TH/Shutterstock.com

Background

SQLI is a major cybersecurity risk that exploits vulnerabilities in database management systems by inserting malicious SQL scripts into queries, leading to data breaches and system takeovers/hijacking. Despite previous research, SQLI remains one of the most common attacks on web applications, necessitating more effective detection methods.

LLMs, like OpenAI's GPT series and Anthropic's Claude models, have emerged as a transformative technology in natural language processing, exhibiting remarkable advancements in various domains, including text generation, translation, summarization, and question-answering.

These models are trained on massive datasets of text and code, enabling them to learn complex language patterns, understand context, and generate human-like text. Their adaptability and knowledge make them promising tools for addressing cybersecurity challenges, especially in vulnerability detection and penetration testing.

About the Research

In this paper, the authors aimed to explore how LLMs could enhance SQLI black-box detection, where external attack scenarios are simulated without knowledge of the application’s structure. Existing black-box detection techniques often rely on fixed rules, limiting their effectiveness and accuracy.

To address these challenges, the researchers proposed "SqliGPT", an LLM-based SQLI black-box scanner that leverages the contextual and reasoning capabilities of LLMs to improve detection efficiency and bypass weak defense mechanisms. They conducted an exploratory study using two approaches, LLM4Sqli-Manual and LLM4Sqli-Sqlmap.

The authors showed that LLM4Sqli-Sqlmap, which uses LLMs to guide the Sqlmap scanner, outperformed LLM4Sqli-Manual in detecting SQLI vulnerabilities on the SqliMicroBenchmark, a benchmark designed for SQLI detection. However, two challenges arose: inefficient detection and difficulties bypassing weak defense mechanisms.

SqliGPT introduced two key modules: the Strategy Selection Module, which improves detection by prioritizing likely injection points, and the Defense Bypass Module, which identifies the target application’s defense mechanisms and uses retrieval-augmented generation (RAG) and payload transformation to bypass them.

Furthermore, the study conducted a comprehensive evaluation of SqliGPT against six state-of-the-art SQLI scanners, including PentestGPT, SQIRL, and four industry-leading tools (Sqlmap, ZAP, BurpSuite, and Arachni), using the SqliMicroBenchmark, a custom-designed micro-benchmark based on the Sqli-labs testing range.

Research Findings

The outcomes showed that the SqliGPT successfully detected all 45 targets in the SqliMicroBenchmark, outperforming the other scanners, particularly those targets which has insufficient defense mechanisms. This highlights the effectiveness of SqliGPT's Defense Bypass Module in identifying and overcoming defense mechanisms that posed challenges for the other scanners.

In terms of efficiency, SqliGPT demonstrated excellent performance, slightly underperforming SQIRL and Arachni on 27 targets but surpassing them on the remaining 18 targets. This improvement was attributed to the Strategy Selection Module, which optimized detection.

The authors also conducted an ablation study to assess the individual contributions of the Defense Bypass Module and Strategy Selection Module. The results confirmed the significant impact of these modules on SqliGPT's overall performance, with the Defense Bypass Module playing a crucial role in enhancing the detection of advanced difficulty targets with insufficient defense mechanisms.

Applications

This research provides a foundation for further studies and practical applications in web application security. By combining LLMs' contextual understanding and reasoning with modules that improve detection and bypass defenses, SqliGPT offers a promising method for enhancing automated SQLI vulnerability detection. The successful detection of real-world common vulnerabilities and exposures (CVEs) in the SqliMicroBenchmark also underscores SqliGPT's practical value in protecting web applications from evolving SQLI threats.

Conclusion

In summary, the novel SQL injection technique effectively improved SQLI black-box detection, highlighting the potential of LLMs in overcoming weak defenses and identifying vulnerabilities and malware. This research could lead to further integration of LLMs in web application security, helping enhance the detection and mitigation of critical SQLI threats.

Future work should focus on combining SqliGPT with other security tools, applying it to detect other web vulnerabilities like cross-site scripting (XSS) and cross-site request forgery (CSRF), and expanding its evaluation on real-world applications to test scalability and robustness.

Incorporating techniques like federated learning or transfer learning could also boost SqliGPT's capabilities without compromising privacy. As LLMs continue to advance, their use in cybersecurity tasks like automated penetration testing and vulnerability detection could have a significant impact on the future of web security.

Journal reference:
  • Gui, Z.; & et, al. SqliGPT: Evaluating and Utilizing Large Language Models for Automated SQL Injection Black-Box Detection. Appl. Sci. 2024, 14, 6929. DOI: 10.3390/app14166929, https://www.mdpi.com/2076-3417/14/16/6929
Muhammad Osama

Written by

Muhammad Osama

Muhammad Osama is a full-time data analytics consultant and freelance technical writer based in Delhi, India. He specializes in transforming complex technical concepts into accessible content. He has a Bachelor of Technology in Mechanical Engineering with specialization in AI & Robotics from Galgotias University, India, and he has extensive experience in technical content writing, data science and analytics, and artificial intelligence.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Osama, Muhammad. (2024, August 14). AI Boosts Advanced SQL Injection Detection. AZoAi. Retrieved on September 16, 2024 from https://www.azoai.com/news/20240814/AI-Boosts-Advanced-SQL-Injection-Detection.aspx.

  • MLA

    Osama, Muhammad. "AI Boosts Advanced SQL Injection Detection". AZoAi. 16 September 2024. <https://www.azoai.com/news/20240814/AI-Boosts-Advanced-SQL-Injection-Detection.aspx>.

  • Chicago

    Osama, Muhammad. "AI Boosts Advanced SQL Injection Detection". AZoAi. https://www.azoai.com/news/20240814/AI-Boosts-Advanced-SQL-Injection-Detection.aspx. (accessed September 16, 2024).

  • Harvard

    Osama, Muhammad. 2024. AI Boosts Advanced SQL Injection Detection. AZoAi, viewed 16 September 2024, https://www.azoai.com/news/20240814/AI-Boosts-Advanced-SQL-Injection-Detection.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Advancing Large Language Models with Multi-Token Prediction