In a recent study published in the journal Applied Sciences, researchers comprehensively examined how large language models (LLMs) could enhance the efficiency and accuracy of automated black-box detection of structured query language (SQL) injection (SQLI) in web applications. They developed and tested a new method called "SQL injection generative pre-trained transformer (SqliGPT)," which uses LLMs to improve SQLI detection. The objective was to enhance web application security against the rising threat of SQLI.
Background
SQLI is a major cybersecurity risk that exploits vulnerabilities in database management systems by inserting malicious SQL scripts into queries, leading to data breaches and system takeovers/hijacking. Despite previous research, SQLI remains one of the most common attacks on web applications, necessitating more effective detection methods.
LLMs, like OpenAI's GPT series and Anthropic's Claude models, have emerged as a transformative technology in natural language processing, exhibiting remarkable advancements in various domains, including text generation, translation, summarization, and question-answering.
These models are trained on massive datasets of text and code, enabling them to learn complex language patterns, understand context, and generate human-like text. Their adaptability and knowledge make them promising tools for addressing cybersecurity challenges, especially in vulnerability detection and penetration testing.
About the Research
In this paper, the authors aimed to explore how LLMs could enhance SQLI black-box detection, where external attack scenarios are simulated without knowledge of the application’s structure. Existing black-box detection techniques often rely on fixed rules, limiting their effectiveness and accuracy.
To address these challenges, the researchers proposed "SqliGPT", an LLM-based SQLI black-box scanner that leverages the contextual and reasoning capabilities of LLMs to improve detection efficiency and bypass weak defense mechanisms. They conducted an exploratory study using two approaches, LLM4Sqli-Manual and LLM4Sqli-Sqlmap.
The authors showed that LLM4Sqli-Sqlmap, which uses LLMs to guide the Sqlmap scanner, outperformed LLM4Sqli-Manual in detecting SQLI vulnerabilities on the SqliMicroBenchmark, a benchmark designed for SQLI detection. However, two challenges arose: inefficient detection and difficulties bypassing weak defense mechanisms.
SqliGPT introduced two key modules: the Strategy Selection Module, which improves detection by prioritizing likely injection points, and the Defense Bypass Module, which identifies the target application’s defense mechanisms and uses retrieval-augmented generation (RAG) and payload transformation to bypass them.
Furthermore, the study conducted a comprehensive evaluation of SqliGPT against six state-of-the-art SQLI scanners, including PentestGPT, SQIRL, and four industry-leading tools (Sqlmap, ZAP, BurpSuite, and Arachni), using the SqliMicroBenchmark, a custom-designed micro-benchmark based on the Sqli-labs testing range.
Research Findings
The outcomes showed that the SqliGPT successfully detected all 45 targets in the SqliMicroBenchmark, outperforming the other scanners, particularly those targets which has insufficient defense mechanisms. This highlights the effectiveness of SqliGPT's Defense Bypass Module in identifying and overcoming defense mechanisms that posed challenges for the other scanners.
In terms of efficiency, SqliGPT demonstrated excellent performance, slightly underperforming SQIRL and Arachni on 27 targets but surpassing them on the remaining 18 targets. This improvement was attributed to the Strategy Selection Module, which optimized detection.
The authors also conducted an ablation study to assess the individual contributions of the Defense Bypass Module and Strategy Selection Module. The results confirmed the significant impact of these modules on SqliGPT's overall performance, with the Defense Bypass Module playing a crucial role in enhancing the detection of advanced difficulty targets with insufficient defense mechanisms.
Applications
This research provides a foundation for further studies and practical applications in web application security. By combining LLMs' contextual understanding and reasoning with modules that improve detection and bypass defenses, SqliGPT offers a promising method for enhancing automated SQLI vulnerability detection. The successful detection of real-world common vulnerabilities and exposures (CVEs) in the SqliMicroBenchmark also underscores SqliGPT's practical value in protecting web applications from evolving SQLI threats.
Conclusion
In summary, the novel SQL injection technique effectively improved SQLI black-box detection, highlighting the potential of LLMs in overcoming weak defenses and identifying vulnerabilities and malware. This research could lead to further integration of LLMs in web application security, helping enhance the detection and mitigation of critical SQLI threats.
Future work should focus on combining SqliGPT with other security tools, applying it to detect other web vulnerabilities like cross-site scripting (XSS) and cross-site request forgery (CSRF), and expanding its evaluation on real-world applications to test scalability and robustness.
Incorporating techniques like federated learning or transfer learning could also boost SqliGPT's capabilities without compromising privacy. As LLMs continue to advance, their use in cybersecurity tasks like automated penetration testing and vulnerability detection could have a significant impact on the future of web security.
Journal reference:
- Gui, Z.; & et, al. SqliGPT: Evaluating and Utilizing Large Language Models for Automated SQL Injection Black-Box Detection. Appl. Sci. 2024, 14, 6929. DOI: 10.3390/app14166929, https://www.mdpi.com/2076-3417/14/16/6929