Introduction
In December 2025, the global cybersecurity community’s annual flagship event, Black Hat Europe 2025, is set to kick off in London, UK. The Arsenal showcase, a key indicator of technological trends within the Black Hat series, has always been a focal point for security researchers and practitioners to gain insights into future trends. It brings together the world’s most cutting-edge open-source security tools and innovative concepts. This article provides a comprehensive analysis of 8 open-source AI security tools that will be presented at the Black Hat Europe 2025 Arsenal, helping you get an early look at their technical highlights and application scenarios.
Github link:
https://github.com/Tencent/AI-Infra-Guard [https://…
Introduction
In December 2025, the global cybersecurity community’s annual flagship event, Black Hat Europe 2025, is set to kick off in London, UK. The Arsenal showcase, a key indicator of technological trends within the Black Hat series, has always been a focal point for security researchers and practitioners to gain insights into future trends. It brings together the world’s most cutting-edge open-source security tools and innovative concepts. This article provides a comprehensive analysis of 8 open-source AI security tools that will be presented at the Black Hat Europe 2025 Arsenal, helping you get an early look at their technical highlights and application scenarios.
Github link:
https://github.com/Tencent/AI-Infra-Guard https://github.com/mandiant/harbinger https://github.com/stratosphereips/MIPSEval https://github.com/ErdemOzgen/RedAiRange https://github.com/ReversecLabs/spikee https://github.com/ThalesGroup/sql-data-guard
I. The Rise of AI Red Teaming: Platforms, Ranges, and Infrastructure Assessment
AI-powered red team attacks are rapidly evolving from individual techniques into systematic operational capabilities. This conference showcases a “Red Team Trilogy” covering an operational platform, a training range, and a risk self-assessment tool.
Harbinger: The AI-Powered Red Team Operations Center Traditional red team operations are heavily reliant on manual experience, leading to significant efficiency bottlenecks. The “Harbinger” platform, open-sourced by the renowned cybersecurity company Mandiant, aims to address this pain point. It is an AI-driven red team collaboration and decision-making platform with core innovations in:
· Operational Automation: Utilizes AI to automatically execute repetitive tasks such as reconnaissance, exploitation, and lateral movement.
· Decision Support: Based on the operational landscape, AI can recommend the optimal next attack path to red team members.
· Automated Reporting: Automatically organizes attack logs, screenshots, and findings to generate structured attack reports, freeing red team members from tedious documentation work.
Connecting the different components of red teaming. This project integrates multiple components commonly used and makes it easier to perform actions, output, and parse.
Features · Socks tasks: Run tools over socks proxies and log the output, as well as templating of commonly used tools.
· Neo4j: Use data from neo4j directly into templating of tool commands.
· C2 Servers: By default we have support for Mythic. But you can bring your own integration by implementing some code, see the custom connectors documentation.
· File parsing: Harbinger can parse a number of file types and import the data into the database. Examples include lsass dumps and ad snapshots. See the parser table for a full list.
· Output parsing: Harbinger can detect useful information in output from the C2 and provide you easy access to it.
· Data searching: Harbinger gives you the ability to search for data in the database in a number of ways. It combines the data from all your C2s in a single database.
· Playbooks: Execute commands in turn in a playbook.
· Dark mode: Do I need to say more.
· AI integration: Harbinger uses LLMs to analyze data, extract useful information and provide suggestions to the operator for the next steps and acts as an assistant.
Harbinger signals a shift in AI red teaming from “using AI tools” to being “driven by an AI platform.”
Red AI Range (RAR): The Digital Dojo for AI Offense and Defense Theoretical knowledge cannot replace hands-on experience. “Red AI Range (RAR),” developed by Sasan Security, provides a much-needed AI security “cyber range” for the industry. It is an AI/ML system environment with pre-configured vulnerabilities, allowing security professionals to:
· Practice Real-World Attacks: Engage in hands-on practice of real-world attack techniques such as model evasion, data poisoning, and model stealing.
· Validate Defenses: Deploy and test defensive measures against AI threats in a controlled environment.
The open-sourcing of RAR significantly lowers the barrier for enterprises and individuals to conduct AI offensive and defensive exercises.
A.I.G: The AI Security Risk Self-Assessment Platform From the underlying AI infrastructure to the Agent application layer, Tencent’s Zhuque Lab has open-sourced “A.I.G,” a comprehensive, intelligent, and user-friendly AI red team security testing platform. Unlike Harbinger, it focuses on helping ordinary users quickly assess the security risks of AI systems themselves, providing a very intuitive front-end interface. Its core capabilities include:
· AI Infrastructure Scanning: Accurately scans mainstream AI frameworks (like Ollama, ComfyUI) based on fingerprinting and detects known CVE vulnerabilities within them.
· MCP Server Scanning: With the explosion in popularity of MCPs, their security has become crucial. A.I.G uses Agent technology to scan MCP Server source code or remote MCP URLs, covering nine major risk categories including tool poisoning, remote code execution, and indirect prompt injection.
Become a member · Large Model Security Check-up: Includes multiple carefully curated jailbreak evaluation datasets to systematically assess the robustness of LLMs against the latest jailbreak attacks, and supports cross-model security comparison and scoring.
A.I.G has the highest number of GitHub Stars (2300+) among all the tools, and its widespread popularity indicates that AI security assessment is becoming democratized. Ordinary AI developers and Agent users also need a platform that can cover the full-stack risk assessment from the underlying infrastructure to the upper-level model applications.
II. LLM Prompt Security: From Prompt Injection to Data Protection
As LLMs become deeply integrated into business processes, fine-grained security assessment and access control are becoming critically important.
SPIKEE & MIPSEval: Evaluating Single-Turn and Multi-Turn LLM Security Prompt injection is currently one of the most significant security threats to LLMs. SPIKEE (Simple Prompt Injection Kit for Evaluation and Exploitation), developed by Reversec, provides a lightweight, modular toolkit that allows researchers and developers to quickly test their LLM applications for prompt injection vulnerabilities.
However, many security issues only manifest during sustained, multi-turn conversations. The open-source tool MIPSEval fills this gap by being specifically designed to evaluate the security consistency of LLMs in long dialogues. For example, a model might refuse to answer an inappropriate question in the first turn, but after a few rounds of “priming” with unrelated conversation, its safety guardrails could be bypassed. MIPSEval, combined with multiple LLM Agents, provides a framework for evaluating this complex, stateful security.
SQL Data Guard: A Secure Channel for LLM Database Access When an LLM needs to connect to an enterprise database to provide services, preventing sensitive data leakage or malicious SQL queries becomes a severe challenge. SQL Data Guard, open-sourced by Thales Group, offers an innovative solution. It acts as a security middleware deployed between the LLM and the database. By analyzing and rewriting the SQL queries generated by the LLM, it ensures that all database interactions comply with preset security policies, thereby effectively controlling risks while empowering the LLM with powerful data capabilities.
SQL is the go-to language for performing queries on databases, and for a good reason — it’s well known, easy to use, and pretty simple. However, it seems that it’s as easy to use as it is to exploit, and SQL injection is still one of the most targeted vulnerabilities — especially nowadays with the proliferation of “natural language queries” harnessing Large Language Models (LLMs) power to generate and run SQL queries.To help solve this problem, we developed sql-data-guard, an open-source project designed to verify that SQL queries access only the data they are allowed to. It takes a query and a restriction configuration, and returns whether the query is allowed to run or not. Additionally, it can modify the query to ensure it complies with the restrictions. sql-data-guard has also a built-in module for detection of malicious payloads, allowing it to report on and remove malicious expressions before query execution.sql-data-guard is particularly useful when constructing SQL queries with LLMs, as such queries can’t run as prepared statements. Prepared statements secure a query’s structure, but LLM-generated queries are dynamic and lack this fixed form, increasing SQL injection risk. sql-data-guard mitigates this by inspecting and validating the query’s content.By verifying and modifying queries before they are executed, sql-data-guard helps prevent unauthorized data access and accidental data exposure. Adding sql-data-guard to your application can prevent or minimize data breaches and the impact of SQL injection attacks, ensuring that only permitted data is accessed.Connecting LLMs to SQL databases without strict controls can risk accidental data exposure, as models may generate SQL queries that access sensitive information. OWASP highlights cases of poor sandboxing leading to unauthorized disclosures, emphasizing the need for clear access controls and prompt validation. Businesses should adopt rigorous access restrictions, regular audits, and robust API security, especially to comply with privacy laws and regulations like GDPR and CCPA, which penalize unauthorized data exposure.
III. AI-Powered Defense: Automated Threat Modeling and Vulnerability Remediation
AI not only introduces new threats but also provides powerful assistance in solving traditional security challenges, especially in terms of scalability and efficiency improvement.
Patch Wednesday: AI-Driven Automated Vulnerability Remediation Vulnerability remediation is a continuous burden in enterprise security operations. The “Patch Wednesday” project demonstrates how to disrupt this process using generative AI. The core idea of the tool is:
· Input: Provide a CVE number and the vulnerable code repository.
· Processing: A privately deployed LLM analyzes the CVE description, understands the root cause of the vulnerability, and analyzes it in the context of the code.
· Output: Automatically generates a code patch to fix the vulnerability for developers to review and apply.
This approach promises to shorten hours or even days of manual remediation work to just a few minutes, dramatically increasing the efficiency of security response.
OpenSource Security LLM: Democratizing Threat Modeling Capabilities Traditionally, advanced security activities like threat modeling required senior experts. The OpenSource Security LLM project explores how to train and utilize small, open-source LLMs to popularize these capabilities. The presenters will demonstrate how to use these lightweight models to:
· Assist in Threat Modeling: Automatically generate potential threat scenarios based on a system description.
· Automate Code Review: Analyze code snippets from a security perspective to identify potential vulnerabilities.
This foreshadows a future where every developer and security engineer can deploy an “AI Security Assistant” locally, thereby integrating security capabilities more broadly into the early stages of the development lifecycle.
IV. Conclusion and Outlook: Towards a Mature AI Security Ecosystem The eight tools showcased at the Black Hat Europe 2025 Arsenal clearly delineate the future trends of AI security tools. From systematic red team attack platforms to fine-grained LLM governance tools, and AI-powered automated defense solutions, traditional security tools are being comprehensively reshaped and accelerated towards maturity by AI:
Systematization of AI Red Teaming and Attack Simulation: Attack tools are evolving from single-function utilities to platform-based, automated, and intelligent systems, with corresponding cyber ranges for adversarial simulation also emerging. 1.
Refinement of LLM Security and Governance: Assessment and defense tools for prompt injection, data security, and multi-turn conversational safety are becoming more mature, forming a critical part of governance. 1.
Automation of AI-Powered Defense: AI is being deeply integrated into traditional security processes like vulnerability management and threat modeling to enhance efficiency and scalability.
Just like open-source large models, open-source AI security tools will become a core driving force for innovation in the security industry. The tools featured at Black Hat will greatly promote the dissemination and iteration of cutting-edge technologies. For all security practitioners, now is a critical moment not only to learn how to “;defend against AI” but also to learn how to “leverage AI” to revolutionize existing security practices. This new arms race centered around artificial intelligence has only just begun.
Reference Black hat. (n.d.).https://www.blackhat.com/eu-25/arsenal/schedule/index.html#track/ai-ml–data-science