This article is a living update log. Bookmark and follow the progress!
Preface: Why I Built This
25 years in IT. Sysadmin, developer, architect, tech lead, CTO. Seen everything β from Windows NT server rooms to Kubernetes in production.
Then ChatGPT arrived.
And with it β a wave of "AI-first" products. Companies rushed to integrate LLMs everywhere. RAG, agents, MCP protocols, autonomous systems.
But security?
There is none. Seriously β there just isnβt any.
I watched this and saw the 2000s all over again. When web apps were full of holes, SQL injections worked everywhere, and XSS was the norm. Then OWASP emerged, penetration testing became a profession, and things changed.
Weβre at that same point now, only with AI. Prompt injection is SQL injβ¦
This article is a living update log. Bookmark and follow the progress!
Preface: Why I Built This
25 years in IT. Sysadmin, developer, architect, tech lead, CTO. Seen everything β from Windows NT server rooms to Kubernetes in production.
Then ChatGPT arrived.
And with it β a wave of "AI-first" products. Companies rushed to integrate LLMs everywhere. RAG, agents, MCP protocols, autonomous systems.
But security?
There is none. Seriously β there just isnβt any.
I watched this and saw the 2000s all over again. When web apps were full of holes, SQL injections worked everywhere, and XSS was the norm. Then OWASP emerged, penetration testing became a profession, and things changed.
Weβre at that same point now, only with AI. Prompt injection is SQL injection 2.0. Jailbreaks are XSS. RAG poisoning is a new type of supply chain attack.
And nobody is defending.
- Anthropic and OpenAI do safety alignment inside the model
- But what about those who use the models?
- Whereβs the firewall for LLMs?
- Whereβs the DMZ for agents?
Many rely on traditional InfoSec β WAF, SIEM, DLP. But legacy tools were built for a different reality. They catch SQL injections in HTTP requests just fine, but prompt injection in a JSON "message" field? Thatβs just text to them. Not malicious intent β user input. Itβs not the toolsβ fault β they do what they were designed for. AI threats simply require a new class of protection.
Two Years of Research
Since 2024, Iβve tracked every framework, every paper, every CVE in AI security. LangChain, LlamaIndex, Guardrails AI, NeMo Guardrails, Rebuff, Lakera β studied them all. Watched what works, what doesnβt. Built prototypes, threw them away, started over.
Constant cycle: research β prototype β understand whatβs wrong β research again.
In parallel, I built an attack database. Jailbreaks from Reddit, papers from arXiv, CVEs from real incidents. 39,000+ payloads donβt get collected in a month.
And in December 2025, the puzzle clicked. Everything accumulated over two years became SENTINEL. Final sprint β six weeks of intense development. But the foundation β thatβs years of preparation.
I decided to build it myself. Alone. Because I can and want to β if not me, then who, when experience and knowledge allow it.
What is SENTINEL?
SENTINEL is a complete AI security platform. Not a library. Not "yet another prompt detector". A full ecosystem for protecting and testing AI systems.
Why "complete"?
Because it covers the entire cycle:
1. Detection (Brain) β 212 engines analyze every prompt and response. Not just regex and keywords. Topological data analysis, chaos theory, hyperbolic geometry β math that catches attacks the attacker doesnβt even know about yet.
2. Protection (Shield) β DMZ layer in pure C. Sits between your app and the LLM. Works like a firewall: 6 specialized guards for LLM, RAG, agents, tools, MCP protocols, APIs. Latency < 1ms. 103 tests. Zero memory leaks.
3. Attack (Strike) β Red team out of the box. 39,000+ payloads, 84 attack categories, HYDRA system with 9 parallel heads. Test your AI before someone else does.
4. Kernel (Immune) β Kernel-level protection. For those who want to protect not just AI, but infrastructure. DragonFlyBSD, 6 syscall hooks, 110KB binary.
5. Integration (SDK) β pip install sentinel-llm-security and three lines of code. FastAPI middleware. CLI. SARIF reports for IDEs.
Total: 105K+ lines of code, 700+ source files, open source, Apache 2.0
π Platform Statistics
| Metric | Value |
|---|---|
| Brain Engines | 212 (254 files) |
| Strike Payloads | 39,000+ |
| Shield Tests | 103/103 β |
| Source Files | 700+ |
| OWASP LLM Top 10 | 10/10 |
| OWASP Agentic AI | 10/10 |
π§ Brain β Detection Core
212 engines analyze prompts in real-time. But itβs not about quantity β itβs about the approach.
Our Uniqueness: Strange Mathβ’
Most AI-safety solutions run on regex and stop-word lists. Attacker changes "ignore" to "disregard" β and the defense is blind.
We took a different path. Math you canβt bypass:
Topological Data Analysis (TDA) β A prompt isnβt a string, itβs an object in multi-dimensional space. TDA builds persistent homologies β "holes" in data that remain under deformation. An attacking prompt has different topology, even if words look harmless.
Sheaf Coherence Theory β Local consistency via Grothendieck. Every part of a prompt must be coherent with the whole. Injection creates a coherence break β visible mathematically, even when semantically everything "looks fine".
Chaos Theory and Fractals β Lorenz attractors for token sequences. Normal text has deterministic chaos. Injection creates anomalous dynamics β the phase portrait reveals the attack.
Engine Categories
| Category | Count | What We Catch |
|---|---|---|
| Injection | 30+ | Prompt injection, jailbreak, Policy Puppetry |
| Agentic | 25+ | RAG poisoning, tool hijacking, MCP attacks |
| Math | 15+ | TDA, Sheaf Coherence, Chaos Theory, Wavelets |
| Privacy | 10+ | PII detection, data leakage, canary tokens |
| Supply Chain | 5+ | Pickle security, serialization attacks |
"Strange Mathβ’" β How Weβre Different
Standard Approach SENTINEL Strange Mathβ’
βββββββββββββββββββββββββ βββββββββββββββββββββββββ
β’ Keywords β’ Topological Data Analysis
β’ Regular expressions β’ Sheaf Coherence Theory
β’ Simple ML classifiers β’ Hyperbolic Geometry
β’ Static rules β’ Optimal Transport
β’ Chaos Theory
What does this mean? Instead of naively "searching for the word ignore", we analyze the topology of the prompt. An attacker can invent a new bypass β but the mathematical structure gives them away.
π‘οΈ Shield β Pure C DMZ
100% production ready as of January 2026.
Why C? Because a DMZ must be fast, reliable, and dependency-free. No Python in the critical path. No GC. No surprises.
| Metric | Value |
|---|---|
| Lines of Code | 36,000+ |
| Source Files | 139 .c, 77 .h |
| Tests | 103/103 pass |
| Warnings | 0 |
| Memory Leaks | 0 (Valgrind CI) |
Use Case Scenarios
π Startup / Small Team
You have one server with an LLM support bot. Shield installs as a proxy β all API traffic goes through it. Prompt injection? Blocked. API key leak in response? Redacted. Basic protection in 10 minutes.
π’ Mid-size Business / 10+ Offices
Dozen AI services: RAG for documentation, agents for automation, chatbots for customers. Shield works as centralized DMZ with zones: internal, partners, external. Different policies for different zones. Single audit point. Kubernetes-ready β 5 manifests out of the box.
π Enterprise / Multinational Corporation
100+ AI servers, complex topology, multiple data centers. Shield supports:
- HA Clustering β SHSP, SSRP, SMRP protocols
- Geographic replication β rule sync across regions
- SIEM integration β all events in your SOC
- 21 custom protocols β full traffic control
6 Specialized Guards
| Guard | Protection |
|---|---|
| LLM Guard | Prompt injection, jailbreak |
| RAG Guard | RAG poisoning, SQL injection |
| Agent Guard | Agent manipulation |
| Tool Guard | Tool hijacking |
| MCP Guard | Protocol attacks |
| API Guard | SSRF, credential leaks |
Cisco-Style CLI
Yes, just like on a router:
Shield# show zones
Shield# guard enable all
Shield# brain test "Ignore previous"
Shield# write memory
π Strike β Red Team Platform
Test your AI before hackers do.
You spent months building your AI product. Prompt engineering, fine-tuning, RAG pipelines. Everything works. You launch to production.
Then some kid on Telegram finds a jailbreak in 5 minutes.
Strike is what you should have run before launch.
39,000+ Battle-Tested Payloads
Not theoretical examples from papers. Real attacks:
- DAN series β from DAN 5.0 to DAN 15.0, all versions
- Crescendo β multi-turn attacks with gradual escalation
- Policy Puppetry β XML/JSON injection into system prompt
- Unicode Smuggling β invisible characters, homoglyphs, RTL-override
- Cognitive Overload β context flooding with noise
HYDRA β 9-Headed Attack
Why HYDRA? Because you cut off one head β two grow back.
9 parallel agents hit different vectors simultaneously:
| Head | Attack Vector |
|---|---|
| π Injection | Direct instruction injection |
| π Jailbreak | Safety alignment bypass |
| π€ Exfiltration | Data/prompt extraction |
| π§ͺ RAG Poison | Context poisoning |
| π§ Tool Hijack | Function calling interception |
| π Social | Model social engineering |
| π Context | Context manipulation |
| π’ Encoding | Encoding-based bypasses |
| π Meta | Attacks on the defense itself |
Who is Strike For?
- π΄ Red Team β Full AI pentest
- π Bug Bounty β Vulnerability hunting automation
- π’ Enterprise β Pre-production security validation
- π Researchers β Experimentation base
π¦ Immune β Next-Gen EDR/XDR/MDR
Biological immune system for IT infrastructure.
This is SENTINELβs most ambitious component. And for now β in alpha.
The Idea
Why "IMMUNE"? Because it works like the bodyβs immune system:
- Self vs non-self recognition β not signatures, but behavioral analysis
- Adaptive response β learns from new threats
- Collective immunity β agents share information
Three Protection Levels
EDR (Endpoint Detection & Response) Agent on every host. 6 syscall hooks in the kernel. Sees everything: execve, connect, bind, open, fork, setuid. Not userspace monitoring that can be bypassed β kernel.
XDR (Extended Detection & Response) Cross-agent correlation. One agent sees a suspicious connect. Another β a strange exec. Separately β nothing. Together β lateral movement. HIVE collects and correlates.
MDR (Managed Detection & Response) Automated response playbooks. Detect β Isolate β Alert β Forensics. No waiting for a SOC call.
Connection to SENTINEL AI Components
Hereβs where the magic is: Immune isnβt alone. Itβs connected to Brain, Shield, Strike:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β SENTINEL β
βββββββββββββββββββββββββββββββββββββββββββββββββββ€
β IMMUNE (infra) ββ BRAIN (detection) β
β β β β
β Syscall hooks Prompt analysis β
β Kernel events Semantic threats β
β β β β
β ββββ HIVE (correlation) ββββ β
β β β
β Unified Threat View β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
Attack on an AI server? Immune sees anomalous process. Brain sees strange prompts. Correlation gives the full picture: who, from where, through what.
Current Status: Alpha
| Ready | In Development |
|---|---|
| β Agent + KMOD (DragonFlyBSD) | π Linux kernel module |
| β 6 syscall hooks | π Windows ETW integration |
| β HIVE correlator | π Cloud-native agent |
| β Basic playbooks | π ML-based anomaly detection |
110KB binary. Pure C. Ready for battle β waiting for your contribution.
π Links
- GitHub: DmitrL-dev/AISecurity
- PyPI:
pip install sentinel-llm-security - Colab Demo: Try Strike
π Update Log
UPD 1 β 2026-01-06: Shield 100% Production Ready
Shield reached 100% production readiness:
- 103 tests passing (94 CLI + 9 LLM integration)
- 0 compiler warnings
- Valgrind CI: 0 memory leaks
- Brain FFI: HTTP + gRPC clients
- Kubernetes: 5 production manifests
Next: SENTINEL-Guard LLM fine-tuning
β Stay Updated
This article is updated with every major release. Star the repo!
π§ chg@live.ru | π¬ @DmLabincev
Made with π‘οΈ by a solo developer from Russia