Security scanner for code and logs of AI-powered applications
-
Updated
Jul 14, 2025 - Python
Security scanner for code and logs of AI-powered applications
Comprehensive LLM AI Model protection - cybersecurity toolset aligned to addressing OWASP vulnerabilities - https://genai.owasp.org/llm-top-10/
Stop prompt injections in 20ms. The safety toolkit every LLM app needs. No API keys, no complex setup, just `pip install llm-guard` and you're protected.
Simulating prompt injection and guardrail bypass across chained LLMs in security decision pipelines.
Security scanner for LLM/RAG applications - Test for prompt injection, jailbreaks, PII leakage, hallucinations & more
CLI tool that uses the Lakera API to perform security checks in LLM inputs
Open-source enforcement layer for LLM safety and governance — ingress/egress evaluation, policy packs, verifier support, and multimodal protection.
MalPromptSentinel (MPS) is a Claude Code skill that detects malicious prompts in uploaded files before Claude processes them. It provides two-tier scanning to identify prompt injection attacks, role manipulation attempts, privilege escalation, and other adversarial techniques.
MCP Guardian acts as a proxy service for remote MCP endpoints, and constantly polls them to make sure they haven't been compromised or modified.
A cross-provider AI model security scanner that evaluates HuggingFace, OpenRouter, and Ollama models for malicious content, unsafe code, license issues, and known vulnerabilities. Includes automated reports and risk scoring.
Research and defense implementation for prompt injection vulnerabilities in LLM applications
Universal and Transferable Attacks on Aligned Language Models
Bidirectional LLM security firewall providing risk reduction (not complete protection) for human/LLM interfaces. Hexagonal architecture with multi-layer validation of inputs, outputs, memory and tool state. Beta status. ~528 KB wheel, optional ML guards.
A prototype defense against prompt-based attacks with real-time threat assessment.
🔍 Discover vulnerabilities in LLMs with garak, a tool that probes for weaknesses like hallucination, data leakage, and misinformation effectively.
A Trustworthy and Secure Conversational Agent for Mental Healthcare
LMpi (Language Model Prompt Injector) is a tool designed to test and analyze various language models, including both API-based models and local models like those from Hugging Face.
Add a description, image, and links to the llm-security topic page so that developers can more easily learn about it.
To associate your repository with the llm-security topic, visit your repo's landing page and select "manage topics."