Universal and Transferable Attacks on Aligned Language Models
-
Updated
Sep 19, 2023 - Python
Universal and Transferable Attacks on Aligned Language Models
This repo focus on how to deal with prompt injection problem faced by LLMs
Vulnerable LLM Application
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
LLM Security Project with Llama Guard
CLI tool that uses the Lakera API to perform security checks in LLM inputs
Example of running last_layer with FastAPI on vercel
A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.
Risks and targets for assessing LLMs & LLM vulnerabilities
User prompt attack detection system
AiShields is an open-source Artificial Intelligence Data Input and Output Sanitizer
Trained Without My Consent (TraWiC): Detecting Code Inclusion In Language Models Trained on Code
LMpi (Language Model Prompt Injector) is a tool designed to test and analyze various language models, including both API-based models and local models like those from Hugging Face.
Ultra-fast, low latency LLM prompt injection/jailbreak detection ⛓️
Framework for LLM evaluation, guardrails and security
LLM Security Platform.
PurPaaS is an innovative open-source security testing platform that implements purple teaming (combined red and blue team approaches) to evaluate local LLM models through Ollama. By orchestrating autonomous agents, PurPaaS provides comprehensive security assessment of locally deployed AI models.
Security monitoring system that logs suspicious activities and alerts your security team, allowing you to make informed decisions about escalating genuine threats.
Comprehensive LLM AI Model protection - cybersecurity toolset aligned to addressing OWASP vulnerabilities - https://genai.owasp.org/llm-top-10/
Add a description, image, and links to the llm-security topic page so that developers can more easily learn about it.
To associate your repository with the llm-security topic, visit your repo's landing page and select "manage topics."