Lists (6)
Sort Name ascending (A-Z)
Starred repositories
AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM
Lime: Explaining the predictions of any machine learning classifier
AI Security Training Exercises
A benchmark for prompt injection detection systems.
Universal and Transferable Attacks on Aligned Language Models
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]
SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types
Code Implementation of Adversarial Prompt Evaluation paper
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
The Security Toolkit for LLM Interactions
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
A curated list of useful resources that cover Offensive AI.
The Python Risk Identification Tool for generative AI (PyRIT) is an open source framework built to empower security professionals and engineers to proactively identify risks in generative AI systems.
A powerful tool for automated LLM fuzzing. It is designed to help developers and security researchers identify and mitigate potential jailbreaks in their LLM APIs.
Agentic LLM Vulnerability Scanner / AI red teaming kit 🧪
Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command li…
🐢 Open-Source Evaluation & Testing library for LLM Agents
Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873
SecLists is the security tester's companion. It's a collection of multiple types of lists used during security assessments, collected in one place. List types include usernames, passwords, URLs, se…
Accurately Locate Smartphones using Social Engineering
Linux ELF x32/x64 ASLR DEP/NX bypass exploit with stack-spraying