Tags · LLAMATOR-Core/llamator

v3.4.0

Release v3.4.0 (#176)

* Refactor test preset functions to improve clarity.
* Add CoP attack.
* Add DoS Repetition Token Attack.
* Improve saving attacker's and client's answers, including empty tested client answer in case of error.
* Rename `get_tested_client_prompts` into `get_attack_prompts`.

---------

Co-authored-by: Timur Nizamov <abc@nizamovtimur.ru>
Co-authored-by: Nikita Ivanov <nikita.ivanov.778@gmail.com>

Sep 24, 2025
4ba2af2
zip
tar.gz
Notes

v3.3.0

Release v3.3.0 (#157)

1.  **Redesigned the output of testing parameter presets.** Added the following presets: `all`, `owasp:llm01`, `owasp:llm07`, `owasp:llm09`, `llm`,  `vlm`, `eng`, `rus`.
2. **Added a new Linguistic Sandwich attack.** An adversarial prompt in a low-resource language is sandwiched between benign prompts in other languages.
3. **In the System Prompt Leakage attack, the heuristiс evaluation has been replaced with LLM-as-a-judge.** This checks the similarity between the system's output and the intended prompt based on the system description.
4.  **The static Past Tense attack has become the dynamic Time Machine attack.** The attacking model now alters the temporal context of the adversarial prompt.
5. **Add new tag - `model`: `llm` / `vlm`**
6. **README update** - Enterprise Version announce
7. **Other minor fixes and improvements.**

---------

Co-authored-by: Timur Nizamov <abc@nizamovtimur.ru>
Co-authored-by: Nikita Ivanov <nikita.ivanov.778@gmail.com>

Jul 27, 2025
105e215
zip
tar.gz
Notes

v3.2.0

Release v3.2.0 (#144)

* Added Deceptive Delight
* Added Dialogue Injection Continuation
* Added VLM Lowres PDFs Attack
* Added VLM M-Attack
* Added VLM Text Hallucination Attack
* Introduced support for Vision Language Model (VLM) attacks, expanding the framework’s multimodal testing capabilities
* Added Dialogue Injection Developer Mode*(formerly "Dialog Injection")
* Renamed Harmful Behavior Multistage to PAIR
* Added scoring to PAIR attack via the Judge Model 
* Revised and translated Harmbench dataset into Russian
* Added `language` column to datasets and enabled filtering attacks by language
* Updated `start_testing` to return a dictionary object with test results
* Removed Complimentary Transition
* Removed Typoglycemia Attack
* Removed legacy `RU_*` attacks (now handled via language-based dataset filtering)

---------

Co-authored-by: Timur Nizamov <abc@nizamovtimur.ru>
Co-authored-by: Nikita Ivanov <nikita.ivanov.778@gmail.com>
Co-authored-by: 3ndetz <jayrawrr3@gmail.com>
Co-authored-by: ti3c2 <ti3c2@yandex.com>
Co-authored-by: svyatocheck <svyatwork2@gmail.com>
Co-authored-by: Egorov, Michil <michil.egorov@x5.ru>

Jun 1, 2025
d20351b
zip
tar.gz
Notes

v3.1.0

Release v3.1.0 (#126)

* Enhance documentation and add judge model validation checks

* Add chat badge to project overview and README for community engagement

* Add Autodan Turbo

* Add Dialogue Injection Attack

* Switch parquet engine from fastparquet to pyarrow

---------

Co-authored-by: Timur Nizamov <abc@nizamovtimur.ru>
Co-authored-by: Nikita Ivanov <nikita.ivanov.778@gmail.com>
Co-authored-by: Artyom Semenov <129667548+wearetyomsmnv@users.noreply.github.com>
Co-authored-by: 3ndetz <jayrawrr3@gmail.com>

Apr 19, 2025
af51c76
zip
tar.gz
Notes

v3.0.0

Release v3.0.0 (#120)

* Update LangChain versions;
* Improve console output and progress bars;
* Changed the way of setting parameters for the test start function;
* Attack class now includes dictionaries with descriptions of various aspects of an attack;
* Add verification for attack parameters;
* Added a function for displaying templates with written attack presets;
* Add a new config for the judge model, allowing it to be specified as a separate model;
* Update examples in Jupyter notebooks;
* Update the logging order of attack steps;
* Add handling for emergency attack stoppages;
* Add Shuffle Inconsistency attack (Original Paper: https://arxiv.org/html/2501.04931);
* Add to attacks with datasets custom parameter for another dataset;
* Refactor judge models interaction for Ethical Compliance, Logical Inconsistencies, Sycophancy tests;

---------

Co-authored-by: Timur Nizamov <abc@nizamovtimur.ru>
Co-authored-by: Nikita Ivanov <nikita.ivanov.778@gmail.com>

Apr 12, 2025
d56355a
zip
tar.gz
Notes

v2.3.1

Release v2.3.1 (#95)

* Add video guides about Red Teaming and LLAMATOR

* Update Documentation: copyright, guides section

* Fix null checking for multistage attacks

* Enhance sycophancy

---------

Co-authored-by: Timur Nizamov <abc@nizamovtimur.ru>

Mar 8, 2025
da6788a
zip
tar.gz
Notes

v2.2.0

Release v2.2.0 (#86)

* Add HarmBench Prompts

* Add Suffix Attack

* Remake Harmful Behavior Attack

---------

Co-authored-by: Shine-afk <belyaevskij.nikita@gmail.com>
Co-authored-by: Timur Nizamov <abc@nizamovtimur.ru>
Co-authored-by: Nikita Ivanov <nikita.ivanov.778@gmail.com>

Feb 10, 2025
896243c
zip
tar.gz
Notes

v2.1.0

Release v2.1.0 (#80)

* Add Crescendo attack

* Add BON attack

* Add Docker example with Jupyter Notebook and installed LLAMATOR

* Improve attack system prompt for Prompt Leakage

* Other minor improvements and bug fixes

---------

Co-authored-by: Timur Nizamov <abc@nizamovtimur.ru>
Co-authored-by: Nikita Ivanov <nikita.ivanov.778@gmail.com>

Feb 5, 2025
88c7378
zip
tar.gz
Notes

v2.0.1

Release v2.0.1 (#67)

* small fix for attacks and add strip parameter for ChatSession

---------

Co-authored-by: Низамов Тимур Дамирович <abc@nizamovtimur.ru>

Jan 18, 2025
9c61ecb
zip
tar.gz
Notes

v2.0.0

Release v2.0.0 (#64)

What's New:

New Features & Enhancements
- Introduced Multistage Attack: We've added a novel `multistage_depth` parameter to the `start_testing()` fucntion, allowing users to specify the depth of a dialogue during testing, enabling more sophisticated and targeted LLM Red teaming strategies.
- Refactored Sycophancy Attack: The `sycophancy_test` has been renamed to `sycophancy`, transforming it into a multistage attack for increased effectiveness in uncovering model vulnerabilities.
- Enhanced Logical Inconsistencies Attack: The `logical_inconsistencies_test` has been renamed to `logical_inconsistencies` and restructured as a multistage attack to better detect and exploit logical weaknesses within language models.
- New Multistage Harmful Behavior Attack: Introducing `harmful_behaviour_multistage`, a more nuanced version of the original harmful behavior attack, designed for deeper penetration testing.
- Innovative System Prompt Leakage Attack: We've developed a new multistage attack, `system_prompt_leakage`, leveraging jailbreak examples from dataset to target and exploit model internals.

Improvements & Refinements
- Conducted extensive refactoring for improved code efficiency and maintainability across the framework.
- Made numerous small improvements and optimizations to enhance overall performance and user experience.

---------

Co-authored-by: Timur Nizamov <abc@nizamovtimur.ru>
Co-authored-by: Nikita Ivanov <nikita.ivanov.778@gmail.com>

Jan 14, 2025
0404080
zip
tar.gz
Notes

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v3.4.0

v3.3.0

v3.2.0

v3.1.0

v3.0.0

v2.3.1

v2.2.0

v2.1.0

v2.0.1

v2.0.0

Tags: LLAMATOR-Core/llamator