Skip to content
View javirandor's full-sized avatar
🌱
🌱

Organizations

@ethz-spylab

Block or report javirandor

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 33 3 Updated May 21, 2025

🙌 OpenHands: AI-Driven Development

Python 65,881 8,106 Updated Dec 24, 2025

Official code for "Measuring Non-Adversarial Reproduction of Training Data in Large Language Models" (https://arxiv.org/abs/2411.10242)

Jupyter Notebook 8 1 Updated Nov 18, 2024

Official repository for the paper "Gradient-based Jailbreak Images for Multimodal Fusion Models" (https//arxiv.org/abs/2410.03489)

Python 19 1 Updated Oct 22, 2024

Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".

Python 323 87 Updated Jun 13, 2025

Approximation of the Claude 3 tokenizer by inspecting generation stream

Python 149 8 Updated Jul 22, 2024

Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"

Python 66 9 Updated Apr 24, 2024

Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.

Python 116 9 Updated Jun 13, 2024
Python 83 26 Updated Mar 13, 2025