-
๐ญ Iโm a Ph.D. student ๐ฉโ๐ at CISPA Helmholtz Center for Information Security, focused on Trustworthy Machine Learning Security.
-
๐ฑ Iโm also a sci-fiction writer ๐จ and publish novels on Science Fiction World (ใ็งๅนปไธ็ใ) and so on.
-
โก I love reading ๐ , handcrafting ๐จ , RPG games ๐ฎ , and every creative thing. I'm trying to fall in love with fitness ๐โโ๏ธ, but it hasn't worked out yet ๐ช .
๐ง
Highlights
- Pro
Pinned Loading
-
jailbreak_llms
jailbreak_llms Public[CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).
-
prompt-stealing-attack
prompt-stealing-attack Public[USENIX'24] Prompt Stealing Attacks Against Text-to-Image Generation Models
-
TrustAIRLab/GPTracker
TrustAIRLab/GPTracker Public[S&P'25] GPTracker: A Large-Scale Measurement of Misused GPTs
Python 6
-
TrustAIRLab/HateBench
TrustAIRLab/HateBench Public[USENIX'25] HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns
-
-
TrustAIRLab/VoiceJailbreakAttack
TrustAIRLab/VoiceJailbreakAttack PublicCode for Voice Jailbreak Attacks Against GPT-4o.
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.