- 🌱 I’m currently learning and working on LLMs Safety
- 🔭 I’m currently a research intern at ShanghaiAILab
- 👯 I’m looking to collaborate on any research topics related to LLMs and Agents
- 💬 Ask me about anything here
- 📫 Reach me by email: hxh_create@outlook.com
-
Fudan Univ | BIT
- Shanghai, China
Pinned Loading
-
OpenSafetyLab/SALAD-BENCH
OpenSafetyLab/SALAD-BENCH Public【ACL 2024】 SALAD benchmark & MD-Judge
-
AI45Lab/VLSBench
AI45Lab/VLSBench Public[ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety
-
AI45Lab/IS-Bench
AI45Lab/IS-Bench PublicData and Code for Paper IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks
-
LLM_Deceive_Unintentionally
LLM_Deceive_Unintentionally PublicExperimental resources for paper titled "LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions"
Python 5
-
wonderNefelibata/Awesome-LRM-Safety
wonderNefelibata/Awesome-LRM-Safety PublicAwesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as DeepSeek-R1 and OpenAI o1, which are currently very popular.
If the problem persists, check the GitHub status page or contact support.