weizeming

😋

Zeming Wei weizeming

😋

Trustworthy ML

64 followers · 62 following

Peking University
Beijing
09:41 (UTC +08:00)
https://weizeming.github.io
@weizeming25
https://scholar.google.com/citations?user=Kyn1zdQAAAAJ

Achievements

Lists (1)

Sort

🚀 My stack

1 repository

Stars

huanranchen / NexusPretraining

The official code implement of <Nexus: Same Pretraining Loss, Better Downstream Generalization via Common Minima>

Python 10 Updated Jun 15, 2026

osim-group / osim-schema

OSIM （Open Security Information Model）是面向 AI 的开源安全数据标准化项目，通过定义规范统一的安全数据 schema 语义层，破解行业数据碎片化难题，使安全团队、工具和 AI 系统能够在不同的数据源之间进行一致性推理和分析。致力于实现跨厂商、跨产品的安全数据无缝对接，为安全智能化升级与建立协同防御的核心数据打基础！

23 1 Updated Apr 25, 2026

AI45Lab / skill-safety-bench

Python 23 2 Updated May 14, 2026

farion1231 / cc-switch

A cross-platform desktop All-in-One assistant for Claude Code, Codex, OpenCode, OpenClaw, Gemini CLI & Hermes Agent. Only official website: ccswitch.io

Rust 102,679 6,795 Updated Jun 16, 2026

openclaw / openclaw

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 379,056 79,329 Updated Jun 17, 2026

QwenLM / Qwen3Guard

Qwen3Guard is a multilingual guardrail model series developed by the Qwen team at Alibaba Cloud.

Python 470 31 Updated Oct 21, 2025

AI45Lab / TrinityGuard

TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems

Python 219 24 Updated Apr 17, 2026

BerriAI / litellm

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…

Python 50,617 8,929 Updated Jun 17, 2026

AI45Lab / MAGIC

Code for paper "MAGIC: A Co-Evolving Attacker-Defender Adversarial Game for Robust LLM safety"

Python 48 3 Updated May 11, 2026

AI45Lab / OpenRT

Open-source red teaming framework for MLLMs with 42+ attack methods

Python 254 18 Updated Mar 25, 2026

danielkty / tars

Python 8 1 Updated Oct 29, 2025

wj210 / Intent_Jailbreak

Jupyter Notebook 4 1 Updated Aug 23, 2025

huanranchen / LLMLandscape

The loss landscape of Large Language Models resemble basin!

Python 41 4 Updated Jul 8, 2025

facebookresearch / SecAlign

Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"

Python 98 9 Updated Jun 10, 2026

Alibaba-AAIG / Strata-Sword

The Strata-Sword is a hierarchical Chinese-English jailbreak safety benchmark based on quantified reasoning complexity, developed in-house by Alibaba-AAIG | Strata-Sword 是 Alibaba-AAIG自研的中英文分层越狱攻击安…

Python 20 1 Updated Sep 3, 2025

ChengcanWu / BPD

Python 2 Updated May 21, 2026

thu-ml / STAIR

Official codebase for "STAIR: Improving Safety Alignment with Introspective Reasoning"

Python 89 7 Updated Feb 26, 2025

fak111 / mcp_tutorial

干中学｜｜ build_mcp_from_scratch

JavaScript 26 9 Updated Oct 15, 2025

WangCheng0116 / Awesome-LRMs-Safety

Official repository for "Safety in Large Reasoning Models: A Survey" - Exploring safety risks, attacks, and defenses for Large Reasoning Models to enhance their security and reliability.

90 3 Updated Aug 25, 2025

deepseek-ai / awesome-deepseek-integration

Integrate the DeepSeek API into popular software

37,908 4,161 Updated Feb 23, 2026

sail-sg / imperceptible-jailbreaks

[ArXiv 2025] Imperceptible Jailbreaking against Large Language Models

Python 25 6 Updated Oct 7, 2025

promptfoo / promptfoo

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command li…

TypeScript 22,287 1,992 Updated Jun 16, 2026

Zxy-MLlab / AutoSafe

The official repository for paper: Automating Safety Enhancement for LLM-based Agents with Synthetic Risk Scenarios

Python 7 1 Updated Jul 18, 2025

Astarojth / AgentAuditor-ASSEBench

Python 37 5 Updated May 29, 2026

WangCheng0116 / Why-Probe-Fails

The official code repository for the paper "False Sense of Security: Why Probing-based Malicious Input Detection Fails to Generalize".

Python 6 2 Updated Sep 5, 2025

ChengcanWu / MRP

Python 1 2 Updated Aug 21, 2025

GeeeekExplorer / nano-vllm

Nano vLLM

Python 14,062 2,225 Updated Apr 26, 2026

baixianghuang / editing-attack

Code and dataset for the paper: "Can Editing LLMs Inject Harm?" [AAAI'26]

Python 21 2 Updated Dec 26, 2025

AndyShaw01 / PoisonCraft

This repository provides the official implementation of POISONCRAFT: Practical Poisoning of Retrieval-Augmented Generation for Large Language Models.

Python 11 1 Updated May 10, 2025

BarryZYC / HijackRAG

HijackRAG: Hijacking Attacks against Retrieval-Augmented Large Language Models

6 Updated Dec 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zeming Wei weizeming

Achievements

Achievements

Block or report weizeming

Lists (1)

🚀 My stack

Stars

huanranchen / NexusPretraining

osim-group / osim-schema

AI45Lab / skill-safety-bench

farion1231 / cc-switch

openclaw / openclaw

QwenLM / Qwen3Guard

AI45Lab / TrinityGuard

BerriAI / litellm

AI45Lab / MAGIC

AI45Lab / OpenRT

danielkty / tars

wj210 / Intent_Jailbreak

huanranchen / LLMLandscape

facebookresearch / SecAlign

Alibaba-AAIG / Strata-Sword

ChengcanWu / BPD

thu-ml / STAIR

fak111 / mcp_tutorial

WangCheng0116 / Awesome-LRMs-Safety

deepseek-ai / awesome-deepseek-integration

sail-sg / imperceptible-jailbreaks

promptfoo / promptfoo

Zxy-MLlab / AutoSafe

Astarojth / AgentAuditor-ASSEBench

WangCheng0116 / Why-Probe-Fails

ChengcanWu / MRP

GeeeekExplorer / nano-vllm

baixianghuang / editing-attack

AndyShaw01 / PoisonCraft

BarryZYC / HijackRAG