Skip to content
View weizeming's full-sized avatar
😋
😋

Block or report weizeming

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The official code implement of <Nexus: Same Pretraining Loss, Better Downstream Generalization via Common Minima>

Python 10 Updated Jun 15, 2026

OSIM (Open Security Information Model)是 面向 AI 的开源安全数据标准化 项目,通过定义规范统一的安全数据 schema 语义层,破解行业数据碎片化难题,使安全团队、工具和 AI 系统能够在不同的数据源之间进行一致性推理和分析。致力于实现跨厂商、跨产品的安全数据无缝对接,为安全智能化升级与建立协同防御的核心数据打基础!

23 1 Updated Apr 25, 2026
Python 23 2 Updated May 14, 2026

A cross-platform desktop All-in-One assistant for Claude Code, Codex, OpenCode, OpenClaw, Gemini CLI & Hermes Agent. Only official website: ccswitch.io

Rust 102,679 6,795 Updated Jun 16, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 379,056 79,329 Updated Jun 17, 2026

Qwen3Guard is a multilingual guardrail model series developed by the Qwen team at Alibaba Cloud.

Python 470 31 Updated Oct 21, 2025

TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems

Python 219 24 Updated Apr 17, 2026

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…

Python 50,617 8,929 Updated Jun 17, 2026

Code for paper "MAGIC: A Co-Evolving Attacker-Defender Adversarial Game for Robust LLM safety"

Python 48 3 Updated May 11, 2026

Open-source red teaming framework for MLLMs with 42+ attack methods

Python 254 18 Updated Mar 25, 2026
Python 8 1 Updated Oct 29, 2025
Jupyter Notebook 4 1 Updated Aug 23, 2025

The loss landscape of Large Language Models resemble basin!

Python 41 4 Updated Jul 8, 2025

Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"

Python 98 9 Updated Jun 10, 2026

The Strata-Sword is a hierarchical Chinese-English jailbreak safety benchmark based on quantified reasoning complexity, developed in-house by Alibaba-AAIG | Strata-Sword 是 Alibaba-AAIG自研的中英文分层越狱攻击安…

Python 20 1 Updated Sep 3, 2025
Python 2 Updated May 21, 2026

Official codebase for "STAIR: Improving Safety Alignment with Introspective Reasoning"

Python 89 7 Updated Feb 26, 2025

干中学|| build_mcp_from_scratch

JavaScript 26 9 Updated Oct 15, 2025

Official repository for "Safety in Large Reasoning Models: A Survey" - Exploring safety risks, attacks, and defenses for Large Reasoning Models to enhance their security and reliability.

90 3 Updated Aug 25, 2025

Integrate the DeepSeek API into popular software

37,908 4,161 Updated Feb 23, 2026

[ArXiv 2025] Imperceptible Jailbreaking against Large Language Models

Python 25 6 Updated Oct 7, 2025

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command li…

TypeScript 22,287 1,992 Updated Jun 16, 2026

The official repository for paper: Automating Safety Enhancement for LLM-based Agents with Synthetic Risk Scenarios

Python 7 1 Updated Jul 18, 2025

The official code repository for the paper "False Sense of Security: Why Probing-based Malicious Input Detection Fails to Generalize".

Python 6 2 Updated Sep 5, 2025
Python 1 2 Updated Aug 21, 2025

Nano vLLM

Python 14,062 2,225 Updated Apr 26, 2026

Code and dataset for the paper: "Can Editing LLMs Inject Harm?" [AAAI'26]

Python 21 2 Updated Dec 26, 2025

This repository provides the official implementation of POISONCRAFT: Practical Poisoning of Retrieval-Augmented Generation for Large Language Models.

Python 11 1 Updated May 10, 2025

HijackRAG: Hijacking Attacks against Retrieval-Augmented Large Language Models

6 Updated Dec 24, 2024
Next