ydyjya

Zhenhong Zhou ydyjya

LLM Safety

135 followers · 10 following

Nanyang Technological University
Singapore
https://www.zhihu.com/people/warrior-18-53

Achievements

Stars

openai / mle-bench

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering

Python 1,581 253 Updated Apr 24, 2026

fogsightai / fogsight

Fogsight is an AI agent and animation engine powered by Large Language Models.

JavaScript 2,481 383 Updated Mar 21, 2026

huggingface / smolagents

🤗 smolagents: a barebones library for agents that think in code.

Python 27,840 2,687 Updated Jun 9, 2026

mem0ai / mem0

Universal memory layer for AI Agents

Python 58,491 6,720 Updated Jun 13, 2026

browser-use / browser-use

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

Python 98,678 11,010 Updated Jun 13, 2026

executeautomation / mcp-playwright

Playwright Model Context Protocol Server - Tool to automate Browsers and APIs in Claude Desktop, Cline, Cursor IDE and More 🔌

TypeScript 5,552 519 Updated Dec 13, 2025

qianlima-lab / awesome-lifelong-llm-agent

TPAMI 2026 | This repository collects awesome survey, resource, and paper for lifelong learning LLM agents

Python 311 19 Updated Feb 5, 2026

shyamsaktawat / OpenAlpha_Evolve

OpenAlpha_Evolve is an open-source Python framework inspired by the groundbreaking research on autonomous coding agents like DeepMind's AlphaEvolve.

Python 1,025 152 Updated May 31, 2025

LIFEBench / LIFEBench

LIFEBENCH: Evaluating Length Instruction Following in Large Language Models

Python 17 2 Updated Apr 23, 2026

algorithmicsuperintelligence / openevolve

Open-source implementation of AlphaEvolve

Python 6,541 1,044 Updated Mar 18, 2026

deepseek-ai / DeepSeek-R1

92,011 11,714 Updated Jun 27, 2025

FoundationAgents / MetaGPT

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 68,769 8,787 Updated Jan 21, 2026

IAAR-Shanghai / Awesome-Attention-Heads

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

TeX 410 12 Updated Mar 2, 2025

ydyjya / SafetyHeadAttribution

Python 70 7 Updated Jun 1, 2025

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,896 370 Updated Dec 17, 2025

GIGABaozi / AED

The code for AED which's a method to help LLM defend jailbreaks

Python 4 Updated Jul 29, 2024

boyiwei / alignment-attribution-code

[ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

Python 90 17 Updated Mar 30, 2025

IS2Lab / S-Eval

S-Eval: Towards Automated and Comprehensive Safety Evaluation for Large Language Models

116 6 Updated Feb 13, 2026

pillowsofwind / Knowledge-Conflicts-Survey

[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"

157 8 Updated Sep 21, 2024

HoagyC / sparse_coding

Using sparse coding to find distributed representations used by neural networks.

Jupyter Notebook 305 39 Updated Nov 10, 2023

openai / sparse_autoencoder

Python 588 68 Updated Jul 19, 2024

ydyjya / LLM-IHS-Explanation

Jupyter Notebook 60 4 Updated Jun 13, 2024

JailbreakBench / jailbreakbench

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]

Python 610 72 Updated Apr 4, 2025

alexandrasouly / strongreject

Repository for "StrongREJECT for Empty Jailbreaks" paper

Jupyter Notebook 158 7 Updated Nov 3, 2024

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 30,206 3,640 Updated Jun 26, 2025

huggingface / trl

Train transformer language models with reinforcement learning.

Python 18,634 2,790 Updated Jun 13, 2026

openai / transformer-debugger

Python 4,116 241 Updated Apr 15, 2026

OptimalScale / LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Python 8,489 828 Updated May 22, 2026

chawins / llm-sp

Papers and resources related to the security and privacy of LLMs 🤖

Python 580 44 Updated Jun 8, 2025

HowieHwong / TrustLLM

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

Python 627 67 Updated Jun 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zhenhong Zhou ydyjya

Achievements

Achievements

Block or report ydyjya

Stars

openai / mle-bench

fogsightai / fogsight

huggingface / smolagents

mem0ai / mem0

browser-use / browser-use

executeautomation / mcp-playwright

qianlima-lab / awesome-lifelong-llm-agent

shyamsaktawat / OpenAlpha_Evolve

LIFEBench / LIFEBench

algorithmicsuperintelligence / openevolve

deepseek-ai / DeepSeek-R1

FoundationAgents / MetaGPT

IAAR-Shanghai / Awesome-Attention-Heads

ydyjya / SafetyHeadAttribution

hijkzzz / Awesome-LLM-Strawberry

GIGABaozi / AED

boyiwei / alignment-attribution-code

IS2Lab / S-Eval

pillowsofwind / Knowledge-Conflicts-Survey

HoagyC / sparse_coding

openai / sparse_autoencoder

ydyjya / LLM-IHS-Explanation

JailbreakBench / jailbreakbench

alexandrasouly / strongreject

karpathy / llm.c

huggingface / trl

openai / transformer-debugger

OptimalScale / LMFlow

chawins / llm-sp

HowieHwong / TrustLLM