Skip to content
View ydyjya's full-sized avatar

Block or report ydyjya

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering

Python 1,581 253 Updated Apr 24, 2026

Fogsight is an AI agent and animation engine powered by Large Language Models.

JavaScript 2,481 383 Updated Mar 21, 2026

πŸ€— smolagents: a barebones library for agents that think in code.

Python 27,840 2,687 Updated Jun 9, 2026

Universal memory layer for AI Agents

Python 58,491 6,720 Updated Jun 13, 2026

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

Python 98,678 11,010 Updated Jun 13, 2026

Playwright Model Context Protocol Server - Tool to automate Browsers and APIs in Claude Desktop, Cline, Cursor IDE and More πŸ”Œ

TypeScript 5,552 519 Updated Dec 13, 2025

TPAMI 2026 | This repository collects awesome survey, resource, and paper for lifelong learning LLM agents

Python 311 19 Updated Feb 5, 2026

OpenAlpha_Evolve is an open-source Python framework inspired by the groundbreaking research on autonomous coding agents like DeepMind's AlphaEvolve.

Python 1,025 152 Updated May 31, 2025

LIFEBENCH: Evaluating Length Instruction Following in Large Language Models

Python 17 2 Updated Apr 23, 2026

Open-source implementation of AlphaEvolve

Python 6,541 1,044 Updated Mar 18, 2026

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 68,769 8,787 Updated Jan 21, 2026

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

TeX 410 12 Updated Mar 2, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 πŸ“ and reasoning techniques.

6,896 370 Updated Dec 17, 2025

The code for AED which's a method to help LLM defend jailbreaks

Python 4 Updated Jul 29, 2024

[ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

Python 90 17 Updated Mar 30, 2025

S-Eval: Towards Automated and Comprehensive Safety Evaluation for Large Language Models

116 6 Updated Feb 13, 2026

[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"

157 8 Updated Sep 21, 2024

Using sparse coding to find distributed representations used by neural networks.

Jupyter Notebook 305 39 Updated Nov 10, 2023
Python 588 68 Updated Jul 19, 2024
Jupyter Notebook 60 4 Updated Jun 13, 2024

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]

Python 610 72 Updated Apr 4, 2025

Repository for "StrongREJECT for Empty Jailbreaks" paper

Jupyter Notebook 158 7 Updated Nov 3, 2024

LLM training in simple, raw C/CUDA

Cuda 30,206 3,640 Updated Jun 26, 2025

Train transformer language models with reinforcement learning.

Python 18,634 2,790 Updated Jun 13, 2026

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Python 8,489 828 Updated May 22, 2026

Papers and resources related to the security and privacy of LLMs πŸ€–

Python 580 44 Updated Jun 8, 2025

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

Python 627 67 Updated Jun 24, 2025
Next