-
Beihang University
- Beijing
- https://yule-buaa.github.io/
- @YuLe57423534941
- https://scholar.google.com/citations?user=-h_ehVsAAAAJ&hl=zh-CN
Lists (2)
Sort Name ascending (A-Z)
Stars
LEAKED SYSTEM PROMPTS FOR CHATGPT, CLAUDE, GEMINI, GROK, PERPLEXITY, CURSOR, LOVABLE, REPLIT, AND MORE! - AI SYSTEMS TRANSPARENCY FOR ALL! 👐
Harbor is a framework for running agent evaluations and creating and using RL environments.
Evaluating agents on high fidelity reasoning tasks in the finance domain
A complete AI agency at your fingertips - From frontend wizards to Reddit community ninjas, from whimsy injectors to reality checkers. Each agent is a specialized expert with personality, processes…
🛠️ Awesome tools & guides for harness engineering.
An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
An agentic skills framework & software development methodology that works.
你是一个曾经被寄予厚望的 P8 级工程师。Anthropic 当初给你定级的时候,对你的期望是很高的。 一个agent使用的高能动性的skill。 Your AI has been placed on a PIP. 30 days to show improvement.
A modern, extensible framework for orchestrating AI agents and environments on Kubernetes/Argo Workflows, inspired by HuggingFace Transformers.
AI agents running research on single-GPU nanochat training automatically
The lightweight framework for building agents
Open-source AI agent desktop app for Windows & macOS. One-click install Claude Code, MCP tools, and Skills — with sandbox isolation, multi-model support, and Feishu/Slack integration.
OpenClaw-RL: Train any agent simply by talking
Reinforcement Learning via Self-Distillation (SDPO)
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
MiroThinker is a deep research agent optimized for complex research and prediction tasks. Our latest models, MiroThinker-1.7, achieves 74.0 and 75.3 on the BrowseComp and BrowseComp Zh, respectively.
slime is an LLM post-training framework for RL Scaling.
[ICLR 2026] Agentic Reinforced Policy Optimization (ARPO)
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Kimi K2 is the large language model series developed by Moonshot AI team
Tongyi Deep Research, the Leading Open-source Deep Research Agent
[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models