Skip to content
View zhaoyuzhi's full-sized avatar
😪
I may be slow to respond.
😪
I may be slow to respond.

Block or report zhaoyuzhi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

SkillOpt is a text-space optimizer that trains reusable natural-language skills for frozen LLM agents through trajectory-driven edits, validation-gated updates, and deployable best_skill.md artifacts.

Python 7,679 736 Updated Jun 15, 2026

Generative World Renderer: an AI-native Renderer for Games and Virtual Worlds. 面向游戏与虚拟世界的AI原生渲染引擎

Python 632 10 Updated May 5, 2026
Python 37 4 Updated May 29, 2026

你是一个曾经被寄予厚望的 P8 级工程师。Anthropic 当初给你定级的时候,对你的期望是很高的。 一个agent使用的高能动性的skill。 Your AI has been placed on a PIP. 30 days to show improvement.

TypeScript 18,289 1,103 Updated Jun 12, 2026
Python 52 8 Updated Oct 20, 2025

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 379,033 79,323 Updated Jun 16, 2026

Agent S: an open agentic framework that uses computers like a human

Python 11,856 1,401 Updated May 13, 2026

A distributed framework for LLM agents

Python 533 14 Updated Jun 16, 2026

Official Repo for AAAI 2026 paper, VP-Bench: A Comprehensive Benchmark for Visual Prompting in Multimodal Large Language Models.

Python 7 Updated Dec 2, 2025

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

Python 3,501 262 Updated Feb 8, 2026

A benchmark for LLMs on complicated tasks in the terminal

Python 2,365 542 Updated Jan 22, 2026

τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains

Python 1,361 354 Updated Jun 11, 2026

This is a repository dedicated to high quality figures from EMNLP 2025 long papers.

52 6 Updated Dec 15, 2025

Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs

Jupyter Notebook 329 29 Updated Jun 7, 2024

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 22,004 4,085 Updated Jun 16, 2026

Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent with a hierarchical manner across multiple platforms, includi…

Python 112 5 Updated Sep 8, 2025

Contexts Optical Compression

Python 23,295 2,150 Updated Jan 27, 2026

Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning

Python 154 2 Updated Jun 1, 2026

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

1,484 46 Updated Mar 9, 2026

Kolors Team

Python 4,607 354 Updated Nov 13, 2024

LIMI: Less is More for Agency

Python 162 7 Updated Oct 14, 2025

A Repository for Diffusion-Model-related Papers in Low-level Vision

555 12 Updated Feb 23, 2025

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 19,418 1,489 Updated Feb 27, 2026

[ICLR 2026 Oral] ScaleCUA is the open-sourced computer use agents that can operate on cross-platform environments (Windows, macOS, Ubuntu, Android).

Python 1,115 79 Updated Jan 7, 2026

Kimi K2 is the large language model series developed by Moonshot AI team

10,866 853 Updated Jan 21, 2026

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 4,223 723 Updated Jun 15, 2026

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 7,093 790 Updated Jun 15, 2026
Python 203 16 Updated Oct 10, 2025

Official style files for papers submitted to venues of the Association for Computational Linguistics

BibTeX Style 1,886 373 Updated Nov 13, 2025
Next