Skip to content
View Jize-W's full-sized avatar
πŸ€ͺ
ι€’ε½’ε­¦δΉ 
πŸ€ͺ
ι€’ε½’ε­¦δΉ 

Highlights

  • Pro

Block or report Jize-W

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

PyTorch code and models for VJEPA2 self-supervised learning from video.

Python 4,174 511 Updated Mar 23, 2026

PyTorch code and models for V-JEPA self-supervised learning from video.

Python 3,963 400 Updated Feb 27, 2025

Self-evolving agent: grows skill tree from 3.3K-line seed, achieving full system control with 6x less token consumption

Python 12,924 1,496 Updated Jun 16, 2026

Ο„-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains

Python 1,364 355 Updated Jun 11, 2026

Code and Data for Tau-Bench

Python 1,276 203 Updated Mar 18, 2026

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

Python 1,516 242 Updated Nov 26, 2025

Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

Python 30,226 2,045 Updated Jun 17, 2026

Harbor is a framework for running agent evaluations and creating and using RL environments.

Python 2,493 1,167 Updated Jun 17, 2026

SkillRouter: Retrieve-and-Rerank Skill Selection for LLM Agents at Scale

Python 180 18 Updated Jun 10, 2026

Lossless Claw β€” LCM (Lossless Context Management) plugin for OpenClaw

TypeScript 4,833 423 Updated Jun 17, 2026

PinchBench is a benchmarking system for evaluating LLM models as OpenClaw coding agents. Made with πŸ¦€ by the humans at https://kilo.ai

Python 1,235 140 Updated Jun 2, 2026

An in-the-wild benchmark for AI agents in the OpenClaw Environment.

Python 443 43 Updated May 19, 2026

Self-hosted, open-source agent skill registry for enterprises. Publish & version skill packages, govern with RBAC and audit logs, deploy on-premise with Docker or Kubernetes.

Java 3,493 510 Updated Jun 17, 2026
TeX 15 2 Updated Jul 10, 2025

A lightweight, unofficial implementation of Meta-Harness (arXiv:2603.28052). Official repo: https://github.com/stanford-iris-lab/meta-harness-tbench2-artifact

Python 8 Updated Apr 8, 2026

Meta-Harness: 76.4% on Terminal-Bench 2.0 (Claude Opus 4.6)

Python 1,099 161 Updated Mar 26, 2026

[ACL 2026] RouteMoA: Dynamic Routing without Pre-Inference Boosts Efficient Mixture-of-Agents

Python 11 1 Updated May 24, 2026

The agent that grows with you

Python 195,527 34,350 Updated Jun 17, 2026

OpenClaw-RL: Train any agent simply by talking

Python 5,503 595 Updated May 23, 2026

The agent-native LLM router for OpenClaw. 41+ models, <1ms routing, USDC payments on Base & Solana via x402.

TypeScript 6,572 606 Updated Jun 14, 2026

"RAG-Anything: All-in-One RAG Framework"

Python 21,386 2,497 Updated Jun 15, 2026

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

Python 67,798 5,709 Updated Jun 17, 2026

[CVPR 2026] Where MLLMs Attend and What They Rely On: Explaining Autoregressive Token Generation

Jupyter Notebook 38 Updated Jun 16, 2026

Interpretable Causal Diffusion Language Models

Python 229 14 Updated Mar 18, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 379,079 79,337 Updated Jun 17, 2026

A compilation of the best multi-agent papers

TeX 1,560 150 Updated Jun 13, 2026

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 22,011 4,087 Updated Jun 16, 2026

Supercharge Your LLM Application Evaluations πŸš€

Python 14,399 1,485 Updated Feb 24, 2026

[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"

Python 36,673 5,180 Updated Jun 17, 2026

[ACL2026 Main] AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts

Python 87 4 Updated Jan 23, 2026
Next