Lists (3)
Sort Name ascending (A-Z)
Stars
A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future …
This is the official repo for the paper "LongCat-Flash-Omni Technical Report"
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
The official code of [ICLR 2026] TFPI: Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners
An open-source AI agent that brings the power of Gemini directly into your terminal.
We introduce BabyVision, a benchmark revealing the infancy of AI vision.
SkillsBench evaluates how well skills work and how effective agents are at using them
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
[ICLR 2026] Official PyTorch implementation for "ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding"
Agent S: an open agentic framework that uses computers like a human
Agent Memory Playground: AI Agent Memory Design & Optimization Techniques
ToolOrchestra is an end-to-end RL training framework for orchestrating tools and agentic workflows.
🔥[NeurIPS'25] DeepFund: Pilot for Your Next Fund Investment
Latent Collaboration in Multi-Agent Systems
Awesome LLM Self-Consistency: a curated list of Self-consistency in Large Language Models
This repository allows reproduction of Poetiq's record-breaking submission to the ARC-AGI-1 and ARC-AGI-2 benchmarks.
A collection of token reduction (token pruning, merging, clustering, etc.) techniques for ML/AI
Official repository for DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research
Official implementation of paper "Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models"
MiroFlow is an agent framework that enables tool-use agent tasks, featuring a reproducible GAIA score of 82.4%.
Designing Multi-Agent Systems with Zero Supervision
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"
dInfer: An Efficient Inference Framework for Diffusion Language Models