Stars
GDM Science Skills to speed up agentic scientific workflows with better grounding and higher token efficiency. Integrate insights from AlphaGenome, AFDB, UniProt and 30+ other databases and tools.
Claude Autoresearch Skill — Autonomous goal-directed iteration for Claude Code. Inspired by Karpathy's autoresearch. Modify → Verify → Keep/Discard → Repeat forever.
ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works…
Cultural Learning-Based Culture Adaptation of Language Models (https://aclanthology.org/2025.acl-long.156/)
[ICLR'26] RAVENEA: A Benchmark for Multimodal Retrieval-Augmented Visual Culture Understanding
(AAAI 2026) SoMe: A Realistic Benchmark for LLM-based Social Media Agents
[ICLR 2024] heterogeneous MoE: mixture of weak & strong experts on graphs https//openreview.net/pdf?id=wYvuY60SdD
Can AI agents predict whether they will succeed at a task?
A curated collection of AI agent research papers released in 2026, covering agent engineering, memory, evaluation, workflows, and autonomous systems.
This is our repository for the training code on the DeepResearch-9K dataset.
Recursive-Open-Meta-Agent v0.1 (Beta). A meta-agent framework to build high-performance multi-agent systems.
A collection of AI Agents papers (Updated biweekly)
🏆 Top-1 on 5+ benchmarks | Web UI | Supports MiroThinker, Claude, Kimi, OpenAI
HoTPP: An Event Sequence Prediction Benchmark
Elevate your AI research writing, no more tedious polishing ✨
Source code for our paper ''Finding What Matters: Anchoring Context Knowledge with Evolving Indices for Iterative Retrieval''
CL-bench: A Benchmark for Context Learning
[paper][ACL 2026 main] Temp-R1: A Unified Autonomous Agent for Complex Temporal KGQA via Reverse Curriculum Reinforcement Learning
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞