Lists (5)
Sort Name ascending (A-Z)
Starred repositories
code for 'Do Sparse Autoencoders Capture Concept Manifolds?'
A curated collection of papers and resources on On-Policy Distillation for Large Language Models.
Ongoing research training transformer models at scale
A Mechanistic Interpretability Toolkit for Cross-Layer Transcoder Training and Attribution-Graph Visualization
基于多智能体LLM的中文金融交易框架 - TradingAgents中文增强版
Open source interpretability artefacts for R1.
🎓从0开始训练一个大模型Minimind项目的超详细解析,包括但不限于用到的架构,算法,以及大模型面试经验
一个完整的 LLM 训练的基本流程笔记 (Tokenizer -> PreTraining -> SFT -> DPO -> GRPO)
The agent that grows with you
🧠「大模型」2小时完全从0训练64M的小参数LLM!Train a 64M-parameter LLM from scratch in just 2h!
AI Agent 驱动的开源视频生成工作台 — 小说→角色/场景/道具设计→剧本→分镜图→视频,跨镜头角色与场景一致 | Open-source AI video workspace powered by AI Agents, Nano Banana 2 & Veo 3.1 / Grok / Seedance / OpenAI
The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
AI agents running research on single-GPU nanochat training automatically
Run agents like Hermes and OpenClaw more securely inside NVIDIA OpenShell with managed inference
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
[ICML 2026] "Step-Level Sparse Autoencoder for Reasoning Process Interpretation"
Probing the Trajectories of Reasoning Traces in Large Language Models
[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models