-
Tsinghua University
- Beijing
-
15:03
(UTC +08:00) - https://orcid.org/0000-0001-6637-8346
Highlights
- Pro
Lists (9)
Sort Name ascending (A-Z)
Stars
AI agents running research on single-GPU nanochat training automatically
🦞 Just talk to your agent — it learns and EVOLVES 🧬.
OpenClaw-RL: Train any agent simply by talking
OpenViking is an open-source context database designed specifically for AI Agents(such as openclaw). OpenViking unifies the management of context (memory, resources, and skills) that Agents need th…
Fast, small, and fully autonomous AI personal assistant infrastructure, ANY OS, ANY PLATFORM — deploy anywhere, swap anything 🦀
Tiny, Fast, and Deployable anywhere — automate the mundane, unleash your creativity
A lightweight alternative to OpenClaw that runs in containers for security. Connects to WhatsApp, Telegram, Slack, Discord, Gmail and other messaging apps,, has memory, scheduled jobs, and runs dir…
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
PiloTY: AI pilot for PTY operations via MCP - enables AI agents to control interactive terminals like a human
Post-training with Tinker
LLMRouter: An Open-Source Library for LLM Routing
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
动手学Ollama,CPU玩转大模型部署,在线阅读地址:https://datawhalechina.github.io/handy-ollama/
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
Code for the paper "Planning with Diffusion for Flexible Behavior Synthesis"
Integrate the DeepSeek API into popular software
FlashMLA: Efficient Multi-head Latent Attention Kernels
Fully open reproduction of DeepSeek-R1
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
verl: Volcano Engine Reinforcement Learning for LLMs
Minimal reproduction of DeepSeek R1-Zero
Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKT…
A generative world for general-purpose robotics & embodied AI learning.
🦁 A research-friendly codebase for fast experimentation of multi-agent reinforcement learning in JAX