Starred repositories
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
A toolkit for developing and comparing reinforcement learning algorithms.
Python sample codes and textbook for robotics algorithms.
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
AI coding assistant skill (Claude Code, Codex, OpenCode, Cursor, Gemini CLI, OpenClaw, Factory Droid, Trae). Turn any folder of code, docs, papers, images, videos, or YouTube links into a queryable…
The conversational control layer for customer-facing AI agents - Parlant is an agentic harness optimized for controlling customer interactions.
TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
Open Source DeepWiki: AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories. Join the discord: https://discord.gg/gMwThUMeme
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
Refine high-quality datasets and visual AI models
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
Unified framework for robot learning built on NVIDIA Isaac Sim
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡
Google Drive Public File Downloader when Curl/Wget Fails
XLeRobot: Practical Dual-Arm Mobile Home Robot for $660
Transparent and Efficient Financial Analysis
Visual localization made easy with hloc
Open source hardware and software platform to build a small scale self driving car.
A collection of high-quality models for the MuJoCo physics engine, curated by Google DeepMind.
pySLAM is a hybrid Python/C++ Visual SLAM pipeline supporting monocular, stereo, and RGB-D cameras. It provides a broad set of modern local and global feature extractors, multiple loop-closure stra…
Isaac Gym Reinforcement Learning Environments
Isaac Gym Environments for Legged Robots
A fast and simple implementation of learning algorithms for robotics.