-
Beihang University
- Beijing
-
18:06
(UTC -12:00) - @Chuanye_Wang888
Stars
Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the future state of the world in the form of video.
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
用于切分NuScenes自动驾驶数据集。支持精确场景切分、区间切分、关键字搜索和批量处理等多种模式。 适用于硬件资源有限或者对数据集切分有严格要求的情况
Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
🚀 Production-Ready Multi-Agent System for WeChat Article Generation - Automated pipeline that actually works! | 真正能用的微信公众号AI生成系统:多Agent协同、自动化处理、通俗易懂
Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Google Robot, WidowX+Bridge) (CoRL 2024)
OpenMMLab Detection Toolbox and Benchmark
(ICLR2025) Enhancing End-to-End Autonomous Driving with Latent World Model
[NeurIPS 2025] Future-Aware End-to-End Driving: Bidirectional Modeling of Trajectory Planning and Scene Evolution
Model-based design and verification for robotics.
Labs for MIT 6.S184/6.S975, IAP 2025
[CVPR2024] NeuRAD: Neural Rendering for Autonomous Driving
✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
Unified framework for robot learning built on NVIDIA Isaac Sim
Robot bimanual manipulation / dual-arm manipulation
A powerful tool for creating fine-tuning datasets for LLM
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,支持ChatGPT多轮对话能力,还可能是首个支持脑机交互的开源智能音箱项目。
Wan: Open and Advanced Large-Scale Video Generative Models
Tien Kung-Lab: Direct IsaacLab Workflow for Legged Robots