Skip to content
View Wei-Baldwin-Zeng's full-sized avatar

Block or report Wei-Baldwin-Zeng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
36 results for source starred repositories written in Python
Clear filter

Open-Sora: Democratizing Efficient Video Production for All

Python 27,786 2,755 Updated Apr 30, 2025

State-of-the-art 2D and 3D Face Analysis Project

Python 26,965 5,812 Updated Sep 27, 2025

MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

Python 22,193 1,664 Updated Sep 24, 2025

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 12,102 1,210 Updated Nov 4, 2025

Implementation of Nougat Neural Optical Understanding for Academic Documents

Python 9,700 618 Updated Feb 21, 2025

BoxMOT: Pluggable SOTA multi-object tracking modules modules for segmentation, object detection and pose estimation models

Python 7,776 1,855 Updated Oct 31, 2025

Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"

Python 6,990 481 Updated Mar 18, 2025

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 6,968 693 Updated Jan 22, 2025

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Python 6,643 543 Updated Jul 11, 2024

Data processing for and with foundation models! 🍎 πŸ‹ 🌽 ➑️ ➑️🍸 🍹 🍷

Python 5,484 286 Updated Nov 7, 2025

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,639 304 Updated Oct 20, 2025

SAPIEN Manipulation Skill Framework, an open source GPU parallelized robotics simulator and benchmark, led by Hillbot, Inc.

Python 2,227 381 Updated Nov 5, 2025

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,782 107 Updated Sep 27, 2024

RetinaFace: Deep Face Detection Library for Python

Python 1,763 180 Updated Aug 11, 2025

[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,763 76 Updated Oct 22, 2025

RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation

Python 1,517 145 Updated Sep 28, 2025

SEED-Voken: A Series of Powerful Visual Tokenizers

Python 971 35 Updated Oct 22, 2025

Official Algorithm Implementation of ICML'23 Paper "VIMA: General Robot Manipulation with Multimodal Prompts"

Python 831 96 Updated Apr 18, 2024

[RSS 2025] Learning to Act Anywhere with Task-centric Latent Actions

Python 820 49 Updated Nov 6, 2025

RoboBrain 2.0: Advanced version of RoboBrain. See Better. Think Harder. Do Smarter. πŸŽ‰πŸŽ‰πŸŽ‰

Python 680 57 Updated Sep 30, 2025

A-MEM: Agentic Memory for LLM Agents

Python 668 80 Updated Oct 21, 2025

Official repo and evaluation implementation of VSI-Bench

Python 618 37 Updated Aug 5, 2025

Low-level locomotion policy training in Isaac Lab

Python 350 30 Updated Mar 7, 2025

[RSS 2024 & RSS 2025] VLN-CE evaluation code of NaVid and Uni-NaVid

Python 298 20 Updated Oct 15, 2025

PyTorch implementation of paper "ARTrack" and "ARTrackV2"

Python 292 35 Updated Oct 20, 2025
Python 280 34 Updated Mar 17, 2025

[CoRL 2025] Repository relating to "TrackVLA: Embodied Visual Tracking in the Wild"

Python 267 19 Updated Oct 16, 2025

Vision-Language Navigation Benchmark in Isaac Lab

Python 263 25 Updated Aug 28, 2025

πŸ€– RoboOS: A Universal Embodied Operating System for Cross-Embodied and Multi-Robot Collaboration

Python 243 29 Updated Sep 4, 2025

Embodied Reasoning Question Answer (ERQA) Benchmark

Python 240 12 Updated Mar 12, 2025
Next