Skip to content
View minghsuanwu's full-sized avatar
  • Taipei

Block or report minghsuanwu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.

Python 533 32 Updated Dec 24, 2025

RealSee3D: A multi-view RGB-D dataset combining real-world captures and procedurally generated scenes, with extensible annotations for diverse 3D vision research.

Python 211 8 Updated Dec 18, 2025

🛜 ESPectre 👻 - Motion detection system based on Wi-Fi spectre analysis (CSI), with Home Assistant integration.

C 4,000 279 Updated Dec 24, 2025
Python 989 69 Updated Mar 24, 2025

Official implementation for "Android in the Zoo: Chain-of-Action-Thought for GUI Agents" (Findings of EMNLP 2024)

Python 96 5 Updated Oct 14, 2024

Sharp Monocular View Synthesis in Less Than a Second

Python 5,079 323 Updated Dec 19, 2025

Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"

Python 229 12 Updated Dec 9, 2025

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,354 3,243 Updated Dec 24, 2025
Python 39 2 Updated Dec 23, 2025

[NeurIPS 2025] Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Python 464 18 Updated Dec 19, 2025

A real-time streaming conversational video system that transforms text interactions into continuous, high-fidelity video responses using autoregressive diffusion.

Python 253 36 Updated Dec 15, 2025

[AAAI'26 Oral] DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping

Python 453 33 Updated Aug 10, 2025

HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency

Python 739 45 Updated Dec 24, 2025

Native and Compact Structured Latents for 3D Generation

Python 2,333 162 Updated Dec 23, 2025

[3DV 2026] "SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass"

Jupyter Notebook 233 13 Updated Dec 15, 2025

Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

Python 352 15 Updated Dec 15, 2025

The official repository of "Astra : General Interactive World Model with Autoregressive Denoising"

Python 172 3 Updated Dec 24, 2025

DDGS | Dux Distributed Global Search. A metasearch library that aggregates results from diverse web search services

Python 2,033 196 Updated Dec 19, 2025

Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent

Python 402 29 Updated Apr 22, 2025

🎈🎈🎈🎈年会抽奖程序,threejs+vue3 3D球体动态抽奖应用。

Vue 2,071 437 Updated Dec 24, 2025
Python 43 3 Updated Dec 23, 2025

A next.js web application that integrates AI capabilities with draw.io diagrams. This app allows you to create, modify, and enhance diagrams through natural language commands and AI-assisted visual…

TypeScript 14,988 1,543 Updated Dec 25, 2025

✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,465 180 Updated Mar 28, 2025

开源移动端车型识别 Mobile Plateform Vehicle Identification Model

C++ 454 135 Updated Jul 9, 2020

Agent Zero AI framework

Python 12,671 2,478 Updated Dec 22, 2025

[arXiv 2025] MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence

Python 67 Updated Dec 23, 2025

An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone

Python 19,369 3,072 Updated Dec 22, 2025
Jupyter Notebook 34 4 Updated Dec 13, 2023

Open-Source Frontier Voice AI

Python 19,019 2,100 Updated Dec 17, 2025
Next