ghx2757

ghx2757

4 followers · 1 following

VibeVoice-ghx Public
Forked from microsoft/VibeVoice

Open-Source Frontier Voice AI

Python MIT License Updated Apr 20, 2026
pyannote-audio-ghx Public
Forked from pyannote/pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook MIT License Updated Mar 26, 2026
ms-swift-ghx Public
Forked from modelscope/ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…

Python Apache License 2.0 Updated Mar 24, 2026
FunASR-ghx Public
Forked from modelscope/FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python MIT License Updated Mar 17, 2026
Qwen-Agent-ghx Public
Forked from QwenLM/Qwen-Agent

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python Apache License 2.0 Updated Mar 4, 2026
Qwen3.5-ghx Public
Forked from QwenLM/Qwen3.6

Qwen3.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Apache License 2.0 Updated Mar 2, 2026
DisCo-ghx Public
Forked from Wangt-CN/DisCo

[CVPR2024] DisCo: Referring Human Dance Generation in Real World

Python Apache License 2.0 Updated Mar 2, 2026
DiffSynth-Studio-ghx Public
Forked from modelscope/DiffSynth-Studio

Enjoy the magic of Diffusion models!

Python Apache License 2.0 Updated Feb 6, 2026
sam-body4d-ghx Public
Forked from gaomingqi/sam-body4d

🏂 Training-Free Human Mesh Recovery from Videos, based on SAM-3, Diffusion-VAS, and SAM-3D-Body.

Python MIT License Updated Feb 3, 2026
Qwen3-VL-ghx Public
Forked from QwenLM/Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook Apache License 2.0 Updated Jan 30, 2026
Wan2.2-ghx Public
Forked from Wan-Video/Wan2.2

Wan: Open and Advanced Large-Scale Video Generative Models

Python Apache License 2.0 Updated Jan 28, 2026
TurboDiffusion-ghx Public
Forked from thu-ml/TurboDiffusion

TurboDiffusion: 100–200× Acceleration for Video Diffusion Models

Python Apache License 2.0 Updated Jan 26, 2026
SoulX-FlashTalk-ghx Public
Forked from Soul-AILab/SoulX-FlashTalk

SoulX-FlashTalk is the first 14B model to achieve a sub-second start-up latency (0.87s) while sustaining a real-time throughput of 32 FPS

Python Apache License 2.0 Updated Jan 23, 2026
echomimic_v3-ghx Public
Forked from antgroup/echomimic_v3

[AAAI 2026] EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation

Python Apache License 2.0 Updated Jan 6, 2026
rcm-ghx Public
Forked from NVlabs/rcm

rCM: SOTA Diffusion Distillation & Few-Step Video Generation based on sCM/MeanFlow

Python Apache License 2.0 Updated Jan 6, 2026
SenseVoice-ghx Public
Forked from FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

Python Other Updated Dec 30, 2025
LiveAvatar-ghx Public
Forked from Alibaba-Quark/LiveAvatar

Implementation of "Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length"

Python Apache License 2.0 Updated Dec 19, 2025
realtime-video-ghx Public
Forked from krea-ai/realtime-video

Krea Realtime 14B. An open-source realtime AI video model.

Python Other Updated Nov 13, 2025
InfiniteTalk-ghx Public
Forked from MeiGen-AI/InfiniteTalk

Unlimited-length talking video generation that supports image-to-video and video-to-video generation

Python Apache License 2.0 Updated Nov 4, 2025
lookahead-anchoring Public
Forked from j0seo/lookahead-anchoring

Updated Oct 27, 2025
LstmSync-ghx Public
Forked from oneCodeSuperman/LstmSync

开源的LstmSync数字人泛化模型，只做最好的泛化模型！

Python Updated Oct 26, 2025
Wan2.1-ghx Public
Forked from Wan-Video/Wan2.1

Wan: Open and Advanced Large-Scale Video Generative Models

Python Apache License 2.0 Updated Oct 20, 2025
Awesome-Human-Motion-Video-Generation Public
Forked from Winn1y/Awesome-Human-Motion-Video-Generation

【Accepted by TPAMI】Human Motion Video Generation: A Survey (https://ieeexplore.ieee.org/document/11106267)

MIT License Updated Oct 14, 2025
OpenS2V-Nexus-ghx Public
Forked from PKU-YuanGroup/OpenS2V-Nexus

[NeurIPS 2025 D&B🔥] OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation

Jupyter Notebook Apache License 2.0 Updated Oct 14, 2025
HuMo-ghx Public
Forked from Phantom-video/HuMo

HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning

Python Apache License 2.0 Updated Oct 11, 2025
StableAvatar-ghx Public
Forked from Francis-Rings/StableAvatar

We present StableAvatar, the first end-to-end video diffusion transformer, which synthesizes infinite-length high-quality audio-driven avatar videos without any post-processing, conditioned on a re…

Python 1 MIT License Updated Sep 19, 2025
LatentSync-ghx Public
Forked from bytedance/LatentSync

Taming Stable Diffusion for Lip Sync!

Python Apache License 2.0 Updated Jul 9, 2025
TANGO-ghx Public
Forked from CyberAgentAILab/TANGO

[ICLR 2025 Oral] TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation

Python Other Updated Jun 26, 2025
hallo3_ghx Public
Forked from fudan-generative-vision/hallo3

[CVPR 2025] Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer

Python MIT License Updated Jun 11, 2025
gaussian-splatting Public
Forked from graphdeco-inria/gaussian-splatting

Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"

Python Other Updated Mar 19, 2025

ghx2757

VibeVoice-ghx Public

Uh oh!

pyannote-audio-ghx Public

Uh oh!

ms-swift-ghx Public

Uh oh!

FunASR-ghx Public

Uh oh!

Qwen-Agent-ghx Public

Uh oh!

Qwen3.5-ghx Public

Uh oh!

DisCo-ghx Public

Uh oh!

DiffSynth-Studio-ghx Public

Uh oh!

sam-body4d-ghx Public

Uh oh!

Qwen3-VL-ghx Public

Uh oh!

Wan2.2-ghx Public

Uh oh!

TurboDiffusion-ghx Public

Uh oh!

SoulX-FlashTalk-ghx Public

Uh oh!

echomimic_v3-ghx Public

Uh oh!

rcm-ghx Public

Uh oh!

SenseVoice-ghx Public

Uh oh!

LiveAvatar-ghx Public

Uh oh!

realtime-video-ghx Public

Uh oh!

InfiniteTalk-ghx Public

Uh oh!

lookahead-anchoring Public

Uh oh!

LstmSync-ghx Public

Uh oh!

Wan2.1-ghx Public

Uh oh!

Awesome-Human-Motion-Video-Generation Public

Uh oh!

OpenS2V-Nexus-ghx Public

Uh oh!

HuMo-ghx Public

Uh oh!

StableAvatar-ghx Public

Uh oh!

LatentSync-ghx Public

Uh oh!

TANGO-ghx Public

Uh oh!

hallo3_ghx Public

Uh oh!

gaussian-splatting Public

Uh oh!