HarryHsing

🎾

TTWSYF

XING, Zhenghao HarryHsing

🎾

TTWSYF

Ph.D. student in Computer Science at CUHK

22 followers · 15 following

The Chinese University of Hong Kong
Hong Kong
19:26 (UTC +08:00)
https://harryhsing.github.io/
in/xingzhenghao
@onehsing

Achievements

Highlights

Lists (1)

Sort

✨ Inspiration

Stars

143 results for source starred repositories

Clear filter

alibaba / ROLL

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2,783 207 Updated Feb 8, 2026

MoonshotAI / WorldVQA

Python 94 2 Updated Feb 4, 2026

MoonshotAI / Kimi-K2.5

Moonshot's most powerful model

818 77 Updated Jan 31, 2026

sgl-project / sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 23,418 4,357 Updated Feb 8, 2026

microsoft / Magma

[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents

Python 1,895 155 Updated Jan 22, 2026

QwenLM / Qwen3-TTS

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…

Python 7,156 885 Updated Feb 6, 2026

KlingTeam / UniVideo

[ICLR 2026] UniVideo: Unified Understanding, Generation, and Editing for Videos

Python 419 20 Updated Jan 30, 2026

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 12,587 1,195 Updated Feb 7, 2026

facebookresearch / sam-audio

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 3,279 277 Updated Jan 5, 2026

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 3,708 501 Updated Feb 5, 2026

zjuruizhechen / Awesome-Video-Agent

A collection of awesome think with videos papers.

87 2 Updated Dec 1, 2025

Visual-Agent / DeepEyesV2

Python 514 49 Updated Jan 28, 2026

thuml / MiniVeo3-Reasoner

Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give it a star 🌟 if you find it useful.

Python 208 8 Updated Oct 12, 2025

sunnweiwei / FoldAgent

Scaling Long-Horizon LLM Agent via Context-Folding

Python 110 8 Updated Jan 26, 2026

ddlBoJack / Omni-Captioner

[ICLR 2026] Data Pipeline, Models, and Benchmark for Omni-Captioner.

Python 117 Updated Oct 17, 2025

QwenLM / Qwen-Agent

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 13,228 1,246 Updated Feb 3, 2026

jinxiang-liu / anno-free-AVS

Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"

Python 37 3 Updated Oct 11, 2024

DabDans / AudioMarathon

Code for "AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs"

Python 22 Updated Oct 9, 2025

adobe-research / EditVerse

Official repo for paper "EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning"

Python 128 4 Updated Oct 9, 2025

FunAudioLLM / ThinkSound

[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning.

Python 1,157 66 Updated Jan 27, 2026

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,399 210 Updated Jan 8, 2026