Skip to content
View Rongjiehuang's full-sized avatar
🎯
Focusing. I may be slow to reply.
🎯
Focusing. I may be slow to reply.

Organizations

@AIGC-Audio

Block or report Rongjiehuang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A brief and partial summary of RLHF algorithms.

151 3 Updated Mar 4, 2025

[ICLR 2026 Oral] ScaleCUA is the open-sourced computer use agents that can operate on cross-platform environments (Windows, macOS, Ubuntu, Android).

Python 1,115 79 Updated Jan 7, 2026

OpenClaw-RL: Train any agent simply by talking

Python 5,498 595 Updated May 23, 2026

MAI-UI: Real-World Centric Foundation GUI Agents ranging from 2B to 235B

Jupyter Notebook 1,823 177 Updated Apr 20, 2026

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 21,985 4,079 Updated Jun 15, 2026

GroundCUA

Python 125 14 Updated Mar 24, 2026

EvoCUA: Evolving Computer Use Agent

Python 325 24 Updated Mar 31, 2026

runs anywhere. uses anything

TypeScript 28,943 8,756 Updated Jun 15, 2026

An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.

Rust 193,847 109,959 Updated Jun 8, 2026

This repository contains code and metadata of How2 dataset

Python 193 19 Updated Dec 30, 2024

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 3,533 319 Updated May 26, 2026

MAGI-1: Autoregressive Video Generation at Scale

Python 3,706 238 Updated Jun 17, 2025

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 2,838 254 Updated Dec 30, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 82,958 18,090 Updated Jun 15, 2026

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 5,014 373 Updated Apr 6, 2026
Python 23 7 Updated Nov 26, 2025

This is the official repo for the paper "LongCat-Flash-Omni Technical Report"

Python 492 32 Updated May 9, 2026

Large Concept Models: Language modeling in a sentence representation space

Python 2,363 210 Updated Jan 29, 2025

《机器阅读理解:算法与实践》代码

Python 157 59 Updated Jul 25, 2024

Build local voice agents with open-source models

Python 4,884 582 Updated Jun 15, 2026

Official implementation of "HumanAesExpert: Advancing a Multi-Modality Foundation Model for Human Image Aesthetic Assessment"

Python 118 2 Updated Apr 15, 2025

An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation

Python 1,608 83 Updated Oct 16, 2025

Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.

Python 883 71 Updated Jun 11, 2026

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 2,299 157 Updated Apr 13, 2026

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,761 273 Updated Jul 18, 2025

An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.

Python 1,914 217 Updated May 26, 2026

[ACM MM 2025] FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis

Python 1,621 127 Updated Jan 26, 2026

Scalable and memory-optimized training of diffusion models

Python 1,360 140 Updated May 26, 2026
Next