Skip to content
View Rongjiehuang's full-sized avatar
🎯
Focusing. I may be slow to reply.
🎯
Focusing. I may be slow to reply.

Organizations

@AIGC-Audio

Block or report Rongjiehuang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A simple SWE style browser agent framework that achieves SOTA results on long horizon web tasks.

Python 5,538 350 Updated Jun 3, 2026

Mobile-Agent: The Powerful GUI Agent Family

Python 8,859 889 Updated May 14, 2026

Qwen3-Coder is the code version of Qwen3, the large language model series developed by Qwen team.

Python 16,645 1,206 Updated Mar 24, 2026

A brief and partial summary of RLHF algorithms.

151 3 Updated Mar 4, 2025

[ICLR 2026 Oral] ScaleCUA is the open-sourced computer use agents that can operate on cross-platform environments (Windows, macOS, Ubuntu, Android).

Python 1,116 79 Updated Jan 7, 2026

OpenClaw-RL: Train any agent simply by talking

Python 5,516 597 Updated May 23, 2026

MAI-UI: Real-World Centric Foundation GUI Agents ranging from 2B to 235B

Jupyter Notebook 1,821 177 Updated Apr 20, 2026

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 22,083 4,109 Updated Jun 22, 2026

GroundCUA

Python 125 14 Updated Mar 24, 2026

EvoCUA: Evolving Computer Use Agent

Python 325 24 Updated Mar 31, 2026

runs anywhere. uses anything

TypeScript 29,276 8,802 Updated Jun 22, 2026

An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.

Rust 194,168 109,908 Updated Jun 8, 2026

This repository contains code and metadata of How2 dataset

Python 193 19 Updated Dec 30, 2024

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 3,533 319 Updated May 26, 2026

MAGI-1: Autoregressive Video Generation at Scale

Python 3,709 238 Updated Jun 17, 2026

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 2,838 254 Updated Dec 30, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83,577 18,329 Updated Jun 22, 2026

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 5,024 372 Updated Apr 6, 2026
Python 24 7 Updated Nov 26, 2025

This is the official repo for the paper "LongCat-Flash-Omni Technical Report"

Python 491 32 Updated May 9, 2026

Large Concept Models: Language modeling in a sentence representation space

Python 2,365 210 Updated Jan 29, 2025

《机器阅读理解:算法与实践》代码

Python 157 59 Updated Jul 25, 2024

Build local voice agents with open-source models

Python 4,894 584 Updated Jun 22, 2026

Official implementation of "HumanAesExpert: Advancing a Multi-Modality Foundation Model for Human Image Aesthetic Assessment"

Python 119 2 Updated Apr 15, 2025

An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation

Python 1,615 83 Updated Oct 16, 2025

Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.

Python 881 71 Updated Jun 11, 2026

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 2,302 156 Updated Apr 13, 2026

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,759 273 Updated Jul 18, 2025

An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.

Python 1,919 219 Updated May 26, 2026
Next