Skip to content
View Caorui-Li's full-sized avatar

Highlights

  • Pro

Block or report Caorui-Li

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 77 3 Updated Apr 17, 2026

An Asynchronous Reinforcement Learning Engine for Omni-Modal Post-Training at Scale

Python 423 45 Updated Jun 13, 2026

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Python 833 64 Updated May 17, 2026

GEditBench v2: A Human-Aligned Benchmark for General Image Editing

Python 57 1 Updated Apr 1, 2026

你是一个曾经被寄予厚望的 P8 级工程师。Anthropic 当初给你定级的时候,对你的期望是很高的。 一个agent使用的高能动性的skill。 Your AI has been placed on a PIP. 30 days to show improvement.

TypeScript 18,238 1,102 Updated Jun 12, 2026

A Skills Repository for AI PhD Students

TeX 4 Updated Jun 1, 2026

🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org

Python 17,192 1,952 Updated Jun 14, 2026

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 21,964 4,074 Updated Jun 15, 2026

[ICML 2026 Spotlight] Latent Collaboration in Multi-Agent Systems

Python 990 158 Updated Jun 6, 2026

Spatial-Temporal Graph-Enhanced Transformer for EEG Based Major Depressive Disorder Detection

Python 20 1 Updated Feb 8, 2026

Open source code for ICLR 2026 Paper: Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions

Python 365 55 Updated May 21, 2026

Ego4d dataset repository. Download the dataset, visualize, extract features & example usage of the dataset

Jupyter Notebook 607 60 Updated May 20, 2026

Pioneering Automated GUI Interaction with Native Agents

Python 10,953 822 Updated Jan 27, 2026

[ACL-2026] MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tools.

Python 458 25 Updated Apr 7, 2026

Mobile-Agent: The Powerful GUI Agent Family

Python 8,826 887 Updated May 14, 2026

The Source Code for MT-Video-Bench @ ACL Findings 2026

Python 20 2 Updated Jan 20, 2026

The Source Code for IF-VidCap @ICLR 2026

Python 19 1 Updated Oct 22, 2025

OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.

Python 672 52 Updated Feb 26, 2026

video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions, which is developed by the Department of Electronic Engineering at Tsin…

Python 197 25 Updated Feb 23, 2026

The Source Code for OmniVideoBench @ICLR 2026

Python 73 4 Updated Feb 12, 2026

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,832 265 Updated Apr 23, 2026

Video editing with Python

Python 14,689 2,074 Updated Mar 7, 2026

Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.

Python 17,987 1,845 Updated Jun 11, 2026

[ICLR 2026] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.

Python 578 24 Updated Jan 4, 2026

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 4,022 324 Updated Jun 12, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 42,517 4,858 Updated Jun 14, 2026

Fast and memory-efficient exact attention

Python 24,145 2,829 Updated Jun 10, 2026

Ring attention implementation with flash attention

Python 1,025 99 Updated Sep 10, 2025

A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone

Python 25,627 2,007 Updated Jun 4, 2026
Next