Skip to content
View AberHu's full-sized avatar

Block or report AberHu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,705 311 Updated Nov 28, 2025

LEAKED SYSTEM PROMPTS FOR CHATGPT, GEMINI, GROK, CLAUDE, PERPLEXITY, CURSOR, DEVIN, REPLIT, AND MORE! - AI SYSTEMS TRANSPARENCY FOR ALL! 👐

12,326 2,432 Updated Nov 24, 2025

Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

Python 234 26 Updated Dec 25, 2025

"AI-Trader: Can AI Beat the Market?" Live Trading Bench: https://ai4trade.ai Tech Report Link: https://arxiv.org/abs/2512.10971

Python 10,261 1,639 Updated Dec 19, 2025

Mobile-Agent: The Powerful GUI Agent Family

Python 6,820 698 Updated Dec 2, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 11,851 1,086 Updated Dec 25, 2025

A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related webs…

946 23 Updated Dec 24, 2025

The official implement of VITA, VITA15, LongVITA, VITA-Audio, VITA-VLA, and VITA-E.

Python 135 2 Updated Oct 28, 2025

Fully Open Framework for Democratized Multimodal Training

Python 663 53 Updated Dec 15, 2025

SpatialVID: A Large-Scale Video Dataset with Spatial Annotations

Python 452 14 Updated Dec 15, 2025

PDF Parsing for RAG — Convert to Markdown & JSON, Fast, Local, No GPU

Java 811 42 Updated Dec 24, 2025

A Survey of Reinforcement Learning for Large Reasoning Models

TeX 2,197 121 Updated Nov 9, 2025

​​Unlimited-length talking video generation​​ that supports image-to-video and video-to-video generation

Python 4,030 672 Updated Dec 18, 2025

🦛 CHONK docs with Chonkie ✨ — The lightweight ingestion library for fast, efficient and robust RAG pipelines

Python 3,399 217 Updated Dec 23, 2025

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,164 193 Updated Oct 9, 2025

E2M converts various file types (doc, docx, epub, html, htm, url, pdf, ppt, pptx, mp3, m4a) into Markdown. It’s easy to install, with dedicated parsers and converters, supporting custom configs. E2…

Jupyter Notebook 1,246 68 Updated Sep 8, 2024

An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.

Python 1,504 190 Updated Dec 19, 2025

Repair invalid JSON documents

TypeScript 2,139 76 Updated Dec 10, 2025

Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]

Python 783 40 Updated Dec 14, 2025

Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’

Jupyter Notebook 2,289 103 Updated Oct 29, 2025

R1-onevision, a visual language model capable of deep CoT reasoning.

Python 574 16 Updated Apr 13, 2025

Solve Visual Understanding with Reinforced VLMs

Python 5,774 376 Updated Oct 21, 2025

Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.

Python 834 54 Updated May 14, 2025

A fork to add multimodal model training to open-r1

Python 1,434 70 Updated Feb 8, 2025

Fully open reproduction of DeepSeek-R1

Python 25,756 2,407 Updated Nov 24, 2025

A library for advanced large language model reasoning

Python 2,321 204 Updated Jun 10, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,786 100 Updated Mar 18, 2025

The Next Step Forward in Multimodal LLM Alignment

Python 193 8 Updated May 1, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & TIS & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 8,655 840 Updated Dec 18, 2025

Unified KV Cache Compression Methods for Auto-Regressive Models

Python 1,294 159 Updated Jan 4, 2025
Next