yiyexy

🎯

Focusing

yiyexy

🎯

Focusing

19 followers · 27 following

Shanghai,China

Achievements

Lists (3)

Sort

Stars

Gen-Verse / MMaDA

MMaDA - Open-Sourced Multimodal Large Diffusion Language Models

Python 1,542 78 Updated Nov 16, 2025

EvolvingLMMs-Lab / OneVision-Encoder

Python 20 Updated Dec 25, 2025

EvolvingLMMs-Lab / LLaVA-OneVision-1.5-RL

Fully Open Framework for Democratized Multimodal Reinforcement Learning.

Python 28 2 Updated Dec 19, 2025

Luodian / nano-hevc

A minimal, educational HEVC (H.265) encoder written in Python.

Python 27 Updated Dec 10, 2025

inclusionAI / AReaL

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,280 258 Updated Dec 25, 2025

EvolvingLMMs-Lab / OpenMMReasoner

Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Python 129 5 Updated Dec 17, 2025

VisionXLab / ProCLIP

Official PyTorch implementation of ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder

Python 17 2 Updated Dec 4, 2025

ML-GSAI / LLaDA

Official PyTorch implementation for "Large Language Diffusion Models"

Python 3,428 231 Updated Nov 12, 2025

GaryGuTC / UniME-v2

[AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"

Python 51 Updated Dec 8, 2025

PrLeung / GAR

Forked from volcengine/verl

Python 3 Updated Nov 20, 2025

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,164 193 Updated Oct 9, 2025

Anning01 / AIMedia

AIMedia 是一款自动抓取热点，AI创作文章，自动发布的集成软件。支持头条，小红书，公众号等

Python 768 151 Updated Dec 25, 2025

deepinsight / insightface

State-of-the-art 2D and 3D Face Analysis Project

Python 27,411 5,872 Updated Nov 25, 2025

EvolvingLMMs-Lab / LLaVA-OneVision-1.5

Fully Open Framework for Democratized Multimodal Training

Python 663 53 Updated Dec 15, 2025

rednote-hilab / dots.ocr

Multilingual Document Layout Parsing in a Single Vision-Language Model

Python 5,933 579 Updated Oct 31, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 11,851 1,087 Updated Dec 25, 2025