Stars
This repository is the official implementation for our paper "Beyond Next-Token Alignment: Distilling Multimodal Large Language Models via Token Interactions."
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Enjoy the magic of Diffusion models!
[[NeurIPS 2025] UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions
[T-PAMI 2025] EMOv2: Pushing 5M Vision Model Frontier
a family of highly capabale yet efficient large multimodal models
VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle various visual tasks.