Skip to content
View RainBowLuoCS's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report RainBowLuoCS

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
RainBowLuoCS/README.md

Hi I am Run Luo 👋

I am an incoming Ph.D. student in National University of Singapore. I received my Master’s degree in Computer Technology from the University of Chinese Academy of Sciences (UCAS) in 2026. Prior this, I received my Bachelor’s degree in Software Engineering from Huazhong University of Science and Technology (HUST) in 2023. My research interests focus on Visual Tracking, Diffusion Model, Multi-modal Learning, and Large Language Models. I am currently exploring a unified omnimodal foundational model involving vision, audio, and text modalities, and I hope to see the model generate synergy and benefit from both generation and understanding, thereby extending the intelligence boundaries of existing models. I firmly believe that it can unify the paradigms of world models or vision-language-action models, and through this, benefit interactions across different physical devices and the real world.

Pinned Loading

  1. DiffusionTrack DiffusionTrack Public

    [AAAI 2024] DiffusionTrack: Diffusion Model For Multi-Object Tracking. DiffusionTrack is the first work to employ the diffusion model for multi-object tracking by formulating it as a generative noi…

    Python 207 12

  2. DEEM DEEM Public

    (ICLR 2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.

    Python 51 6

  3. MMEvol MMEvol Public

    (ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"

    Jupyter Notebook 22 2

  4. OpenOmni OpenOmni Public

    (NIPS 2025) OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-Time Self-Aware Emotional Speech Synthesis

    Python 140 7

  5. GUI-R1 GUI-R1 Public

    Forked from ritzz-ai/GUI-R1

    Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents

    Python 3