MartinXM

Follow

🎯

Focusing

Martin MartinXM

🎯

Focusing

Follow

Struggle and have some fun.

33 followers · 55 following

Tongyi Lab, Alibaba
Hangzhou

Achievements

Achievements

Stars

LJungang / Awesome-Video-Reasoning-Landscape

🔥An open-source survey of the latest video reasoning tasks, paradigms, and benchmarks.

108 4 Updated Dec 20, 2025

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 2,956 358 Updated Dec 23, 2025

danieladdisonorg / DeepSeek-R1-Voice-Agent

An interactive AI voice agent that can capture and transcribe speech in real-time, generate intelligent responses using the DeepSeek R1 (7B model) AI, and convert the responses back to natural spee…

Python 22 5 Updated Jun 20, 2025

magenta / magenta-realtime

Python 943 97 Updated Dec 17, 2025

GeeeekExplorer / nano-vllm

Nano vLLM

Python 10,025 1,254 Updated Nov 3, 2025

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,500 481 Updated Oct 27, 2025

OpenHands / OpenHands

🙌 OpenHands: AI-Driven Development

Python 65,866 8,103 Updated Dec 23, 2025

PKUFlyingPig / cs-self-learning

计算机自学指南

HTML 70,219 7,794 Updated Nov 28, 2025

PicoTrex / GPT-ImgEval

GPT-ImgEval: Evaluating GPT-4o’s state-of-the-art image generation capabilities

Python 305 8 Updated May 3, 2025

ByteByteGoHq / ml-bytebytego

993 202 Updated May 13, 2025

dome272 / Flow-Matching

My take on Flow Matching

Jupyter Notebook 86 12 Updated Jan 11, 2025

comfyanonymous / ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 97,813 11,080 Updated Dec 23, 2025

xinyu1205 / recognize-anything

Open-source and strong foundation image recognition models.

Jupyter Notebook 3,530 316 Updated Feb 18, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,855 303 Updated Jun 12, 2025

TIGER-AI-Lab / TheoremExplainAgent

Official Repo for "TheoremExplainAgent: Towards Video-based Multimodal Explanations for LLM Theorem Understanding" [ACL 2025 oral]

Python 1,441 188 Updated Jul 27, 2025

alirezadir / Machine-Learning-Interviews

This repo is meant to serve as a guide for Machine Learning/AI technical interviews.

Jupyter Notebook 7,356 1,332 Updated Nov 28, 2025

wenqsun / DimensionX

[ICCV'25]DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Python 1,320 75 Updated Oct 17, 2025

Doubiiu / ToonCrafter

[SIGGRAPH Asia 2024, Journal Track] ToonCrafter: Generative Cartoon Interpolation

Python 5,929 528 Updated Mar 19, 2025

diff-usion / Awesome-Diffusion-Models

A collection of resources and papers on Diffusion Models

HTML 12,211 1,011 Updated Aug 1, 2024

wangkai930418 / awesome-diffusion-categorized

collection of diffusion model papers categorized by their subareas

2,090 95 Updated Dec 22, 2025

Cominclip / BoxDiff-XL

Extend BoxDiff to SDXL (SDXL-based layout-to-image generation)

Python 25 2 Updated May 23, 2024

ali-vilab / Ranni

Python 238 16 Updated Apr 10, 2024

kongzhecn / OMG

[ECCV 2024] OMG: Occlusion-friendly Personalized Multi-concept Generation In Diffusion Models

Python 700 46 Updated Jul 2, 2024

lifeisboringsoprogramming / sd-webui-lora-masks

Apply unlimited masks to unlimited LoRA models

Python 50 4 Updated Jul 24, 2023

wooyeolbaek / attention-map-diffusers

🚀 Cross attention map tools for huggingface/diffusers

Python 374 27 Updated Jan 18, 2025

FireRedTeam / StoryMaker

StoryMaker: Towards consistent characters in text-to-image generation

Python 717 61 Updated Dec 2, 2024

levihsu / OOTDiffusion

[AAAI 2025] Official implementation of "OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on"

Python 6,487 939 Updated May 13, 2024

baaivision / Emu3

Next-Token Prediction is All You Need

Python 2,270 91 Updated Nov 19, 2025

kousw / experimental-consistory

Python 112 5 Updated Mar 3, 2024

suchot / DevConsiStory

experimental implementation of Consistory

Python 20 2 Updated Jul 15, 2024