Skip to content
View NROwind's full-sized avatar

Block or report NROwind

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

AI agents running research on single-GPU nanochat training automatically

Python 81,532 11,854 Updated Mar 26, 2026

ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works…

Python 9,697 936 Updated May 17, 2026
Python 6 Updated Mar 7, 2026

The code implementation for UME-R1: Exploring Reasoning-Driven Generative Multimodal Embeddings (ICLR 2026).

Python 59 3 Updated Feb 25, 2026

FireRed-Image-Edit is a powerful image editing foundation model achieving open-source state-of-the-art performance with precise instruction following, high-fidelity generation, superior identity co…

Python 1,213 74 Updated Apr 3, 2026
Python 36 4 Updated Apr 9, 2026

Statistical Learning course in USTC. 中科大统计学习(刘东)课程复习资料。

TeX 62 10 Updated Jan 9, 2024

Collection of papers about video-audio understanding

25 1 Updated Dec 26, 2025

Official implementation of RLFR: Extending Reinforcement Learning for LLMs with Flow Environment

Python 47 1 Updated Nov 15, 2025

A Comprehensive Dataset for Advanced Image Generation and Editing}

32 2 Updated Oct 2, 2025
Python 1,928 122 Updated Sep 30, 2025

[ICLR-2026] Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning

Python 149 7 Updated Jun 30, 2025
Python 1,211 76 Updated Nov 20, 2025

Latest open-source "Thinking with images" (O3/O4-mini) papers, covering training-free, SFT-based, and RL-enhanced methods for "fine-grained visual understanding".

113 2 Updated Aug 21, 2025

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

1,456 45 Updated Mar 9, 2026
Python 24 1 Updated Oct 16, 2025

New generation of CLIP with strong fine grained discrimination capability, ICML2026 and ICML2025

Python 754 36 Updated May 8, 2026

LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning

Python 77 3 Updated May 23, 2025

[Extended verision ICLR 2025 Blog Track] Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Python 838 50 Updated Jun 16, 2025

Official implementation of BLIP3o-Series

Python 1,654 78 Updated Nov 29, 2025

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,574 66 Updated Jun 14, 2025

[ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation

Python 189 6 Updated May 21, 2025
Python 2,503 245 Updated Jul 16, 2025

This is a repo to track the latest autoregressive visual generation papers.

431 6 Updated Jun 25, 2025

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

Python 1,426 93 Updated May 15, 2026

GPT-ImgEval: Evaluating GPT-4o’s state-of-the-art image generation capabilities

Python 306 8 Updated May 3, 2025

Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"

Jupyter Notebook 316 11 Updated Sep 28, 2025

A collection of awesome text-to-image generation studies.

TeX 757 40 Updated Apr 25, 2026

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

2,435 205 Updated Feb 7, 2026
Next