Skip to content
View JamesHujy's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report JamesHujy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Jupyter Notebook 213 4 Updated Dec 19, 2025
Python 2 1 Updated Oct 16, 2025
Python 37 5 Updated Dec 26, 2025

Official implementation of "Figure It Out: Improve the Frontier of Reasoning with Active Visual Thinking"

Python 13 Updated Jan 13, 2026
Python 521 51 Updated Jan 28, 2026

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,758 66 Updated Jan 20, 2026

A General, Accurate, Long-Horizon, and Efficient Mobile Agent driven by Multimodal Foundation Models

Python 340 10 Updated Nov 18, 2025

LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence https://arxiv.org/abs/2509.03505

Python 3,092 285 Updated Dec 18, 2025
Python 869 45 Updated Sep 15, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 19,155 3,232 Updated Feb 11, 2026

Qwen3-Coder is the code version of Qwen3, the large language model series developed by Qwen team.

Python 15,470 1,079 Updated Feb 3, 2026
Python 81 Updated Oct 18, 2025

FULL Augment Code, Claude Code, Cluely, CodeBuddy, Comet, Cursor, Devin AI, Junie, Kiro, Leap.new, Lovable, Manus, NotionAI, Orchids.app, Perplexity, Poke, Qoder, Replit, Same.dev, Trae, Traycer AI…

114,212 29,558 Updated Feb 6, 2026

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

1,324 40 Updated Feb 3, 2026

OmniGen2: Exploration to Advanced Multimodal Generation. https://arxiv.org/abs/2506.18871

Jupyter Notebook 4,025 17 Updated Dec 2, 2025

MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model.

Python 3,078 272 Updated Jul 7, 2025

Official implementation of BLIP3o-Series

Python 1,637 78 Updated Nov 29, 2025

[ICLR 2025 Oral] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Python 955 66 Updated Jul 10, 2025

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Python 840 25 Updated Dec 23, 2025

The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning

Python 331 18 Updated May 31, 2025

Open-source unified multimodal model

Python 5,659 501 Updated Oct 27, 2025

Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"

Python 273 27 Updated Oct 16, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,805 103 Updated Mar 18, 2025

PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [NeurIPS '25]

Python 62 8 Updated Oct 2, 2025

Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]

Python 819 42 Updated Dec 14, 2025

Cosmos-Reason1 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.

Python 909 82 Updated Jan 6, 2026

This package contains the original 2012 AlexNet code.

Cuda 2,826 365 Updated Mar 12, 2025

A series of technical report on Slow Thinking with LLM

Python 759 41 Updated Aug 13, 2025
Next