Skip to content
View j-min's full-sized avatar

Highlights

  • Pro

Organizations

@PyTorchKR @PyTorchKorea

Block or report j-min

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official implementation of Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents (NeurIPS 2025)

Python 43 3 Updated Nov 24, 2025

RotBench: Evaluating Multimodal Large Language Models on Identifying Image Rotation (arxiv preprint)

Python 5 Updated Aug 25, 2025

🌟A curated list of DUSt3R-related papers and resources, tracking recent advancements using this geometric foundation model.

775 23 Updated Nov 5, 2025

Open-source unified multimodal model

Python 5,478 481 Updated Oct 27, 2025

FULL Augment Code, Claude Code, Cluely, CodeBuddy, Comet, Cursor, Devin AI, Junie, Kiro, Leap.new, Lovable, Manus, NotionAI, Orchids.app, Perplexity, Poke, Qoder, Replit, Same.dev, Trae, Traycer AI…

101,159 26,998 Updated Dec 2, 2025

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 17,271 1,447 Updated Nov 28, 2025
Python 57 5 Updated May 19, 2025

A generative world for general-purpose robotics & embodied AI learning.

Python 27,817 2,569 Updated Dec 19, 2025

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Python 644 41 Updated Oct 16, 2024

PhD Dissertation Template for UNC Computer Science

TeX 5 3 Updated Feb 7, 2023

Implementation of the paper: "Answering Questions by Meta-Reasoning over Multiple Chains of Thought"

Python 96 12 Updated Jan 21, 2024

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 3,555 593 Updated Dec 17, 2025

Jupyter notebook server extension to proxy web services.

Python 385 150 Updated Dec 8, 2025

PyTorch code for System-1.x: Learning to Balance Fast and Slow Planning with Language Models

Python 24 2 Updated Jul 22, 2024

Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.

Python 3,798 262 Updated May 17, 2025

A reading list of video generation

642 41 Updated Dec 18, 2025

LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR team.

JavaScript 503 46 Updated Feb 11, 2025
Python 3,890 255 Updated Mar 15, 2024

Official implementation of SEED-LLaMA (ICLR 2024).

Python 638 33 Updated Sep 21, 2024

Official implementation of Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model (ICLR 2025 Oral)

Python 462 15 Updated Feb 11, 2025

4M: Massively Multimodal Masked Modeling

Python 1,779 111 Updated Jun 2, 2025

[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…

Jupyter Notebook 8,561 551 Updated Nov 10, 2025

Rethinking Interactive Image Segmentation with Low Latency, High Quality, and Diverse Prompts (CVPR 2024)

Python 101 13 Updated Aug 6, 2025

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,330 279 Updated May 4, 2024

OCR, layout analysis, reading order, table recognition in 90+ languages

Python 18,989 1,299 Updated Oct 21, 2025

Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)

Python 39 1 Updated Jul 13, 2024

Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data

Python 35 1 Updated Mar 12, 2024

Code for the paper "pix2gestalt: Amodal Segmentation by Synthesizing Wholes" (CVPR 2024)

Python 191 11 Updated Jun 26, 2025
Next