-
NTU
- Singapore
- https://jiequancui.github.io/
Lists (1)
Sort Name ascending (A-Z)
Stars
[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
PyTorch implementation of JiT https://arxiv.org/abs/2511.13720
Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"
Official Repository of Absolute Zero Reasoner
Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
Generative Agents: Interactive Simulacra of Human Behavior
12 Lessons to Get Started Building AI Agents
A curated list of reinforcement learning with human feedback resources (continually updated)
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
Official repository of paper “IML-ViT: Benchmarking Image manipulation localization by Vision Transformer”
Official PyTorch implementation for "Large Language Diffusion Models"
MMaDA - Open-Sourced Multimodal Large Diffusion Language Models
CODA: Repurposing Continuous VAEs for Discrete Tokenization
[ICLR2025] Halton Scheduler for Masked Generative Image Transformer
High-Resolution Image Synthesis with Latent Diffusion Models
NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024
[ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models
[NeurIPS25 & ICML25 Workshop on Reliable and Responsible Foundation Models] A Simple Baseline Achieving Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1. Paper at: https:/…
PyTorch implementation of FractalGen https://arxiv.org/abs/2502.17437
The Next Step Forward in Multimodal LLM Alignment
MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & TIS & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"