Stars
Demo code for Horikawa, T. (2025) Mind captioning: Evolving descriptive text of mental content from human brain activity. Science Advances https://doi.org/10.1126/sciadv.adw1464
Zhejiang University Graduation Thesis LaTeX Template
BrainSCUBA: Fine-Grained Natural Language Captions of Visual Cortex Selectivity (ICLR 2024)
Learning Deep Representations of Data Distributions
This repo contains the code for 1D tokenizer and generator
Code for ICML 2025 Paper "Highly Compressed Tokenizer Can Generate Without Training"
Emu Series: Generative Multimodal Models from BAAI
[CVPR 2022] Official PyTorch Implementation for DiffusionCLIP: Text-guided Image Manipulation Using Diffusion Models
Official implementations of our LaZSL (ICCV'25)
Official PyTorch Code for Anchor Token Guided Prompt Learning Methods: [ICCV 2025] ATPrompt and [Arxiv 2511.21188] AnchorOPT
[NeurIPS 2025] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
[ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)
🌋👵🏻 Yo'LLaVA: Your Personalized Language and Vision Assistant (NeurIPS 2024)
🦎 Yo'Chameleon: Your Personalized Chameleon (CVPR 2025)
(SRA) No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
Official Implementation of weights2weights
[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
PyTorch code and models for the DINOv2 self-supervised learning method.
PyTorch implementation of RCG https://arxiv.org/abs/2312.03701
[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
✔(已完结)最全面的 深度学习 笔记【土堆 Pytorch】【李沐 动手学深度学习】【吴恩达 深度学习】【大飞 大模型Agent】