Skip to content
View kentang-mit's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report kentang-mit

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

Python 96 2 Updated Oct 24, 2024

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

457 9 Updated Oct 25, 2024

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Python 271 8 Updated Oct 16, 2024

A sparse attention kernel supporting mix sparse patterns

C++ 49 Updated Oct 15, 2024

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 951 50 Updated Sep 27, 2024
Python 104 6 Updated Jul 12, 2024

[ICML 2024] LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery

Python 55 5 Updated May 31, 2024

Code for the paper DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents, ICML 2024

Python 71 2 Updated Jun 12, 2024

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Cuda 189 18 Updated Oct 21, 2024

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Python 674 27 Updated Sep 27, 2024

Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning

Python 65 3 Updated Jun 16, 2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Python 268 5 Updated Oct 25, 2024

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,282 53 Updated Aug 15, 2024

Tile primitives for speedy kernels

Cuda 1,579 60 Updated Oct 30, 2024

(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis

Python 490 27 Updated Sep 27, 2024

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Jupyter Notebook 515 29 Updated Oct 6, 2024

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Python 1,416 147 Updated Oct 28, 2024
Python 114 10 Updated Jun 12, 2024

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Python 425 21 Updated Sep 5, 2024

PyTorch emulation library for Microscaling (MX)-compatible data formats

Python 158 20 Updated Sep 23, 2024
Jupyter Notebook 887 101 Updated Apr 29, 2024

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…

Python 4,196 309 Updated Oct 6, 2024

Code for QuaRot, an end-to-end 4-bit inference of large language models.

Python 270 20 Updated Jul 22, 2024

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,202 277 Updated May 4, 2024

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 1,947 157 Updated Oct 24, 2024

Microsoft Collective Communication Library

C++ 314 31 Updated Sep 20, 2023

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)

Python 724 44 Updated Jul 29, 2024

Code repository for Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.

Python 1,743 123 Updated Feb 23, 2024

ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models

Python 161 6 Updated Oct 8, 2024
Next