Skip to content
View ranery's full-sized avatar
🎭
Focusing
🎭
Focusing

Highlights

  • Pro

Organizations

@GATECH-EIC

Block or report ranery

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

SenseNova-U series: Native Unified Paradigm with NEO-unify from the First Principles

Python 3,249 284 Updated Jun 15, 2026

The official implementation of T3D: T3D: Few-Step Diffusion Language Models via Trajectory Self-Distillation with Direct Discriminative Optimization

Python 24 1 Updated May 20, 2026

[ICML 2026] code & model for arxiv paper "Autoregressive Image Generation with Masked Bit Modeling"

Python 57 1 Updated May 1, 2026

Nano vLLM

Python 14,088 2,231 Updated Apr 26, 2026

[NeurIPS' 2025] JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent

Python 828 37 Updated Apr 4, 2026

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 2,635 321 Updated Jun 18, 2026

Tile primitives for speedy kernels

Cuda 3,454 297 Updated Jun 15, 2026

DFloat11 [NeurIPS '25]: Lossless Compression of LLMs and DiTs for Efficient GPU Inference

Python 639 38 Updated Nov 24, 2025

[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding

Python 527 12 Updated Nov 14, 2025

[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Python 1,497 57 Updated Dec 16, 2025

🔥 How to efficiently and effectively compress the CoTs or directly generate concise CoTs during inference while maintaining the reasoning performance is an important topic!

65 2 Updated May 22, 2025

Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.

Python 845 53 Updated May 14, 2025

Paper list for Efficient Reasoning.

889 45 Updated May 29, 2026

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 39,480 4,790 Updated May 1, 2026

Strong and Open Vision Language Assistant for Mobile Devices

Python 1,358 88 Updated Apr 15, 2024

Sky-T1: Train your own O1 preview model within $450

Python 3,391 345 Updated Jul 12, 2025

📚 Collection of token-level model compression resources.

198 9 Updated Sep 3, 2025

[ICLR2025] Accelerating Diffusion Transformers with Token-wise Feature Caching

Python 220 10 Updated Mar 14, 2025

Material for gpu-mode lectures

Jupyter Notebook 6,190 623 Updated Jun 15, 2026

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,821 324 Updated Mar 12, 2026

More relighting!

Python 8,443 524 Updated Feb 20, 2025

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 1,372 73 Updated Jan 27, 2026

(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis

Python 1,376 71 Updated Mar 5, 2025

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,759 273 Updated Jul 18, 2025

SpeeD: A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training

Python 189 7 Updated Jan 27, 2025

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,872 2,773 Updated Aug 12, 2024

[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation

Python 977 66 Updated Mar 24, 2026

Mamba SSM architecture

Python 18,455 1,758 Updated Jun 15, 2026

Official inference repo for FLUX.1 models

Python 25,641 1,895 Updated Jul 31, 2025

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 650 90 Updated Sep 11, 2024
Next