- Palo Alto, CA
-
07:30
(UTC -08:00) - zhaoyue-zephyrus.github.io
Highlights
- Pro
Lists (2)
Sort Name ascending (A-Z)
Stars
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
"E-RayZer: Self-supervised 3D Reconstruction as Spatial Visual Pre-training" official implementation.
Curate, Annotate, and Manage Your Data in LightlyStudio.
Matplotlib styles for scientific plotting
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
A python library for self-supervised learning on images.
Official repository for "VideoPrism: A Foundational Visual Encoder for Video Understanding" (ICML 2024)
Interactive Post-Training for Vision-Language-Action Models
official training and inference code of bitwise tokenizer
Benchmarking Knowledge Transfer in Lifelong Robot Learning
Code for ICCV'2025 (Best student paper honorable mention) "RayZer: A Self-supervised Large View Synthesis Model"
MAGI-1: Autoregressive Video Generation at Scale
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
Official PyTorch implementation of One-Minute Video Generation with Test-Time Training
Mastering Diverse Domains through World Models
NVIDIA Isaac GR00T N1.6 - A Foundation Model for Generalist Robots.
This package contains the original 2012 AlexNet code.
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
FlashMLA: Efficient Multi-head Latent Attention Kernels
[arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
An official code release of the paper RGB no more: Minimally Decoded JPEG Vision Transformers
This repo contains the code for 1D tokenizer and generator