- Palo Alto, CA
-
03:12
(UTC -08:00) - zhaoyue-zephyrus.github.io
- @__yuezhao__
Highlights
- Pro
Lists (2)
Sort Name ascending (A-Z)
Stars
A latent text-to-image diffusion model
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
A hands-on introduction to video technology: image, video, codec (av1, vp9, h265) and more (ffmpeg encoding). Translations: 🇺🇸 🇨🇳 🇯🇵 🇮🇹 🇰🇷 🇷🇺 🇧🇷 🇪🇸
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image genera…
Taming Transformers for High-Resolution Image Synthesis
NVIDIA Isaac GR00T N1.6 - A Foundation Model for Generalist Robots.
Simple tutorials using Google's TensorFlow Framework
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
Benchmarking Knowledge Transfer in Lifelong Robot Learning
This repo contains the code for 1D tokenizer and generator
[NeurIPS 2024] Code release for "Segment Anything without Supervision"
Code for Ditto: Building Digital Twins of Articulated Objects from Interaction
[arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation