-
17:09
(UTC -12:00)
Lists (1)
Sort Name ascending (A-Z)
Stars
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple text input.
(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
SEED-Voken: A Series of Powerful Visual Tokenizers
TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet and so on. 文本生成模型,实现了包括LLaMA,ChatGLM,BLOOM,GPT2,Seq2Seq,BART,T5,UDA等模型的训练和预测,开箱即用。
Unitree Go2, Unitree G1 support for Nvidia Isaac Lab (Isaac Gym / Isaac Sim)
[CVPR 2025 Highlight🔥] Identity-Preserving Text-to-Video Generation by Frequency Decomposition
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
High quality training free inpaint for every stable diffusion model. Supports ComfyUI
Implementation of MagViT2 Tokenizer in Pytorch
⚡ Flash Diffusion ⚡: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation (AAAI 2025 Oral)
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
A PyTorch implementation of MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
An MBTI Exploration of Large Language Models
[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding
MineWorld: A Real-time interactive world model on Minecraft
DiffusionNFT: Online Diffusion Reinforcement with Forward Process
Official inference code and LongText-Bench benchmark for our paper X-Omni (https://arxiv.org/pdf/2507.22058).
Benchmarking Legal Knowledge of Large Language Models
[ICCV 2025] VideoVAE+: Large Motion Video Autoencoding with Cross-modal Video VAE
GPT-ImgEval: Evaluating GPT-4o’s state-of-the-art image generation capabilities
[ICCV 2025] SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
MoH: Multi-Head Attention as Mixture-of-Head Attention