-
Great Bay University
- China Beijing
-
12:30
(UTC -12:00) - 18281128hyh@gmail.com
Highlights
- Pro
Stars
GPT-Image-2 PPT Generator Skill for Creating Image-Based PowerPoint Presentations in Codex and Other Skill-Compatible Agents
StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing
Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.
[RSS 2026] Causal video-action world model for generalist robot control
Official repo for "Let ViT Speak: Generative Language-Image Pre-training"
Unified Codebase for Advanced World Models.
An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.
Code to pretrain, fine-tune, and evaluate DreamZero and run sim & real-world evals
Robometer: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons
Official code of Motus: A Unified Latent Action World Model
RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
[ICLR 2026] SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning
[ICML 2026] Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation" & Causal Forcing++
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
Light Image Video Generation Inference Framework
Easy Data Preparation with latest LLMs-based Operators and Pipelines.
ThinkGen: Generalized Thinking for Visual Generation
[CVPR2026] Efficient Long Video Generation via Next-Frame-Rate Prediction
Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation
Official JAX implementation of End-to-End Test-Time Training for Long Context
Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch
Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)
[DEIMv2] Real Time Object Detection Meets DINOv3
You can easily calculate FVD, PSNR, SSIM, LPIPS for evaluating the quality of generated or predicted videos.
This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing and Generation (CVPR2026 Highlight)''