Stars
user friendly, beautiful comment area to your blog
Pixio: a capable vision encoder dedicated to dense tasks, simply by pixel reconstruction
A paper list for spatial reasoning
CUDA Tile IR is an MLIR-based intermediate representation and compiler infrastructure for CUDA kernel optimization, focusing on tile-based computation patterns and optimizations targeting NVIDIA te…
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation Model
[ArXiv 2025] DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
WorldPlay: Interactive World Modeling with Real-Time Latency and Geometric Consistency
Native and Compact Structured Latents for 3D Generation
🌐 WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Open-source release accompanying Gao et al. 2025
RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards.
Deepagents is an agent harness built on langchain and langgraph. Deep agents are equipped with a planning tool, a filesystem backend, and the ability to spawn subagents - making them well-equipped …
Taming large-scale full-parameter few-step training with self-adversarial flows! 👏🏻
A Definition of Scientific General Intelligence
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
A toolkit for developing and comparing reinforcement learning algorithms.
一个基于nano banana pro🍌的原生AI PPT生成应用,迈向真正的"Vibe PPT"; 支持上传任意模板图片;上传任意素材&智能解析;一句话/大纲/页面描述自动生成PPT;口头修改指定区域、一键导出 - An AI-native PPT generator based on nano banana pro🍌
🚀 An awesome list of curated Nano Banana pro prompts and examples. Your go-to resource for mastering prompt engineering and exploring the creative potential of the Nano banana pro(Nano banana 2) AI…
ContextGen: Contextual Layout Anchoring for Identity-Consistent Multi-Instance Generation
LATTICE: Democratize High-Fidelity 3D Generation at Scale
This repository catalogs cutting-edge research papers, practical tools, datasets, and learning materials for AI-powered SVG generation, processing, and manipulation.
Ovis-Image is a 7B text-to-image model specifically optimized for high-quality text rendering, designed to operate efficiently under stringent computational constraints.
cuTile is a programming model for writing parallel kernels for NVIDIA GPUs
SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation