-
Central South University
- China
Lists (30)
Sort Name ascending (A-Z)
3d dataset
3d photo
3dgs
AI_isp
AIGC_Engineering
chatgpt
dataset
denoise
depth completion
depth diffusion
depth sr
diffusion
engineering - deep learning
hdr
image fusion related
image restoration
inpainting
Matting
mono depth
normal
novel view synthesis
object detection
optical flow
relighting
segmentation
stable diffusion and related
stereo matching
stitching
super resolution
video stabilization
Stars
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, …
An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.
[CVPR 2026] PromptEnhancer is a prompt-rewriting tool, refining prompts into clearer, structured versions for better image generation.
📹 A more flexible framework that can generate videos at any resolution and creates videos from images.
We introduce BabyVision, a benchmark revealing the infancy of AI vision.
omo/lazycodex: The coding agent for tokenmaxxers;the one and only agent harness for complex codebases. For your Codex, for your OpenCode
Official implementation of "WorDepth: Variational Language Prior for Monocular Depth Estimation"
Get 10X more out of Claude Code, Codex or any coding agent
[ECCV 2026] Towards Scalable Pre-training of Visual Tokenizers for Generation
A Cross-Platform Backend for High-Performance Sparse Convolutions
A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing
[ArXiv 2025] MobileI2V: Fast and High-Resolution Image-to-Video on Mobile Devices
Kandinsky 5.0: A family of diffusion models for Video & Image generation
Lightweight Image Video Action Generation Inference Framework
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
[NeurIPS 2025] Official implementation of ScaleDiff: Higher-Resolution Image Synthesis via Efficient and Model-Agnostic Diffusion
PyTorch implementation of JiT https://arxiv.org/abs/2511.13720
[CVPR 2026] Scaling Spatial Intelligence with Multimodal Foundation Models
Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.
Automatically crawl arXiv papers daily and summarize them using AI. Illustrating them using GitHub Pages.
Cambrian-S: Towards Spatial Supersensing in Video
[NeurIPS 2025 Oral]Infinity⭐️: Unified Spacetime AutoRegressive Modeling for Visual Generation
Interactive, editable docs designed for coding agents
collection of diffusion model papers categorized by their subareas
Open-source AI hackers to find and fix your app’s vulnerabilities.
[WACV'25 Oral] Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
[ICCV 25] OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting