Highlights
- Pro
Stars
The most cited deep learning papers
Refine high-quality datasets and visual AI models
Awesome curated collection of images and prompts generated by gemini-2.5-flash-image (aka Nano Banana) state-of-the-art image generation and editing model. Explore AI generated visuals created with…
TensorFlow 2.x version's Tutorials and Examples, including CNN, RNN, GAN, Auto-Encoders, FasterRCNN, GPT, BERT examples, etc. TF 2.0版入门实例代码,实战教程。
[ICCV 2019] Monocular depth estimation from a single image
Sequential model-based optimization with a `scipy.optimize` interface
A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.
An implementation of CycleGan using TensorFlow
Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]
PyTorch Extension Library of Optimized Graph Cluster Algorithms
👉 CARLA resources such as tutorial, blog, code and etc https://github.com/carla-simulator/carla
A summary of related works about flow matching, stochastic interpolants
📓 Notes and summaries of various ML, Computer Vision & NLP papers.
Mask Transfiner for High-Quality Instance Segmentation, CVPR 2022
Pytorch implementation for MeanFlow
Autonomous Navigation of UAV using Reinforcement Learning algorithms.
[ICRA 2024 Oral] Open-Fusion: Real-time Open-Vocabulary 3D Mapping and Queryable Scene Representation
CEDNet: A Cascade Encoder-Decoder Network for Dense Prediction (Pattern Recognition 2024)
clintonjwang / ControlNet
Forked from lllyasviel/ControlNetGenerate videos that interpolate between two given images
AIOZ-GDANCE: a large-scale dataset & baseline for music-driven group dance generation. (CVPR 2023)
[ISBI 2024] An implementation of SAM3D which adapts Segment Anything Model for Volumetric Medical Image Segmentation
Notes and comments about Deep Reinforcement Learning papers
[CVPR 2025] h-Edit: Effective and Flexible Diffusion-Based Editing via Doob’s h-Transform
[Remote Sensing] AerialFormer: Multi-resolution Transformer for Aerial Image Segmentation
[AAAI 2023 Oral] VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning