Lists (32)
Sort Name ascending (A-Z)
books
Calibration
cpp
Dataset
DenseReconstruction
DepthEstimation
Feature&Matching
hdmap/vecmap
ImageEnhance
ImagePreprocessing
ImageRetrieval
LLM/VLM
Localization
MVS
Nerf/GS
Object Detection
OpticalFlow
Paper
Perception
ProgramLanguage
python
python_tools
RL
SfM/SLAM
SurfaceReconstruction
Tools
Tracking
tutorial
Vectorization
VIO
Visualization
VO
Stars
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
A hands-on introduction to video technology: image, video, codec (av1, vp9, h265) and more (ffmpeg encoding). Translations: 🇺🇸 🇨🇳 🇯🇵 🇮🇹 🇰🇷 🇷🇺 🇧🇷 🇪🇸
PyTorch code and models for the DINOv2 self-supervised learning method.
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image genera…
A course on aligning smol models.
COCO API - Dataset @ http://cocodataset.org/
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Efficient Image Captioning code in Torch, runs on GPU
CoTracker is a model for tracking any point (pixel) on a video.
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
GTSAM is a library of C++ classes that implement smoothing and mapping (SAM) in robotics and vision, using factor graphs and Bayes networks as the underlying computing paradigm rather than sparse m…
Code for "LoFTR: Detector-Free Local Feature Matching with Transformers", CVPR 2021, T-PAMI 2022
Metric depth estimation from a single image
🔥 LeetCode for PyTorch — practice implementing softmax, attention, GPT-2 and more from scratch with instant auto-grading. Jupyter-based, self-hosted or try online.
Efficient neural feature detector and descriptor
This is the official repository for The Hundred-Page Language Models Book by Andriy Burkov
A Modular Framework for 3D Gaussian Splatting and Beyond
Implementation of XFeat (CVPR 2024). Do you need robust and fast local feature extraction? You are in the right place!
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
assistant tools for attention visualization in deep learning
[ECCV`24&ICLR`25] CityGaussian Series for High-quality Large-Scale Scene Reconstruction with Gaussians
Code for "Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed", CVPR 2024
Superpoint Implemented in PyTorch: https://arxiv.org/abs/1712.07629
[CVPR 2024 Oral] Rethinking Inductive Biases for Surface Normal Estimation
Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral
D2-Net: A Trainable CNN for Joint Description and Detection of Local Features