Stars
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
Code and data for paper "Deep Painterly Harmonization": https://arxiv.org/abs/1804.03189
how to optimize some algorithm in cuda.
Deformable ConvNets V2 (DCNv2) in PyTorch
Fast k nearest neighbor search using GPU
Parallel GPU Implementation of Connected Component Labelling (CCL). Connected-component labeling is used in computer vision to detect connected regions in binary digital images
The repository holds several custom network layers. Some of which were used in my recent optical flow project: Learning Energy Based Inpainting for Optical Flow.