Skip to content
View AceCoooool's full-sized avatar
😴
lazy.
😴
lazy.

Block or report AceCoooool

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
8 stars written in Cuda
Clear filter

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 10,302 1,050 Updated Apr 18, 2026

Code and data for paper "Deep Painterly Harmonization": https://arxiv.org/abs/1804.03189

Cuda 6,053 613 Updated Aug 2, 2021

how to optimize some algorithm in cuda.

Cuda 2,928 270 Updated Apr 16, 2026

Deformable ConvNets V2 (DCNv2) in PyTorch

Cuda 1,487 231 Updated Nov 18, 2022

Fast k nearest neighbor search using GPU

Cuda 546 111 Updated Aug 6, 2018

CUDA Data Parallel Primitives Library

Cuda 438 97 Updated Nov 9, 2018

Parallel GPU Implementation of Connected Component Labelling (CCL). Connected-component labeling is used in computer vision to detect connected regions in binary digital images

Cuda 54 16 Updated Feb 11, 2018

The repository holds several custom network layers. Some of which were used in my recent optical flow project: Learning Energy Based Inpainting for Optical Flow.

Cuda 7 1 Updated Nov 5, 2018