Stars
7
stars
written in Cuda
Clear filter
DeepEP: an efficient expert-parallel communication library
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Deformable ConvNets V2 (DCNv2) in PyTorch
A personal depthwise convolution layer implementation on caffe by liuhao.(only GPU)
deformable convolution 2D 3D DeformableConvolution DeformConv Modulated Pytorch CUDA
Repository with several custom NMS ops for Tensorflow.