gemm

Star

Here are 29 public repositories matching this topic...

OpenNMT / CTranslate2

Star

Fast inference engine for Transformer models

Updated Nov 29, 2025
C++

CNugteren / CLBlast

Sponsor

Star

Tuned OpenCL BLAS

gpu opencl matrix-multiplication blas gemm blas-libraries clblas

Updated Nov 29, 2025
C++

CoffeeBeforeArch / mmul

Sponsor

Star

Serial and parallel implementations of matrix multiplication

serial parallel matrix-multiplication benchmarks gemm mmul

Updated Feb 19, 2021
C++

eth-cscs / spla

Star

Specialized Parallel Linear Algebra, providing distributed GEMM functionality for specific matrix distributions with optional GPU acceleration.

linear-algebra mpi cuda gemm rocm

Updated Jun 26, 2024
C++

Bruce-Lee-LY / cutlass_gemm

Star

Multiple GEMM operators are constructed with cutlass to support LLM inference.

gpu cublas nvidia cutlass gemm cublaslt llm matrix-multiply tensor-core

Updated Aug 3, 2025
C++

XiaoSong9905 / dgemm-knl

Star

DGEMM on KNL, achieve 75% MKL

hpc high-performance linear-algebra x86 gemm dgemm

Updated May 19, 2022
C++

Avafly / optimize-gemm

Star

My gemm optimization on RPi (ARM) achieved a 170x performance boost, showing speeds faster than Eigen and close to OpenBLAS.

c cpp hpc linear-algebra simd matrix-multiplication high-performance-computing blas gemm

Updated Nov 17, 2024
C++

CambriconECO / BANGC_Gemm_Tutorial

Star

algorithm gemm cambricon bangc

Updated Apr 7, 2021
C++

blackccpie / fastconv

Star

fast 2D convolution implementation benchmark

cpp avx simd convolution gemm toeplitz im2col

Updated Nov 21, 2017
C++

xziya / gemm-opt

Star

Manually optimize the GEMM (GEneral Matrix Multiply) operation. There is a long way to go.

cpu cpp gemm gemm-optimization

Updated Aug 22, 2021
C++

yester31 / GEMM_Conv2d_CUDA

Star

CUDA Gemm Convolution implementation

cuda cublas convolution cuda-kernels gemm cuda-programming

Updated Feb 4, 2022
C++

5000user5000 / mpGEMM

Star

mixed-precision GEMM library

cpp matrix gemm pybind11 lookup-table mixed-precision

Updated May 27, 2025
C++

yester31 / OpenCL_EX

Star

Development of deep learning inference code by OpenCL kerenl function.

opencl parallel-computing convolution deeplearning gemm

Updated Jun 1, 2022
C++

KaiserKlayton / lpa_cnn

Star

Low Precision Arithmetic for Convolutional Neural Network Inference

benchmarking caffe deep-learning image-recognition convolutional-neural-networks 8-bit gemm

Updated Oct 29, 2017
C++

adelj88 / rocm_wmma_gemm

Star

WMMA GEMM in ROCm for RDNA GPUs

amd matrix-multiplication gemm half-precision rocm amd-gpu matrix-core

Updated Nov 13, 2025
C++

scocoyash / Convolution-To-Gemm

Star

My experiments with convolution

matrix-multiplication convolution openmpi gemm gemm-optimization

Updated Jun 21, 2020
C++

nirw4nna / YAMI

Sponsor

Star

Yet Another Machine Inference framework

performance cpp avx2 gemm llm-inference

Updated Feb 7, 2025
C++

xylcbd / gemm_base

Star

gemm baseline code.

gemm mkl openblas gemm-optimization

Updated Oct 22, 2017
C++

LRZ-BADW / OMMOP

Star

OpenMP Matrix Multiplication Offloading Playground

gpu openmp offloading gemm matmul

Updated Dec 2, 2022
C++

p-anastas / PARALiA-GEMMex

Star

hpc autotuning blas multi-gpu gemm gpus uncut-gemms

Updated Nov 19, 2024
C++

Improve this page

Add a description, image, and links to the gemm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gemm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gemm

Here are 29 public repositories matching this topic...

OpenNMT / CTranslate2

CNugteren / CLBlast

CoffeeBeforeArch / mmul

eth-cscs / spla

Bruce-Lee-LY / cutlass_gemm

XiaoSong9905 / dgemm-knl

Avafly / optimize-gemm

CambriconECO / BANGC_Gemm_Tutorial

blackccpie / fastconv

xziya / gemm-opt

yester31 / GEMM_Conv2d_CUDA

5000user5000 / mpGEMM

yester31 / OpenCL_EX

KaiserKlayton / lpa_cnn

adelj88 / rocm_wmma_gemm

scocoyash / Convolution-To-Gemm

nirw4nna / YAMI

xylcbd / gemm_base

LRZ-BADW / OMMOP

p-anastas / PARALiA-GEMMex

Improve this page

Add this topic to your repo