Skip to content
View mkolod's full-sized avatar
  • San Francisco Bay Area, CA

Block or report mkolod

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Cuda 33 2 Updated Jul 19, 2024

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 59,829 10,325 Updated Nov 12, 2025

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance…

Python 3,397 751 Updated Jun 17, 2026
C 287 27 Updated May 26, 2024

A retargetable MLIR-based machine learning compiler and runtime toolkit.

C++ 3,810 931 Updated Jun 18, 2026

Starlark implementation of bazel rules for CUDA.

Starlark 119 66 Updated Jun 17, 2026

This repo holds the extended examples for rules_cuda.

Starlark 7 2 Updated Dec 18, 2025

Spatial Sparse Convolution Library

Python 2,280 419 Updated Dec 15, 2024

Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors

Python 2,935 475 Updated Mar 5, 2024

Source code for 'Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL' by James Reinders, Ben Ashbaugh, James Brodman, Michael Kinsner, John Pennycook, Xin…

CMake 285 86 Updated May 11, 2026

Dire Wolf is a software "soundcard" AX.25 packet modem/TNC and APRS encoder/decoder. It can be used stand-alone to observe APRS traffic, as a tracker, digipeater, APRStt gateway, or Internet Gatewa…

C 1,990 343 Updated May 27, 2026

Various scripts written for ham radio pi

Shell 130 43 Updated Mar 21, 2026
Shell 342 71 Updated Apr 18, 2024

The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.

C++ 3,155 818 Updated Jun 18, 2026

A curated list of projects related to the reMarkable tablet

7,498 258 Updated Jun 5, 2026

Make huge neural nets fit in memory

Python 2,840 279 Updated Apr 26, 2020

Deprecated - see our other repos for Bazel examples

Java 10 4 Updated Mar 22, 2022

For publishing the source for UG1352 "Get Moving with Alveo"

C++ 51 16 Updated Jun 17, 2020
C++ 9 17 Updated Sep 28, 2025

Vitis_Accel_Examples

Makefile 597 230 Updated Jun 15, 2026

Vitis In-Depth Tutorials

C 1,586 621 Updated May 22, 2026

Avnet Board Definition Files

Tcl 145 75 Updated Jan 12, 2026

The code for the ebook Ray Tracing in One Weekend by Peter Shirley translated to CUDA by Roger Allen. This work is in the public domain.

C++ 393 98 Updated Jan 29, 2021

Brevitas: neural network quantization in PyTorch

Python 1,540 244 Updated Jun 17, 2026

Performance writing to GPIO with CPU and DMA on the Raspberry Pi

C 210 27 Updated Jul 15, 2024

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 13,083 2,378 Updated Jun 3, 2026

Documentation of NVIDIA chip/hardware interfaces

C 1,341 104 Updated Jun 10, 2026

Source code examples from the Parallel Forall Blog

HTML 1,330 642 Updated Sep 23, 2025
Next