Skip to content
View gakadam's full-sized avatar

Highlights

  • Pro

Block or report gakadam

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

RDMA RoCE Lab

HTML 4 Updated Jan 8, 2026

Friction Graphics

C++ 1,513 50 Updated Mar 16, 2026

The lcc retargetable ANSI C compiler

C 2,528 486 Updated Oct 6, 2024

Tutorials for NVIDIA CUPTI samples

C++ 61 12 Updated Nov 3, 2025

Autotuning NVCC Compiler Parameters, published @ CCPE Journal

C 10 2 Updated Apr 2, 2021

"Deep Learning Crash Course" is a comprehensive and up-to-date guide that takes you from simple neural networks all the way to cutting-edge deep learning architectures-no advanced math and programm…

Jupyter Notebook 120 45 Updated Feb 8, 2026

A collection of machine learning examples and tutorials.

Python 8,844 6,444 Updated Feb 19, 2026

Projects and exercises for the latest Deep Learning ND program https://www.udacity.com/course/deep-learning-nanodegree--nd101

Jupyter Notebook 5,490 5,334 Updated Jun 27, 2023

Repo for the Deep Learning Nanodegree Foundations program.

Jupyter Notebook 4,060 4,434 Updated Oct 2, 2023

This repo demonstrates the creation of a thread-safe Lazy Singleton in CPP using shared pointers.

C++ 2 Updated Oct 17, 2022

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 1,240 132 Updated Mar 24, 2026

A Quirky Assortment of CuTe Kernels

Python 863 99 Updated Mar 24, 2026

Google's Operations Research tools:

C++ 13,261 2,372 Updated Mar 24, 2026

"Deep Generative Modeling": Introductory Examples

Jupyter Notebook 1,298 204 Updated Mar 9, 2026

High-performance, GPU-aware communication library

C++ 88 23 Updated Dec 16, 2025

A simple Python Boolean library that can parse and manipulate dimacs as well as a custom language. Try some of the features out online here: http://formal.cs.utah.edu:8080/pbl/PBL.php

Python 10 4 Updated Jun 21, 2015

Implementation and Evaluation of Barrier Synchronization in OpenMP and MPI

C 8 2 Updated Dec 18, 2015

A compute shader wrapper for Godot

GDScript 943 15 Updated Mar 5, 2025

A suite of GShade shaders for Final Fantasy XIV

HLSL 1,222 49 Updated Jun 29, 2024

A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.

Python 467 28 Updated Mar 10, 2025

FlagGems is an operator library for large language models implemented in the Triton Language.

Python 928 293 Updated Mar 24, 2026

Material for the SC22 Deep Learning at Scale Tutorial

Python 41 9 Updated Jul 14, 2023

Perplexity GPU Kernels

C++ 564 80 Updated Nov 7, 2025
C++ 72 14 Updated Jun 23, 2020
Python 72 16 Updated Feb 11, 2025

Userspace tool to map virtual page addresses to physical addresses.

C 198 56 Updated Jul 9, 2019

TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.

Cuda 106 6 Updated Jun 28, 2025

Official implementation of Half-Quadratic Quantization (HQQ)

Python 919 90 Updated Feb 26, 2026

Code to reproduce some of the figures in the paper "On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima"

Python 146 23 Updated Apr 24, 2017

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

77,281 8,930 Updated Feb 5, 2026
Next