Stars
friction2d / friction
Forked from MaurycyLiebner/enveFriction Graphics
Autotuning NVCC Compiler Parameters, published @ CCPE Journal
"Deep Learning Crash Course" is a comprehensive and up-to-date guide that takes you from simple neural networks all the way to cutting-edge deep learning architectures-no advanced math and programm…
A collection of machine learning examples and tutorials.
Projects and exercises for the latest Deep Learning ND program https://www.udacity.com/course/deep-learning-nanodegree--nd101
Repo for the Deep Learning Nanodegree Foundations program.
This repo demonstrates the creation of a thread-safe Lazy Singleton in CPP using shared pointers.
UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)
"Deep Generative Modeling": Introductory Examples
A simple Python Boolean library that can parse and manipulate dimacs as well as a custom language. Try some of the features out online here: http://formal.cs.utah.edu:8080/pbl/PBL.php
Implementation and Evaluation of Barrier Synchronization in OpenMP and MPI
A compute shader wrapper for Godot
A suite of GShade shaders for Final Fantasy XIV
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
FlagGems is an operator library for large language models implemented in the Triton Language.
Material for the SC22 Deep Learning at Scale Tutorial
Userspace tool to map virtual page addresses to physical addresses.
TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.
Official implementation of Half-Quadratic Quantization (HQQ)
Code to reproduce some of the figures in the paper "On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima"
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.