scatyf3

scatyf3 scatyf3

学习，学习，再学习

77 followers · 296 following

New York University
New York
14:38 (UTC -05:00)

Achievements

Lists (14)

Sort

Starred repositories

18 results for source starred repositories written in Cuda

Clear filter

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 28,460 3,338 Updated Jun 26, 2025

xlite-dev / LeetCUDA

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 9,045 889 Updated Dec 24, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,996 778 Updated Dec 23, 2025

Tony-Tan / CUDA_Freshman

Cuda 2,643 500 Updated Jan 16, 2024

Liu-xiandong / How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…

Cuda 1,209 178 Updated Jul 29, 2023

yassa9 / qwen600

Static suckless single batch CUDA-only qwen3-0.6B mini inference engine

Cuda 536 46 Updated Sep 8, 2025

wangzyon / NVIDIA_SGEMM_PRACTICE

Step-by-step optimization of CUDA SGEMM

Cuda 416 54 Updated Mar 30, 2022

godweiyang / GrabGPU

一款便捷的抢占显卡脚本

Cuda 387 40 Updated Dec 15, 2025

leimao / CUDA-GEMM-Optimization

CUDA Matrix Multiplication Optimization

Cuda 247 24 Updated Jul 19, 2024

puttsk / cuda-tutorial

A set of hands-on tutorials for CUDA programming

Cuda 243 35 Updated Apr 8, 2024

nicolaswilde / cuda-tensorcore-hgemm

Cuda 156 25 Updated Dec 26, 2024

xgqdut2016 / cuda_code

easy cuda code

Cuda 92 45 Updated Dec 24, 2024

lzyrapx / LeetGPU

🌈 Solutions of LeetGPU

Cuda 59 9 Updated Nov 12, 2025

xiexi51 / MaxK-GNN

Official implementation of "MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training"

Cuda 43 9 Updated Mar 4, 2024

NVIDIA / cuEmbed

CUDA Embedding Lookup Kernel Library

Cuda 40 5 Updated Oct 21, 2025

xgqdut2016 / hpc_project

some hpc project for learning

Cuda 26 4 Updated Aug 28, 2024

Phoenix8215 / BuildCudaNeuralNetworkFromScratch

Build CUDA Neural Network From Scratch

Cuda 22 1 Updated Aug 28, 2024

double-flower / TaiChi

source code for TaiChi (A Hybrid Compression Format for Binary Sparse Matrix-Vector Multiplication on GPU)

Cuda 8 1 Updated Mar 20, 2023

scatyf3 scatyf3

Lists (14)

2024 软件所 开源之夏

😻2024GSoC

for 24sp

🌟fun

🏋️hands-on CS

📚learn CS

🤖minecraft

NYU Tandon Course info

🌲study

💰success_of_life

🔧Tools

📕已存在的笔记

计算社会科学

陆本学

Starred repositories

MATLAB

2024 软件所开源之夏