I may be slow to respond.
-
PingCAP
- Shanghai,China
-
21:02
(UTC +08:00) - https://www.hawkingrei.com/
Highlights
- Pro
Lists (4)
Sort Name ascending (A-Z)
- All languages
- ANTLR
- Agda
- Assembly
- Astro
- Batchfile
- C
- C#
- C++
- CSS
- Clojure
- CoffeeScript
- Common Lisp
- Coq
- Cuda
- Cython
- Dart
- Dockerfile
- Erlang
- Fortran
- Go
- Groovy
- HCL
- HTML
- Hack
- Handlebars
- Haskell
- Java
- JavaScript
- Jsonnet
- Jupyter Notebook
- Kotlin
- LLVM
- Lean
- Lua
- MATLAB
- MDX
- Makefile
- Markdown
- Mermaid
- Mustache
- OCaml
- Objective-C
- Objective-C++
- PHP
- PLSQL
- Perl
- PostScript
- PowerShell
- Prolog
- Python
- R
- Racket
- Raku
- Rez
- Rich Text Format
- Roff
- Ruby
- Rust
- SCSS
- Scala
- Scheme
- Shell
- Smarty
- Solidity
- Starlark
- Stylus
- Swift
- SystemVerilog
- TLA
- TSQL
- TeX
- Thrift
- TypeScript
- Typst
- VHDL
- Vala
- Verilog
- Vim Script
- Vue
- Wikitext
- Zig
Starred repositories
12
stars
written in Cuda
Clear filter
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
Introduction to Parallel Programming class code
Source code that accompanies The CUDA Handbook.
Static suckless single batch CUDA-only qwen3-0.6B mini inference engine
A simple GPU hash table implemented in CUDA using lock free techniques
Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search
CUDA implementation of the Floyd-Warshall All pairs shortest path graph algorithm(with path reconstruction)
Python wrappers for fast NMF training using CUDA,MKL, and ATLAS