-
Shanghai Jiao Tong University
- Shanghai, China
-
03:05
(UTC +08:00)
Highlights
- Pro
Lists (4)
Sort Name ascending (A-Z)
Stars
- All languages
- ANTLR
- ASP
- Assembly
- C
- C#
- C++
- CMake
- CSS
- Clojure
- CodeQL
- Cuda
- Dart
- Dockerfile
- F*
- Go
- Groff
- HLSL
- HTML
- Haskell
- Java
- JavaScript
- Jupyter Notebook
- Kotlin
- LLVM
- Lean
- Lua
- MATLAB
- MLIR
- Makefile
- Markdown
- Meson
- Nunjucks
- Objective-C
- PHP
- Pascal
- Perl
- PowerShell
- PureScript
- Python
- Ren'Py
- Ruby
- Rust
- SCSS
- Scala
- Shell
- Svelte
- Swift
- TSQL
- TeX
- TypeScript
- Typst
- Verilog
- Vim Script
- Vue
- WebAssembly
夸克网盘文件管理 CLI 工具 - Quark Cloud Drive File Management CLI Tool
Review automated kernel generation in the era of LLMs
eBPF for GPU UVM offloading and scheduling in Linux kernel
NVIDIA Linux open GPU kernel module source
Three-finger trackpad gestures for middle-click and middle-drag on macOS
Predict the performance of LLM inference services
Simulator code of the paper "Dissecting and Modeling the Architecture of Modern GPU Cores"
Source code of the simulator used in the Mosaic paper from MICRO 2017: "Mosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes" https://people.inf.ethz.ch/omutlu/…
A Primer on Memory Consistency and Cache Coherence (Second Edition) 翻译计划
NVIDIA Linux open GPU with P2P support
LLM Inference via Triton (Flexible & Modular): Focused on Kernel Optimization using CUBIN binaries, Starting from gpt-oss Model
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Examples demonstrating available options to program multiple GPUs in a single node or a cluster
Allow torch tensor memory to be released and resumed later
Development repository for the Triton language and compiler
这是一个简单的技术科普教程项目,主要聚焦于解释一些有趣的,前沿的技术概念和原理。每篇文章都力求在 5 分钟内阅读完成。
Set the color of files/folders for OSX Finder from the command line.
AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。
FlagGems is an operator library for large language models implemented in the Triton Language.
NVIDIA curated collection of educational resources related to general purpose GPU programming.
a taichi implementation of fast and differentiable stroke renderer
A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.