- Nanjing
- All languages
- ANTLR
- ASL
- Assembly
- Awk
- Batchfile
- BitBake
- Bluespec
- C
- C#
- C++
- CMake
- CSS
- Clojure
- Cuda
- Cython
- Dart
- Dockerfile
- Emacs Lisp
- GLSL
- Go
- HTML
- Java
- JavaScript
- Jupyter Notebook
- Koka
- LLVM
- Lua
- Makefile
- Markdown
- Mathematica
- Mojo
- Nix
- Objective-C
- Objective-C++
- PHP
- Perl
- PostScript
- PowerShell
- Python
- Rich Text Format
- Roff
- Ruby
- Rust
- SAS
- Sass
- Scala
- Shell
- Starlark
- Swift
- SystemVerilog
- TSQL
- TeX
- TypeScript
- Verilog
- Vim Script
Starred repositories
Collect some CS textbooks for learning.
AI-LJ / Computer_Science_Parallel_Computing_Textbooks
Forked from DevMTech/Computer_Science_Parallel_Computing_TextbooksCollect some CS textbooks for learning.
Splits single Nvidia GPU into multiple partitions with complete compute and memory isolation (wrt to performace) between the partitions
The repo is finally unlocked. enjoy the party! The fastest repo in history to surpass 100K stars ⭐. Join Discord: https://discord.gg/5TUQKqFWd Built in Rust using oh-my-codex.
A modern GUI client based on Tauri, designed to run in Windows, macOS and Linux for tailored proxy experience
First open-source KVTC implementation (NVIDIA, ICLR 2026) -- 8-32x KV cache compression via PCA + adaptive quantization + entropy coding
UniCNet is a cycle-accurate simulator supporting effienct simulation for composable chiplet networks.
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
Cost-efficient and pluggable Infrastructure components for GenAI inference
NVLeak: Off-Chip Side-Channel Attacks via Non-Volatile Memory Systems [USENIX Security '23]
A tutorial for getting started with the Deep Learning Accelerator (DLA) on NVIDIA Jetson
https://github.com/eunomia-bpf homepage, documents and blogs
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
A comprehensive toolkit for GPU Communications Libraries performance testing and data analysis.
cluster data collected from production clusters in Alibaba for cluster management research
vArmor is a cloud native container sandbox system based on AppArmor/BPF/Seccomp. It also includes multiple built-in protection rules that are ready to use out of the box.
A book for Learning the Foundations of LLMs
Machine Learning Engineering Open Book
hpc 教程,包含集合通信(mpi、nccl)、cuda 编程、向量化 SIMD、RDMA 通信等
Using Persistent Memory Region in NVMe SSD to boost KVStore accessing
SMDK, Scalable Memory Development Kit, is developed for Samsung CXL(Compute Express Link) Memory Expander to enable full-stack Software-Defined Memory system