-
Sun Yat-sen University
- Guang Zhou, China
- https://www.zhihu.com/people/liang-de-peng/posts
Stars
- All languages
- Assembly
- C
- C#
- C++
- CMake
- CSS
- Clojure
- CoffeeScript
- Component Pascal
- Cuda
- Dockerfile
- Emacs Lisp
- Fortran
- Go
- HTML
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- LLVM
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Markdown
- Nim
- Objective-C
- OpenEdge ABL
- Perl
- PureBasic
- Python
- R
- Ruby
- Rust
- SCSS
- Scala
- Scheme
- Shell
- Swift
- TeX
- Thrift
- TypeScript
- Vim Script
- Vue
cuTile is a programming model for writing parallel kernels for NVIDIA GPUs
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning
Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton
A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML
Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy
SWE-bench: Can Language Models Resolve Real-world Github Issues?
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
ArcticInference: vLLM plugin for high-throughput, low-latency inference
A debugging and profiling tool that can trace and visualize python code execution
Getting Started with Triton: A Tutorial for Python Beginners
🎒 Token-Oriented Object Notation (TOON) – Compact, human-readable, schema-aware JSON for LLM prompts. Spec, benchmarks, TypeScript SDK.
Qwen3Guard is a multilingual guardrail model series developed by the Qwen team at Alibaba Cloud.
verl: Volcano Engine Reinforcement Learning for LLMs
slime is an LLM post-training framework for RL Scaling.
Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Verify Precision of all Kimi K2 API Vendor
A high-performance inference engine for LLMs, optimized for diverse AI accelerators.
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
A tiny demo of interfacing CUDA via nanobind with a pytorch tensor
🥢像老乡鸡🐔那样做饭。主要部分于2024年完工,非老乡鸡官方仓库。文字来自《老乡鸡菜品溯源报告》,并做归纳、编辑与整理。CookLikeHOC.
Tongyi Deep Research, the Leading Open-source Deep Research Agent