-
University of Texas at Austin
- Austin, Texas
- https://www.bodunhu.com
- @BodunHu
Highlights
Lists (3)
Sort Name ascending (A-Z)
Starred repositories
An agent for CUDA compute-communication kernel co-design
Distributed Compiler based on Triton for Parallel Systems
Perplexity open source garden for inference technology
Triton-based Symmetric Memory operators and examples
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
A minimal command-line utility written in Rust for querying GPU status
Efficient Triton Kernels for LLM Training
Tips and resources to prepare for Behavioral interviews.
[NeurIPS 2025] ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models
The slightly more awesome standard unix password manager for teams
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
FlashInfer: Kernel Library for LLM Serving
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
MSCCL++: A GPU-driven communication stack for scalable AI applications
Read-only mirror of https://git.zx2c4.com/cgit/about . Pull requests and issues on GitHub cannot be accepted and will be automatically closed. The proper way to submit changes is via the mailing li…
Altis-SYCL: a SYCL-based implementation of the Altis GPGPU benchmark suite for CPUs, GPUs, and FPGAs.
A list of Free Software network services and web applications which can be hosted on your own servers
校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
MMSA is a unified framework for Multimodal Sentiment Analysis.
A composable and fully extensible C++ execution engine library for data management systems.
Google Research