Skip to content
View BDHU's full-sized avatar

Organizations

@utcs-scea @Rust-sys @UT-InfraAI

Block or report BDHU

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

An agent for CUDA compute-communication kernel co-design

Cuda 32 2 Updated Mar 11, 2026

Distributed Compiler based on Triton for Parallel Systems

Python 1,394 132 Updated Mar 11, 2026

Perplexity open source garden for inference technology

Rust 382 35 Updated Dec 25, 2025
Python 163 16 Updated Dec 27, 2024

Triton-based Symmetric Memory operators and examples

Python 94 13 Updated Jan 15, 2026

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 2,162 184 Updated Mar 23, 2026

A minimal command-line utility written in Rust for querying GPU status

Rust 23 4 Updated Dec 21, 2025

Efficient Triton Kernels for LLM Training

Python 6,233 504 Updated Mar 23, 2026

A minimal RAG/agent orchestration framework

Python 2 Updated Jan 14, 2026

Tips and resources to prepare for Behavioral interviews.

7,992 1,625 Updated Aug 19, 2025
C++ 8 2 Updated Dec 13, 2024

[NeurIPS 2025] ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models

Python 21 2 Updated Oct 13, 2025

Open Agentic Schema Framework

Elixir 298 35 Updated Mar 23, 2026

The slightly more awesome standard unix password manager for teams

Go 6,753 527 Updated Mar 22, 2026

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 5,074 349 Updated Mar 19, 2026

FlashInfer: Kernel Library for LLM Serving

Python 5,205 823 Updated Mar 24, 2026

CUDA on non-NVIDIA GPUs

Rust 14,036 900 Updated Mar 24, 2026

NumPy & SciPy for GPU

Python 10,859 1,006 Updated Mar 23, 2026

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 13,171 2,211 Updated Mar 24, 2026

MSCCL++: A GPU-driven communication stack for scalable AI applications

C++ 493 90 Updated Mar 24, 2026

Read-only mirror of https://git.zx2c4.com/cgit/about . Pull requests and issues on GitHub cannot be accepted and will be automatically closed. The proper way to submit changes is via the mailing li…

C 211 27 Updated Mar 10, 2026
Cuda 1 Updated Dec 28, 2023

Inference Llama 2 in one file of pure C

C 19,313 2,472 Updated Aug 6, 2024

Altis-SYCL: a SYCL-based implementation of the Altis GPGPU benchmark suite for CPUs, GPUs, and FPGAs.

C++ 1 3 Updated Dec 22, 2023

A list of Free Software network services and web applications which can be hosted on your own servers

281,740 12,965 Updated Mar 23, 2026

校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step

C++ 3,374 359 Updated Jun 22, 2025

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Python 9,440 396 Updated Feb 20, 2026

MMSA is a unified framework for Multimodal Sentiment Analysis.

Python 982 138 Updated Jan 15, 2025

A composable and fully extensible C++ execution engine library for data management systems.

C++ 4,081 1,475 Updated Mar 23, 2026

Google Research

Jupyter Notebook 37,523 8,365 Updated Mar 24, 2026
Next