Skip to content
View BDHU's full-sized avatar

Organizations

@utcs-scea @Rust-sys @UT-InfraAI

Block or report BDHU

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A tiny and modular agent written in Rust

Rust 1 Updated Jun 10, 2026

AI agent toolkit: unified LLM API, agent loop, TUI, coding agent CLI

TypeScript 62,450 7,565 Updated Jun 14, 2026

NCCL device-init API binding for CuTe Python DSL

Python 1 Updated Mar 26, 2026

An agent for CUDA compute-communication kernel co-design

Cuda 35 4 Updated May 7, 2026

Distributed Compiler based on Triton for Parallel Systems

Python 1,459 151 Updated Apr 22, 2026

Perplexity open source garden for inference technology

Rust 578 56 Updated May 27, 2026
Python 169 19 Updated Dec 27, 2024

Triton-based Symmetric Memory operators and examples

Python 100 14 Updated May 15, 2026

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

Cuda 2,308 219 Updated Jun 13, 2026

A minimal command-line utility written in Rust for querying GPU status

Rust 24 4 Updated Dec 21, 2025

Efficient Triton Kernels for LLM Training

Python 6,430 540 Updated Jun 12, 2026

A minimal RAG/agent orchestration framework

Python 2 Updated Jan 14, 2026

Tips and resources to prepare for Behavioral interviews.

8,355 1,708 Updated Aug 19, 2025
C++ 9 4 Updated Dec 13, 2024

[NeurIPS 2025] ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models

Python 23 2 Updated Apr 20, 2026

Open Agentic Schema Framework

Elixir 316 43 Updated Jun 2, 2026

The slightly more awesome standard unix password manager for teams

Go 6,927 541 Updated Jun 11, 2026

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 5,301 392 Updated Apr 20, 2026

FlashInfer: Kernel Library for LLM Serving

Python 5,789 1,048 Updated Jun 14, 2026

CUDA on non-NVIDIA GPUs

Rust 14,282 912 Updated Jun 13, 2026

NumPy & SciPy for GPU

Python 10,998 1,039 Updated Jun 11, 2026

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 13,866 2,465 Updated Jun 14, 2026

MSCCL++: A GPU-driven communication stack for scalable AI applications

C++ 532 100 Updated Jun 13, 2026

Read-only mirror of https://git.zx2c4.com/cgit/about . Pull requests and issues on GitHub cannot be accepted and will be automatically closed. The proper way to submit changes is via the mailing li…

C 225 27 Updated May 21, 2026
Cuda 1 Updated Dec 28, 2023

Inference Llama 2 in one file of pure C

C 19,628 2,561 Updated Aug 6, 2024

Altis-SYCL: a SYCL-based implementation of the Altis GPGPU benchmark suite for CPUs, GPUs, and FPGAs.

C++ 1 3 Updated Dec 22, 2023

A list of Free Software network services and web applications which can be hosted on your own servers

299,048 13,942 Updated Jun 14, 2026

校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step

C++ 3,435 367 Updated Jun 22, 2025

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Python 9,509 399 Updated May 31, 2026
Next