Skip to content
View BDHU's full-sized avatar

Organizations

@utcs-scea @Rust-sys @UT-InfraAI

Block or report BDHU

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

NCCL device-init API binding for CuTe Python DSL

Python 1 Updated Mar 26, 2026

An agent for CUDA compute-communication kernel co-design

Cuda 33 2 Updated Mar 24, 2026

Distributed Compiler based on Triton for Parallel Systems

Python 1,396 134 Updated Mar 11, 2026

Perplexity open source garden for inference technology

Rust 383 36 Updated Dec 25, 2025
Python 164 16 Updated Dec 27, 2024

Triton-based Symmetric Memory operators and examples

Python 94 13 Updated Jan 15, 2026

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 2,174 185 Updated Mar 26, 2026

A minimal command-line utility written in Rust for querying GPU status

Rust 24 4 Updated Dec 21, 2025

Efficient Triton Kernels for LLM Training

Python 6,240 506 Updated Mar 27, 2026

A minimal RAG/agent orchestration framework

Python 2 Updated Jan 14, 2026

Tips and resources to prepare for Behavioral interviews.

8,009 1,628 Updated Aug 19, 2025
C++ 8 2 Updated Dec 13, 2024

[NeurIPS 2025] ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models

Python 21 2 Updated Oct 13, 2025

Open Agentic Schema Framework

Elixir 299 35 Updated Mar 26, 2026

The slightly more awesome standard unix password manager for teams

Go 6,763 527 Updated Mar 22, 2026

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 5,085 352 Updated Mar 26, 2026

FlashInfer: Kernel Library for LLM Serving

Python 5,223 829 Updated Mar 26, 2026

CUDA on non-NVIDIA GPUs

Rust 14,043 900 Updated Mar 26, 2026

NumPy & SciPy for GPU

Python 10,869 1,009 Updated Mar 23, 2026

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 13,198 2,218 Updated Mar 27, 2026

MSCCL++: A GPU-driven communication stack for scalable AI applications

C++ 494 90 Updated Mar 27, 2026

Read-only mirror of https://git.zx2c4.com/cgit/about . Pull requests and issues on GitHub cannot be accepted and will be automatically closed. The proper way to submit changes is via the mailing li…

C 211 26 Updated Mar 10, 2026
Cuda 1 Updated Dec 28, 2023

Inference Llama 2 in one file of pure C

C 19,321 2,475 Updated Aug 6, 2024

Altis-SYCL: a SYCL-based implementation of the Altis GPGPU benchmark suite for CPUs, GPUs, and FPGAs.

C++ 1 3 Updated Dec 22, 2023

A list of Free Software network services and web applications which can be hosted on your own servers

282,315 12,991 Updated Mar 25, 2026

校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step

C++ 3,380 360 Updated Jun 22, 2025

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Python 9,444 397 Updated Feb 20, 2026

MMSA is a unified framework for Multimodal Sentiment Analysis.

Python 982 138 Updated Jan 15, 2025

A composable and fully extensible C++ execution engine library for data management systems.

C++ 4,081 1,476 Updated Mar 26, 2026
Next