Skip to content
View haohui's full-sized avatar

Highlights

  • Pro

Block or report haohui

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An experimental tile-based, Pythonic programming language for writing GPU kernels designed for LLM agents

C++ 6 3 Updated Jun 12, 2026

Orchestrate multiple coding agents from desktop and mobile

TypeScript 8,918 849 Updated Jun 20, 2026

Optimized FP16/BF16 x FP4 GPU kernels for AMD GPUs

C++ 58 7 Updated May 29, 2026

a simple Flash Attention v2 implementation with ROCM (RDNA3 GPU, roc wmma), mainly used for stable diffusion(ComfyUI) in Windows ZLUDA environments.

Python 52 7 Updated Aug 25, 2024

collection of benchmarks to measure basic GPU capabilities

C++ 529 84 Updated Oct 24, 2025

Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?

Jupyter Notebook 1,923 75 Updated May 13, 2024

fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型,任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型,单并发20tps;INT4量化模型单并发30tps,多并发可达60+。

C++ 4,797 471 Updated Jun 14, 2026

Interact with your documents using the power of GPT, 100% privately, no data leaks

Python 57,285 7,607 Updated Jun 18, 2026

A platform for building proxies to bypass network restrictions.

Go 46,892 8,845 Updated Apr 24, 2026

Open Hardware Monitor

C# 6,418 1,312 Updated Jul 13, 2024

OpenSource tool for monitoring, configuring and overclocking NVIDIA GPUs

C 2 Updated Feb 21, 2020

GPUVerify: a Verifier for GPU Kernels

C# 82 18 Updated Jul 28, 2022

An unofficial cuda assembler, for all generations of SASS, hopefully :)

Python 596 108 Updated Apr 20, 2023

《金庸群侠传》c++复刻版,已完工

C++ 2,934 404 Updated Jun 19, 2026

Radeon reverse engineering tools

Python 153 17 Updated Mar 29, 2020

Tools for people envious of nvidia's blob driver.

C 523 102 Updated Oct 26, 2023
C++ 9 Updated Aug 23, 2019

Assembler for NVIDIA Volta and Turing GPUs

Python 245 41 Updated Jan 13, 2022

Mythril is a symbolic-execution-based securty analysis tool for EVM bytecode. It detects security vulnerabilities in smart contracts built for Ethereum and other EVM-compatible blockchains.

Python 4,252 816 Updated Apr 27, 2026

SQL-based streaming analytics platform at scale

Java 1,223 282 Updated Jun 21, 2020

C++ library for zkSNARKs

C++ 1,927 590 Updated Jun 12, 2025

Official repository of the AWS EC2 FPGA Hardware and Software Development Kit

SystemVerilog 1,664 538 Updated Jun 19, 2026

Beringei is a high performance, in-memory storage engine for time series data.

C++ 3,154 287 Updated Jul 11, 2018

Equihash miner for NiceHash

C++ 763 573 Updated Dec 27, 2018

A curated list of Deep Learning hardware, cycle/memory optimisation techniques

45 13 Updated Aug 9, 2016

Firmware Analysis Tool

Rust 14,060 1,809 Updated Jun 17, 2026

The ExpressOS kernel

C# 17 5 Updated Jun 7, 2013

A pure front-end web UI for you-know-which bbs.

JavaScript 26 10 Updated Mar 7, 2016