Skip to content
View asu-gkg's full-sized avatar

Block or report asu-gkg

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

TurboQuant: Near-optimal KV cache quantization for LLM inference (3-bit keys, 2-bit values) with Triton kernels + vLLM integration

Python 1,088 125 Updated Mar 27, 2026

A from-scratch Prefill/Decode disaggregation inference engine for LLMs

Python 152 17 Updated Apr 15, 2026
Python 480 40 Updated Apr 18, 2026

🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!

Python 47,256 5,890 Updated Apr 10, 2026

Bash is all you need - A nano claude code–like 「agent harness」, built from 0 to 1

TypeScript 54,538 8,967 Updated Apr 14, 2026

[NeurIPS2025] "AI-Researcher: Autonomous Scientific Innovation" -- A production-ready version: https://novix.science/chat

Python 5,155 645 Updated Oct 16, 2025

AI agents running research on single-GPU nanochat training automatically

Python 73,996 10,788 Updated Mar 26, 2026

Code repo for efficient quantized MoE inference with mixture of low-rank compensators

Python 36 Updated Apr 14, 2025

Official implementation of Half-Quadratic Quantization (HQQ)

Python 929 90 Updated Feb 26, 2026

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 1,057 86 Updated Sep 4, 2024

A machine model for line-rate programmable switches

C++ 27 2 Updated Oct 8, 2016
Python 21 5 Updated Jan 7, 2026

Leetcode for Pytorch

Jupyter Notebook 2,018 257 Updated Jan 19, 2026
Python 57 4 Updated Aug 19, 2025
C++ 190 31 Updated Apr 8, 2026

This repository contains the source code for P4TG, a 1 Tb/s traffic generator for Ethernet/IP networks

Rust 60 15 Updated Apr 17, 2026

BSP for X-T Programmable Bare Metal Switches

C 7 2 Updated Dec 16, 2025

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 8,506 779 Updated May 31, 2024

"Your Fully-Automated Personal AI Assistant"

Python 1,491 211 Updated Oct 16, 2025

Your personal AI trading assistant. Any market. Any model. Pay with USDC, not API keys.

Go 11,818 2,956 Updated Apr 17, 2026

A opensource AI trading platform in real market,

TypeScript 802 191 Updated Apr 1, 2026

Nix packaging of the Intel Tofino SDE

Nix 29 4 Updated Oct 14, 2025

barefoot platform bsp

Shell 2 2 Updated Sep 1, 2022

Hands-on tutorial to learn the building blocks of the Next-Gen SDN architecture

Java 357 189 Updated Jul 17, 2022

CRS-自建Claude Code镜像,一站式开源中转服务,让 Claude、OpenAI、Gemini、Droid 订阅统一接入,支持拼车共享,更高效分摊成本,原生工具无缝使用。

JavaScript 11,139 1,664 Updated Apr 16, 2026

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 1,307 137 Updated Apr 18, 2026

Stratum is an open source silicon-independent switch operating system for software defined networks.

C++ 410 143 Updated Jul 9, 2024

Ubuntu安装V2ray指南

169 29 Updated Aug 7, 2023

Nano vLLM

Python 12,975 1,948 Updated Apr 13, 2026
Next