Skip to content
View sar's full-sized avatar

Block or report sar

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

HIP: C++ Heterogeneous-Compute Interface for Portability

C++ 4,327 582 Updated May 6, 2026

super repo for rocm libraries

Assembly 337 289 Updated May 15, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 80,068 16,822 Updated May 15, 2026

RDNA-native LLM inference engine in Rust.

Rust 378 40 Updated May 14, 2026
C++ 53 1 Updated Apr 29, 2026

The main repository for building Pascal-compatible versions of ML applications and libraries.

Shell 197 32 Updated Aug 23, 2025

A fast high-compression read-only file system for Linux, FreeBSD, macOS and Windows

C++ 2,548 86 Updated May 2, 2026

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 3,184 474 Updated May 15, 2026

Kimi Code CLI is your next CLI agent.

Python 8,588 1,033 Updated May 13, 2026

Jobs scraper library for LinkedIn, Indeed, Glassdoor, Google, ZipRecruiter & more

Python 3,379 681 Updated Feb 18, 2026

Qwen3-Coder is the code version of Qwen3, the large language model series developed by Qwen team.

Python 16,518 1,196 Updated Mar 24, 2026

An open-source AI agent that lives in your terminal.

TypeScript 24,400 2,374 Updated May 15, 2026

Flexible I/O Tester

C 6,223 1,406 Updated May 12, 2026

A fast JSON parser/generator for C++ with both SAX/DOM style API

C++ 15,058 3,642 Updated Feb 5, 2025

NVIDIA Inference Xfer Library (NIXL)

C++ 1,031 318 Updated May 14, 2026

Supercharge Your LLM with the Fastest KV Cache Layer

Python 8,271 1,172 Updated May 15, 2026

Distributed reliable key-value store for the most critical data of a distributed system

Go 51,701 10,352 Updated May 15, 2026

The etcd-cpp-apiv3 is a C++ library for etcd's v3 client APIs, i.e., ETCDCTL_API=3.

C++ 392 150 Updated Mar 28, 2025

IOR and mdtest

C 479 196 Updated Apr 3, 2026

Magnum IO community repo

C++ 116 19 Updated May 7, 2026

NVIDIA GPUDirect Storage Driver

C 351 57 Updated Apr 24, 2026

llama.cpp fork with additional SOTA quants and improved performance

C++ 2,447 310 Updated May 15, 2026

Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference in pure C/C++

C++ 6,010 618 Updated May 14, 2026

Minimal CLI coding agent by Mistral

Python 4,190 487 Updated May 11, 2026

Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs

HTML 960 138 Updated May 13, 2026

CUDA Library Samples

Cuda 2,395 457 Updated May 12, 2026

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

C++ 9,174 2,335 Updated May 13, 2026

Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.

TypeScript 34,010 2,765 Updated Mar 4, 2026

LLM inference in C/C++

C++ 110,215 18,210 Updated May 15, 2026

A lightweight chat terminal-interface for llama.cpp server written in C++ with many features and windows/linux support.

C++ 27 5 Updated Mar 31, 2026
Next