sar

Sar Malik sar

SaaS founder, Cloud Architect, Full Stack Developer

20 followers · 84 following

Achievements

Highlights

Developer Program Member

Starred repositories

tukl-msd / DRAMPower

Fast and accurate DRAM power and energy estimation tool

C++ 210 61 Updated Jul 22, 2026

sar / GPT2.Training.Google.Colaboratory

Train GPT-2 large language model (150M to 774M) series on Google Colab using K80 GPUs

Jupyter Notebook 4 Updated Jan 4, 2020

sar / v620

AMD ROCm AI Inference & Training Solutions Library for RNDA2/GFX1030/Radeon Pro v620 includes: compiling vLLM Source, Llama.cpp, HIPFire, and Fine-Tuning paths.

Dockerfile 1 Updated Jul 14, 2026

sar / nebius

Slurm on Kubernetes Architecture Solution for Fine-tuning LLMs, Inference, and Eval across distributed NVIDIA L40s X8 GPU Cluster on Nebius AI Cloud

HCL 1 Updated Dec 17, 2025

nebius / soperator

Run Slurm in Kubernetes

Go 403 60 Updated Jul 23, 2026

ROCm / hip

HIP: C++ Heterogeneous-Compute Interface for Portability

C++ 4,380 588 Updated Jul 8, 2026

ROCm / rocm-libraries

super repo for rocm libraries

Assembly 390 349 Updated Jul 24, 2026

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 87,031 19,788 Updated Jul 24, 2026

Kaden-Schutt / hipfire

RDNA-native LLM inference engine in Rust.

Rust 489 50 Updated Jul 24, 2026

ROCm / hipThreads

C++ 51 1 Updated Jul 1, 2026

sasha0552 / pascal-pkgs-ci

The main repository for building Pascal-compatible versions of ML applications and libraries.

Shell 214 33 Updated Aug 23, 2025

mhx / dwarfs

A fast high-compression read-only file system for Linux, FreeBSD, macOS and Windows

C++ 2,585 88 Updated Jul 23, 2026

llm-d / llm-d

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 3,870 633 Updated Jul 24, 2026

MoonshotAI / kimi-cli

Kimi Code CLI is your next CLI agent.

Python 10,729 1,251 Updated Jul 16, 2026

speedyapply / JobSpy

Jobs scraper library for LinkedIn, Indeed, Glassdoor, Google, ZipRecruiter & more

Python 3,939 781 Updated Feb 18, 2026

QwenLM / Qwen3-Coder

Qwen3-Coder is the code version of Qwen3, the large language model series developed by Qwen team.

Python 16,742 1,230 Updated Mar 24, 2026

QwenLM / qwen-code

An open-source AI coding agent that lives in your terminal.

TypeScript 26,275 2,713 Updated Jul 24, 2026

axboe / fio

Flexible I/O Tester

C 6,302 1,415 Updated Jul 23, 2026

Tencent / rapidjson

A fast JSON parser/generator for C++ with both SAX/DOM style API

C++ 15,108 3,648 Updated Feb 5, 2025

ai-dynamo / nixl

NVIDIA Inference Xfer Library (NIXL)

C++ 1,148 376 Updated Jul 24, 2026

LMCache / LMCache

LMCache: Supercharge Your LLM with the Fastest KV Cache Layer

Python 10,853 1,611 Updated Jul 24, 2026

etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system

Go 52,023 10,427 Updated Jul 23, 2026

etcd-cpp-apiv3 / etcd-cpp-apiv3

The etcd-cpp-apiv3 is a C++ library for etcd's v3 client APIs, i.e., ETCDCTL_API=3.

C++ 389 152 Updated Mar 28, 2025

hpc / ior

IOR and mdtest

C 482 199 Updated Apr 3, 2026

NVIDIA / MagnumIO

Magnum IO community repo

C++ 117 23 Updated Jun 22, 2026

NVIDIA / gds-nvidia-fs

NVIDIA GPUDirect Storage Driver

C 367 65 Updated Jun 1, 2026

ikawrakow / ik_llama.cpp

llama.cpp fork with additional SOTA quants and improved performance

C++ 2,959 386 Updated Jul 23, 2026

leejet / stable-diffusion.cpp

Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference in pure C/C++

C++ 6,583 705 Updated Jul 23, 2026

mistralai / mistral-vibe

Minimal CLI coding agent by Mistral

Python 4,730 608 Updated Jul 23, 2026

jax-ml / scaling-book

Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs

HTML 1,292 183 Updated Jul 13, 2026

TypeScript

Terminal

Tensorflow

Serverless

React

Linux

Deep learning

C#

Cryptocurrency

Terraform

See all starred topics