Skip to content
View 1a1a11a's full-sized avatar

Highlights

  • Pro

Organizations

@cacheMon

Block or report 1a1a11a

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

🕳 bore is a simple CLI tool for making tunnels to localhost

Rust 10,533 451 Updated Jun 9, 2025

OpenAI API-compatible wrapper for Claude Code

Python 281 38 Updated Dec 16, 2025

DedupBench is a benchmarking tool for content-defined chunking techniques used in data deduplication. It currently supports eleven unique CDC techniques and five different vector instruction sets.

C++ 19 1 Updated Oct 27, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,918 352 Updated Dec 19, 2025

DAOS Storage Stack (client libraries, storage engine, control plane)

C 903 338 Updated Dec 19, 2025

⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.

Rust 3,503 261 Updated Dec 19, 2025

This is the user space repo for famfs, the fabric-attached memory file system

C 85 3 Updated Dec 18, 2025

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

C++ 1,296 180 Updated Dec 17, 2025

Python bindings for libCacheSim, designed for rapid experimentation with cache simulation models.

Python 5 2 Updated Oct 23, 2025

A framework for generating realistic LLM serving workloads

Python 93 6 Updated Oct 9, 2025

A single interface to use and evaluate different agent frameworks

Python 1,052 78 Updated Dec 17, 2025

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…

Python 32,697 5,087 Updated Dec 21, 2025

Zero instrucment LLM and AI agent (e.g. claude code, gemini-cli) observability in eBPF

C 163 24 Updated Nov 21, 2025

A comprehensive open-source cache trace dataset

Jupyter Notebook 18 1 Updated Aug 23, 2025

Lossless codec for numerical data

Rust 444 28 Updated Dec 20, 2025

a high performance library for building cache simulators

C++ 276 76 Updated Nov 29, 2025

Nano vLLM

Python 9,851 1,238 Updated Nov 3, 2025

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

Python 396 58 Updated Jun 10, 2025

Huawei Cloud datasets

Jupyter Notebook 82 13 Updated Nov 20, 2025

A tool for bandwidth measurements on NVIDIA GPUs.

C++ 584 65 Updated Apr 15, 2025

A gallery that showcases on-device ML/GenAI use cases and allows people to try and use models locally.

Kotlin 14,539 1,224 Updated Dec 18, 2025

Simple high-throughput inference library

Python 153 10 Updated May 14, 2025

PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for evaluation of training and inference platforms.

Python 154 66 Updated Dec 10, 2025

A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).

Python 300 34 Updated Jun 10, 2025

Composable building blocks to build LLM Apps

Python 8,201 1,227 Updated Dec 20, 2025

New file format for storage of large columnar datasets.

C++ 661 59 Updated Dec 20, 2025

Ollama Python library

Python 9,040 872 Updated Dec 11, 2025

A C implementation of the SIEVE cache eviction algorithm, based on the research paper (https://junchengyang.com/publication/nsdi24-SIEVE.pdf)

Makefile 3 Updated Jan 22, 2025
Next