Skip to content
View Sartek's full-sized avatar

Block or report Sartek

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

Python 28,043 1,897 Updated Jun 15, 2026

A massively parallel, high-level programming language

Rust 19,461 479 Updated Jun 3, 2025

KVarN is a native vLLM KV-cache quantization backend for your agents: 3-5x more context, throughput above FP16, and FP16-level accuracy. Calibration-free, one flag.

Python 397 22 Updated Jun 15, 2026

🏔 Calculating total viewsheds for geographic terrain using a cache-efficient and hightly parallel algorithm

Rust 81 4 Updated Jun 8, 2026

An Xposed module for downloading AI models from alternative sources

Kotlin 112 6 Updated Jun 14, 2026

Fast, lossless LLM inference via dual-view diffusion decoding.

Python 423 17 Updated May 18, 2026

Libraries for executing federated programs and computations.

C++ 108 27 Updated Jun 15, 2026

Visualize, query, and stream to train on multimodal robotics data.

Rust 10,948 766 Updated Jun 15, 2026

Row-Bot - Personal AI Sovereignty. A local-first AI assistant with integrated tools, a personal knowledge graph, voice, vision, shell, browser automation, scheduled tasks, health tracking, and mess…

Python 1,274 151 Updated Jun 13, 2026

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.

Python 5,563 420 Updated Apr 21, 2025

v1 of Asimov, an open-source humanoid robot

Python 909 142 Updated May 18, 2026

Re-Connectable secure remote shell

C++ 3,725 214 Updated Jun 14, 2026

a browser frontend for codex desktop, running on a machine you control.

TypeScript 150 13 Updated Jun 8, 2026

Generate hard iron offsets and soft iron matrix from raw magnetometer samples

C 5 Updated Nov 18, 2024

An exploration of arancini

C++ 11 3 Updated Jan 15, 2026

low level drivers for various xr glasses

Rust 31 2 Updated Jun 5, 2026

A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations

Python 17,283 1,313 Updated Jun 7, 2026

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 29,029 6,537 Updated Jun 15, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 82,896 18,077 Updated Jun 15, 2026

Open source cost intelligence proxy for AI agents. Cut costs ~80% with smart model routing. Dashboard, policy engine, 11 providers. MIT licensed.

TypeScript 180 29 Updated May 19, 2026

A light-weight and powerful meta-prompting, context engineering and spec-driven development system for Claude Code by TÂCHES.

JavaScript 64,232 5,466 Updated May 31, 2026

Ultra-Sparse Adaptation of 1-Bit LLMs via XOR Patches

Python 77 9 Updated Apr 10, 2026
Cuda 39 1 Updated Mar 31, 2026

An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.

Rust 193,822 109,964 Updated Jun 8, 2026

An AI co-worker with its own computer. Self-evolving, persistent memory, MCP server, secure credential collection, email identity. Built on the Claude Agent SDK.

TypeScript 1,433 190 Updated May 2, 2026

AI agents running research on single-GPU nanochat training automatically

Python 86,847 12,577 Updated Mar 26, 2026

llama.cpp fork with additional SOTA quants and improved performance

C++ 2,738 350 Updated Jun 15, 2026

KV cache compression via block-diagonal rotation. Beats TurboQuant: better PPL (6.91 vs 7.07), 28% faster decode, 5.3x faster prefill, 44x fewer params. Drop-in llama.cpp integration.

Python 1,015 87 Updated Apr 23, 2026

Train the smallest LM you can that fits in 16MB. Best model wins!

Python 5,128 3,340 Updated May 4, 2026

Krasis is a Hybrid LLM runtime which focuses on efficient running of larger models on consumer grade VRAM limited hardware

C++ 473 26 Updated Jun 10, 2026
Next