Skip to content
View kklemon's full-sized avatar
  • Munich 🍺
  • 00:24 (UTC +02:00)

Block or report kklemon

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Supercharge Your LLM with the Fastest KV Cache Layer

Python 7,941 1,077 Updated Apr 9, 2026

Ultra-Sparse Adaptation of 1-Bit LLMs via XOR Patches

Python 61 5 Updated Apr 2, 2026

NUMA-Aware Contention-Free Dynamically-Auto-Tuning Bash-Native Streaming Parallelization Engine

Shell 346 7 Updated Apr 9, 2026

AMD ROCm™ Software - GitHub Home

Shell 6,345 535 Updated Apr 9, 2026

The HIP Environment and ROCm Kit - A lightweight open source build system for HIP and ROCm

Python 917 221 Updated Apr 9, 2026

Open-source LLM load balancer and serving platform for self-hosting LLMs at scale 🏓🦙 Alternative to projects like llm-d, Docker Model Runner, etc but with less moving parts and simple deployments b…

Rust 1,515 82 Updated Apr 3, 2026

Tensor library for machine learning

C++ 14,391 1,545 Updated Apr 9, 2026

Official code base for LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels

Python 2,138 239 Updated Mar 27, 2026
Python 53 1 Updated Mar 23, 2026

I replicated Ng's RYS method and found that duplicating 3 specific layers in Qwen2.5-32B boosts reasoning by 17% and duplicating layers 12-14 in Devstral-24B improves logical deduction from 0.22→0.…

Python 224 16 Updated Mar 20, 2026

Magical utilities for your Svelte applications.

TypeScript 1,777 80 Updated Mar 4, 2026

Reduce Claude Code, Codex, OpenCode wall clock and token use by 50% with open source, local semantic search. Works for small and large codebases and monorepos! Enterprise-ready and fully compliant …

Go 142 12 Updated Apr 9, 2026

dLLM: Simple Diffusion Language Modeling

Python 2,334 234 Updated Feb 27, 2026
Dockerfile 1 Updated Feb 9, 2026

🍄 Give your Codex CLI an extra life

TypeScript 425 18 Updated Feb 19, 2026

Tiny, Fast, and Deployable anywhere — automate the mundane, unleash your creativity

Go 27,932 3,957 Updated Apr 9, 2026

Ready-to-use and customizable users management for FastAPI

Python 6,070 502 Updated Mar 30, 2026
TypeScript 1 Updated Feb 6, 2026

Pure C inference of Mistral Voxtral Realtime 4B speech to text model

C 1,604 114 Updated Feb 15, 2026

A new kind of Progress Bar, with real-time throughput, ETA, and very cool animations!

Python 6,264 232 Updated Oct 10, 2025

Compiler-based i18n library that emits tree-shakable translations, leading to up to 70% smaller bundle sizes.

TypeScript 340 10 Updated Apr 8, 2026

AI Agent Framework, the Pydantic way

Python 16,204 1,892 Updated Apr 9, 2026

Svelte AI Elements is a custom registry built on top of shadcn-svelte to help you build AI-native applications faster.

Svelte 254 11 Updated Apr 9, 2026

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

Python 6,637 540 Updated Apr 7, 2026

Turso is an in-process SQL database, compatible with SQLite.

Rust 18,167 827 Updated Apr 9, 2026

The most comprehensive authentication framework for TypeScript

TypeScript 27,712 2,435 Updated Apr 9, 2026

PersonaPlex code.

Python 8,730 1,235 Updated Mar 2, 2026

Go bindings for WebRTC AudioProcessing module (echo cancellation, noise suppression, AGC, VAD)

Go 2 Updated Jan 20, 2026

stfu

HTML 1,113 56 Updated Jan 27, 2026
Next