Skip to content
View 6's full-sized avatar

Organizations

@1000Memories @nko2 @wealthsimple

Block or report 6

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🎙️ Give your apps, CLIs, and agents a voice. VoiPi is a universal, zero-dependency, free text-to-speech library for JavaScript.

TypeScript 181 12 Updated Apr 26, 2026

ArtifactFS is a filesystem driver designed to mount large git repos as quickly as possible, hydrating file contents on-the-fly instead of blocking on the initial clone. It's ideal for agents, sandb…

Go 742 28 Updated Apr 23, 2026

Agent Skill to help convert transformer LLMs to mlx-lm

Python 17 1 Updated Apr 16, 2026

KV cache compression via block-diagonal rotation. Beats TurboQuant: better PPL (6.91 vs 7.07), 28% faster decode, 5.3x faster prefill, 44x fewer params. Drop-in llama.cpp integration.

Python 921 78 Updated Apr 23, 2026
Python 6,572 880 Updated Apr 25, 2026

LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar

Python 11,647 1,011 Updated Apr 24, 2026

Official implementation of Paper "System-Aware 4-Bit KV-Cache Quantization for Real-World LLM Serving"

Shell 13 1 Updated Apr 17, 2026

Lucebox optimization hub: hand-tuned LLM inference, built for specific consumer hardware.

C++ 1,009 80 Updated Apr 26, 2026

Lossless DFlash speculative decoding for MLX on Apple Silicon

Python 585 33 Updated Apr 24, 2026
Python 55 8 Updated Apr 25, 2026

RTX 6000 Pro Wiki — Running Large LLMs (Qwen3.5-397B, Kimi-K2.5, GLM-5) on PCIe GPUs without NVLink

Python 220 19 Updated Apr 27, 2026

:octocat: Static checker for GitHub Actions workflow files

Go 3,818 215 Updated Apr 19, 2026

Create stunning demos for free. Open-source, no subscriptions, no watermarks, and free for commercial use. An alternative to Screen Studio.

TypeScript 33,038 2,224 Updated Apr 27, 2026

The Modular Platform (includes MAX & Mojo)

Mojo 25,910 2,802 Updated Apr 27, 2026

nono - a capability-based, multiplexing sandbox tool, built for developers - lift'n'shift seamless path to prod. Run agents securely without needing any additional infra, zero setup, zero latency.

Rust 2,129 150 Updated Apr 25, 2026

Full-screen TUI worktree manager in Rust

Rust 1 1 Updated Apr 1, 2026
Go 3 Updated Mar 29, 2026

Replace port numbers with stable, named local URLs. For humans and agents.

TypeScript 7,490 237 Updated Apr 27, 2026

Dark mode PDFs without destroying your images.

JavaScript 123 4 Updated Apr 3, 2026

CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies

Rust 36,502 2,213 Updated Apr 26, 2026

Flash weight streaming for MLX: run massive models larger than your RAM on Apple Silicon.

Python 103 8 Updated Apr 1, 2026

Running a big model on a small laptop

Objective-C 3,761 464 Updated Mar 19, 2026

Train the smallest LM you can that fits in 16MB. Best model wins!

Python 4,968 3,306 Updated Apr 27, 2026

Sub-millisecond VM sandboxes for AI agents via copy-on-write forking

Rust 2,249 96 Updated Mar 21, 2026

agent-sandbox enables easy management of isolated, stateful, singleton workloads, ideal for use cases like AI agent runtimes.

Go 1,949 221 Updated Apr 25, 2026

Lean 4 programming language and theorem prover

Lean 7,909 826 Updated Apr 27, 2026

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.

TypeScript 42,198 2,811 Updated Apr 27, 2026

sub-500ms latency phone agent orchestration

Python 641 65 Updated Mar 6, 2026

Secure and fast microVMs for serverless computing.

Rust 33,977 2,361 Updated Apr 24, 2026
Next