Stars
- All languages
- ActionScript
- Assembly
- Awk
- Batchfile
- C
- C#
- C++
- CSS
- Clojure
- CoffeeScript
- D
- DIGITAL Command Language
- Dart
- Dockerfile
- Eagle
- Elixir
- Emacs Lisp
- Erlang
- Fortran
- Gherkin
- Go
- HCL
- HLSL
- HTML
- Hack
- Haskell
- Java
- JavaScript
- Jinja
- Jsonnet
- Jupyter Notebook
- Kotlin
- Lua
- MDX
- Makefile
- Markdown
- Nim
- Nix
- Nunjucks
- OCaml
- Objective-C
- PHP
- PLpgSQL
- Pascal
- Perl
- PowerShell
- Python
- Rich Text Format
- Riot
- Ruby
- Rust
- SCSS
- SQL
- Scala
- Shell
- Smarty
- Starlark
- Svelte
- Swift
- TypeScript
- Vala
- Vim Script
- Vue
- YARA
- Zig
Learn it. Build it. Ship it for others.
Give Claude Code a memory that evolves with your codebase. Hooks automatically capture sessions, the Claude Agent SDK extracts key decisions and lessons, and an LLM compiler organizes everything in…
Minimal and readable coding agent harness implementation in Python to explain the core components of coding agents.
Portable multi-agent AI developer setup for Claude Code + Ollama. Role-based local LLM orchestration via Bash — plan, code, review, commit. Zero Dependency. Works with any language stack.
Stop Claude Code from burning through your quota in 20 minutes. Auto-rotates oversized sessions and preserves context.
A ClaudeOps framework to use the bypass permissions/skip-permissions-dangerously mode safer.
Never lose context between Claude Code sessions again. Auto-checkpoint, resume, EOD summaries — all local.
Adaptive Precision for EXpert Models: MoE-aware mixed-precision quantization
reference impl with llama.cpp compiled to distributed inference across machines, with real end to end demo
Fork of https://github.com/elastic/supply-chain-monitor with local AI backend (vLLM/llama.cpp)
Turbo1Bit: Combining 1-bit LLM weights (Bonsai) with TurboQuant KV cache compression for maximum inference efficiency. 4.2x KV cache compression + 16x weight compression = ~10x total memory reduction.
AI agent dreaming system — replay, consolidation, creative synthesis during quiet hours
OpenShell is the safe, private runtime for autonomous AI agents.
Beads - A memory upgrade for your coding agent
LLM inference in C/C++ with changes from Prism-ML to support 1Bit models
A fast, helpful, and open-source document parser
The repo is finally unlocked. enjoy the party! The fastest repo in history to surpass 100K stars ⭐. Join Discord: https://discord.gg/5TUQKqFWd Built in Rust using oh-my-codex.
A fast, secure CLI tool for managing Kubernetes kubeconfig files.
Linux GPU Configuration And Monitoring Tool
Every meeting, every idea, every voice note — searchable by your AI. Open-source, privacy-first conversation memory layer.
OCR model that handles complex tables, forms, handwriting with full layout.
InfraLens is a next-generation observability tool that uses eBPF to automatically discover and visualize service-to-service communication in Kubernetes clusters—without requiring any code changes o…
TheTom / llama-cpp-turboquant
Forked from ggml-org/llama.cppLLM inference in C/C++
Madreag / turbo3-cuda
Forked from TheTom/llama-cpp-turboquantLLM inference in C/C++