Skip to content
View bkataru's full-sized avatar
🌩️
boomin at da speed of c ⚡🔧
🌩️
boomin at da speed of c ⚡🔧

Organizations

@oakblr @KREA-Science-Club @impulsesproject @planckeon @godsfromthemachine @thezaptrack @dirmacs @BK-Modding

Block or report bkataru

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Forge: Swarm Agents That Turn Slow PyTorch Into Fast CUDA/Triton Kernels

TypeScript 13 2 Updated Jan 30, 2026

Minimal TPU implementation with 8x8 systolic array and PyTorch integration

Python 59 5 Updated Jan 26, 2026

Pure Triton kernels for Qwen3.5-27B inference on NVIDIA B200

Python 97 9 Updated Feb 28, 2026

Run a 1-billion parameter LLM on a $10 board with 256MB RAM

C 1,483 183 Updated Feb 22, 2026

Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels.

Python 1,138 107 Updated Mar 19, 2026

Development stack for AI-assisted multi-repo work. Persistent memory, quality gates, deployment automation. CLI + Claude Code plugin.

Rust 1 Updated Apr 9, 2026

🔷 Zig implementation of TOON (Token-Oriented Object Notation) — Compact, human-readable JSON encoding for LLM prompts with 30-60% token reduction. Spec-compliant v3.0 implementation.

Zig 1 Updated Jan 20, 2026

Zig implementation of TOON (Token-Oriented Object Notation).

Zig 1 Updated Mar 25, 2026

Efficient Universal Perception Encoder: a single on-device vision encoder with versatile representations that match or exceed specialized experts across multiple task domains.

Python 434 24 Updated Apr 6, 2026

The highest-scoring AI memory system ever benchmarked. And it's free.

Python 28,520 3,588 Updated Apr 8, 2026

Understand any codebase instantly. System intelligence for codebases, built for humans and AI.

TypeScript 115 11 Updated Apr 9, 2026

Using LLMs for iteratively exploring the solution search space at scale.

TypeScript 395 35 Updated Apr 6, 2026

Customize Claude Code's system prompts, create custom toolsets, input pattern highlighters, themes/thinking verbs/spinners, customize input box & user message styling, support AGENTS.md, unlock pri…

TypeScript 1,612 125 Updated Apr 9, 2026

All parts of Claude Code's system prompt, 24 builtin tool descriptions, sub agent prompts (Plan/Explore/Task), utility prompts (CLAUDE.md, compact, statusline, magic docs, WebFetch, Bash cmd, secur…

JavaScript 8,488 1,571 Updated Apr 9, 2026

Model parallel transformers in JAX and Haiku

Python 6,366 884 Updated Jan 21, 2023

Code for GPT-4chan

Python 637 78 Updated Jun 3, 2022

A modern model graph visualizer and debugger

JavaScript 1,428 146 Updated Apr 8, 2026

A gallery that showcases on-device ML/GenAI use cases and allows people to try and use models locally.

Kotlin 19,604 1,832 Updated Apr 8, 2026

Cross-platform, customizable ML solutions for live and streaming media.

C++ 34,625 5,892 Updated Apr 8, 2026

LiteRT, successor to TensorFlow Lite. is Google's On-device framework for high-performance ML & GenAI deployment on edge platforms, via efficient conversion, runtime, and optimization

C++ 2,136 268 Updated Apr 9, 2026

The open-source memory operating system for AI agents. Persistent memory, semantic search, loop detection, agent messaging, crash recovery, and real-time observability.

Python 93 18 Updated Apr 8, 2026

Asynchronous Zoned Unified Runtime Environment for Agentic LLMs

Rust 1 Updated Apr 7, 2026

PokeClaw (PocketClaw) — first on-device AI that controls your Android phone. Gemma 4, no cloud, no API key. Poke is short for Pocket.

Kotlin 329 44 Updated Apr 8, 2026

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Rust 1,280 82 Updated Apr 9, 2026

[MLsys2026]: RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

Python 10,754 943 Updated Apr 4, 2026

A new chunking strategy developed by ZeroEntropy for general semantic chunking using Llama-70B.

Python 256 20 Updated Jan 28, 2025

AI-powered penetration testing assistant using local LLM on linux (Parrot OS)

Python 1,909 382 Updated Apr 8, 2026

JiuwenClaw is an intelligent AI Agent built on openJiuwen. It extends the powerful capabilities of large language models directly to your fingertips through various communication apps you use daily.

Python 378 69 Updated Apr 9, 2026
Next