Skip to content
View Infatoshi's full-sized avatar

Highlights

  • Pro

Block or report Infatoshi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

slime is an LLM post-training framework for RL Scaling.

Python 6,680 963 Updated Jun 22, 2026

Tile-Based Runtime for Ultra-Low-Latency LLM Inference

Python 1,456 91 Updated Jun 8, 2026

Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).

HTML 18,644 1,207 Updated Jun 22, 2026

Open source Ghostty-based macOS terminal with vertical tabs and notifications. Built for AI coding agents and programmability.

Swift 22,678 1,784 Updated Jun 22, 2026

Official source code of FreeCAD, a free and opensource multiplatform 3D parametric modeler.

C++ 31,679 5,687 Updated Jun 22, 2026

extract all your personal data history from cursor, codex, claude-code, windsurf, and trae

Python 811 78 Updated Jan 21, 2026

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 60,014 10,338 Updated Nov 12, 2025

AI agent toolkit: unified LLM API, agent loop, TUI, coding agent CLI

TypeScript 64,723 7,889 Updated Jun 22, 2026

cuda-oxide is an experimental Rust-to-CUDA compiler that lets you write (SIMT) GPU kernels in safe(ish), idiomatic Rust. It compiles standard Rust code directly to PTX — no DSLs, no foreign languag…

Rust 2,804 193 Updated Jun 22, 2026

Mobile and Web client for Codex and Claude Code, with realtime voice, encryption and fully featured

TypeScript 22,137 1,843 Updated Jun 22, 2026

Universal AI coding proxy. Use Claude Code, Codex CLI, or any tool with DeepSeek, GLM, MiniMax, and more—without rate limits breaking your flow.

Rust 22 4 Updated May 25, 2026

Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.

Python 359 82 Updated Jun 22, 2026

RTX 6000 Pro Wiki — Running Large LLMs (Qwen3.5-397B, Kimi-K2.5, GLM-5) on PCIe GPUs without NVLink

Python 459 31 Updated Jun 22, 2026

Control panel for VLLM, Sglang, llama.cpp, exllamav3

TypeScript 1,180 94 Updated Jun 18, 2026

A unified library of SOTA model optimization techniques like quantization, distillation, pruning, neural architecture search, speculative decoding, etc. It compresses deep learning models for downs…

Python 2,967 453 Updated Jun 22, 2026

If tinygrad wasn't small enough for you...

Python 811 100 Updated Mar 9, 2024

Overworld's local world client interface to run Waypoint world models

TypeScript 126 17 Updated Jun 17, 2026

Repository for the CUDA H100 Course

65 6 Updated Apr 12, 2026

SpectralQuant: Calibrated Eigenbasis Rotation and Water-Filled Bit Allocation for KV-Cache Compression

Python 195 22 Updated May 15, 2026

Autonomous GPU Kernel Generation & Optimization via Deep Agents

Python 455 76 Updated Jun 6, 2026

An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.

Rust 194,165 109,909 Updated Jun 8, 2026

Our first fully AI generated deep learning system

Python 630 48 Updated Feb 2, 2026

LLM inference in C/C++

C++ 1,922 326 Updated Jun 20, 2026

Production-grade client-side tracing, profiling, and analysis for complex software systems.

C++ 6,126 811 Updated Jun 22, 2026

Show usage stats for OpenAI Codex and Claude Code, without having to login.

Swift 15,229 1,257 Updated Jun 22, 2026
TypeScript 73 4 Updated Jun 16, 2026

NanoGPT (124M) in 90 seconds

Python 5,434 816 Updated Jun 21, 2026

DFlash: Block Diffusion for Flash Speculative Decoding

Python 5,194 373 Updated May 10, 2026

CUDA Tile IR is an MLIR-based intermediate representation and compiler infrastructure for CUDA kernel optimization, focusing on tile-based computation patterns and optimizations targeting NVIDIA te…

C++ 985 83 Updated May 28, 2026

Voice-to-text with push-to-talk for Wayland compositors

Rust 866 63 Updated Jun 17, 2026
Next