Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Makes your AI agent think like the laziest senior dev in the room. The best code is the code you never wrote.
Analyzing available cost token and time for ant vs oai trajectories that are available.
Open-source observability tool that uses AI agents to self-heal your software
Clearing the nanoGPT speedrun's 3.28 val-loss target on one H200 by stacking the Aurora optimizer and Token Superposition Training (TST) on an untouched Transformer.
Local Responses-API shim that exposes Factory BYOK models (and optional ChatGPT GPT-5.5 passthrough) to Codex Desktop.
+3M Downloads! Repair invalid LLM JSON, commonly used to parse the output of LLMs — Parsing ChatGPT and llm JSON stream response — Partial and incomplete JSON parser python library for OpenAI | rep…
Validate, repair, and retry LLM structured outputs. 13 repair strategies for common JSON malformations, JSON Schema validation, and retry-with-feedback prompts.
Rust-backed repair of malformed JSON for LLM-style outputs
Permanent memory for AI agents. Single binary, zero dependencies, MCP native.
Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.
(HAM) Memory system for AI coding agents. Cut token usage by 80% by scoping context to directories.
The Context OS for Autonomous AI Agents. Distill terminal noise into pure semantic signal, stop agent hallucinations, and cut token costs by up to 90%.
Never stop coding. Free AI gateway: one endpoint, 160+ providers (50+ free), connect Claude Code, Codex, Cursor, Cline & Copilot to FREE Claude/GPT/Gemini. RTK+Caveman stacked compression saves 15-…
The batteries-included agent harness.
An ongoing, collaborative meta-analysis about Human-AI-Interactions. We aggregate data and knowledge to build a non-abrasive, user-friendly prompting framework tailored to LLM mechanics, ensuring r…
Official implementation of paper "ACON: Optimizing Context Compression for Long-horizon LLM Agents"
🎒 Token-Oriented Object Notation (TOON) – Compact, human-readable, schema-aware JSON for LLM prompts. Spec, benchmarks, TypeScript SDK.
Parallax Engine plugin for OpenCode -- friction-loop verification, mode switching (plan/build/debug), multi-perspective reasoning, and the 4 invariants framework
vLLM fork for Tesla V100 (SM70) with AWQ 4-bit support, CUDA 12.8 build flow, and validated Qwen3.5 27B/35B deployment on multi-GPU V100.
ADHD — a skill for coding agents. Tree-of-thought with pruning, built on the Claude & Codex Agent SDK. Fans out parallel divergent thoughts under different cognitive frames, scores, prunes traps, d…
Fast, lossless LLM inference via dual-view diffusion decoding.
Tasks for planning - enhanced with Hiveminds for multi-model reviews, Swarms for long running autonomous tasks. Nurse to keep things running!
Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents
[ICLR 2026] ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
DFlash: Block Diffusion for Flash Speculative Decoding
Live-SWE-agent: live, runtime self-evolving software engineering agent