Highlights
- Pro
Stars
A retargetable MLIR-based machine learning compiler and runtime toolkit.
Development repository for the Triton language and compiler
TokenSpeed is a speed-of-light LLM inference engine.
FlyDSL is the Python frontβend of the project: Flexible LaYout DSL.
High-performance, light-weight C++ LLM and VLM Inference Software for Physical AI
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
All information and news with respect to Falcon-H1 series
Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.
Interactive 3D visualization of dense decoder-only LLM inference. Companion to the AI Inference Engineer 2026 course.
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
Turn your PC, Mac, or Linux box into an AI server. LLM inference, chat UI, voice, agents, workflows, RAG, and image generation.
Rust home automation runtime for Genie: local device graph, deterministic actuation safety, audit logs, and AI-native home-control APIs.
Jetson Orin-tuned LLM inference runtime for GenieClaw β memory-first, power-aware, zero-allocation. C++17 + CUDA.
π§ Token weight loss. Lean output compaction for terminal-heavy agent workflows. Works as a native CLI tool or as an extension to popular coding and agent frameworks.
Your own personal AI assistant. Any OS. Any Platform. The lobster way. π¦
ai-hpc / jetson-esp-hosted
Forked from espressif/esp-hostedHosted Solution (Jetson Linux) with ESP32 (Wi-Fi + BT + BLE)
GeniePod Home V1 hardware: MVP testing build, wiring, BOM, and planned interface-board/enclosure docs.
π¦ Low-latency, limited-context AI harness for private on-device homes.
Install and run NemoClaw on NVIDIA Jetson Orin with a patched OpenShell cluster image and streamlined onboarding.
High-performance C++/CUDA GPU-accelerated STARK prover for Triton VM
Master AI inference, AI agent harness systems, and hardware engineering β then design a physical AI chip. That is the goal.
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.
High-performance C++/CUDA GPU-accelerated XNT Miner
High-performance C++/CUDA GPU-accelerated STARK prover for Triton VM