Lists (2)
Sort Name ascending (A-Z)
- All languages
- AppleScript
- Arduino
- Assembly
- Astro
- AutoHotkey
- Bikeshed
- C
- C#
- C++
- CMake
- CSS
- Clojure
- CoffeeScript
- Cuda
- Cython
- Dart
- Dockerfile
- EJS
- Elixir
- GDScript
- Go
- Go Template
- HTML
- Hack
- Handlebars
- Haxe
- HolyC
- Java
- JavaScript
- Jinja
- Julia
- Jupyter Notebook
- Kotlin
- LLVM
- Lean
- Lua
- MDX
- Makefile
- Markdown
- Motoko
- Move
- Mustache
- Objective-C
- PHP
- Pascal
- Perl
- PlantUML
- PowerShell
- Python
- QML
- R
- Roff
- Ruby
- Rust
- SCSS
- Sass
- Scala
- Shell
- Solidity
- Svelte
- Swift
- TeX
- TypeScript
- V
- Vim Script
- Vue
- WebAssembly
- Wikitext
- Wolfram Language
- YARA
- Zig
Starred repositories
DFlash: Block Diffusion for Flash Speculative Decoding
An Enhanced TOP program to monitor your Nvidia DGX SPARK's Hardware
High-performance interactive system monitor for NVIDIA DGX systems — GPU, CPU, memory, disk, network in a beautiful TUI
Bidirectional Telegram bot plugin for Paperclip - push notifications, bot commands, inline approve/reject buttons, reply routing
Bidirectional Discord integration for Paperclip: notifications, slash commands, and community intelligence
A benchmark for LLMs on complicated tasks in the terminal
A coding agent optimized to smaller LLMs
sparkrun - launch, manage, and stop LLM inference workloads on NVIDIA DGX Spark systems
Docker configuration for running VLLM on dual DGX Sparks
The open-source app everyone uses to manage agents at work
🪨 why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman
DFlash vLLM for DGX Spark — Plug & Play Block-Diffusion Speculative Decoding
Qwen3.6-35B-A3B-heretic NVFP4 + DFlash speculative decoding on DGX Spark (GB10/sm_121a). Source-built vLLM image + 7 patches + comprehensive deployment guide.
Lossless abliteration of Qwen3.6-27B with NVFP4 hardware quantization for DGX Spark / Blackwell. BF16 (51 GB) + NVFP4 (26 GB) deployment guide, docker-compose, and QuickStart.
A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks …
Menubar Tool to set Charge Limits and Prolong Battery Lifespan
A high-throughput and memory-efficient inference and serving engine for LLMs
Community recipes for serving LLMs on RTX 3090. Multi-engine (vLLM, llama.cpp, SGLang) and model-agnostic. Currently shipping Qwen3.6-27B configs for 1× and 2× cards.
Composio powers 1000+ toolkits, tool search, context management, authentication, and a sandboxed workbench to help you build AI agents that turn intent into action.
The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.
From the team behind Gatsby, Mastra is a framework for building AI-powered applications and agents with a modern TypeScript stack.
An extension suite that turns Pi into a multi-agent orchestration platform
AGENTS.md — a simple, open format for guiding coding agents
The agent that grows with you
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
This project demonstrates Agent-to-Agent (A2A) communication between different agent frameworks, enabling distributed tracing and conversation across multiple…