Stars
Open-source, community-driven agent harness
A high-throughput and memory-efficient inference and serving engine for LLMs
Experimental implementation of DeepSeek v4 flaash in llama.cpp
DeepSeek 4 Flash and PRO local inference engine for Metal, CUDA and ROCm
Lossless DFlash speculative decoding for MLX on Apple Silicon
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
FlashMLA: Efficient Multi-head Latent Attention Kernels
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
DeepEP: an efficient expert-parallel communication library
DeepSeek Coder: Let the Code Write Itself
A kernel library written in tilelang
Docker configuration for running VLLM on dual DGX Sparks
The agent that grows with you
Fast LLM speculative inference server for consumer hardware.
ClashFX — macOS proxy tool with Enhanced Mode (TUN)
《Real-Time Rendering 4th》 (RTR4) 中文翻译
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Your Personal AI Assistant; easy to install, deploy on your own machine or on the cloud; supports multiple chat apps with easily extensible capabilities.
精读鸿蒙内核源码,百万汉字注解分析;百篇博客深入解剖,挖透内核地基工程.注解同步官方,工具文档齐全,多站点发布 . weharmonyos.com
Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.
A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
Run compilers interactively from your web browser and interact with the assembly
Stop renting your intelligence. Own it with AnythingLLM. Everything you need for a powerful local-first agent experience
Lightweight Lenovo Vantage and Hotkeys replacement for Lenovo Legion laptops.
Guides, Tricks, and Tips to get the Legion Go running best on Linux