moe

Star

Here are 14 public repositories matching this topic...

lucienhuangfu / eLLM

Star

eLLM can infer LLM on CPUs faster than on GPUs

inference transformer moe llama minimax cpu-inference qwen llm-infernece rust-llm

Updated Jun 10, 2026
Rust

xigh / herbert-rs

Star

Local LLM inference engine written from scratch in Rust — hand-written AVX-512 assembly kernels, Metal & Vulkan compute shaders. Supports Qwen3, Mistral3, ... Q4/INT8/BF16 quantization.

opensource inference moe ia gemma inference-engine llm qwen openweight

Updated Mar 18, 2026
Rust

noobping / listenmoe

Star

Listen to J-POP and K-POP, or pause and resume the live stream. Stream and metadata provided by LISTEN.moe.

radio music rust app listen moe listen-moe adwaita gtk4 listenmoe gtk4-rs adw-gtk4

Updated May 20, 2026
Rust

DPRK-KCC / catbox

Star

Rust wrapper for the Catbox.moe API

moe catbox litterbox

Updated May 14, 2026
Rust

Rust-native MoE inference runtime with custom CUDA kernels for Blackwell GPUs. Includes DFlash speculative decoding, multi-tier Engram memory, and entropy-adaptive routing. Targets Qwen3.5-35B-A3B on a single RTX 5060 Ti 16GB.

rust ffi cuda inference moe quantization mamba state-space-models deltanet blackwell engram llm qwen speculative-decoding sm120 mamba2 nemotron-h hybrid-ssm

Updated Apr 25, 2026
Rust

musshiyaki / sebas

Star

Run Qwen3.5-122B-A10B on a 16 GB MacBook Air via SSD-streamed MoE expert weights.

macos metal inference moe on-device-ai apple-silicon llm local-llm qwen qwen3 ssd-streaming

Updated Jun 3, 2026
Rust

GOBA-AI-Labs / moe-stream

Star

SSD-streaming MoE inference engine for consumer hardware. Run 80B parameter models on a 24GB Mac.

rust metal inference moe pruning mixture-of-experts apple-silicon gguf ssd-streaming

Updated Feb 22, 2026
Rust

faisalmumtaz89 / Lumen

Star

LLM inference in Rust - Metal & CUDA

rust metal cuda inference nvidia transformer openai moe llm llm-serving anthropic llm-inference qwen

Updated Jun 5, 2026
Rust

NAME0x0 / OMNI

Star

PERSPECTIVE v2 — A 1.05 trillion parameter sparse Mixture-of-Experts language model that runs on consumer hardware (4 GB VRAM + 32 GB RAM). Features O(1) perspective decay recurrence, 3D torus manifold routing, native ternary {-1,0,+1} weights, holographic distributed memory, and hard geometric safety constraints. Built in Rust.

Updated Feb 20, 2026
Rust

chunix64 / novel-rs

Star

Multipage novel crawling, scraper written by rust

rust cli scraper downloader moe novels

Updated Jun 17, 2025
Rust

Edg6183 / chimere

Star

Runs a Rust inference server for hybrid State-Space and MoE language models with fast GPU throughput on consumer hardware

rust ffi cuda moe quantization mamba state-space-models github-config deltanet blackwell engram llm qwen speculative-decoding mamba2 hybrid-ssm

Updated Jun 10, 2026
Rust

iahuang / cosmoe

Star

Enabling inference of large mixture-of-experts (MoE) models on Apple Silicon using dynamic offloading.

rust ml moe large-language-models

Updated Mar 11, 2026
Rust

julianmb / hcc-edge-moe

Star

Heterogeneous Compute Cascade (HCC) — distributed 400B-parameter MoE inference across dual AMD Ryzen AI MAX+ 395 'Strix Halo' workstations via USB4.

rust amd inference moe rocm usb4 llm speculative-decoding strix-halo

Updated May 17, 2026
Rust

mrunalpendem123 / meshthatworks

Star

Frontier AI on the Macs you already own. Treats your SSD as memory and splits models across paired devices, so 18 GB models run on 8 GB Macs. Local, private, OpenAI-compatible — works with Claude Code, Cursor, any agent. Built on iroh + SwiftLM.

rust ai peer-to-peer moe mlx local-first apple-silicon llm distributed-inference openai-compatible

Updated May 26, 2026
Rust

Improve this page

Add a description, image, and links to the moe topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the moe topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

moe

Here are 14 public repositories matching this topic...

lucienhuangfu / eLLM

xigh / herbert-rs

noobping / listenmoe

DPRK-KCC / catbox

AIdevsmartdata / chimere

musshiyaki / sebas

GOBA-AI-Labs / moe-stream

faisalmumtaz89 / Lumen

NAME0x0 / OMNI

chunix64 / novel-rs

Edg6183 / chimere

iahuang / cosmoe

julianmb / hcc-edge-moe

mrunalpendem123 / meshthatworks

Improve this page

Add this topic to your repo