moe

Here are 7 public repositories matching this topic...

uccl-project / uccl

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

ai networking hpc amd gpu collective cuda p2p nvidia broadcom moe rdma allreduce llm kvcache

Updated Apr 11, 2026
C++

inferflow / inferflow

Star

Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).

bloom falcon moe gemma mistral mixture-of-experts model-quantization multi-gpu-inference m2m100 llamacpp llm-inference internlm llama2 qwen baichuan2 mixtral phi-2 deepseek minicpm

Updated Mar 15, 2024
C++

yvonwin / qwen2.cpp

Star

qwen2 and llama3 cpp implementation

nlp moe large-language-models qwen qwen2 qwen1-5

Updated Jun 7, 2024
C++

Harry-Chen / InfMoE

Sponsor

Star

Inference framework for MoE layers based on TensorRT with Python binding

inference moe tensorrt

Updated May 31, 2021
C++

tobychui / Weather-Pet-Display

Star

A simple weather display with a cute interactive desktop pet (❛◡❛✿)

weather arduino esp8266 anime display pet moe diy maker cute uart-hmi

Updated May 24, 2022
C++

MartinCrespoC / QuantumLeap---Llama.cpp-TurboQuant

Star

🚀 Run any LLM on any hardware. 130% faster MoE inference with ExpertFlow + TurboQuant KV compression. Ollama-compatible API. Built on llama.cpp.

machine-learning ai cpp gpu optimization cuda inference moe quantization rocm nvidia-gpu amd-gpu mixture-of-experts openai-api llm llama-cpp local-llm llm-inference ollama

Updated Apr 1, 2026
C++

m1kron / flash-distributed-moe

Star

Implementation of flashDmoe paper: https://arxiv.org/abs/2506.04667 paper for AMD's GPU ecosystem.

amd gpu moe rocm rocshmem

Updated Apr 1, 2026
C++

Improve this page

Add a description, image, and links to the moe topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the moe topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

moe

Here are 7 public repositories matching this topic...

uccl-project / uccl

inferflow / inferflow

yvonwin / qwen2.cpp

Harry-Chen / InfMoE

tobychui / Weather-Pet-Display

MartinCrespoC / QuantumLeap---Llama.cpp-TurboQuant

m1kron / flash-distributed-moe

Improve this page

Add this topic to your repo