#
moe
Here are 6 public repositories matching this topic...
[ICLR 2025] Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
-
Updated
Oct 5, 2025 - Shell
HydraNet is a state-of-the-art transformer architecture that combines Multi-Query Attention (MQA), Mixture of Experts (MoE), and continuous learning capabilities.
-
Updated
Nov 3, 2025 - Shell
Anleitung zum Aufbau unseres Käpseles
docker education benchmarking tools university mcp chatbot inference multi-agent moe tutoring prompts rag llm code-interpreter openwebui hfwu kaepsele
-
Updated
Oct 17, 2025 - Shell
Improve this page
Add a description, image, and links to the moe topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the moe topic, visit your repo's landing page and select "manage topics."