MobileLLM
MobileLLM Optimizing Sub-billion Parameter Language Models
MobileLLM is a lightweight large language model (LLM) framework developed by Facebook Research, optimized for on-device deployment where computational and memory efficiency are critical. Introduced in the ICML 2024 paper “MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases”, it focuses on delivering strong reasoning and generalization capabilities in models under one billion parameters. The framework integrates several architectural innovations—SwiGLU activation, deep and thin network design, embedding sharing, and grouped-query attention (GQA)—to achieve a superior trade-off between model size, inference speed, and accuracy. MobileLLM demonstrates remarkable performance, with the 125M and 350M variants outperforming previous state-of-the-art models of the same scale by up to 4.3% on zero-shot commonsense reasoning tasks.