MiniLLM

轻量级 LLM 训练、对齐、部署一体化项目，面向从 0 到 1 的学习与 DeepSeek-V3.2 架构复现。

本仓库由 MiniMind 项目重构而来，保留“从零实现轻量级 LLM”的教学目标，并补全数据、训练、评估与部署流程。

✨ 特性

DeepSeek-V3.2 架构复现：MLA + MoE + MTP + DSA（MLX/Torch 对照实现）
端到端训练链路：预训练 → SFT → 偏好对齐（DPO/GRPO/PPO/SPO）→ 蒸馏
训练与推理：原生 PyTorch + DeepSpeed + MLX（Apple Silicon）
数据管线：清洗、去重、质量评估、RustBPE 分词
部署方式：Streamlit WebUI、OpenAI 协议 API、llama.cpp/vLLM/Ollama 导出
评估工具：C-Eval、CMMLU、OpenBookQA 等基准评测

🚀 快速开始

1) 环境准备

conda create -n minillm python=3.10 -y
conda activate minillm
pip install -r requirements.txt

如果下载较慢，可使用清华源：

python -m pip install -i https://pypi.tuna.tsinghua.edu.cn/simple -r requirements.txt

2) 数据准备

将原始语料放在 dataset/ 或自定义目录
运行 scripts/prepare_data.sh 完成去重、分词、过滤
处理后的数据会同步到 data/ 供训练脚本使用

3) 一键训练

# 预训练 → SFT → DPO
scripts/run.sh

# 跳过预训练，仅执行 SFT + DPO
scripts/run.sh --skip-pretrain

# 烟雾测试（CPU + 小数据）
scripts/run.sh --smoke-test

4) WebUI

python -m streamlit run scripts/web_demo.py

训练日志、权重与评估输出默认保存在 out/。

🍎 MLX（Apple Silicon）

# 自动跑通下载数据 → 预训练 → SFT
bash scripts/run_mlx.sh

# Smoke Test
bash scripts/run_mlx.sh --smoke-test

MLX 产物默认写入 out/mlx，WebUI 会自动解析最新 step_ checkpoint。

🧪 蒸馏（可选）

MLX 一键蒸馏（Ollama 教师模型）

# 需要先启动 ollama serve，并拉取教师模型（如 qwen3:0.6b）
bash scripts/run_mlx_distill_ollama.sh

可通过环境变量调整：

OLLAMA_MODEL=qwen3:0.6b DATA_JSONL=out/distill_ollama_qwen3_0.6b/synth.jsonl OUT_DIR=out/mlx_distill/qwen3_0.6b_sft \
  bash scripts/run_mlx_distill_ollama.sh

MTP 投机解码（MiniLLM / DeepSeek-V3.2 架构）

MTP 使用模型自带的 multi-token prediction 头，不再需要单独训练 speculator。

--spec_len 控制最多采纳的 draft 长度（上限 = 1 + num_nextn_predict_layers）。

MiniLLM（Torch）

python speculator/infer/torch/bench.py \
  --target_arch minillm \
  --minillm_ckpt out/pretrain_512.pth \
  --minillm_tokenizer ./model \
  --spec_len 2

MiniLLM（MLX）

python speculator/infer/mlx/bench.py \
  --target_arch minillm \
  --minillm_ckpt_dir out/mlx/sft/checkpoints/step_00000050 \
  --minillm_tokenizer ./model \
  --spec_len 2

MLX 推理/训练依赖 mlx-lm（当前与 transformers==5.0.0rc1 绑定），建议使用独立虚拟环境。

兼容：EAGLE-3 speculator（Qwen3）

如需对 Qwen3 使用 EAGLE-3 draft 模型，可继续使用 speculator/train/* 与 speculator/infer/*（参数与旧版保持一致）。

PyTorch 蒸馏训练

# 默认读取 out/ 中的 full_sft_512.pth（学生）与 full_sft_768.pth（教师）
python trainer/train_distillation.py --data_path dataset/sft_xxx.jsonl --out_dir out

🧪 推理与部署

OpenAI 兼容 API：python scripts/serve_openai_api.py（默认端口 8998）
评测/推理脚本：python eval_model.py --model_mode 1
训练监控面板：python -m scripts.dashboard.app --host 0.0.0.0 --port 8008

🧭 仓库结构

.
├── apps/                # 服务与 UI（OpenAI API / WebUI / Dashboard）
├── data/                # 数据缓存目录
├── dataset/             # 公开数据集示例与脚本
├── docs/                # 文档与指南
├── speculator/          # Speculator/MTP 解码与推理入口（torch/mlx）
├── mlx_train/           # MLX 训练与推理
├── model/               # MiniLLM Dense/MoE 实现
├── pipelines/           # 一键训练/推理流水线脚本（主逻辑）
├── scripts/             # 脚本与工具
├── tokenizer/           # RustBPE 分词与词表
├── trainer/             # 训练/对齐/蒸馏脚本
├── tools/               # 数据/评测/转换/分词等工具脚本
└── utils/               # 公共工具与评估脚本

📚 资源与文档

docs/README.md：文档入口与导航
docs/deepseek_v3_2_mlx_cn.md：DeepSeek-V3.2 MLX 架构说明
docs/deepseek_v3_2_mlx.md：DeepSeek-V3.2 MLX 架构说明（EN）
docs/booklet_cn.md：完整中文小册子
docs/changelog/CHANGELOG.md：版本记录
ModelScope: MiniMind-Reasoning
ModelScope: MiniMind
Bilibili 视频介绍

🤝 贡献指南

欢迎通过 Issue 或 Pull Request 反馈问题和改进建议。请先阅读 docs/CODE_OF_CONDUCT.md，并参考 AGENTS.md 了解项目约定。

📄 许可协议

本项目采用 MIT License。在引用或再发布模型与数据时，请遵守相应许可证要求。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MiniLLM

✨ 特性

🚀 快速开始

1) 环境准备

2) 数据准备

3) 一键训练

4) WebUI

🍎 MLX（Apple Silicon）

🧪 蒸馏（可选）

MLX 一键蒸馏（Ollama 教师模型）

MTP 投机解码（MiniLLM / DeepSeek-V3.2 架构）

MiniLLM（Torch）

MiniLLM（MLX）

兼容：EAGLE-3 speculator（Qwen3）

PyTorch 蒸馏训练

🧪 推理与部署

🧭 仓库结构

📚 资源与文档

🤝 贡献指南

📄 许可协议

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 327 Commits
apps		apps
assets		assets
configs/dashboard		configs/dashboard
data/chinese		data/chinese
dataset		dataset
docs		docs
mlx_train		mlx_train
model		model
pipelines		pipelines
rustbpe		rustbpe
scripts		scripts
speculator		speculator
tests		tests
tokenizer		tokenizer
tools		tools
trainer		trainer
utils		utils
.editorconfig		.editorconfig
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
README_en.md		README_en.md
analyze_cleaned_data.py		analyze_cleaned_data.py
eval_model.py		eval_model.py
eval_vlm.py		eval_vlm.py
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt

License

ai-clarify/mini-llm

Folders and files

Latest commit

History

Repository files navigation

MiniLLM

✨ 特性

🚀 快速开始

1) 环境准备

2) 数据准备

3) 一键训练

4) WebUI

🍎 MLX（Apple Silicon）

🧪 蒸馏（可选）

MLX 一键蒸馏（Ollama 教师模型）

MTP 投机解码（MiniLLM / DeepSeek-V3.2 架构）

MiniLLM（Torch）

MiniLLM（MLX）

兼容：EAGLE-3 speculator（Qwen3）

PyTorch 蒸馏训练

🧪 推理与部署

🧭 仓库结构

📚 资源与文档

🤝 贡献指南

📄 许可协议

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages