- chapter2:训练第一个网络
- chapter3: 逐步搭建语言模型
- chapter4:ViT + ConvNeXt
- moe: 学习笔记链接:https://jhqxxx.github.io/moe.html
- LLMs-from-scratch:https://github.com/rasbt/LLMs-from-scratch/
- Llama: https://zhuanlan.zhihu.com/p/636784644
- transformer: https://arxiv.org/pdf/1706.03762
- ConvNeXt: https://arxiv.org/pdf/2201.03545
- ViT: https://arxiv.org/pdf/2010.11929