Stars
Use Garry Tan's exact Claude Code setup: 23 opinionated tools that serve as CEO, Designer, Eng Manager, Release Manager, Doc Engineer, and QA
An Open Foundation Model and Benchmark to Accelerate Generative Recommendation
[Pytorch] The repo contains the code for "FORGE: Forming Semantic Identifiers for Generative Retrieval in Industrial Datasets"
Eedi - Mining Misconceptions in Mathematics 5th place solution
Solution of Kaggle competition: MAP - Charting Student Math Misunderstandings
搜索、推荐、广告、用增等工业界实践文章收集(来源:知乎、Datafuntalk、技术公众号)
手把手带你实战 Huggingface Transformers 课程视频同步更新在B站与YouTube
最少使用 3090 即可训练自己的比特大脑(miniLLM)🧠(进行中). Train your own BitBrain(A mini LLM) with just an RTX 3090 minimum.
🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!
从无名小卒到大模型(LLM)大英雄~ 欢迎关注后续!!!
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
复盘所有NLP比赛的TOP方案,只关注NLP比赛,持续更新中!
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
we want to create a repo to illustrate usage of transformers in chinese
pytorch单精度、半精度、混合精度、单卡、多卡(DP / DDP)、FSDP、DeepSpeed模型训练代码,并对比不同方法的训练速度以及GPU内存的使用
中文nlp解决方案(大模型、数据、模型、训练、推理)
Unsloth Studio is a web UI for training and running open models like Qwen, DeepSeek, gpt-oss and Gemma locally.
2023 Kaggle LECR 金牌 Top3 训练代码
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
1st Place Solution for LLM - Detect AI Generated Text Kaggle Competition