🐒
Making AI Safer
Making AI Safer. Focus on LLM、RL、Infra
-
NSA-pytorch Public
DeepSeek Native Sparse Attention pytorch implementation
-
easy-dualpipe Public
Pipeline-Parallel Lecture: Simplest Dualpipe Implementation.
-
X-R1 Public
minimal-cost for training 0.5B R1-Zero
-
-
DeepSpeed Public
Forked from deepspeedai/DeepSpeedDeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
-
online-softmax Public
simplest online-softmax notebook for explain Flash Attention
-
cut-cross-entropy-pytorch Public
pytorch notebook for implemention for cut-cross-entropy LLM training.
-
-