Highlights
- Pro
Pinned Loading
-
llm_causal_reasoning
llm_causal_reasoning PublicCode for synthetic data generation, GRPO/DAPO/SFT training, and reasoning trace analysis to study algorithmic generalization of RL post-training.
Jupyter Notebook
-
annotation
annotation PublicA library for collaboratively prompt engineering to annotate social media posts
Python
-
batched_vocabulary_optimization
batched_vocabulary_optimization PublicTraining UnigramLM style tokenizers jointly with Transformer task model
Shell
-
dblm
dblm PublicLanguage models that condition on joint probability distributions, and interleave probabilistic inference with next-token prediction
Python
-
regression-gradient-estimator
regression-gradient-estimator PublicDemo of using regression over perturbations to estimate gradient
Python
-
cslm
cslm PublicSynthesizing code-switching data from a language model that was trained only on parallel or separate monolingual corpuses over two languages
Shell
If the problem persists, check the GitHub status page or contact support.