Highlights
- Pro
Stars
π Efficient implementations for emerging model architectures
Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
Official repository for CMU Machine Learning Department's 10717: "The Art of the Paper".
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
π Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.
Chat Templates for π€ HuggingFace Large Language Models
Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models)
A playbook for systematically maximizing the performance of deep learning models.
A quick guide (especially) for trending instruction finetuning datasets
Reading list for research topics in state-space models
Metric Learning (npair loss & angular loss) on mnist and Visualizing by t_SNE
A list of contrastive Learning papers
PyTorch implementation for the ICLR 2020 paper "Understanding the Limitations of Variational Mutual Information Estimators"