Stars
Compile programs directly into transformer weights. Includes a 2D convex-hull KV cache with O(log n) inference.
A non-saturating, open-ended environment for evaluating LLMs in Factorio
Code repository for the paper "Mission: Impossible Language Models."
The repository for the code of the UltraFastBERT paper
Merge Transformers language models by use of gradient parameters.
SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.
Interpretability for sequence generation models 🐛 🔍
Scripts that were used to scrape and process data from Yandex.Q
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
A prize for finding tasks that cause large language models to show inverse scaling
Russian coreference resolution made as simple and accessible as could be
Official code for LEWIS, from: "LEWIS: Levenshtein Editing for Unsupervised Text Style Transfer", ACL-IJCNLP 2021 Findings by Machel Reid and Victor Zhong
Library for Russian rap generation.
Structured state space sequence models
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
1st place solution for RuSimpleSentEval
Probing suite for evaluation of Russian embedding and language models
The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.
Telegram Data Clustering Contest (Bossy Gnu's submission )
Winning entry for Telegram Data Clustering competition