Stars
[EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"
Project to generate POS tag dictionary for Ukrainian language
FastAPI Best Practices and Conventions we used at our startup
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
super fast cpp implementation of longest common subsequence/substring
Scikit-learn compatible library for molecular fingerprints and chemoinformatics
A sophisticated data pipeline that creates meaningful collections of ENS names.
Help your users discover ENS names they love with NameGraph.
The new multichain indexer for ENSv2, powered by Ponder.
Hackable and optimized Transformers building blocks, supporting a composable construction.
A library for language transfer methods and algorithms.
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Code and sources for the eval-UA-tion benchmark and paper
LLM papers I'm reading, mostly on inference and model compression
GPT-2 Metadata Pretraining Towards Instruction Finetuning for Ukrainian
розмічений руками морфо’, синт’, кореф’ корпус української мови
Natural language processing course thought at AGH University of Science and Technology
Metadata for graphemes that may appear in ENS names
Detailed inspection of labels in ENS names
Easily integrate rich ENS user journeys into your wallet, app, or game.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Unsupervised text tokenizer for Neural Network-based text generation.