Stars
LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation
Minimal effort CLIs derived from type hints and parse from command line, config files and environment variables
「大規模言語モデル入門」(2023)と「大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価」(2024)のGitHubリポジトリ
A tool to perform sentence segmentation on Japanese text
The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.
An open-source NLP research library, built on PyTorch.
This is a repository of yohei's lecture pdf of 2018 Cookpad Summer Internship 5 DAY R&D.
Preprossed data for workshop on statistical machine translation (WMT), collected from papers or other projects
文系自然言語処理ゼミのチュートリアル用。Github, 仮想環境(Anaconda), Jupyter Notebook, 自然言語処理の深層学習のコードまで。