Efficient, Flexible, Multi-task Batch SFT Data Labeling Tool
-
Updated
Aug 18, 2025 - Python
Efficient, Flexible, Multi-task Batch SFT Data Labeling Tool
Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
A modern Python-based MySQL backup tool with flexible archiving, multi-channel notifications (Telegram, Email, SMS, etc.), remote uploads (SFTP, FTP, SCP), and robust configuration validation.
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
DICE: Detecting In-distribution Data Contamination with LLM's Internal State
[EMNLP 2025 Findings] Robust Knowledge Editing via Explicit Reasoning Chains for Distractor-Resilient Multi-Hop QA
Custom trl.SFTTrainer that adds a KL divergence loss between a LoRA-adapted model and its base model.
Production-grade LLMOps Framework
EasyRLHF aims to provide an easy and minimal interface to train aligned language models, using off-the-shelf solutions and datasets
🖼️ Extract and process text from images with an OCR AI agent, featuring tools for image preprocessing and unit conversions.
Advancing Prompt Evolution through Hybridization
Add a description, image, and links to the sft topic page so that developers can more easily learn about it.
To associate your repository with the sft topic, visit your repo's landing page and select "manage topics."