An Activation Offloading Framework to SSDs for Faster Large Language Model Training
-
Updated
Apr 18, 2025 - Python
An Activation Offloading Framework to SSDs for Faster Large Language Model Training
4th_sem_Paraphrase_Project_(Mainly_trainning phase)_(py_and_dataset)
ismail is a from-scratch Turkish language model implementation designed for low-end hardware, built and trained on a single RTX 5070 (12GB).
Restaurant Name and Menu Generator is a web app that uses Streamlit, Hugging Face Transformers, and LangChain to generate unique restaurant names and customized menus based on different cuisines. Enter a country's name to receive creative restaurant ideas and menu options.
Virtual assistant project exploring LLM tool use.
AsymCheck: Asymmetric Partitioned Checkpointing for Efficient Large Language Model Training
Train transformer language models with reinforcement learning.
Latency and Memory Analysis of Transformer Models for Training and Inference
An implementation of a language part-of-speech (POS) tagger using Hidden Markov Models. Basically, it takes English text as input and tries to tag each word as a noun, verb, adjective, etc. based on the input's sequence.
Doing devious stuff with AI
A hands-on lab to train, fine-tune, serve, and evaluate LLMs — from scratch to deployment.
A production-ready pipeline for fine-tuning large language models using Unsloth with YAML-based configuration, advanced training features, and web-based model serving.
Kurdish Kurmanji LLM train script
24G 显存训练(微调)一个小型大语言模型!(手搓训练流程)
Multi-agent KTO: Reinforcing Strategic Interactions of Large Language Model in Language Game
Add a description, image, and links to the llm-training topic page so that developers can more easily learn about it.
To associate your repository with the llm-training topic, visit your repo's landing page and select "manage topics."