This is a modified version of LLaMA-Factory that supports training ChatTS models.
- 2025/11/04: LoRa and Qwen3 training is now supported.
- 2025/08/01: We have updated the code for data preprocessing. No need to preprocess the dataset before training now!
- Please download the latest datasets (we have updated them) from ChatTS-Training-Dataset.
- If you want to generate the datasets by yourself, please use
noencoding instead ofspencoding when generating the data.
Following the steps in LLaMA-Factory.
Make sure that flash-attention and DeepSpeed are installed.
- Download the base models (Qwen3-8B or Qwen2.5-14B-Instruct) from huggingface
- Replace
*.py,added_tokens.json,config.json,special_tokens_map.json,tokenizer_config.jsonin the base model folder with the files in theChatTS-8B or ChatTS-14B repo according to type of your base model. - Initialization: To ensure training stability, we strongly recommend using Xavier normal initialization for the parameters of
ts_encoder. You can first load the model created in the previous steps usingAutoModelForCausalLM.from_pretrainedin Python, then apply Xavier normal initialization to themodel.ts_encoderpart, and finally save the model usingsave_pretrained. For detailed API usage, please refer to the official Transformers documentation.
- Download the training datasets from ChatTS-Training-Dataset. Put the folders under
data/(e.g.,data/align_256/,data/align_random/, etc). - Configure your base model (see the instructions below), output model, training datasets and training parameters in
scripts/full/train_stage1.shandtrain_stage2.sh. - Run
bash scripts/train_stage1.shandbash scripts/train_stage2.sh.
- If you want to use your own datasets, put your own training data in
data/. The example of dataset format is shown in chatts_dev.jsonl. Set your training data path indata/dataset_info.json. - Configure your base model (see the instructions below), output model, training datasets and training parameters in
scripts/full/dev.sh(for full SFT) orscripts/lora/dev.sh. - Run
bash scripts/train_chatts.shfor full SFT. Runbash scripts/train_lora.shfor LoRA SFT.