ChatTS-Training

This is a modified version of LLaMA-Factory that supports training ChatTS models.

News

2025/11/04: LoRa and Qwen3 training is now supported.
2025/08/01: We have updated the code for data preprocessing. No need to preprocess the dataset before training now!
- Please download the latest datasets (we have updated them) from ChatTS-Training-Dataset.
- If you want to generate the datasets by yourself, please use no encoding instead of sp encoding when generating the data.

Requirements

Following the steps in LLaMA-Factory. Make sure that flash-attention and DeepSpeed are installed.

Instructions for converting Qwen base models to ChatTS format

Download the base models (Qwen3-8B or Qwen2.5-14B-Instruct) from huggingface
Replace *.py, added_tokens.json, config.json, special_tokens_map.json, tokenizer_config.json in the base model folder with the files in theChatTS-8B or ChatTS-14B repo according to type of your base model.
Initialization: To ensure training stability, we strongly recommend using Xavier normal initialization for the parameters of ts_encoder. You can first load the model created in the previous steps using AutoModelForCausalLM.from_pretrained in Python, then apply Xavier normal initialization to the model.ts_encoder part, and finally save the model using save_pretrained. For detailed API usage, please refer to the official Transformers documentation.

Steps to reproduce

Download the training datasets from ChatTS-Training-Dataset. Put the folders under data/ (e.g., data/align_256/, data/align_random/, etc).
Configure your base model (see the instructions below), output model, training datasets and training parameters in scripts/full/train_stage1.sh and train_stage2.sh.
Run bash scripts/train_stage1.sh and bash scripts/train_stage2.sh.

Use your own datasets

If you want to use your own datasets, put your own training data in data/. The example of dataset format is shown in chatts_dev.jsonl. Set your training data path in data/dataset_info.json.
Configure your base model (see the instructions below), output model, training datasets and training parameters in scripts/full/dev.sh (for full SFT) or scripts/lora/dev.sh.
Run bash scripts/train_chatts.sh for full SFT. Run bash scripts/train_lora.sh for LoRA SFT.

Credit

LLaMA-Factory

Name		Name	Last commit message	Last commit date
Latest commit History 2,937 Commits
.github		.github
assets		assets
data		data
docker		docker
ds_config		ds_config
evaluation		evaluation
examples		examples
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.env.local		.env.local
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ChatTS-Training

News

Requirements

Instructions for converting Qwen base models to ChatTS format

Steps to reproduce

Use your own datasets

Credit

About

Uh oh!

Releases

Packages

Languages

License

xiezhe-24/ChatTS-Training

Folders and files

Latest commit

History

Repository files navigation

ChatTS-Training

News

Requirements

Instructions for converting Qwen base models to ChatTS format

Steps to reproduce

Use your own datasets

Credit

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages