Suggestion: Use TRL Chat template utilities to resolve chat-template issues

Since some models do not preserve exact conversation prefix, it causes tokenizer drift and discrepancy.
For example, Qwen3 drops the `<think>` context of past turns, which causes discrepancies in RL training.
I've seen that you have uploaded your own Qwen3 model which manually fixes this issue.

I found that TRL tackled it as well and provide a structured fix for many models, 
including assistant token masks and fixing such issues. 

I suggest checking out: [TRL Template Utils](https://huggingface.co/docs/trl/chat_template_utils), as it provides some very useful helper functions and also template patches. 
Some useful functions I encountered:
- `is_chat_template_prefix_preserving()`
- `get_training_chat_template()`

and so on. 

I think installing trl without dependencies is enough to get the functionality from this module.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion: Use TRL Chat template utilities to resolve chat-template issues #726

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Suggestion: Use TRL Chat template utilities to resolve chat-template issues #726

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions