[docs] Add chat templates page to web docs#5581
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
albertvillanova
left a comment
There was a problem hiding this comment.
Thanks! Just a small suggestion due to recent code changes...
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
|
|
||
| TRL ships a small collection of Jinja2 chat templates under [`trl/chat_templates/`](https://github.com/huggingface/trl/tree/main/trl/chat_templates). They serve two purposes: | ||
|
|
||
| 1. **Identity comparison**: detecting which model is being used (by comparing `processing_class.chat_template` against known templates) to add the appropriate response schema ([`add_response_schema`]) or swap in a training template ([`get_training_chat_template`]). |
There was a problem hiding this comment.
- ([`add_response_schema`])
+ (`add_response_schema`)now that it's not in the doc anymore
There was a problem hiding this comment.
Nice!
I'd suggest a few modifications, flagging the rationale:
Reading the original intro cold, I felt it opened mid-thought: it jumped straight into "they serve two purposes: identity comparison and training patches", which is framed from the implementer's side (ie, why these files exist internally) rather than the reader's (=user) side: "why do I, a TRL user, care?". I reworked it along this arc:
- Define what a chat template is (+ link to transformers docs, tiny
apply_chat_templateexample). - Reassure: most users never touch them; TRL handles it transparently.
- Motivate the page with the two user-facing scenarios (SFT -assistant_only_loss=True`, GRPO tool calls).
- Say what to do: TRL auto-patches supported families; for others, you patch yourself.
Collapsed "Original templates" into "Supported model families". The per-file stubs ("Original Qwen3 chat template.", ...) weren't user-facing: those originals exist for TRL's internal identity-comparison; a user would never use them directly -> collapsed to a one-line family list
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
What does this PR do?
Follow-up to #5545, adding the content to the web docs
Before submitting
AI writing disclosure
We welcome the use of AI tools to help with contributions. For transparency and to help us improve our review process, please indicate the level of AI involvement in this PR.
Who can review?
@albertvillanova @qgallouedec
Note
Low Risk
Documentation-only changes (new page, navigation entry, and links) with no runtime or API impact.
Overview
Adds a new
Chat Templatesconceptual guide documenting TRL’s bundled/patched training chat templates (prefix-preserving +generationmarkers) and which model families are supported.Wires the page into the docs nav (
_toctree.yml) and adds cross-references fromchat_template_utils.md, plus contextual notes ingrpo_trainer.md(tools require prefix-preserving templates) andsft_trainer.md(assistant-only loss requiresgenerationmarkers and points to the bundled template list).Reviewed by Cursor Bugbot for commit 6099d3c. Bugbot is set up for automated code reviews on this repo. Configure here.