Migrate tests to Qwen3.5 Think/NoThink fixtures#5821
Merged
qgallouedec merged 1 commit intoMay 22, 2026
Merged
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 1544c50. Configure here.
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Migrates the test suite from the legacy
tiny-Qwen3_5ForConditionalGenerationfixture to the sibling-Think/-NoThinkvariants introduced in #5819.Chat-template tests now exercise both variants (so the
qwen3_5_4b_and_above.jinjatemplate gains coverage it didn't have before — no fixture was sourced from the 4B checkpoint until now). Trainer/utility tests, which only need a tiny Qwen3.5 model and don't care about default thinking behavior, are mechanically pointed at-NoThink(matching the 0.8B source of the legacy fixture, so behavior is unchanged for those tests).Note: legacy Hub repo can be deleted
After this PR merges,
trl-internal-testing/tiny-Qwen3_5ForConditionalGeneration(the unsuffixed repo) will have no remaining references in the codebase and can be gracefully deleted by a maintainer if needed.Changes
tests/test_chat_template_utils.py— three parametrize sites split intoqwen35-nothink+qwen35-thinkpairs:TestAddResponseSchema.test_add_response_schema_vlmTestSupportsToolCalling.test_supports_tool_calling—transformers>=5.0.0skipifmark duplicated onto the Think entryTestParseResponseclass-level parameterizeMechanical rename
tiny-Qwen3_5ForConditionalGenerationtotiny-Qwen3_5ForConditionalGeneration-NoThinkat 8 sites:tests/test_sft_trainer.pytests/test_grpo_trainer.pytests/test_rloo_trainer.pytests/test_dpo_trainer.pytests/test_data_utils.pyThese trainer tests don't exercise thinking behavior — they need a small Qwen3.5 checkpoint.
-NoThinkpreserves the original 0.8B sourcing, so test behavior is unchanged.Part of #5471
Follows #5819
cc: @qgallouedec
Note
Low Risk
Low risk: changes are limited to test fixture/model IDs and parametrization, with no production code or training logic modifications.
Overview
Updates the test suite to stop using the legacy
tiny-Qwen3_5ForConditionalGenerationfixture and instead target the new Qwen3.5-NoThinkand-Thinkvariants.Chat-template/response-parsing coverage is expanded to run against both Qwen3.5 variants (including duplicating
transformers>=5.0.0skip marks where needed), while trainer/data utility tests are mechanically switched to-NoThinkto preserve prior default behavior.Reviewed by Cursor Bugbot for commit 1544c50. Bugbot is set up for automated code reviews on this repo. Configure here.