Skip to content

Align KTO with DPO: Align precompute_ref_logps#5850

Merged
albertvillanova merged 4 commits into
mainfrom
align-kto-dpo-precompute_ref_logps
May 26, 2026
Merged

Align KTO with DPO: Align precompute_ref_logps#5850
albertvillanova merged 4 commits into
mainfrom
align-kto-dpo-precompute_ref_logps

Conversation

@albertvillanova

@albertvillanova albertvillanova commented May 26, 2026

Copy link
Copy Markdown
Member

Align KTO with DPO: Align precompute_ref_logps.

Part of:

Changes

Variable renaming for clarity:

  • Renamed all instances of precompute_ref_log_probs to precompute_ref_logps in KTOTrainer init.
  • Replaced if "reference_logps" in batch with if self.precompute_ref_logps in _compute_loss.

Note

Low Risk
Small refactor in KTO loss path; behavior change only when batches lack precompute but still carry reference_logps, which should match enabled precompute mode.

Overview
Aligns KTO with DPO by using a trainer flag self.precompute_ref_logps (from args.precompute_ref_log_probs) instead of the old self.precompute_ref_log_probs name everywhere that path is checked—including Liger validation and dataset precomputation.

In _compute_loss, reference log-probs are taken from the batch only when self.precompute_ref_logps is true, replacing the previous "reference_logps" in batch check so precomputed references are tied to the config, not accidental batch keys.

Reviewed by Cursor Bugbot for commit 34b8124. Bugbot is set up for automated code reviews on this repo. Configure here.

@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@albertvillanova albertvillanova merged commit 2ffaabd into main May 26, 2026
5 checks passed
@albertvillanova albertvillanova deleted the align-kto-dpo-precompute_ref_logps branch May 26, 2026 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants