Skip to content

feat(inkless): ensure unclean leader election is disabled on classic-to-diskless switch [KC-129]#647

Open
EelisK wants to merge 2 commits into
mainfrom
EelisK/KC-129-unclean-leader-election
Open

feat(inkless): ensure unclean leader election is disabled on classic-to-diskless switch [KC-129]#647
EelisK wants to merge 2 commits into
mainfrom
EelisK/KC-129-unclean-leader-election

Conversation

@EelisK

@EelisK EelisK commented Jun 15, 2026

Copy link
Copy Markdown
Member

No description provided.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR strengthens the classic-to-diskless topic switch preconditions by ensuring unclean leader election is not enabled (either already effective via defaults/overrides or being enabled by the same AlterConfigs request), and adds unit/integration coverage around the new behavior.

Changes:

  • Add controller-side validation that rejects classic→diskless switches when unclean.leader.election.enable is enabled (or being enabled) for the topic.
  • Update existing switch-precondition unit tests to use the per-resource validation API and add new tests for unclean-leader-election cases (including legacy AlterConfigs behavior).
  • Add an integration test verifying rejection when unclean leader election is enabled at the cluster/broker-default level, plus a test that unclean-leader-election enabling is rejected while a switch is pending.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java Adds unclean-leader-election checks to classic→diskless switch preconditions (incremental + legacy).
metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java Refactors precondition tests to new API and adds unit tests for unclean-leader-election switch rejection/allow cases (including legacy).
core/src/test/java/kafka/server/InklessTopicTypeSwitcherClusterTest.java Adds cluster-level integration coverage for switch rejection when unclean leader election is enabled and for rejecting unclean enable while switch is pending.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java Outdated
Comment thread metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java Outdated
@EelisK EelisK force-pushed the EelisK/KC-129-unclean-leader-election branch from 4d1ee75 to d284e21 Compare June 15, 2026 09:56
@EelisK EelisK force-pushed the EelisK/KC-129-unclean-leader-election branch from d284e21 to 5cfb85c Compare June 15, 2026 12:40
@EelisK EelisK marked this pull request as ready for review June 16, 2026 06:50
"Cannot enable unclean leader election for topic " + resource.name() +
": topic has a pending classic-to-diskless switch.");
}
if (!isSettingConfigToTrue(changes, DISKLESS_ENABLE_CONFIG)) return ApiError.NONE;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this moved later? Can we move it back up so it returns earlier for these cases?

"Expected 'unclean leader election' in error, got: " + ex.getCause().getMessage());
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

nit: remove double new line

&& !isDisklessTopic(resource.name())
&& "false".equals(configurationControl.currentTopicConfig(resource.name()).get(DISKLESS_ENABLE_CONFIG));
if (!explicitlyEnabling && !implicitlyEnabling) return ApiError.NONE;
if (isDisklessTopic(resource.name())) return ApiError.NONE;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not guard against the case where the topic is already diskless but the switch is still pending. The incremental alter configs logic blocks this instead.

}

@Test
public void testSwitchRejectedWhenUncleanLeaderElectionAlreadyEnabled() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a test that verifies this: reject a classic→diskless switch when unclean leader election would be effectively enabled (incremental + legacy paths) ?

E.g. mark pending via markClassicToDisklessSwitchStarted + replay, then call validateClassicToDisklessSwitchPrecondition with unclean=SET=true and assert rejection

Comment on lines 3005 to 3008
ApiError validateClassicToDisklessSwitchPrecondition(
ConfigResource resource,
Map<ConfigResource, Map<String, Entry<OpType, String>>> configChanges
) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this check is missing: an incremental DELETE of an unclean=false override on a pending-switch topic, where the cluster default is true. This is now allowed but it should not be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants