feat(metadata:diskless): implement managed replicas for diskless topics#492
feat(metadata:diskless): implement managed replicas for diskless topics#492
Conversation
259d178 to
d64e450
Compare
There was a problem hiding this comment.
Pull request overview
Implements Phase 1 of “Diskless Managed Replicas”: when enabled via a new server config, newly-created diskless topics use standard KRaft replica placement with replication factor derived from rack cardinality (instead of legacy RF=1), and manual replica assignments are allowed for diskless topics in managed mode.
Changes:
- Add
diskless.managed.rf.enableserver config and plumb it through broker/controller config wiring. - Update
ReplicationControlManager.createTopicto compute diskless RF from rack cardinality and usereplicaPlacer.place()in managed mode; allow manual assignments in managed mode. - Extend unit tests to cover managed vs unmanaged behavior across no-rack / with-rack / invalid input / fencing & unregister scenarios.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
server-common/src/main/java/org/apache/kafka/server/config/ServerConfigs.java |
Introduces new diskless.managed.rf.enable config definition and documentation. |
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java |
Adds managed-replica enable flag, rack-cardinality RF computation, and managed-mode placement path in topic creation. |
metadata/src/main/java/org/apache/kafka/controller/QuorumController.java |
Wires the managed-replica flag from controller builder into ReplicationControlManager. |
core/src/main/scala/kafka/server/KafkaConfig.scala |
Exposes disklessManagedReplicasEnabled from server properties. |
core/src/main/scala/kafka/server/ControllerServer.scala |
Passes the new config into the QuorumController.Builder. |
metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java |
Adds/updates tests and test context plumbing for managed vs unmanaged diskless behavior. |
checkstyle/suppressions.xml |
Adds a MethodLength suppression for ReplicationControlManager.java. |
Comments suppressed due to low confidence (1)
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java:792
- The INVALID_REPLICATION_FACTOR message hard-codes the “default value (1)”, but when diskless managed replicas are enabled the effective RF is derived from rack cardinality (can be > 1). Update the error message (and/or the preceding comment) to describe the actual allowed inputs (1 or -1) and what -1/managed mode means, without implying the default is always 1.
// Diskless RF: only -1 (system-computed from rack count) or 1 (backward compat) allowed.
// Explicit RF > 1 rejected: users shouldn't need to know rack topology.
if (Math.abs(topic.replicationFactor()) != 1) {
return new ApiError(Errors.INVALID_REPLICATION_FACTOR,
"Replication factor for diskless topics must be 1 or -1 to use the default value (1).");
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
server-common/src/main/java/org/apache/kafka/server/config/ServerConfigs.java
Show resolved
Hide resolved
metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java
Outdated
Show resolved
Hide resolved
metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java
Outdated
Show resolved
Hide resolved
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java
Outdated
Show resolved
Hide resolved
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java
Outdated
Show resolved
Hide resolved
… unregister scenarios - Add _noRacks and _withRacks test variants for consistent coverage - Fix tests that assumed broker 0 was always the leader - Get actual leader from partition registration before fencing/unregistering - Use dynamic assertions based on actual partition state - Improve assertion error messages for clarity
d64e450 to
3e7a42b
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java
Outdated
Show resolved
Hide resolved
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java
Show resolved
Hide resolved
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java
Show resolved
Hide resolved
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java
Show resolved
Hide resolved
metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java
Outdated
Show resolved
Hide resolved
3e7a42b to
6fba0c9
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java
Show resolved
Hide resolved
metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java
Outdated
Show resolved
Hide resolved
6fba0c9 to
1f3c7e1
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
server-common/src/main/java/org/apache/kafka/server/config/ServerConfigs.java
Show resolved
Hide resolved
metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java
Outdated
Show resolved
Hide resolved
metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java
Outdated
Show resolved
Hide resolved
1f3c7e1 to
6f01be4
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java
Outdated
Show resolved
Hide resolved
6f01be4 to
82dbca5
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java
Show resolved
Hide resolved
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
server-common/src/main/java/org/apache/kafka/server/config/ServerConfigs.java
Show resolved
Hide resolved
metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java
Outdated
Show resolved
Hide resolved
Add diskless.managed.rf.enable config (default: false) to control whether diskless topics use managed replicas with RF=rack_count or legacy RF=1. This config only affects topic creation. When enabled, new diskless topics will be created with one replica per rack using standard KRaft placement. Part of Phase 1: Diskless Managed Replicas (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
82dbca5 to
5f41dde
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Implements Phase 1 of Diskless Managed Replicas (see #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md).
When
diskless.managed.rf.enable=true, new diskless topics are created with RF = rack_count (one replica per rack) using standard KRaft replica placement, instead of legacy RF=1.Changes
diskless.managed.rf.enableserver config (default: false)replicaPlacer.place()for rack-aware assignmentKnown Limitations (Phase 1)
This feature is opt-in only via
diskless.managed.rf.enable=true.Existing clusters and topics are unaffected unless explicitly enabled.
When enabled:
Partition Reassignment: Not yet supported for managed replica topics. Reassignment will hang waiting for replica sync. Will be fixed in follow-up PR.
Transformer: Not updated - uses legacy routing (Phase 2). KRaft metadata shows RF=3, but clients are still routed to any alive broker. Doesn't affect correctness for DISKLESS_ONLY topics.
Add Partitions: Works correctly - inherits RF and uses rack-aware placement.
Observability metrics: Deferred to Phase 2.
Safe to merge because:
falseby defaultFollow-up PRs Sequence
Testing
Configuration
diskless.managed.rf.enablefalse