Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rdkafka doesn't honor the same broker ID when moved to new host #343

Closed
liljenback opened this issue Aug 5, 2015 · 4 comments
Closed

rdkafka doesn't honor the same broker ID when moved to new host #343

liljenback opened this issue Aug 5, 2015 · 4 comments
Labels

Comments

@liljenback
Copy link

If migrating an existing broker to a new host and re-using its broker ID (from the host that is decommissioned) rdkafka still remember the old host name broker ID combination even when it is no longer available (brought offline). The client does not start using the new host. This results in messages being lost for partitions that are reassigned to the new host. The Kafka documentation seems to hint that the clients should be able to start using the new host with the same ID without having to be restarted.

@edenhill edenhill added the bug label Aug 5, 2015
@edenhill
Copy link
Contributor

edenhill commented Aug 5, 2015

By messages lost you mean they are succesfully produced to the old broker, right?

@liljenback
Copy link
Author

Right, the messages will be produced to the old broker but in this case it's offline.

edenhill added a commit that referenced this issue Aug 11, 2015
This also migrates bootstrap brokers to proper brokers
if they can be exactly matched by name.
@edenhill
Copy link
Contributor

Should be fixed now.

@edenhill
Copy link
Contributor

If possible please try verifying on your end

mfleming added a commit to mfleming/librdkafka that referenced this issue Nov 2, 2023
The Kafka protocol allows for brokers to have multiple host:port pairs
for a given node Id, e.g. see UpdateMetadata request which contains a
live_brokers list where each broker Id has a list of host:port pairs. It
follows from this that the thing that uniquely identifies a broker is
its Id, and not the host:port.

The behaviour right now is that if we have multiple brokers with the
same host:port but different Ids, the first broker in the list will be
updated to have the Id of whatever broker we're looking at as we iterate
through the brokers in the Metadata response in
rd_kafka_parse_Metadata0(), e.g.

 Step 1. Broker[0] = Metadata.brokers[0]
 Step 2. Broker[0] = Metadata.brokers[1]
 Step 3. Broker[0] = Metadata.brokers[2]

A typical situation where brokers have the same host:port pair but
differ in their Id is if the brokers are behind a load balancer.

The NODE_UPDATE mechanism responsible for this was originally added in
b09ff60 ("Handle broker name and nodeid updates (issue confluentinc#343)") as a way
to forcibly update a broker hostname if an Id is reused with a new host
after the original one was decommissioned. But this isn't how the Java
Kafka client works, so let's use the Metadata response as the source of
truth instead of updating brokers if we can only match by their
host:port.
emasab pushed a commit to mfleming/librdkafka that referenced this issue Jun 10, 2024
The Kafka protocol allows for brokers to have multiple host:port pairs
for a given node Id, e.g. see UpdateMetadata request which contains a
live_brokers list where each broker Id has a list of host:port pairs. It
follows from this that the thing that uniquely identifies a broker is
its Id, and not the host:port.

The behaviour right now is that if we have multiple brokers with the
same host:port but different Ids, the first broker in the list will be
updated to have the Id of whatever broker we're looking at as we iterate
through the brokers in the Metadata response in
rd_kafka_parse_Metadata0(), e.g.

 Step 1. Broker[0] = Metadata.brokers[0]
 Step 2. Broker[0] = Metadata.brokers[1]
 Step 3. Broker[0] = Metadata.brokers[2]

A typical situation where brokers have the same host:port pair but
differ in their Id is if the brokers are behind a load balancer.

The NODE_UPDATE mechanism responsible for this was originally added in
b09ff60 ("Handle broker name and nodeid updates (issue confluentinc#343)") as a way
to forcibly update a broker hostname if an Id is reused with a new host
after the original one was decommissioned. But this isn't how the Java
Kafka client works, so let's use the Metadata response as the source of
truth instead of updating brokers if we can only match by their
host:port.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants