KAFKA-16017: Checkpoint restored offsets instead of written offsets #15044

cadonna · 2023-12-18T21:13:40Z

Kafka Streams checkpoints the wrong offset when a task is closed during restoration. If under exactly-once processing guarantees a TaskCorruptedException happens, the affected task is closed dirty, its state content is wiped out and the task is re-initialized. If during the following restoration the task is closed cleanly, the task writes the offsets that it stores in its record collector to the checkpoint file. Those offsets are the offsets that the task wrote to the changelog topics. In other words, the task writes the end offsets of its changelog topics to the checkpoint file. Consequently, when the task is initialized again on the same Streams client, the checkpoint file is read and the task assumes it is fully restored although the records between the last offsets the task restored before closing clean and the end offset of the changelog topics are missing locally.

The fix is to clear the offsets in the record collector on close.

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

Kafka Streams checkpoints the wrong offset when a task is closed during restoration. If under exactly-once processing guarantees a TaskCorruptedException happens, the affected task is closed dirty, its state content is wiped out and the task is re-initialized. If during the following restoration the task is closed cleanly, the task writes the offsets that it stores in its record collector to the checkpoint file. Those offsets are the offsets that the task wrote to the changelog topics. In other words, the task writes the end offsets of its changelog topics to the checkpoint file. Consequently, when the task is initialized again on the same Streams client, the checkpoint file is read and the task assumes it is fully restored although the records between the last offsets the task restored before closing clean and the end offset of the changelog topics are missing locally. The fix is to clear the offsets in the record collector on close.

cadonna · 2023-12-18T21:17:49Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/RecordCollectorTest.java

@@ -1228,6 +1290,7 @@ public void shouldNotThrowStreamsExceptionOnSubsequentCallIfASendFailsWithContin

        try (final LogCaptureAppender logCaptureAppender =
                 LogCaptureAppender.createAndRegister(RecordCollectorImpl.class)) {
+            logCaptureAppender.setThreshold(Level.INFO);


I added this because otherwise the result of the test depends on the configured log level.

cadonna · 2023-12-18T21:18:16Z

checkstyle/import-control.xml

@@ -408,6 +408,7 @@
        <allow pkg="com.fasterxml.jackson" />
        <allow pkg="kafka.utils" />
        <allow pkg="org.apache.zookeeper" />
+        <allow pkg="org.apache.log4j" />


I needed to add this to enable https://github.com/apache/kafka/pull/15044/files#r1430656200

cadonna · 2023-12-18T21:18:49Z

streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java

+    }
+
+    private void close() {
+        offsets.clear();


This is the actual fix.

I like having common logic for close clean / close dirty. Should we also move removeAllProducedSensors here?

I am afraid I do not understand. Method removeAllProducedSensors() is only called in closeClean() but not in closedDirty().

lucasbru

Thanks a lot for the important fix, @cadonna . I left some comments, but overall looking good to me

lucasbru · 2023-12-19T10:06:30Z

streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordCollectorImpl.java

+    }
+
+    private void close() {
+        offsets.clear();


I like having common logic for close clean / close dirty. Should we also move removeAllProducedSensors here?

lucasbru · 2023-12-19T10:13:43Z

streams/src/test/java/org/apache/kafka/streams/integration/EosIntegrationTest.java

+        final int endKey = 30001;
+        final int valueSize = 1000;
+        final StringBuilder value1 = new StringBuilder(valueSize);
+        for (int i = 0; i < valueSize; ++i) {


Is the size of the value essential to the test? In other words, "why?"

No, it is not essential. Actually it is just to put some load on restoration.

OK, I removed the string value and put in place a integer value instead.

streams/src/test/java/org/apache/kafka/streams/integration/EosIntegrationTest.java

lucasbru

LGTM, thanks!

cadonna · 2023-12-20T11:08:45Z

streams/src/test/java/org/apache/kafka/streams/integration/utils/IntegrationTestUtils.java

@@ -1313,7 +1313,6 @@ private static <K, V> List<ConsumerRecord<K, V>> readRecords(final String topic,
                                                                 final int maxMessages) {
        final List<ConsumerRecord<K, V>> consumerRecords;
        consumer.subscribe(singletonList(topic));
-        System.out.println("Got assignment:" + consumer.assignment());


I discovered this and removed it, because I guess it is a left-over from some other PR.

cadonna · 2023-12-20T11:11:44Z

streams/src/test/java/org/apache/kafka/streams/integration/EosIntegrationTest.java

            topic,
            numberOfRecords
        );
    }

+    private <K, V> void ensureCommittedRecordsInTopicPartition(final String topic,


I added this method to specifically verify if the partition to verify contains committed records because I saw flaky test failures where the changelog topic of the partition to verify was empty. If the changelog topic is empty, the latch never counts down, the Streams client never closes and the test runs into the test timeout.

lucasbru · 2023-12-20T12:43:02Z

LGTM, thanks!

cadonna · 2023-12-20T13:30:38Z

checkstyle/suppressions.xml

@@ -230,7 +230,7 @@

    <!-- Streams tests -->
    <suppress checks="ClassFanOutComplexity"
-              files="(RecordCollectorTest|StreamsPartitionAssignorTest|StreamThreadTest|StreamTaskTest|TaskManagerTest|TopologyTestDriverTest|KafkaStreamsTest).java"/>
+              files="(RecordCollectorTest|StreamsPartitionAssignorTest|StreamThreadTest|StreamTaskTest|TaskManagerTest|TopologyTestDriverTest|KafkaStreamsTest|EosIntegrationTest).java"/>


I could have probably solved this checkstyle issue by moving the test to a separate file but I think it makes sense to keep it in EosIntegrationTest to avoid starting an additional embedded Kafka.

…15044) Kafka Streams checkpoints the wrong offset when a task is closed during restoration. If under exactly-once processing guarantees a TaskCorruptedException happens, the affected task is closed dirty, its state content is wiped out and the task is re-initialized. If during the following restoration the task is closed cleanly, the task writes the offsets that it stores in its record collector to the checkpoint file. Those offsets are the offsets that the task wrote to the changelog topics. In other words, the task writes the end offsets of its changelog topics to the checkpoint file. Consequently, when the task is initialized again on the same Streams client, the checkpoint file is read and the task assumes it is fully restored although the records between the last offsets the task restored before closing clean and the end offset of the changelog topics are missing locally. The fix is to clear the offsets in the record collector on close. Reviewer: Lucas Brutschy <lbrutschy@confluent.io>

…pache#15044) Kafka Streams checkpoints the wrong offset when a task is closed during restoration. If under exactly-once processing guarantees a TaskCorruptedException happens, the affected task is closed dirty, its state content is wiped out and the task is re-initialized. If during the following restoration the task is closed cleanly, the task writes the offsets that it stores in its record collector to the checkpoint file. Those offsets are the offsets that the task wrote to the changelog topics. In other words, the task writes the end offsets of its changelog topics to the checkpoint file. Consequently, when the task is initialized again on the same Streams client, the checkpoint file is read and the task assumes it is fully restored although the records between the last offsets the task restored before closing clean and the end offset of the changelog topics are missing locally. The fix is to clear the offsets in the record collector on close. Reviewer: Lucas Brutschy <lbrutschy@confluent.io>

cadonna added the streams label Dec 18, 2023

cadonna requested review from lucasbru and mjsax December 18, 2023 21:13

cadonna commented Dec 18, 2023

View reviewed changes

cadonna added 2 commits December 18, 2023 22:31

Use more meaningful name

a3fb436

Delete System.outs and try improve robustness of integration test

6be0405

lucasbru reviewed Dec 19, 2023

View reviewed changes

Simplify record value and ensure exception is thrown

97e4227

lucasbru approved these changes Dec 19, 2023

View reviewed changes

cadonna added 2 commits December 19, 2023 13:15

Fix checkstyle error

e082655

Ensure partition to verify contains committed records

e69abfb

cadonna commented Dec 20, 2023

View reviewed changes

cadonna added 2 commits December 20, 2023 13:56

Abort the verification for committed records regarding some conditions

db151a5

Fix checkstyle issues

75ada37

cadonna commented Dec 20, 2023

View reviewed changes

cadonna merged commit 19727f8 into apache:trunk Dec 21, 2023
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KAFKA-16017: Checkpoint restored offsets instead of written offsets #15044

KAFKA-16017: Checkpoint restored offsets instead of written offsets #15044

cadonna commented Dec 18, 2023

cadonna Dec 18, 2023

cadonna Dec 18, 2023

cadonna Dec 18, 2023

lucasbru Dec 19, 2023

cadonna Dec 19, 2023

lucasbru left a comment

lucasbru Dec 19, 2023

lucasbru Dec 19, 2023

cadonna Dec 19, 2023

cadonna Dec 19, 2023

lucasbru left a comment

cadonna Dec 20, 2023

cadonna Dec 20, 2023

lucasbru commented Dec 20, 2023

cadonna Dec 20, 2023

KAFKA-16017: Checkpoint restored offsets instead of written offsets #15044

KAFKA-16017: Checkpoint restored offsets instead of written offsets #15044

Conversation

cadonna commented Dec 18, 2023

Committer Checklist (excluded from commit message)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lucasbru left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lucasbru left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lucasbru commented Dec 20, 2023

Choose a reason for hiding this comment