Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-35464] Fixes operator state backwards compatibility from CDC 3.0.x #3369

Merged
merged 3 commits into from
Jun 6, 2024

Conversation

yuxiqian
Copy link
Contributor

@yuxiqian yuxiqian commented May 28, 2024

This closes FLINK-35441 and FLINK-35464.

Flink CDC 3.1 changes how SchemaRegistry [de]serializes state data, which causes any checkpoint states saved with earlier version could not be restored in version 3.1.0. This PR adds serialization versioning for state payloads and ensures 3.0.x state could be successfully restored.

Unfortunately 3.1.0 introduces breaking changes without bumping serialization version, so this release will be excluded from state compatibility guarantee.


Running some real savepoint restore tests with https://github.com/yuxiqian/migration-test reveals that:

Before this change: (Snapshots compiled from release-3.1 and master)

From \ To 3.0.0 3.0.1 3.1.0 3.1-SNAPSHOT 3.2-SNAPSHOT
3.0.0
3.0.1
3.1.0
3.1-SNAPSHOT
3.2-SNAPSHOT

✅ - Compatible, ❌ - Not compatible, ❓ - Target version doesn't support --from-savepoint

After this change: (Snapshots compiled from FLINK-35464-BP-3.1 and FLINK-35464)

From \ To 3.0.0 3.0.1 3.1.0 3.1-SNAPSHOT 3.2-SNAPSHOT
3.0.0
3.0.1
3.1.0
3.1-SNAPSHOT
3.2-SNAPSHOT

@yuxiqian
Copy link
Contributor Author

@leonardBang @PatrickRen PTAL

Copy link
Contributor

@PatrickRen PatrickRen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuxiqian Thanks for the PR! The patch and test themselves looks good to me, but I'm a bit concerning about introducing new modules for each version, even patch versions (like 3.0.1). As the project evolves there must be a lot of versions in the future, so number of modules could grow very fast. Not sure if this might burden the Maven build process.

@yuxiqian
Copy link
Contributor Author

yuxiqian commented Jun 3, 2024

Thanks for @PatrickRen's comments! I too agree that it's not very elegant to package all history versions just for compatibility test, but seems to be inevitable since there's no such thing like C's macro or conditional compilation in Java.

I've moved flink-cdc-migration-test out of flink-cdc-parent so it won't slow down other modules' compilaton.

Copy link
Contributor

@PatrickRen PatrickRen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuxiqian Thanks for the update! I left some comments.

flink-cdc-migration-tests/pom.xml Outdated Show resolved Hide resolved
pom.xml Outdated Show resolved Hide resolved
@yuxiqian
Copy link
Contributor Author

yuxiqian commented Jun 4, 2024

Thanks for the tips! Addressed your comments to simplify pom files.

Copy link
Contributor

@PatrickRen PatrickRen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuxiqian Thanks for the update! LGTM

@PatrickRen PatrickRen merged commit 414720d into apache:master Jun 6, 2024
15 checks passed
wuzhenhua01 pushed a commit to wuzhenhua01/flink-cdc-connectors that referenced this pull request Aug 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants