Twitter: @morsapaes
I’ve been working with data, big and small, for the last 10 years: first as a Data Engineer slogging through ETL pipelines and big migration projects; and then in Product-shaped roles helping build Apache Flink (before it was cool) and Materialize. I’m currently a Product Manager at ClickHouse, focusing on our real-time data ingestion service.
The idea of incremental reads from data lakes has been cooking for years, but few are serving it up. As a user, you must wrangle change feeds, snapshots, time travel, that one corrupted manifest file. Do you need to be a “Big Data Engineer” to get it right? In this lightning talk, we’ll explore what’s broken, what’s just hard, and why making data lake CDC accessible is a problem worth solving.