Marta Paes

I’ve been working with data, big and small, for the last 10 years: first as a Data Engineer slogging through ETL pipelines and big migration projects; and then in Product-shaped roles helping build Apache Flink (before it was cool) and Materialize. I’m currently a Product Manager at ClickHouse, focusing on our real-time data ingestion service.

Recent Presentations

AI Council

Data Lake CDC: are we there yet?

The idea of incremental reads from data lakes has been cooking for years, but few are serving it up. As a user, you must wrangle change feeds, snapshots, time travel, that one corrupted manifest file. Do you need to be a “Big Data Engineer” to get it right? In this lightning talk, we’ll explore what’s broken, what’s just hard, and why making data lake CDC accessible is a problem worth solving.
DataEngBytes 2026

From transactions to analytics: where do we go from here?
Current

Debezium vs. the world: an overview of the CDC ecosystem

Older presentations

More presentations

Debezium vs. the world: an overview of the CDC ecosystem	Kafka Summit 2024	March 2024
Billion Dollar data streams: how real-time data accelerates revenue	Census tech demo	February 2024
CI/CD patterns for dbt projects	Current	September 2023
CI/CD workflows for dbt+Materialize	Materialize Tech Demo	March 2023
Streaming with dbt: the Jaffle Shop don’t stop!	Coalesce	October 2022
SQL, Streams, Action! Up & Running with Materialize	Devoxx UK	November 2021
An Introduction to Streaming SQL with Materialize	Big Data Conference Europe	September 2021
Materialize: streaming SQL for the rest of us	The Developer’s Conference (TDC)	August 2021
Select Star: Flink SQL for Pulsar Folks	Uber Apache Flink x Apache Pulsar Meetup 2021	March 2021
The Streaming Mindset	Bristech Meetup	January 2021
Building an end-to-end Analytics Pipeline with PyFlink	Netflix Apache Flink Meetup 2021	January 2021
Introduction to Stream Processing with Apache Flink	Open Source Summit Japan	December 2020
Building an End-to-End Analytics Pipeline with PyFlink	Data Science UA	November 2020
Building an End-to-End Analytics Pipeline with PyFlink	Flink Forward Global 2020	October 2020
Snakes on a Plane: Interactive Data Exploration with PyFlink and Zeppelin Notebooks	ApacheCon	September 2020
Change Data Capture with Flink SQL and Debezium	ApacheCon	September 2020
Snakes on a Plane: Interactive Data Exploration with PyFlink and Zeppelin Notebooks	ODSC Europe	September 2020
Change Data Capture with Flink SQL and Debezium	DataEngBytes	August 2020
Introduction to Stream Processing with Apache Flink	WeAreDevelopers Live	May 2020
7 Reasons to use Apache Flink for your IoT Project: How We Built a Real-time Asset Tracking System	ApacheCon Europe 2019	October 2019

Marta Paes

AI Council

Data Lake CDC: are we there yet?

DataEngBytes 2026

From transactions to analytics: where do we go from here?

Current

Debezium vs. the world: an overview of the CDC ecosystem