Skip to content

[Bug]: Lineage metrics in Dataflow streaming are beinng reported as cumulative rather than delta #34052

@rohitsinha54

Description

@rohitsinha54

What happened?

Dataflow streaming metrics are delta metrics unlike batch which are cumulative. This means that in every periodic update Dataflow workers send a delta of metrics from last report.

StringSet metrics (used for lineage tracking) are being reported as cumulative metrics in streaming which causes the following issues:

  • Every periodic (10 seconds) reports took cumulative over and over and reported it hence every report was reporting the metric. Unlike batch job reporting where it filters to only take one which has changed (tracked by dirty bit).
  • Not reseting was using more memory as metrics remained in memory forever
  • In backend it lead to large memory consumption when tracking active workitem counters.

Reporting them as cumulative resets the timestamp of counter in backend. As they get overwritten in every report. This is troublesome because when counters are polled in backend to be dumped to monitoring state store this timestamp is used to determine whether the counter has changed or not hence they get dumped more often than they should be.

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions