Self Checks
Is your feature request related to a problem?
Describe the feature you'd like
Currently, the usage of the data transform node in the data pipeline is very rigid: its downstream operator can only be indexing/embedding. We need it to be orchestratable within the pipeline—for example, to interleave summaries with the original chunks.
For that purpose, we need to add component to manipulate variables, like, data operations, variable aggregator/assigner, and conversation variable creations.
Scenarios Example
According to https://arxiv.org/abs/2510.06999
The document-level summary is attached to each chunk(Summary-Augmented Chunking) such that to resolve Document-Level Retrieval Mismatch issue for legal retrieval
Describe implementation you've considered
No response
Documentation, adoption, use case
Additional information
No response
Self Checks
Is your feature request related to a problem?
Describe the feature you'd like
Currently, the usage of the data transform node in the data pipeline is very rigid: its downstream operator can only be indexing/embedding. We need it to be orchestratable within the pipeline—for example, to interleave summaries with the original chunks.
For that purpose, we need to add component to manipulate variables, like, data operations, variable aggregator/assigner, and conversation variable creations.
Scenarios Example
According to https://arxiv.org/abs/2510.06999
The document-level summary is attached to each chunk(Summary-Augmented Chunking) such that to resolve Document-Level Retrieval Mismatch issue for legal retrieval
Describe implementation you've considered
No response
Documentation, adoption, use case
Additional information
No response