Skip to content

Conversation

@franciscovalentecastro
Copy link
Contributor

@franciscovalentecastro franciscovalentecastro commented Jul 31, 2025

Description

Add support for transformation testing 3P app receivers that are registered with RegisterLoggingFilesProcessorMacro. This is required to test 3P app receivers that have MultineRules defined in a LoggingReceiverFilesMixin, for example zookeeper_general.

This is an improved version of #1968.

Some details :

  • Creates testLoggingFilesProcessorMacroAdapter which "adpats" the "Macros" for a transformation test environment.
  • Created logging_receiver-redis transformation test.

Related issue

b/429241517

How has this been tested?

Checklist:

  • Unit tests
    • [] Unit tests do not apply.
    • Unit tests have been added/modified and passed for this PR.
  • Integration tests
    • Integration tests do not apply.
    • Integration tests have been added/modified and passed for this PR.
  • Documentation
    • This PR introduces no user visible changes.
    • This PR introduces user visible changes and the corresponding documentation change has been made.
  • Minor version bump
    • This PR introduces no new features.
    • This PR introduces new features, and there is a separate PR to bump the minor version since the last release already.
    • This PR bumps the version.

@franciscovalentecastro franciscovalentecastro changed the title [Draft] Add support for testing LoggingReceiverFiles like receivers. [transformation_test] Support testing LoggingFilesProcessorMacro receivers. Aug 6, 2025
@franciscovalentecastro franciscovalentecastro marked this pull request as ready for review August 6, 2025 19:13
@franciscovalentecastro franciscovalentecastro force-pushed the fcovalente-add-receivers-transfom-tests branch from b15bdd1 to 1476598 Compare August 7, 2025 15:17
Copy link
Member

@quentinmit quentinmit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't these apps just use LoggingProcessorParseMultilineRegex instead of making all these changes to hack together support for manipulating file receivers?

@franciscovalentecastro
Copy link
Contributor Author

franciscovalentecastro commented Aug 7, 2025

Why MultilineRules were kept in INPUT ?

The main reason i find for why the MultilineRules were kept in the INPUT component was what it is described in this comment This is necessary, since using the multiline filter will not work if a multiline message spans between two chunks..
I found some sources that back up this issue :

if len(r.MultilineRules) > 0 {
// Configure multiline in the input component;
// This is necessary, since using the multiline filter will not work
// if a multiline message spans between two chunks.
rules := [][2]string{}
for _, rule := range r.MultilineRules {
rules = append(rules, [2]string{"rule", rule.AsString()})
}
parserName := fmt.Sprintf("multiline.%s", tag)
c = append(c, fluentbit.Component{
Kind: "MULTILINE_PARSER",
Config: map[string]string{
"name": parserName,
"type": "regex",
"flush_timeout": "5000",
},
OrderedConfig: rules,
})

Answer

Re @quentinmit :

Why can't these apps just use LoggingProcessorParseMultilineRegex instead of making all these changes to hack together support for manipulating file receivers?

I haven't considered so far (though it is indeed possible) to change / refactor the main 3P app processor logic at all. This implies work on "3p app receivers fluent-bit logic" rather than "refactoring 3P receivers for Otel Logging", so it would be additional effort.

Currently, I was considering it would be better to support 3P receivers for transformation testing which extends our testing possibilities, rather than considering working on Fluent-bit specific questions like "Does moving 'multiline.parsing' out of the input will still handle long-multiline logs correctly through multiple chunks and same performance ?".

I'm weighing as valuable to keep the same fluent-bit processor logic (same user feature expectations) and proposing this PR (we could improve) as a way to achieve that.

With this considerations, do you think its better to work on refactoring fluent-bit multiline parsing instead of the approach of this PR ?

@quentinmit
Copy link
Member

Why MultilineRules were kept in INPUT ?

The main reason i find for why the MultilineRules were kept in the INPUT component was what it is described in this comment This is necessary, since using the multiline filter will not work if a multiline message spans between two chunks.. I found some sources that back up this issue :

if len(r.MultilineRules) > 0 {
// Configure multiline in the input component;
// This is necessary, since using the multiline filter will not work
// if a multiline message spans between two chunks.
rules := [][2]string{}
for _, rule := range r.MultilineRules {
rules = append(rules, [2]string{"rule", rule.AsString()})
}
parserName := fmt.Sprintf("multiline.%s", tag)
c = append(c, fluentbit.Component{
Kind: "MULTILINE_PARSER",
Config: map[string]string{
"name": parserName,
"type": "regex",
"flush_timeout": "5000",
},
OrderedConfig: rules,
})

Yeah, I know about that bug. But we already have a pattern to resolve that:

type MetricsProcessorMerger interface {
// MergeMetricsProcessor attempts to merge p into the current receiver.
// It returns the new receiver; and true if the processor has been merged
// into the receiver completely
MergeMetricsProcessor(p MetricsProcessor) (MetricsReceiver, bool)
}

We can apply the same logic to logging receivers, which would allow the multiline parser to be configured as part of the file receiver to work around the fluent-bit bug.

Answer

Re @quentinmit :

Why can't these apps just use LoggingProcessorParseMultilineRegex instead of making all these changes to hack together support for manipulating file receivers?

I haven't considered so far (though it is indeed possible) to change / refactor the main 3P app processor logic at all. This implies work on "3p app receivers fluent-bit logic" rather than "refactoring 3P receivers for Otel Logging", so it would be additional effort.

Currently, I was considering it would be better to support 3P receivers for transformation testing which extends our testing possibilities, rather than considering working on Fluent-bit specific questions like "Does moving 'multiline.parsing' out of the input will still handle long-multiline logs correctly through multiple chunks and same performance ?".

I'm weighing as valuable to keep the same fluent-bit processor logic (same user feature expectations) and proposing this PR (we could improve) as a way to achieve that.

I don't believe my suggestion would change any of the user-visible processor logic.

With this considerations, do you think its better to work on refactoring fluent-bit multiline parsing instead of the approach of this PR ?

I think it's worth preserving our abstraction barriers. I don't actually think it would be much of a refactoring - you would keep MultilineRules as it is in LoggingReceiverFilesMixin, but you would add a MergeLoggingProcessor() method that takes a LoggingProcessorParseMultilineRegex and turns it into the MultilineRules field.

The transformation tests would use the processor, but the full config would configure the same parser directly on the input to avoid the chunk size bug.

Incidentally, I think this is kinda moot for now because all the receivers that use multiline are going to need additional changes to deal with the differences between otel and fluent-bit (I suspect by manually modifying all of the multiline regexes). So that's a further argument against adding abstraction-breaking complexity.

@franciscovalentecastro
Copy link
Contributor Author

franciscovalentecastro commented Aug 8, 2025

Re @quentinmit :

I think it's worth preserving our abstraction barriers. I don't actually think it would be much of a refactoring - you would keep MultilineRules as it is in LoggingReceiverFilesMixin, but you would add a MergeLoggingProcessor() method that takes a LoggingProcessorParseMultilineRegex and turns it into the MultilineRules field.

The transformation tests would use the processor, but the full config would configure the same parser directly on the input to avoid the chunk size bug.

Thank you for the suggestion of an alternative solution on how to test receivers with MultilineRules in LoggingReceiverFilesMixin.

I implemented the alternative solution in #2025 following the guidelines of MetricsProcessorMerger, though with a main difference is the layer of abstraction (ops agent pipeline vs within a receiver) at which this "merges" happen. More details on #2025.

I don't believe my suggestion would change any of the user-visible processor logic.

Yes, i can confirm this implemented solution didn't have any user-visible changes in the processor logic.

@franciscovalentecastro
Copy link
Contributor Author

Closing in favour of #2025.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants