pipeline disconnect is now initalized by the beater#50721
pipeline disconnect is now initalized by the beater#50721khushijain21 wants to merge 14 commits into
Conversation
|
This pull request doesn't have a |
🤖 GitHub commentsJust comment with:
|
|
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
TL;DRMost Buildkite failures in this run trace to the shutdown refactor from Remediation
Investigation detailsRoot CauseThe PR changes pipeline shutdown contracts and call sites in ways that introduce two failure modes:
A related lifecycle change removes central publisher close hooks:
Evidence
Verification
Follow-upIf you want, I can post a minimal patch suggestion scoped to shutdown semantics only (pipeline + filebeat timeout default) to reduce risk and quickly confirm causality. What is this? | From workflow: PR Buildkite Detective Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. |
🔍 Preview links for changed docs |
✅ Vale Linting ResultsNo issues found on modified lines! The Vale linter checks documentation changes against the Elastic Docs style guide. To use Vale locally or report issues, refer to Elastic style guide for Vale. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| // and shut down at this point, so all events that will be acknowledged | ||
| // already have been. However for the "once" option supported by the | ||
| // log input, events may still be active. | ||
| if *once { |
There was a problem hiding this comment.
We don't need special handling for once case because pipeline disconnection is now initalized by the beater instead of libbeat.
| Bbolt: bboltst.DefaultConfig(), | ||
| }, | ||
| ShutdownTimeout: 0, | ||
| ShutdownTimeout: 1 * time.Second, |
There was a problem hiding this comment.
1s was the default time the pipeline always waited before shuttinf down. See
beats/libbeat/cmd/instance/beat.go
Lines 386 to 393 in 712d64f
This comment has been minimized.
This comment has been minimized.
TL;DR
Remediation
Investigation detailsRoot CauseThe run fails in step Evidence
Validation
Follow-up
What is this? | From workflow: PR Actions Detective Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. |
Proposed commit message
Checklist
stresstest.shscript to run them under stress conditions and race detector to verify their stability../changelog/fragmentsusing the changelog tool.Disruptive User Impact
How to test this PR locally
Related issues
Use cases
Screenshots
Logs