Skip to content

Conversation

@fbs
Copy link
Contributor

@fbs fbs commented Apr 17, 2025

The SIGINT(control-c) is sent to both the parent and child, as they're in the same group. The vdev process quits immediately, as its just waiting on the child. However the child will go through its shutdown process, which by default takes a minute if it cannot cleanly shutdown. During this minute your controlling terminal gets flooded with shutdown messages.
The way to stop it is to kill vector again, e.g. pkill vector but its annoying and could kill other vector instances.

This lets the vdev ignore the SIGINT and just rely on the parent to terminate after the child did.

Its currently only implemented on unix using the libc crate. Vector uses the nix crate, but that seemed a bit overkill for 2 lines.

Summary

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

How did you test this PR?

Tested in terminal.

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the "no-changelog" label to this PR.

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • The CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • cargo fmt --all
      • cargo clippy --workspace --all-targets -- -D warnings
      • cargo nextest run --workspace (alternatively, you can run cargo test --all)
      • ./scripts/check_changelog_fragments.sh
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run cargo vdev build licenses to regenerate the license inventory and commit the changes (if any). More details here.

References

The SIGINT(control-c) is sent to both the parent and child, as they're
in the same group. The `vdev` process quits immediately, as its just
waiting on the child. However the child will go through its shutdown
process, which by default takes a minute if it cannot cleanly shutdown.
During this minute your controlling terminal gets flooded with shutdown
messages.

The way to stop it is to `kill` vector again, e.g. `pkill vector` but
its annoying and could kill other vector instances.

This lets the vdev ignore the SIGINT and just rely on the parent to
terminate after the child did.

Its currently only implemented on unix using the libc crate. Vector uses
the nix crate, but that seemed a bit overkill for 2 lines.
@fbs fbs requested a review from a team as a code owner April 17, 2025 23:36
@github-actions github-actions bot added the domain: vdev Anything related to the vdev tooling label Apr 17, 2025
@pront
Copy link
Member

pront commented Apr 18, 2025

Can you share an example of how you are using vdev and the messages that flood your terminal?

@fbs
Copy link
Contributor Author

fbs commented Apr 19, 2025

Let me try to explain in a bit more detail (hope i got the details of processgroups and terminals right)

Starting point is an unreachable sink destination:

sources:
  demo:
    type: demo_logs
    format: json

sinks:
  vector:
    type: vector
    inputs:
    - demo
    address: http://127.0.0.1:4545

If you run this and want to quickly exit you need to send 2 SIGINTS. The first one enters the 'clean shutdown', the 2nd forces it.

$ vector -c broken.yaml
2025-04-19T08:32:34.295774Z ERROR vector::topology::builder: msg="Healthcheck failed." error=Request failed: status: Unavailable, message: "error trying to connect: ...
^C
025-04-19T08:32:35.371397Z  INFO vector::signal: Signal received. signal="SIGINT"
2025-04-19T08:32:35.371693Z  INFO vector: Vector has stopped.
2025-04-19T08:32:35.372855Z  INFO vector::topology::running: Shutting down... Waiting on running components. remaining_components="demo, vector" time_remaining="59 seconds left"
2025-04-19T08:32:35.406564Z  WARN sink{component_kind="sink" component_id=vector component_type=vector}:request{request_id=1}: vector::sinks::util::retries: Internal log [Retrying after error.] is being suppressed to avoid flooding.
^C
2025-04-19T08:32:36.483711Z  INFO vector::signal: Signal received. signal="SIGINT"
2025-04-19T08:32:36.483843Z  INFO vector: Vector has quit.

When running with vdev run broken.yaml vdev will span the child target/debug/vector -c broken.yaml and will block on it until it terminates.

if you look at the process hierarchy you'll see:

USER    PID  PPID  PGID   SESS JOBC STAT   TT       TIME COMMAND
bsmit 78347 74486 78347      0    1 S+   s000    0:00.01 vdev run broken.yaml
bsmit 78359 78347 78347      0    1 S+   s000    0:02.63 target/debug/vector --config broken.yaml

as they're in the same PGID/process group the SIGINT that the terminal sends on ^c is sent to both and both are asked to terminate. However vdev will terminate right away, as it has no SIGINT handler, while vector will enter its shutdown loop. With vdev gone vector will reparent, but is still connected to the terminal and running.

this can be seen with ps in the 3 stages:

# before the target is running, notice cargo is a child of vdev
 $ ps -j |grep -E 'PP|vector|vdev'
USER    PID  PPID  PGID   SESS JOBC STAT   TT       TIME COMMAND
bsmit 79012 74486 79012      0    1 S+   s000    0:00.01 vdev run broken.yaml
bsmit 79024 79012 79012      0    1 S+   s000    0:00.91 cargo run --no-default-features --features sinks-vector,sources-demo_logs -- --config broken.yaml
bsmit 79165 79025 79164      0    2 R+   s001    0:00.00 ggrep -E PP|vector|vdev

# here, cargo `exec`d into `vector`, 
github/vectordotdev/vector vdevsignal[$✗?] $ ps -j |grep -E 'PP|vector|vdev'
USER    PID  PPID  PGID   SESS JOBC STAT   TT       TIME COMMAND
bsmit 79012 74486 79012      0    1 S+   s000    0:00.01 vdev run broken.yaml
bsmit 79024 79012 79012      0    1 S+   s000    0:02.17 target/debug/vector --config broken.yaml
bsmit 79188 79025 79187      0    2 S+   s001    0:00.00 ggrep -E PP|vector|vdev

# SIGINT was sent, `vdev` is gone and `vector` has a new parent (PPID)
github/vectordotdev/vector vdevsignal[$✗?] $ ps -j |grep -E 'PP|vector|vdev'
USER    PID  PPID  PGID   SESS JOBC STAT   TT       TIME COMMAND
bsmit 79024     1 79012      0    0 S    s000    0:02.25 target/debug/vector --config broken.yaml
bsmit 79313 79025 79312      0    2 S+   s001    0:00.00 ggrep -E PP|vector|vdev

the effect in the terminal:

Screenshot 2025-04-19 at 10 48 45

An alternative is to exec without fork, but then you loose the option to run code post execution, it also seemed like a bigger structural change.

@fbs
Copy link
Contributor Author

fbs commented Apr 19, 2025

actually typing this out made me realized I miss one case:

If you kill vdev by sending a signal instead of through the terminal, the signal isn't propagated from vdev to the child, so vdev will exit but vector will stay running.

(os error 61)", details: [], metadata: MetadataMap { headers: {} } internal_log_rate_limit=true
zsh: terminated  vdev run broken.yaml

❌143 github/vectordotdev/vector vdevsignal[$✗?] $
2025-04-19T10:07:00.417440Z  WARN sink{component_kind="sink" component_id=vector component_type=vector}:request{request_id=1}: vector::sinks::util::retries: Retrying after error. error=Request failed: status: Unavailable, message: "error trying to connect: tcp connect error: Connection refused (os error 61)", details: [], metadata: MetadataMap { headers: {} } internal_log_rate_limit=true

github/vectordotdev/vector vdevsignal[$✗?] $ echo test
test

github/vectordotdev/vector vdevsignal[$✗?] $ ps aux | grep vector
bsmit            81680   0.1  0.3 411589888  51168 s000  S    12:05PM   0:01.35 target/debug/vector --config broken.yaml

To fix that vdev needs a signal handler that propagates signals to the child. Not sure thats worth the effort as afaik vdev is meant for interactive use

@pront pront force-pushed the master branch 4 times, most recently from 1720078 to ffe54be Compare July 10, 2025 15:43
@pront pront added the meta: awaiting author Pull requests that are awaiting their author. label Jul 11, 2025
@fbs
Copy link
Contributor Author

fbs commented Jul 19, 2025

@pront I see you added a label to this. What can i do for this one? Imo its good to go in, it fixes the early termination I run into

@pront
Copy link
Member

pront commented Jul 21, 2025

Hi @fbs, I appreciate the effort to reduce the need for manual intervention with pkill vector. However, I have some concerns about the current approach that I think we should address before moving forward. This is a unix only approaching and also it introduces non-standard SIGINT handling. Could we do the std thing and let the parent forward the signal to the child? Depending on the implementation, maybe we can use nix to abstract the low level details away.

@pront
Copy link
Member

pront commented Jul 21, 2025

Also, Vector supports a shutdown timeout, there might be opportunity to reduce the shutdown message flood though.

@fbs
Copy link
Contributor Author

fbs commented Jul 21, 2025

. Could we do the std thing and let the parent forward the signal to the child?

What do you mean by forwarding it? Both processes get the sigint, its just that the parent terminates before the child and then ends up messing up the terminal. Instead of terminating right away vdev now ignores it and waits on the child

Depending on the implementation, maybe we can use nix to abstract the low level details away.

I'll have a look at nix. I see its already in cargo.lock as dependency

Also, Vector supports a shutdown timeout, there might be opportunity to reduce the shutdown message flood though.

Yes I do use that when I remember to put it in :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: vdev Anything related to the vdev tooling meta: awaiting author Pull requests that are awaiting their author.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants