An extensible monitoring tool, sending alerts via Signal messenger, and responding to requests for information (status, metrics values, plots) or other commands from administrators.
signal-gateway receives alerts and log messages from various sources and forwards them to Signal
messenger via signal-cli. It supports:
- Alertmanager webhooks - Receive Prometheus alerts and forward them to Signal
- JSON log streams - Accept JSON logs over TCP/UDP
- Syslog (RFC 5424) - Accept syslog messages over TCP/UDP
signal-gateway also allows you to define filtering and rate limiting schemes to decide if and when an
error log should be escalated to an alert and forwarded, while avoiding alert fatigue.
It also retains a buffer of recent logs to send as context.
Beyond simple forwarding, it enables admins to query the system interactively.
- Prometheus querying - With access to the prometheus query API, you can query metrics and generate plots directly from signal.
- AI integration - Can't remember PromQL syntax or the names of your metrics? Connect it to claude, and ask claude to generate plots for you. Claude also sees the log messages, retains context on the system, and can help you troubleshoot.
Commands:
/log
/query
/plot
/series
/labels
/alerts
/claude-stop
/claude-compact
/claude-debug
/help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
Additionally, admins can send "commands" with semantics interpreted by your services, if support is configured.
-
Secured by Signal - Signal messages are a form of authenticated encryption, tied to your device. You can take the safety numbers from the app and put them in the
signal-gatewayconfig. Then, even if your phone number is simjacked, and the attacker bypasses registration lock somehow, they won't be able to send messages that are accepted by thesignal-gateway, without physical access to your device. -
Extensible
The project is designed as both a library and a binary. You can either use the configurable binary (signal-gateway-bin) that is offered as a default, or use the library signal-gateway
and customize it for your needs. This allows you to add custom handling for admin commands, expose additional tools to the AI integration, and so on.
It's actually a workspace with multiple libraries, so that you can mix and match what features you want without pulling in unnecessary stuff, or easily swap in alternative implementations of different parts.
- Free
Created to simplify devops for projects on a shoestring budget. This project will remain free and open-source.
- signal-cli running in JSON-RPC daemon mode
- You can configure it to listen on TCP, or on a unix domain socket
- A registered Signal account.
- It's best to use a new number that you aren't already using with signal, such as a google voice number.
- For security, you should enable registration lock on this number.
-
Start signal-cli in JSON-RPC mode:
signal-cli -a +15551234567 daemon --tcp 127.0.0.1:7583
-
Run signal-gateway:
signal-gateway \ --signal-cli-tcp-addr 127.0.0.1:7583 \ --signal-account +15551234567 \ --signal-admins '["your-uuid-here"]' -
Configure Alertmanager to send webhooks to
http://signal-gateway:8000/alert -
(Optional) Configure
syslogorjsonlistener, and configure your app(s) to send logs over UDP (or TCP) tosignal-gateway.Note:
signal-gatewayonly buffers your logs temporarily in memory, it doesn't provide long term storage. -
(Optional) Configure
signal-gatewayto have access to prometheus query API, e.g.http://prometheus:9090This allows
/plotcommand and friends to work in the signal chat. -
(Optional) Configure
signal-gatewayto use conversational AI (add a[claude]section toconfig.toml).This allows you to ask for new plots in plain language, ask for help in triaging alerts, making sense of logs, etc.
signal-gateway supports hierarchical config, and can read config values from CLI arguments, environment variables, or a TOML config file,
or combinations thereof. See --help for details.
Use --config-file path/to/config.toml to load from a file.
Example TOML configuration:
http_listen_addr = "0.0.0.0:8000"
signal_account = "+15551234567"
signal_cli_tcp_addr = "127.0.0.1:7583"
# Admin UUIDs mapped to their safety numbers (empty list means no verification)
# Find the safety numbers in the Signal app, in your conversation with `signal_account`.
[signal_admins]
"12345678-1234-1234-1234-123456789abc" = []
# Optional: send alerts to a group instead of individual admins
# alert_group_id = "base64-encoded-group-id"
# Optional: Prometheus for /query, /plot, /alerts commands
[prometheus]
prometheus_url = "http://172.31.10.138:9090"
[prometheus.plot]
timezone = "US/Mountain"
# Syslog listener (optional)
[syslog]
listen_addr = "0.0.0.0:1514"
# JSON log listener (optional)
[json]
listen_addr = "0.0.0.0:5000"Run signal-gateway --help for all available options.
NOTE: This example is incomplete, you should refer to signal-gateway-bin/src/main.rs for the Config object
for exhaustive documentation of the options.
For rust projects, I had success using tracing-rfc-5424
to send logs in syslog format over UDP to signal-gateway-bin. It worked pretty much out of the box even if the
log messages contain \n, because it expects 1 log message per UDP packet.
The log handler controls (not exhaustive):
- How many log messages are cached from each source (
log_handler.log_buffer_size) - How we format log messages to be sent in signal (
log_handler.log_format) - When a log message can lead to an alert (by configuring one or more "routes")
- Overall limits on alerting (applies to all routes)
See docs for LogHandlerConfig for more specifics.
High level:
- A
Routeconsists of analert_level, aLogFilter, and a series ofLimit's. If a message is at the alert level or higher, and it passes the filter, then we test eachLimitin the route.- A
LogFilteris a test against the fields of the log message. It is stateless. - A
Limitcontains its ownLogFilter, and a rate threshold.- A threshold of the form
> n / timeperforms "burst detection", and is useful for suppressing transient errors, so that they only lead to alerts if they happen in rapid succession. - A threshold of the form
< n / timeperforms "rate limiting", and is useful for suppressing spam. Only the first few events will pass the limit, and anything beyond that is suppressed. - A
Limitmay also apply in aby_source_locationfashion. This means that there is a separate rate limit counter for eachfile:linenopair. This allows you to be more surgical in what you choose to suppress.
- A threshold of the form
route.limitscontains limits that apply per source (app + hostname pair).route.global_limitscontains limits that apply to all sources.- A message passes a route if it passed the filter, and each limit and global limit.
- A
- In order to trigger an alert, a log message must pass at least one route, and then pass the
overall_limits, if any are configured. This is an additional sequence ofLimit's.- If it passes these then it leads to an alert -- a signal message being sent to admins or to the group, containing this log message and then all recent log messages in the log buffer from this source.
Example TOML section:
# Log handler configuration
[log_handler]
# Overall rate limit: max 1 alert per 10m from same source location
overall_limits = [
{ threshold = "< 2 / 10m", by_source_location = true }
]
# Log formatting
[log_handler.log_format]
format_module = true
format_source_location = true
# Single route matching errors, with burst detection for noisy patterns
# (only alert if pattern occurs 2+ times in 10m)
[[log_handler.route]]
alert_level = "error"
limits = [
{ threshold = ">= 2 / 10m", module_equals = "ws", msg_contains = "WebSocket protocol error: Connection reset without closing handshake" },
{ threshold = ">= 2 / 10m", module_equals = "ws", msg_contains = "did not respond to ping, closing stream" },
{ threshold = ">= 2 / 10m", module_equals = "ws", msg_contains = "IO error: peer closed connection without sending TLS close_notify" },
{ threshold = ">= 2 / 10m", module_equals = "main", msg_contains = "error sending request for url" },
]For a complete list of LogFilter keys, see docs for LogFilter.
signal-gateway-bin has a claude integration.
- Set a path to an anthropic api key.
- Configure one or more system prompts (1) explain its role (2) summarize what metrics are available
- Choose what model to use in conversation
- Configure how compaction works
- What prompt to use when compacting, how large of a response to allow
- What model to use for compacting
- How many characters in the conversation should trigger compaction.
[claude]
api_key_file = "creds/anthropic_api_key"
system_prompt_files = ["system_prompt.md", "metrics_prompt.md"]
claude_model = "claude-sonnet-4-5-20250929"
[claude.compaction]
prompt_file = "compaction_prompt.md"
model = "claude-sonnet-4-5-20250929"
max_tokens = 2048
trigger_chars = 10000This integration is still a work in progress -- it's useful as it is and can generate complicated plots on demand and help figure out what might be wrong in the system.
But:
- It's not as sophisticated as some agent frameworks like
rig. It's possible that we'll switch to something like that ifrigbecomes more mature. For the moment I decided to just make the simplest thing that would meet my immediate needs. - It isn't using very sophisticated compression techniques. Compressing information before generating a prompt can result in less tokens for a similar result. This would make it cost less to use it for a similar amount of log data. For systems that aren't very chatty it's pretty cost effective as is.
YMMV, contributions are welcome!
MIT or Apache 2 at your option.