logsaw

logsaw is my practical CLI for fast, local log analytics.

I built it for real operational debugging: when you need answers from logs in minutes, without standing up heavy infrastructure like ELK, Loki, or a data warehouse.

Why I built this

Most incidents start the same way: you have a large log file, a hypothesis, and no time.

I do not want to:

provision infrastructure for one investigation
ship sensitive logs to external systems
wait for indexing pipelines

I do want to:

point to a file (or stdin)
run one command
get immediate output for both humans and scripts

That is exactly what logsaw is designed for.

Who this is for

logsaw is aimed at:

backend engineers debugging production or staging incidents
SRE / DevOps engineers doing fast drill-downs on access and app logs
technical leads who prefer decisions based on measurable data

Core engineering principles

Streaming-first: process line by line, never load full files into memory.
Predictable output contract: table to stderr, JSON to stdout.
Conservative time semantics: invalid timestamps do not leak into time-based results.
Tool observability: JSON includes stats so you can see what was skipped and why.
Fail fast config: unknown --ts-format is an explicit error, never a silent fallback.

How it works internally

For each line, the pipeline is:

Resolve input format (auto | jsonl | nginx | plain).
Extract:

aggregation value (--by)
timestamp (when needed)

Apply command logic:

top: frequency aggregation
hist: frequency by time bucket
grep: regex + context

For --last, the time window is robust even when logs are out of timestamp order.

Quick start

Build

git clone https://github.com/etherinus/logsaw
cd logsaw
cargo build --release
./target/release/logsaw --help

Typical flows

# top nginx statuses
logsaw top --format nginx --by status -i access.log

# error search with context
logsaw grep --re "panic|timeout|error" --context 2 -i app.log

# status histogram for the last 30 minutes
logsaw hist --format nginx --by status --bucket 1m --last 30m -i access.log

Commands

`top`

Returns top-N values for a field.

logsaw top --by <field> [--last <duration>] [--limit N] [--out table|json|both] [--format ...] [--fast-json] -i <file>

Examples:

logsaw top --format nginx --by status -i access.log
logsaw top --format nginx --nginx-preset combined --by request.method -i access.log
logsaw top --format jsonl --by user.id -i app.jsonl
logsaw top --format jsonl --by user_id --fast-json -i app.jsonl
logsaw top --format plain --by a.b -i app.log

`hist`

Returns frequencies by time bucket and field value.

logsaw hist --by <field> --bucket <duration> [--last <duration>] [--top N] [--out table|json|both] [--format ...] [--fast-json] -i <file>

Examples:

logsaw hist --format nginx --by status --bucket 1m -i access.log
logsaw hist --format nginx --by status --bucket 1m --last 30m -i access.log
logsaw hist --format jsonl --by status --bucket 1m --ts-field @timestamp --ts-format rfc3339 -i app.jsonl

`grep`

Regex search with before/after context.

logsaw grep --re <regex> [--context N | --before N --after N] [--json] -i <file>

Examples:

logsaw grep --re "timeout|panic" --context 2 -i app.log
logsaw grep --re "error" --before 10 --after 5 -i app.log
logsaw grep --re "panic" --context 2 --json -i app.log

Input formats

--format auto|jsonl|nginx|plain

auto (default): tries jsonl -> nginx -> plain
jsonl: JSON Lines
nginx: built-in parser or your template/preset
plain: key=value style logs + timestamp heuristics

Multiple inputs are supported:

logsaw top --format nginx --by status -i a.log -i b.log

Stdin is supported via -:

tail -f access.log | logsaw top --format nginx --by status -i -

Fields and dotted paths

`jsonl`

Nested paths are supported:

logsaw top --format jsonl --by user.id -i app.jsonl

`plain`

Supported:

dotted keys: a.b=c
JSON expansion: a={"b":1} with --by a.b

Working with timestamps

Critical for:

--last (sliding time window)
hist (bucketization)

`--ts-field`

Explicit timestamp field.

Examples:

logsaw hist --format jsonl --by status --bucket 1m --ts-field @timestamp --ts-format rfc3339 -i app.jsonl
logsaw hist --format plain --by status --bucket 1m --ts-field ts --ts-format rfc3339 -i app.log

`--ts-format`

Supported values:

auto
epoch
epoch_ms
rfc3339
ymd_hms
nginx_time_local
chrono:<fmt>

Note: unknown format values fail explicitly.

Nginx support

Presets

logsaw top --format nginx --nginx-preset combined --by status -i access.log
logsaw top --format nginx --nginx-preset common --by remote_addr -i access.log
logsaw top --format nginx --nginx-preset json-ish --by http_user_agent -i access.log

Custom template

logsaw top \
  --format nginx \
  --nginx-format '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent"' \
  --by status \
  -i access.log

Derived request fields:

request.method
request.path
request.proto
request.raw

Output contract

Table output always goes to stderr.
JSON output always goes to stdout.

This keeps terminal output readable for humans and stable for pipelines.

JSON example: `top`

{
  "command": "top",
  "by": "status",
  "format": "nginx",
  "fast_json": false,
  "ts_field": null,
  "ts_format": "auto",
  "nginx_preset": "combined",
  "nginx_format_used": "...",
  "last_ms": 900000,
  "total": 12345,
  "stats": {
    "lines_read": 13000,
    "parsed_lines": 12900,
    "used_lines": 12345,
    "skipped_unparsed": 100,
    "skipped_no_value": 400,
    "skipped_no_ts": 80,
    "skipped_out_of_window": 75
  },
  "items": [
    {"value": "200", "count": 10000},
    {"value": "500", "count": 234}
  ]
}

JSON example: `hist`

{
  "command": "hist",
  "by": "status",
  "format": "nginx",
  "fast_json": false,
  "ts_field": null,
  "ts_format": "auto",
  "nginx_preset": "combined",
  "nginx_format_used": "...",
  "bucket_ms": 60000,
  "last_ms": 1800000,
  "stats": {
    "lines_read": 13000,
    "parsed_lines": 12900,
    "used_lines": 7600,
    "skipped_unparsed": 100,
    "skipped_no_value": 3200,
    "skipped_no_ts": 900,
    "skipped_out_of_window": 1100
  },
  "items": [
    {"bucket_ms": 1730000000000, "value": "200", "count": 120},
    {"bucket_ms": 1730000000000, "value": "500", "count": 2}
  ]
}

JSON example: `grep --json`

{"source":"app.log","line":42,"kind":"match","text":"panic: ..."}

What `stats` tells you

I added stats because in incident work the result is not enough; sample quality matters.

lines_read: total lines read
parsed_lines: lines successfully parsed in selected format
used_lines: lines included in aggregation
skipped_unparsed: parser could not parse line
skipped_no_value: missing --by field
skipped_no_ts: missing/invalid timestamp for time-based logic
skipped_out_of_window: line excluded by --last

Performance notes

Streaming I/O, single pass over input.
HashMap-based aggregation; memory depends on key cardinality.
--last uses an ordered time window and remains correct for out-of-order logs.
--fast-json accelerates common JSONL top-level field cases.

Limitations

--fast-json supports top-level keys only.
hist and --last require parseable timestamps.
nginx template parsing expects reasonably structured log_format lines.

Practical recommendations

If you know the format, set --format explicitly.
For nginx, start with --nginx-preset combined, then move to custom --nginx-format.
If hist output looks off, check stats first, then validate --ts-field and --ts-format.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

logsaw

Why I built this

Who this is for

Core engineering principles

How it works internally

Quick start

Build

Typical flows

Commands

`top`

`hist`

`grep`

Input formats

Fields and dotted paths

`jsonl`

`plain`

Working with timestamps

`--ts-field`

`--ts-format`

Nginx support

Presets

Custom template

Output contract

JSON example: `top`

JSON example: `hist`

JSON example: `grep --json`

What `stats` tells you

Performance notes

Limitations

Practical recommendations

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

logsaw

Why I built this

Who this is for

Core engineering principles

How it works internally

Quick start

Build

Typical flows

Commands

top

hist

grep

Input formats

Fields and dotted paths

jsonl

plain

Working with timestamps

--ts-field

--ts-format

Nginx support

Presets

Custom template

Output contract

JSON example: top

JSON example: hist

JSON example: grep --json

What stats tells you

Performance notes

Limitations

Practical recommendations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages

`top`

`hist`

`grep`

`jsonl`

`plain`

`--ts-field`

`--ts-format`

JSON example: `top`

JSON example: `hist`

JSON example: `grep --json`

What `stats` tells you