Skip to content

Request for proxy latency metrics #1201

@sergmour

Description

@sergmour

The proxy is supposed to be an interface between the applications and the backends. It is important to report objective latency metrics for the request duration, the time between receiving the request and sending the response back to the application.

Ideally, the proxy should:

  • Report latency metric per route handle
  • (maybe) report latency metric per command (this might be an overkill and reporting per route handle might be good enough)

While end-to-end latency metrics can be collected on the application side, these metrics are often skewed by the round-trip duration and applications themselves (GC pauses, CPU contention, etc.). These application-side factors are in most cases bigger than the cache request/response latencies. Having reliable and independent latency metrics/SLIs reported by the server side proxy is helpful especially for organizations where applications and services like Memcached are operated by separate teams.

In case of Mcrouter, it reports just two duration_get_us and duration_update_us metrics (disregarding older duration_us). I believe these are weighted average metrics. Both metrics, especially duration_get_us, help a lot to in debugging whether the application or the cache-service is responsible for the latency spikes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions