Skip to content

High availability #624

@turt2live

Description

@turt2live

With room version 12 it's possible for rooms to have multiple creators. We are intending to operate our rooms such that our moderation bot holds creatorship everywhere, and for redundancy we intend to add one or more other bot-shaped accounts.

When the bot's server goes offline, we'd ideally be able to switch over to using another creator, limiting the amount of permissions fixing we'd need to do to spin up a second (temporary) bot. It might even be nice if the bot did this on it's own, especially to ensure the code works without explicit disaster testing. Ideally, we'd also have a way to disable or change functionality of the bot when the server it runs on is offline. This is a problem for Mjolnir at the moment because it uses account data to track enabled protections, and those protections can cause bans when the server recovers if not disabled in time.

Options for this include:

  1. Finally, set the bot up with a real database where it can store "instance config" like enabled protections, protected rooms, etc. Sync tokens, encryption keys, etc would be "client config/data", which would be stored either in the existing JSON files or a dedicated table away from the instance data. The idea here being that a new Matrix client can be configured and the bot will pick up protection, etc config from the database without needing to migrate account data.
  2. Possibly in combination with the above, the bot's config supports specifying multiple clients. The bot would maintain sync streams for all of them, and use the first account which isn't failing /sync as "primary" (the one which triggers protections, bans, and sends messages). If a client were to start failing for whatever reason, the bot would fail over to the next client automatically. There might need to be some tuning to avoid flapping between accounts when servers are trying to recover (or are just sad), but might be solved by considering clients down for at least 30 minutes from the most recent error regardless of actual conditions.

Relatedly, it feels valuable to duplicate protections in particular into standalone policy servers for layered protection when the bot simply can't fail over enough to a working account.

Metadata

Metadata

Assignees

No one assigned

    Labels

    T-ThoughtDocumentation for an idea without commitment to a feature or task.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions