Skip to content

refactor(scheduler): decouple from orchestrator#6276

Merged
kaposke merged 7 commits into
masterfrom
gui/NAN-5413/refac-decouple-scheduler-from-orchestrator
May 29, 2026
Merged

refactor(scheduler): decouple from orchestrator#6276
kaposke merged 7 commits into
masterfrom
gui/NAN-5413/refac-decouple-scheduler-from-orchestrator

Conversation

@kaposke

@kaposke kaposke commented May 28, 2026

Copy link
Copy Markdown
Contributor

The scheduler package currently has a some orchestrator coupling:

  • Uses ORCHESTRATOR_* env vars
  • Emits ORCH_* metrics

This PR removes that coupling by:

  • Replacing the inline env vars with constructor configuration.
  • Creates an event hook for consumers to use for metrics.
  • Makes logger injectable

And adapts orchestrator to consume this with no behavior change.

Known design tradeoffs (per cubic review)

  • Loggerless construction is now silent. logger defaults to a noopLogger rather than getLogger('scheduler') so the package stays generic. The orchestrator passes a logger so prod is unchanged; anyone constructing a Scheduler elsewhere should pass logger: getLogger('scheduler') (or any StrictLogger) if they want logs.
  • Multiple Scheduler instances in one process share the module-level logger (last set wins). Acceptable today (one Scheduler per process); will revisit when the server-side use case actually needs distinct loggers per instance.

@linear

linear Bot commented May 28, 2026

Copy link
Copy Markdown

NAN-5413

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 issues found across 20 files

Confidence score: 2/5

  • Several high-confidence issues are user-impacting (7–8/10), so merge risk is elevated until key guards/validation are added.
  • Most severe: packages/scheduler/lib/daemons/cleaning/cleaning.daemon.ts takes public olderThanDays into raw SQL intervals without validation, which can break cleanup queries or delete unintended rows.
  • Daemon resilience is at risk in packages/scheduler/lib/daemons/monitoring/backpressure-monitoring.daemon.ts and packages/scheduler/lib/daemons/scheduling/scheduling.daemon.ts: unguarded onEvent listener/telemetry failures can crash processing or roll back scheduling work.
  • Pay close attention to packages/scheduler/lib/daemons/cleaning/cleaning.daemon.ts, packages/scheduler/lib/daemons/monitoring/backpressure-monitoring.daemon.ts, packages/scheduler/lib/daemons/scheduling/scheduling.daemon.ts, and packages/orchestrator/lib/app.ts - input validation, hook isolation, and SSL fallback handling are the main stability risks.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/scheduler/lib/utils/logger.ts">

<violation number="1" location="packages/scheduler/lib/utils/logger.ts:5">
P2: Defaulting the package logger to `noopLogger` silently drops scheduler and migration logs for callers that do not inject a logger.</violation>
</file>

<file name="packages/scheduler/lib/scheduler.ts">

<violation number="1" location="packages/scheduler/lib/scheduler.ts:69">
P2: Logger injection is not instance-safe here because this overwrites the package-global logger for every Scheduler in the process.</violation>
</file>

<file name="packages/orchestrator/lib/app.ts">

<violation number="1" location="packages/orchestrator/lib/app.ts:37">
P1: Use the shared DB SSL flag when `databaseUrl` falls back to `NANGO_DATABASE_URL`/`NANGO_DB_*`; otherwise SSL-required deployments can stop connecting.</violation>
</file>

<file name="packages/orchestrator/lib/scheduler-config.ts">

<violation number="1" location="packages/orchestrator/lib/scheduler-config.ts:37">
P2: Avoid tagging this metric with the full `groupKey`; it creates one time series per queue and can explode monitoring cardinality.

(Based on your team's feedback about limiting high-cardinality monitoring dimensions to essentials.) [FEEDBACK_USED]</violation>
</file>

Tip: instead of fixing issues one by one fix them all with cubic

Re-trigger cubic

Comment thread packages/scheduler/lib/daemons/scheduling/scheduling.daemon.ts Outdated
Comment thread packages/orchestrator/lib/app.ts Outdated
Comment thread packages/scheduler/lib/daemons/cleaning/cleaning.daemon.ts
Comment thread packages/scheduler/lib/utils/logger.ts
Comment thread packages/scheduler/lib/scheduler.ts
Comment thread packages/orchestrator/lib/scheduler-config.ts

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 4 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/scheduler/lib/utils/logger.ts">

<violation number="1" location="packages/scheduler/lib/utils/logger.ts:5">
P2: Defaulting the package logger to `noopLogger` silently drops scheduler and migration logs for callers that do not inject a logger.</violation>
</file>

<file name="packages/scheduler/lib/scheduler.ts">

<violation number="1" location="packages/scheduler/lib/scheduler.ts:69">
P2: Logger injection is not instance-safe here because this overwrites the package-global logger for every Scheduler in the process.</violation>
</file>

<file name="packages/orchestrator/lib/app.ts">

<violation number="1" location="packages/orchestrator/lib/app.ts:37">
P1: Use the shared DB SSL flag when `databaseUrl` falls back to `NANGO_DATABASE_URL`/`NANGO_DB_*`; otherwise SSL-required deployments can stop connecting.</violation>
</file>

<file name="packages/orchestrator/lib/scheduler-config.ts">

<violation number="1" location="packages/orchestrator/lib/scheduler-config.ts:37">
P2: Avoid tagging this metric with the full `groupKey`; it creates one time series per queue and can explode monitoring cardinality.

(Based on your team's feedback about limiting high-cardinality monitoring dimensions to essentials.) [FEEDBACK_USED]</violation>
</file>

Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic

Comment thread packages/scheduler/lib/scheduler.integration.test.ts Outdated
@kaposke kaposke requested a review from a team May 28, 2026 17:53

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 issues found across 21 files

Confidence score: 2/5

  • Merge risk is elevated because packages/scheduler/lib/db/client.ts maps ssl: true to rejectUnauthorized: false, which can silently disable certificate verification when callers expect standard TLS validation.
  • There are concrete behavior/accuracy risks in scheduling metrics: packages/scheduler/lib/daemons/scheduling/scheduling.daemon.ts can repeatedly emit task_dropped for still-due schedules, and packages/scheduler/lib/scheduler.ts may emit the same event before a transaction commits.
  • Configuration and query-safety concerns add additional risk: packages/scheduler/lib/daemons/cleaning/cleaning.daemon.ts may spin in a near-tight loop with invalid tickIntervalMs, packages/scheduler/lib/models/tasks.ts interpolates batchSize into raw SQL, and packages/orchestrator/lib/scheduler-config.ts can create high-cardinality metric tags.
  • Pay close attention to packages/scheduler/lib/db/client.ts, packages/scheduler/lib/daemons/scheduling/scheduling.daemon.ts, packages/scheduler/lib/daemons/cleaning/cleaning.daemon.ts, packages/scheduler/lib/models/tasks.ts, packages/orchestrator/lib/scheduler-config.ts, packages/scheduler/lib/scheduler.ts - TLS verification semantics, metric/event correctness, config validation, and SQL parameterization need fixes before relying on this behavior.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/scheduler/lib/utils/logger.ts">

<violation number="1" location="packages/scheduler/lib/utils/logger.ts:5">
P2: Defaulting the package logger to `noopLogger` silently drops scheduler and migration logs for callers that do not inject a logger.</violation>
</file>

<file name="packages/scheduler/lib/scheduler.ts">

<violation number="1" location="packages/scheduler/lib/scheduler.ts:69">
P2: Logger injection is not instance-safe here because this overwrites the package-global logger for every Scheduler in the process.</violation>
</file>

<file name="packages/orchestrator/lib/app.ts">

<violation number="1" location="packages/orchestrator/lib/app.ts:37">
P1: Use the shared DB SSL flag when `databaseUrl` falls back to `NANGO_DATABASE_URL`/`NANGO_DB_*`; otherwise SSL-required deployments can stop connecting.</violation>
</file>

<file name="packages/orchestrator/lib/scheduler-config.ts">

<violation number="1" location="packages/orchestrator/lib/scheduler-config.ts:37">
P2: Avoid tagging this metric with the full `groupKey`; it creates one time series per queue and can explode monitoring cardinality.

(Based on your team's feedback about limiting high-cardinality monitoring dimensions to essentials.) [FEEDBACK_USED]</violation>

<violation number="2" location="packages/orchestrator/lib/scheduler-config.ts:37">
P2: Don't tag `ORCH_QUEUE_BACKPRESSURE` with the raw `groupKey`; it creates one time series per queue and will drive high-cardinality metrics costs/noise. Keep only the bounded `primitive` tag here and log the full key separately if you need it for debugging.

(Based on your team's feedback about limiting high-cardinality monitoring dimensions to essentials.) [FEEDBACK_USED]</violation>
</file>

<file name="packages/scheduler/lib/daemons/scheduling/scheduling.daemon.ts">

<violation number="1" location="packages/scheduler/lib/daemons/scheduling/scheduling.daemon.ts:97">
P2: This `task_dropped` event is emitted for schedules that stay due, so one capped schedule will be counted again on every scheduling tick until the backlog clears.</violation>
</file>

<file name="packages/scheduler/lib/daemons/cleaning/cleaning.daemon.ts">

<violation number="1" location="packages/scheduler/lib/daemons/cleaning/cleaning.daemon.ts:33">
P2: Validate `tickIntervalMs` here as well. Since it is now caller-supplied config, negative/NaN values turn the cleanup daemon into a near-tight loop.</violation>
</file>

Tip: instead of fixing issues one by one fix them all with cubic

Re-trigger cubic

Comment thread packages/scheduler/lib/db/client.ts Outdated
Comment thread packages/scheduler/lib/daemons/scheduling/scheduling.daemon.ts Outdated
Comment thread packages/orchestrator/lib/scheduler-config.ts
Comment thread packages/scheduler/lib/daemons/cleaning/cleaning.daemon.ts
Comment thread packages/scheduler/lib/models/tasks.ts Outdated
Comment thread packages/scheduler/lib/scheduler.ts Outdated
Base automatically changed from gui/NAN-5413/tests-scheduler to master May 28, 2026 18:12
@kaposke kaposke force-pushed the gui/NAN-5413/refac-decouple-scheduler-from-orchestrator branch from 747cd33 to d376a72 Compare May 28, 2026 19:18
@kaposke kaposke removed the request for review from a team May 28, 2026 19:18

@TBonnin TBonnin left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️
I would love to see the backpressure monitoring move outside of scheduler since it is a Nango functions executions / orchesrator problem, not a task scheduling problem. I am fine if you want to do it in next PR

Comment thread packages/orchestrator/lib/app.ts Outdated
cleaningTickIntervalMs: envs.ORCHESTRATOR_CLEANING_TICK_INTERVAL_MS,
monitoringTickIntervalMs: envs.ORCHESTRATOR_BACKPRESSURE_MONITORING_TICK_INTERVAL_MS,
cleaningOlderThanDays: envs.ORCHESTRATOR_CLEANING_OLDER_THAN_DAYS,
monitoringTopN: envs.ORCHESTRATOR_BACKPRESSURE_MONITORING_TOP_N

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move the entire backpressure monitoring out of scheduler and into orchestrator? Fine if you want to do it in a separate PR

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like it makes sense, butI haven't looked too much at it's workings. Is it too specific to the orchestrator?

Comment thread packages/scheduler/lib/daemons/cleaning/cleaning.daemon.ts
Comment thread packages/scheduler/lib/daemons/scheduling/scheduling.daemon.ts Outdated
Comment thread packages/scheduler/lib/db/client.ts Outdated
Comment thread packages/scheduler/lib/daemons/scheduling/scheduling.daemon.ts Outdated

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 4 files (changes from recent commits).

Tip: Review your code locally with the cubic CLI to iterate faster.

Fix all with cubic | Re-trigger cubic

Comment thread packages/orchestrator/lib/app.ts
Comment thread packages/scheduler/lib/db/client.ts
@kaposke kaposke added this pull request to the merge queue May 29, 2026
Merged via the queue into master with commit 5b121bd May 29, 2026
28 checks passed
@kaposke kaposke deleted the gui/NAN-5413/refac-decouple-scheduler-from-orchestrator branch May 29, 2026 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants