Tags: multica-ai/multica
Tags
fix(lark): hide only the Lark (international) connect entry; keep Fei… …shu (#3835) Mainland Feishu binding works; only the newly-added Lark (international, open.larksuite.com) install path is unreliable — some Lark installs complete on Lark's side but never persist a lark_installation row (no WS, no inbound, no task). Hide just the "Bind to Lark" CTA behind a single LARK_INTL_CONNECT_ENABLED flag and leave the "Bind to Feishu" entry, the settings panel, and all existing-installation management untouched. Flip LARK_INTL_CONNECT_ENABLED back to true to restore the Lark CTA; nothing else changes. Temporary measure while the Lark install-landing bug is investigated. - LarkAgentBindButton: the Lark button is gated by the flag; the Feishu button and the Connected badge / Manage / Disconnect are unchanged. - Tests: the CTA tests assert Feishu shown + Lark hidden; the Feishu click-to-begin (region=feishu) test stays; the Lark click test was removed (no button) and noted for restore; the dialog polling-error tests open via the Feishu CTA. MUL-3083 Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>
docs: add June 4 changelog entry (#3762) * docs: add June 4 changelog entry Co-authored-by: multica-agent <github@multica.ai> * docs: refine June 4 changelog copy Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>
feat(metrics): BusinessSamplerCollector for active users / queued / r… …untime gauges (MUL-2947) (#3706) * feat(metrics): scrape-time BusinessSamplerCollector for active users / queued / runtime gauges (MUL-2947) Adds an opt-in prometheus.Collector that runs a fixed set of read-only SQL queries on every /metrics scrape and exposes the results as gauges: - multica_active_users{window=5m|1h|24h} - multica_active_workspaces{window=...} - multica_agent_task_queued{source} - multica_agent_task_running{source,runtime_mode} - multica_agent_task_stuck_total{source} - multica_runtime_online{runtime_mode,provider} - multica_runtime_heartbeat_age_seconds{runtime_mode} (histogram) - multica_workspace_total Plus a self-introspection histogram multica_business_sampler_query_seconds{name=...} and a counter multica_business_sampler_query_errors_total{name=...} so the sampler's own behaviour is observable on /metrics. Production-safety contract per the PR4 brief: - every query runs in its own BEGIN READ ONLY tx with SET LOCAL statement_timeout = '500ms' (configurable) - the sampler takes a dedicated *pgxpool.Pool option so operators can isolate it from business traffic - successful results are cached for 5–10s (default 8s) to absorb concurrent scrapes from multiple Prometheus replicas - every SQL has a hard LIMIT 100 fallback - all label values flow through the existing BusinessMetrics NormalizeTaskSource / NormalizeRuntimeMode / NormalizeRuntimeProvider whitelists, so a misbehaving runtime cannot inflate cardinality - sampler is OPT-IN via RegistryOptions.BusinessSampler — existing callers that only pass Pool keep their current behaviour and never start hitting the DB on /metrics Tests cover: emit shape, TTL cache (one DB call per N scrapes), bounded cardinality under malicious labels, opt-out (no leakage), and DB-hang isolation (unreachable host -> /metrics returns within 5s, query_errors_total advances). Refs MUL-2947 (depends on PR2 / MUL-2948, merged in #3695). Co-authored-by: multica-agent <github@multica.ai> * fix(metrics): address PR4 review — wire sampler in main.go, fix LIMIT bug, add live-DB statement_timeout test Three fixes from 大彪's review on #3706: 1. main.go was building NewRegistry without the BusinessSampler option, so the collector was effectively dead code in prod. Now constructs a dedicated 2-conn pgxpool (newSamplerDBPool) from the same DATABASE_URL when METRICS_ADDR is set, plumbs it into RegistryOptions.BusinessSampler, and defers Close() at shutdown. A pool-build failure logs and disables the sampler instead of taking down the server. 2. queryActiveUsers / queryActiveWorkspaces previously wrapped the distinct-user/workspace subquery in a 'LIMIT 100', then COUNT(*)'d the result — capping the active-user gauge at 100 regardless of reality. Removed the inner LIMIT; the COUNT scalar is one row anyway, and metric cardinality is bounded by the fixed samplerWindows allow-list, not by the SQL shape. 3. The previous DB-hang test only exercised the acquire-fails path. Added business_sampler_pgsleep_test.go which connects to a live Postgres (skips cleanly when DATABASE_URL is not set), runs SELECT pg_sleep(2) inside a sampler-style tx with SET LOCAL statement_timeout = '500ms', and asserts: - the call returns in well under 1.5 s (proving the server-side cancellation, not just our caller-side context) - query_errors_total{name=pg_sleep_canary} advances - the duration histogram records the cancellation Verified locally: 550 ms, SQLSTATE 57014 'canceling statement due to statement timeout' — exactly the safety net the PR claims. Refs MUL-2947 / PR #3706. Co-authored-by: multica-agent <github@multica.ai> * test(metrics): assert SQLSTATE 57014 on pg_sleep cancellation The previous assertion only checked that the query was cut off in well under the sleep duration, which a caller-side context cancellation would also satisfy. Capturing the inner pgconn.PgError and asserting Code == "57014" ("query_canceled") nails down that Postgres itself cancelled the statement because of the SET LOCAL statement_timeout — so a regression that drops the SET LOCAL line fails this test loudly instead of silently passing on context cancellation. Refs MUL-2947 / PR #3706 review nit. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>
docs: add Skill search changelog (#3609) * docs: add 2026-06-01 changelog Co-authored-by: multica-agent <github@multica.ai> * docs: refine Skill Command changelog copy Co-authored-by: multica-agent <github@multica.ai> * docs: correct Skill search changelog wording Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>
fix(comments): revert since-delta to issue-wide, steer to parent thre… …ad first (#3535) #3509/#3523 scoped the comment-trigger since-delta count to the triggering thread, so an agent resuming a busy issue only saw "+N in this thread" and lost visibility of new comments in other threads. Revert the count to issue-wide (every thread), keeping the trigger-comment + agent-own exclusions, and reshape the warm-path hint to: - report the issue-wide new-comment volume, - steer the agent to read the triggering (parent) thread FIRST (`--thread <trigger> --since`, or `--tail 30` for full context), - demote the issue-wide `--since` catch-up to an only-if-needed fallback ("don't read them all blindly"). Also fixes the now-stale "scoped to the triggering thread" wording in the resumed-session no-delta hint (it's issue-wide zero now). Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>
docs(changelog): add v0.3.11 release notes (#3449) * docs(changelog): add v0.3.11 release notes Co-authored-by: multica-agent <github@multica.ai> * docs(changelog): refine v0.3.11 release notes Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>
PreviousNext