Skip to content

enableRGWDashboard reuses RGW_API_ACCESS_KEY dict-string as the new store's dashboard-admin access_key, breaking multi-store dashboards #17553

@UmanShahzad

Description

@UmanShahzad

What happened

When deploying a second CephObjectStore alongside an existing one, the operator's enableRGWDashboard() flow creates the dashboard-admin user in the new store's realm, but with malformed credentials. The dashboard UI subsequently fails for the new store with 403 InvalidAccessKeyId.

How to reproduce

  1. Cluster with one CephObjectStore (storeA); dashboard configured normally; ceph dashboard get-rgw-api-access-key returns the access key as a Python-dict-style string like {'storeA': '<key>'}.
  2. Add a second CephObjectStore (storeB).
  3. Operator log: ceph-object-controller: enabling rgw dashboardsetting the dashboard api secret keydone setting the dashboard api secret key.
  4. Inspect the new realm's dashboard-admin user:
    radosgw-admin user info --uid=dashboard-admin --rgw-realm=storeB --rgw-zonegroup=storeB --rgw-zone=storeB
    
    access_key is literally "{'storeA': '<existing-key>'}" (the dict-string, not a real S3 key). Same for secret_key.
  5. Dashboard UI: select storeB from the top-right dropdown → 403 InvalidAccessKeyId from RGW.

Root cause

In pkg/operator/ceph/object/objectstore.go, retrieveDashboardAPICredentials() runs ceph dashboard get-rgw-api-access-key and blindly assigns the entire stdout to user.AccessKey:

// retrieveDashboardAPICredentials
if string(out) != "" {
    accessKey := string(out)
    user.AccessKey = &accessKey
}

In multi-store setups the dashboard module stores per-store credentials as a Python-dict-style string (e.g. {'storeA': '<key>'}). retrieveDashboardAPICredentials does not parse this — it returns the whole dict-string. enableRGWDashboard() then passes it to CreateOrRecreateUserIfExists, which creates dashboard-admin in the new realm with that literal string as both the access and secret key. The subsequent ceph dashboard set-rgw-api-access-key -i <file> writes the same dict-string back, so the dashboard config remains the old dict and the new store is not represented at all.

Expected behavior

Either:

  • For each CephObjectStore, create the dashboard-admin user in its realm with freshly generated keys, then merge an entry for that store into the RGW_API_ACCESS_KEY / RGW_API_SECRET_KEY dicts.
  • Or, if the dashboard module already manages per-store credentials, don't try to set them from the operator at all for non-first stores.

Workaround

Manually for each additional store:

  1. radosgw-admin user create --uid=dashboard --display-name="Ceph Dashboard" --system --rgw-realm=<store> --rgw-zonegroup=<store> --rgw-zone=<store>
  2. Capture the generated access/secret keys.
  3. ceph config set mgr mgr/dashboard/RGW_API_ACCESS_KEY "{'storeA': '<keyA>', 'storeB': '<keyB>'}" (and same for RGW_API_SECRET_KEY).
  4. ceph mgr fail <active> to reload.
  5. Delete the broken dashboard-admin user from the new realm to clean up.

Environment

  • Rook v1.18.8
  • Ceph v19.2.3 (Squid)
  • Two CephObjectStores using sharedPools.poolPlacements

Related issues

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions