Before reporting an issue
Area
admin/api
Describe the bug
We use a single realm with one client where client roles represent fine-grained permissions tied to individual resources (e.g., per-site or per-entity access controls). As resources are onboarded, new client roles are created — we currently have ~2,400 client roles and this count grows continuously. Users are assigned ~20 direct roles, roughly a dozen of which are composite roles each bundling ~16-17 leaf roles, resulting in ~212 effective roles per user. Our services rely on the Admin API endpoint GET /admin/realms/{realm}/users/{user-id}/role-mappings/clients/{client-id}/composite to resolve a user's effective client role mappings.
The getCompositeClientRoleMappings() implementation iterates over all client roles (not just the user's assigned roles) and calls user.hasRole() on each one. hasRole() in turn delegates to KeycloakModelUtils.searchFor(), which recursively expands composite roles using a fresh HashSet on every invocation — there is no memoization across calls. This produces O(C × M × D) complexity, where C is the total number of client roles, M is the number of the user's direct role mappings, and D is the composite expansion depth. In our case, this means roughly 800,000 recursive role-containment checks per single API call.
Under concurrency, this causes severe latency degradation. A single request completes in ~1s, but under moderate load (10-60 parallel requests), response times spike to 7-23s due to CPU saturation and GC pressure from the large number of short-lived HashSet allocations. The database is not the bottleneck — we observe zero DB queries during these requests and a 99.96% Infinispan cache hit ratio. The problem is purely algorithmic: the work is proportional to the total number of client roles in the realm rather than the number of roles assigned to the user, and it will get progressively worse as more client roles are added.
Related issues:
Version
26.5.5
Regression
Expected behavior
The endpoint should return effective client role mappings in time proportional to the user's role count, not the total number of client roles in the realm. RoleUtils.expandCompositeRoles() already exists in the codebase and performs BFS expansion in O(M × D) — it should be used here instead of iterating all client roles with user.hasRole().
Actual behavior
ClientRoleMappingsResource.getCompositeClientRoleMappings() (line 130-145 in services/src/main/java/org/keycloak/services/resources/admin/ClientRoleMappingsResource.java) iterates all client roles via client.getRolesStream() and filters with user.hasRole(). Each hasRole() call triggers recursive composite expansion through KeycloakModelUtils.searchFor() with a new HashSet per invocation.
Current code:
Stream<RoleModel> roles = client.getRolesStream(); // ALL client roles (2,400+)
return roles.filter(user::hasRole).map(toBriefRepresentation); // hasRole() per role
Proposed fix using existing RoleUtils.expandCompositeRoles():
Set<RoleModel> directRoles = user.getRoleMappingsStream().collect(Collectors.toSet());
Set<RoleModel> effectiveRoles = RoleUtils.expandCompositeRoles(directRoles);
return effectiveRoles.stream()
.filter(r -> r.isClientRole() && r.getContainerId().equals(client.getId()))
.map(toBriefRepresentation);
This changes the algorithm from O(C × M × D) to O(M × D + C), eliminating ~800,000 recursive checks per request.
Validated results (patched image deployed to production-like environment, 6 replicas):
| Scenario |
Before |
After |
Improvement |
| Sequential (warm cache) |
1.009s |
0.115s |
8.8x |
| 60 concurrent avg |
6.822s |
0.604s |
11x |
| 60 concurrent p95 |
8.794s |
0.663s |
13x |
| Sustained 60 rps / 30s avg |
~7-23s |
0.144s |
50-160x |
Responses are byte-for-byte identical between patched and unpatched versions (212 roles, same IDs and content).
How to Reproduce?
- Create a realm with a client containing ~2,400+ roles
- Create ~12 composite roles, each bundling ~16-17 leaf client roles
- Assign ~20 direct roles (mix of composite and leaf) to a user, resulting in ~212 effective roles
- Call
GET /admin/realms/{realm}/users/{user-id}/role-mappings/clients/{client-id}/composite
- Observe ~1s latency for a single request
- Send 60 concurrent requests to the same endpoint
- Observe average latency of 6-7s, p95 of 8-9s
The latency scales with the total number of client roles, not the user's role count. Adding more client roles (even unrelated to the user) increases per-request latency.
Anything else?
We have a patched image deployed and validated in our environment. We plan to submit a PR with the fix and integration tests. The fix uses RoleUtils.expandCompositeRoles() which already exists in the codebase (introduced via work on composite role optimization) but was never applied to this specific endpoint.
Note: this issue was investigated and the fix developed with assistance from Claude (Anthropic AI). The algorithmic analysis, patching, deployment, and load test validation were performed by a human engineer.
Before reporting an issue
Area
admin/api
Describe the bug
We use a single realm with one client where client roles represent fine-grained permissions tied to individual resources (e.g., per-site or per-entity access controls). As resources are onboarded, new client roles are created — we currently have ~2,400 client roles and this count grows continuously. Users are assigned ~20 direct roles, roughly a dozen of which are composite roles each bundling ~16-17 leaf roles, resulting in ~212 effective roles per user. Our services rely on the Admin API endpoint
GET /admin/realms/{realm}/users/{user-id}/role-mappings/clients/{client-id}/compositeto resolve a user's effective client role mappings.The
getCompositeClientRoleMappings()implementation iterates over all client roles (not just the user's assigned roles) and callsuser.hasRole()on each one.hasRole()in turn delegates toKeycloakModelUtils.searchFor(), which recursively expands composite roles using a freshHashSeton every invocation — there is no memoization across calls. This produces O(C × M × D) complexity, where C is the total number of client roles, M is the number of the user's direct role mappings, and D is the composite expansion depth. In our case, this means roughly 800,000 recursive role-containment checks per single API call.Under concurrency, this causes severe latency degradation. A single request completes in ~1s, but under moderate load (10-60 parallel requests), response times spike to 7-23s due to CPU saturation and GC pressure from the large number of short-lived
HashSetallocations. The database is not the bottleneck — we observe zero DB queries during these requests and a 99.96% Infinispan cache hit ratio. The problem is purely algorithmic: the work is proportional to the total number of client roles in the realm rather than the number of roles assigned to the user, and it will get progressively worse as more client roles are added.Related issues:
/compositeendpoint (11,000 roles, 15-22s). Closed after PR 23404 improve client role listing #24012 improved role listing, but theuser.hasRole()per-role loop ingetCompositeClientRoleMappings()was never addressed.expandCompositeRoles()utility but never applied it to this endpoint.Version
26.5.5
Regression
Expected behavior
The endpoint should return effective client role mappings in time proportional to the user's role count, not the total number of client roles in the realm.
RoleUtils.expandCompositeRoles()already exists in the codebase and performs BFS expansion in O(M × D) — it should be used here instead of iterating all client roles withuser.hasRole().Actual behavior
ClientRoleMappingsResource.getCompositeClientRoleMappings()(line 130-145 inservices/src/main/java/org/keycloak/services/resources/admin/ClientRoleMappingsResource.java) iterates all client roles viaclient.getRolesStream()and filters withuser.hasRole(). EachhasRole()call triggers recursive composite expansion throughKeycloakModelUtils.searchFor()with a newHashSetper invocation.Current code:
Proposed fix using existing
RoleUtils.expandCompositeRoles():This changes the algorithm from O(C × M × D) to O(M × D + C), eliminating ~800,000 recursive checks per request.
Validated results (patched image deployed to production-like environment, 6 replicas):
Responses are byte-for-byte identical between patched and unpatched versions (212 roles, same IDs and content).
How to Reproduce?
GET /admin/realms/{realm}/users/{user-id}/role-mappings/clients/{client-id}/compositeThe latency scales with the total number of client roles, not the user's role count. Adding more client roles (even unrelated to the user) increases per-request latency.
Anything else?
We have a patched image deployed and validated in our environment. We plan to submit a PR with the fix and integration tests. The fix uses
RoleUtils.expandCompositeRoles()which already exists in the codebase (introduced via work on composite role optimization) but was never applied to this specific endpoint.Note: this issue was investigated and the fix developed with assistance from Claude (Anthropic AI). The algorithmic analysis, patching, deployment, and load test validation were performed by a human engineer.