-
Notifications
You must be signed in to change notification settings - Fork 29
Description
As organizations adopt Octo STS, automation comes and goes, and it is easy to forget to clean up a trust policy, especially if the automation is managed by IaC, but cleaning up the policy requires a follow-up pull request, it is easy to forget to clean that policy up.
Currently we track (as is plainly visible in iac/
) 90d of exchange events in BigQuery, which has helped us identify when a particular identity is exhausting an organization's shared quota.
With this data, we can see what policies (identity
) have been active at what scope
for each organization (installation_id
):
SELECT o.installation_id, o.scope, o.identity, COUNT(DISTINCT o.actor.sub) AS actors, COUNT(*) AS c
FROM `octo-sts.cloudevents_octo_sts_recorder.dev_octo-sts_exchange` AS o
WHERE o.error IS NULL
GROUP BY o.installation_id, o.scope, o.identity
ORDER BY c DESC
... but we "don't know what we don't know". We cannot see beyond those 90d to know what policies were used and dropped off. Moreover, if a policy was created, but NEVER used, then the logs are an insufficient mechanism for identifying candidates for policy cleanup.
Proposal
My proposal is as follows...
Allow organizations to opt-in to a monthly policy cleanup cron by creating a policy in the {org}/.github
repository under .github/chainguard/octo-sts-monthly-cleanup.sts.yaml
(happy to :bikeshed:
here, the name isn't of consequence), with something like:
issuer: https://accounts.google.com
# The unique ID of the service account that octo-sts-monthly-cleanup runs as (auditable in the public actions logs for deploying octo-sts.dev)
subject: "1234567"
permissions:
# Required to push a branch to the repository containing defunct policies
contents: write
# Required to turn the branch ☝️ into a pull request.
pull_requests: write
# Clean up policies across the entire organization.
repositories: []
The rough flow for folks opted-in will look something like this:
- Monthly, octo-sts will enumerate all installations,
- For each installation, it will attempt to assume the opt-in identity -- 🚨 if folks don't have a VALID identity, it stops here
- Using the assumed identity, search for
org:{org} path:/^.github\/chainguard\/.*.sts.yaml/
to identify all candidate repos and policies, - Walk through the full repo and file list eliminating any policy used in the last 90d,
- For each repository with unused policies, open a pull request to remove all of the unused policies.
We should have 0-1
of these pull requests per repository, so I believe that we should adopt a similar convention to dependabot
and use a branch named octo-sts/{identity}
so for our strawperson policy name we would use the branch: octo-sts/octo-sts-monthly-cleanup
.
Each month the cleanup will force push to this branch, and open a PR if there is not one. So if a month goes by and the PR isn't merged, it will update in-place.
I have not yet prototyped any of this, so there may be some nuanced divergences as its implemented (e.g. IDK whether contents: read
is sufficient permissions to search 🤷♂️ ), but I believe that this rough outline should allow us to achieve the goal, while being OPT IN, and minimizing the impact on the organization's app quotas.