-
Notifications
You must be signed in to change notification settings - Fork 31
CEP XXXX: Build provenance metadata #113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm excited to see this being formalized!
cep-9999.md
Outdated
- `remote_url`: Required on CI. Repository URL of the feedstock being built. | ||
- `flow_run_id`: Optional. CI-specific identifier for the workflow run. | ||
|
||
For local workflows such as those specified by CFEP 03, `remote_url` MAY be omitted, but authors strongly recommend providing the adequate value manually if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For local workflows such as those specified by CFEP 03, `remote_url` MAY be omitted, but authors strongly recommend providing the adequate value manually if necessary. | |
For local workflows such as those specified by [CFEP-03](https://github.com/conda-forge/cfep/blob/main/cfep-03.md), `remote_url` MAY be omitted, but authors strongly recommend providing the adequate value manually if necessary. |
Also, if remote_url
is omitted, should sha
also be omitted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I had assumed users interested in provenance are already using some type of version control, but maybe we can't force that either.
About the dash in CFEP 03, see
Line 35 in 971cb23
name: CEPs must be referred to with 'CEP N' (no dash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please keep sha
as mandatory ... some packages may not have the remote_url
to not expose private repositories to the public, but the sha
is this helpful for internal audits / attestations to check that e.g. the sha
actually exists / was reviewed as part of an PR / triggered a CI run etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am writing this to describe the current conventions, not to impose on how packages should be built. I think that should be decided by packaging organizations separately. I plan to submit a CFEP for conda-forge where we do "require these fields in CI pipelines, as recommended by CEP XYZ".
Someone using conda-build to share an artifact with their research lab internally may not need to care about whether the recipe is version controlled or what a git hash is.
cep-9999.md
Outdated
|
||
- `sha`: Required. Commit hash of the feedstock being built. | ||
- `remote_url`: Required on CI. Repository URL of the feedstock being built. | ||
- `flow_run_id`: Optional. CI-specific identifier for the workflow run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name flow_run_id
originates from Anaconda's current build system. I wish that we could use something more generic and meaningful. But I guess the boat has sailed already? Or do you foresee a future where CF would be willing to change this to something else (on our side at Anaconda, this is very easy to do).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can change it easily too, but then we will have to maintain two ways of accessing this metadata because the already stamped artifacts won't be rebuilt. I think flow_run_id
is sufficiently generic. I always read it as "(work)flow run ID".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flow_run_id
is constantly used for defaults and conda-forge and the naming was accepted on both sites in the past ... have a look into this PR ... you can see it that PR that every CI system (Azure, Travis, Github,...) name their variable containing the ID differently ... flow_run_id
was/is meant to be agnostic and the value prefix tells what automation/ci/flow/workflow system was used.
Is this an intermediate step towards generating SLSA provenance attestations? If so, should we just straight in that direction, or would trying to implement SLSA delay the gains we want to get now? |
thanks @jaimergp for the initiative here. some background:
@chenghlee yes, it was meant as intermediate step to enable (multiple different) actual attestations via an attestation worker using the data to lookup things:
though, if and how those attestations can be stored via e.g. Sigstore in context of SLSA is a different thing. tl;dr enables attestations that would be hard otherwise without any provenance data. |
I'm not personally aiming for that, only wanted to standardise what otherwise was an undocumented convention. I think that SLSA provenance can be iterated on later, and this can just reflect the current state. This way we can refer at non-SLSA provenance like "CEP XYZ metadata". |
Checklist for submitter
cep-0000.md
namedcep-XXXX.md
in the root level.Checklist for CEP approvals
${greatest-number-in-main} + 1
.cep-XXXX.md
file has been renamed accordingly.# CEP XXXX -
header has been edited accordingly.pre-commit
checks are passing.defaults
too? More details in Burn in flow_run_id, remote_url, sha into package meta data conda-forge/conda-smithy#1577.