Skip to content

Conversation

@beckermr
Copy link
Contributor

@beckermr beckermr commented Mar 11, 2025

This draft CEP has an updated specification for storing conda packages as OCI artifacts. It is an updated form of the specification in PR #70, given the feedback on the previous PR.

Rendered CEP

@beckermr beckermr changed the title [CEP XYZ] Storing conda Packages as OCI Artifacts [CEP XYZZ] OCI Storage of Conda Artifacts Mar 11, 2025
Co-authored-by: jaimergp <jaimergp@users.noreply.github.com>
Copy link
Contributor

@jaimergp jaimergp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few cosmetic comments, but unless I understood wrong, there's a problem with the _ OCI-encoding.

@beckermr beckermr requested a review from jaimergp March 11, 2025 14:56
@beckermr
Copy link
Contributor Author

Good catch @jaimergp! I reran my notebooks with _ going to _U and everything works great.

Interestingly enough, we apparently have no build strings with a double underscore in either defaults or conda-forge! All OCI-encoded conda artifact names I produced using _ -> __ from both of those channels passed the OCI regexes from the Distribution Spec. Fun!

@beckermr beckermr changed the title [CEP XYZZ] OCI Storage of Conda Artifacts [CEP 21] OCI Storage of Conda Artifacts Mar 11, 2025
@beckermr beckermr requested a review from jaimergp March 11, 2025 18:54
@beckermr
Copy link
Contributor Author

Thanks for the additional comments @jaimergp! Any thing else you can see?

@jaimergp
Copy link
Contributor

Nothing too big, just a couple of observations:

  • The copyright statement needs to be put back in.
  • A References section that compiles the different URLs mentioned would be welcome.
  • I would wait until this CEP is approved to assign a number. There are a few ongoing PRs that might get voted before this one and then it would be confusing. e.g. there's a PR named "CEP 17", but CEP 17 ended up being this one. This might be reflecting a problem in how we mint CEP numbers. Happy to discuss further!

@beckermr
Copy link
Contributor Author

pre-commit.ci autofix

cep-XXXX.md Outdated

- `<OCI-compatible channel path>`: `^[a-z0-9]+((\.|_|__|-+)[a-z0-9]+)*(\/[a-z0-9]+((\.|_|__|-+)[a-z0-9]+)*)*$` (same as the regex for an OCI repository `<name>`)
- `<OCI-compatible label>`: `^[a-z0-9]+((\.|_|__|-+)[a-z0-9]+)*(\/[a-z0-9]+((\.|_|__|-+)[a-z0-9]+)*)*$` (same as the regex for an OCI repository `<name>`)
- `<OCI-compatible subdir>`: `^[a-z0-9]+(-[a-z0-9]+)*$`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as the one proposed in #116, right? We should update this part to refer to that one once accepted.

beckermr and others added 4 commits March 18, 2025 20:19
Co-authored-by: jaimergp <jaimergp@users.noreply.github.com>
Co-authored-by: jaimergp <jaimergp@users.noreply.github.com>
Co-authored-by: jaimergp <jaimergp@users.noreply.github.com>
@beckermr
Copy link
Contributor Author

I was thinking on this more and I think we should not use the m prefix and instead disallow current repodata on OCI channels. My reasons are

  • Older conda clients that had performance improvements from current repodata won't be able to read OCI channels directly anyways. They can fall back to repodata and the client should be upgraded anyways.
  • Older conda clients that access an OCI channel via a web proxy could request current_repodata.json and the web proxy could translate to repodata_current.json for use with an OCI channel if we wanted.
  • The CEP in [CEP 26] Identifying Packages and Channels in the conda Ecosystem #116 specifies that the URL of a conda channel needs to have <channel base URL>/noarch/repodata.json as a valid address. As long as we stick to the tag "latest" being the current most recent image, then we can meet this spec.
  • We will likely save ourselves some pain by only having to deal with prefixing package names with c as opposed to having to prefix everything.

Thoughts @jaimergp?

@jaimergp
Copy link
Contributor

All of that is currently valid and sound. I'm just worried that we are being lucky now with no conflicts in the filename prefixes (I do second dropping current_repodata.json, we don't need it these days, and we can consider it an Anaconda.org implementation detail if there's such a need).

I just don't see the pain in prefixing metadata files with m. If we don't, we might run into a situation where we want to add a new type of files, and the only way out would be to add OCI-specific sub-subdirs like conda-forge/linux-64/packages/ and conda-forge/linux-64/metadata/, and that seems like more painful.

But if you think the burden of prefixing m to everything is not worth it, I won't block it. I'd be happy to hear what others think too.

@beckermr
Copy link
Contributor Author

Hmmm. Maybe the right thing is to distinguish between the abstract url given to conda versus the storage location on disk. We can specify that more clearly in the CEP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants