Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
5672f6e
Standardize names in the conda ecosystem
jaimergp Mar 11, 2025
b91b868
Add versions and categorise a bit better
jaimergp Mar 11, 2025
64584b5
Descope versions, fix build strings
jaimergp Mar 12, 2025
563d9f4
Update cep-9999.md
jaimergp Mar 12, 2025
8d85088
Amend package names
jaimergp Mar 12, 2025
3df5157
do not recommend whitespace in labels
jaimergp Mar 12, 2025
732cf77
add backwards compatibility section
jaimergp Mar 12, 2025
ff68855
Merge branch 'main' into naming
beckermr Mar 12, 2025
e1c043f
Apply suggestions from code review
jaimergp Mar 12, 2025
90c4867
Add length limits
jaimergp Mar 12, 2025
1b9f46b
label -> channel
jaimergp Mar 12, 2025
809aa05
Define what a channel is
jaimergp Mar 12, 2025
cb227fc
Restrict subdirs to alphanumeric and single dash separators
jaimergp Mar 13, 2025
bcd3e31
Update cep-9999.md
beckermr Mar 14, 2025
b2d733a
Allow plus symbols in build strings
jaimergp Mar 14, 2025
3a09b36
Merge branch 'naming' of github.com:jaimergp/ceps into naming
jaimergp Mar 14, 2025
10c9265
Further restrict subdirs
jaimergp Mar 18, 2025
358b48f
rename
jaimergp Mar 18, 2025
dd0a28d
Rework channel names as base URLs
jaimergp Mar 18, 2025
8f2221b
Merge branch 'main' of github.com:conda/ceps into naming
jaimergp Mar 18, 2025
1185921
pre-commit
jaimergp Mar 18, 2025
5e3c86c
add lang to fences
jaimergp Mar 18, 2025
567ff85
add stats from anaconda.org
jaimergp Mar 19, 2025
4b3a39e
pre-commit
jaimergp Mar 19, 2025
e81f84f
add discussion link
jaimergp Mar 19, 2025
b07d334
Reword abstract
jaimergp Mar 19, 2025
42c33dd
Refer to RFC2119
jaimergp Mar 19, 2025
c2b25a6
Change table of contents a bit
jaimergp Mar 19, 2025
88d58dc
Add distribution strings
jaimergp Mar 19, 2025
501bfc1
clarify dist str != match spec
jaimergp Mar 19, 2025
ba6b09f
clarify scheme omission
jaimergp Mar 19, 2025
e9e6c06
Update cep-XXXX.md
jaimergp Mar 19, 2025
e79f5bf
pre-commit
jaimergp Mar 19, 2025
73c8607
Update cep-XXXX.md
beckermr Mar 19, 2025
a59522f
update path component regex
jaimergp Mar 19, 2025
a6c119a
Merge branch 'main' into naming
beckermr Mar 19, 2025
c473235
refine lengths
jaimergp Mar 29, 2025
5291902
add extensions
jaimergp Mar 29, 2025
d93720b
Merge branch 'main' into naming
beckermr Mar 30, 2025
a54e326
Reformat CEP document for readability
beckermr Mar 30, 2025
3961b88
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 30, 2025
9686874
Add additional references to comments in CEP
beckermr Mar 30, 2025
21a8e3a
style: correct spacing
beckermr Mar 30, 2025
8b6e3c3
update date
jaimergp Mar 30, 2025
913279e
Update cep-XXXX.md
beckermr Apr 1, 2025
f2d5480
add cheng to authors
jaimergp Apr 1, 2025
1a70702
add exception for file:// channels
jaimergp Apr 1, 2025
3793b1d
reword channel names
jaimergp Apr 1, 2025
14aae11
Merge branch 'naming' of github.com:jaimergp/ceps into naming
jaimergp Apr 1, 2025
3f7c94f
Update cep-XXXX.md
beckermr Apr 1, 2025
80a1225
bye \s
jaimergp Apr 1, 2025
162c57c
Merge branch 'naming' of github.com:jaimergp/ceps into naming
jaimergp Apr 1, 2025
4737ce9
Apply suggestions from code review
jaimergp Apr 2, 2025
de486d2
Swap dist str and filenames
jaimergp Apr 2, 2025
de7a36f
wrap paragraphs
jaimergp Apr 2, 2025
15bcd90
Apply suggestions from code review
beckermr Apr 2, 2025
e9afe15
Reword channel name use in absence of scheme and authority
jaimergp Apr 3, 2025
7eb2fbc
update date
jaimergp Apr 3, 2025
74f7156
Update cep-XXXX.md
jaimergp Apr 4, 2025
897a9a2
clarify subdir
jaimergp Apr 4, 2025
bcf6112
add restriction about channel names and label names wrt subdirs
jaimergp Apr 14, 2025
7ae6b5b
Recommend against subdir ambiguity
jaimergp Apr 15, 2025
858fb6f
Merge branch 'main' into naming
beckermr Apr 15, 2025
a9d809d
Update cep-XXXX.md
beckermr Apr 15, 2025
edfbe6e
Fix line breaks in channel base URL section
beckermr Apr 16, 2025
9f824e7
add note about __anaconda_core_depends deprecation
jaimergp Apr 17, 2025
f629100
Merge branch 'main' of github.com:conda/ceps into naming
jaimergp May 7, 2025
0794754
Mint as CEP 26
jaimergp May 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ for conda's implementation, all major changes should be submitted as
| [0023](cep-0023.md) | Text spec input files |
| [0024](cep-0024.md) | Specification of <code>environment.yml</code> input files |
| [0025](cep-0025.md) | Versioning of Existing conda Standards |
| [0026](cep-0026.md) | Identifying Packages and Channels in the conda Ecosystem |

## References

Expand Down
239 changes: 239 additions & 0 deletions cep-0026.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
# CEP 26 - Identifying Packages and Channels in the conda Ecosystem

<table>
<tr><td> Title </td><td> CEP 26 - Identifying Packages and Channels in the conda Ecosystem </td>
<tr><td> Status </td><td> Approved </td></tr>
<tr><td> Author(s) </td><td>
Jaime Rodríguez-Guerra &lt;jaime.rogue@gmail.com&gt; <br />
Matthew R. Becker &lt;becker.mr@gmail.com&gt; <br />
Cheng H. Lee &lt;clee@anaconda.com&gt;
</td></tr>
<tr><td> Created </td><td> Mar 11, 2025</td></tr>
<tr><td> Updated </td><td> Apr 17, 2025</td></tr>
<tr><td> Discussion </td><td> https://github.com/conda/ceps/pull/116 </td></tr>
<tr><td> Implementation </td><td> N/A </td></tr>
</table>

## Abstract

This CEP aims to standardize names and other strings used to identify packages, artifacts and
channels in the conda ecosystem.

## Specification

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT",
"RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as
described in [RFC2119][RFC2119] when, and only when, they appear in all capitals, as shown here.

More specifically, violations of a MUST or MUST NOT rule MUST result in an error. Violations of the
rules specified by any of the other all-capital terms MAY result in a warning, at discretion of the
implementation.

### Identifying package artifacts

The conda ecosystem distinguishes between two types of packages:

- Distributable package names: represented by a concrete, downloadable, extractable conda artifact.
- Virtual package names: not backed by any concrete artifact. They only exist on the client side.

#### Package names

A distributable package name MUST only consist of lowercase ASCII letters, numbers, hyphens,
periods and underscores. It MUST start with a letter, a number, or a single underscore. It MUST NOT
include two consecutive separators (hyphen, period, underscore).

Virtual package names MUST only consist of lowercase ASCII letters, numbers, hyphens, periods and
underscores. They MUST NOT use two consecutive separators, with one exception: they MUST start with
two underscores.

Distributable package names MUST match the following case-insensitive regex:
`^(([a-z0-9])|([a-z0-9_](?!_)))[._-]?([a-z0-9]+(\.|-|_|$))*$`.

Virtual package names MUST follow this regex: `^__[a-z0-9][._-]?([a-z0-9]+(\.|-|_|$))*$`.

In all cases, the maximum length of a package name MUST NOT exceed 64 characters.

#### Version strings

Version strings MUST only consist of digits, periods, lowercase ASCII letters, underscores, plus
symbols, and exclamation marks. Additional rules apply but are out of scope in this CEP and will be
discussed separately.

The maximum length of a version string MUST NOT exceed 64 characters.

#### Build strings

Builds strings MUST only consist of ASCII letters, numbers, periods, plus symbols, and underscores.
They MUST match this regex `^[a-zA-Z0-9_\.+]+$`.

The maximum length of a build string MUST NOT exceed 64 characters.

#### Artifact extensions

Artifact extensions MUST only consist of lowercase ASCII letters, numbers and periods. They must
start and end with a letter or a number. They MUST NOT include two consecutive periods. They MUST
match this regex `^[a-z0-9](\.?[a-z0-9])*$`.

The maximum length of a file extension MUST NOT exceed 16 characters.

> The conda ecosystem currently recognizes two artifact extensions: `tar.bz2` and `conda`,
versioned `v1` and `v2` respectively.

#### Distribution strings

A "distribution string" MAY be used to identify a package artifact, without specifying the
extension or channel. It MUST match the following syntax:

```text
[<subdir>/]<package name>-<version string>-<build string>
```

Distribution strings apply to distributable packages. They are used as the name of
the directories where artifacts are extracted in the package cache, for example.

Virtual packages MAY be also identified by a distribution string, but in those cases a subdir MUST NOT be present.

> Note: Despite the similarity, distribution strings are not `MatchSpec`-like specifiers and MUST
> NOT be used as such.

#### Filenames

The filename of distributable conda artifacts is obtained by adding the artifact extension to its
distribution string (without the subdir, if present). It MUST match this syntax:

```text
<package name>-<version string>-<build string>.<extension>
```

The maximum length of a filename MUST NOT exceed 211 characters.

Virtual conda packages do not exist on disk and SHOULD NOT need filename standardization.

### Identifying channels

A conda channel is defined as a URL where one can find one or more `repodata.json` files arranged
in one subdirectory (_subdir_) each. `noarch/repodata.json` MUST be present to consider the parent
location a channel.

#### Channel base URLs

The base URL for the arbitrary location of a repodata file is defined as:

```text
<scheme>://[<authority>][/<path>/][/label/<label name>]/<subdir>/repodata.json
```

with `<scheme>`, `<authority>` and `<path>` defined by [RFC
3986](https://datatracker.ietf.org/doc/html/rfc3986#section-3.2).

Taken the channel definition above, the base URL without trailing slashes is thus:

```text
<scheme>://[<authority>][/<path>/][/label/<label name>]
```

For example, given `https://conda.anaconda.org/conda-forge/noarch/repodata.json`, the part leading
to `noarch/repodata.json` and thus base URL is `https://conda.anaconda.org/conda-forge`. For local
repodata such as `file:///home/username/channel/noarch/repodata.json`, the channel base URL is
`file:///home/username/channel`.

When present, each path component MUST only contain lowercase ASCII letters, numbers, underscores,
periods, and dashes. They MUST NOT start with a period or a dash. They SHOULD start and end with a
letter or a number. If present, each path component MUST match this regex:

```re
^[a-z0-9_][a-z0-9_.-]*$
```

For `file://`-based channel URLs, the path component rules MAY be understood as recommendations
only.

The maximum length of an individual path component in a channel base URL MUST NOT exceed 128
characters. The maximum length of a channel base URL SHOULD NOT exceed 256 characters.

To avoid ambiguous `MatchSpec` grammar, the last path component of a channel base URL SHOULD NOT
match any `subdir` identifiers. If it does, the behavior in this ambiguous case is not defined
and implementation dependent.

#### Channel names

For convenience, the channel _name_ is defined as the concatenation of `scheme`, `authority` and
`path` components of a channel URL. At least one of `authority` or `path` SHOULD be present. In
their absence, the channel name MUST be considered empty, regardless the scheme. Empty channel
names SHOULD NOT be used.

When the scheme and authority fields are missing, the full URL can be inferred with these rules:

- If the channel name matches the regex `^\.{0,2}[/\\].*$`, or if it matches the regex
`^[A-Z]:([\\/].*)?$` (for Windows drives), it SHOULD be understood as the path component of a
`file://` URL.
- Otherwise, the tool SHOULD provide a user-configurable mechanism to use a default scheme and
authority, with the provided channel name taken as the rest of the path component. At the time of
this CEP's writing, most tools assume the default URL scheme and authority to be
`https://conda.anaconda.org`.

#### Subdir names

Channel subdir names MUST either be the literal `noarch` or a string following the syntax
`{os}-{arch}`, where `{os}` and `{arch}` MUST only consist of lowercase ASCII letters and numbers.
Non-`noarch` subdirs MUST match this regex: `^[a-z0-9]+-[a-z0-9]+$`.

The maximum length of a subdir name MUST NOT exceed 32 characters.

#### Label names

Channel label names MUST only consist of ASCII letters, digits, underscores, hyphens, forward
slashes, periods, and whitespace. They MUST start with a letter. They MUST match this regex:
`^[a-zA-Z][0-9a-zA-Z_\-\./]*$`. The last `/`-delimited component of a label
SHOULD NOT match any `subdir` identifier. If it does, the behavior in this ambiguous
case is undefined and implementation dependent.

The label `nolabel` is reserved and MUST only be used for conda packages which have no other
labels. In other words, in the space of labels, the empty set is represented by the labels
`nolabel`.

A URL for a package, repodata, etc. without a label component MUST be assumed to have the default
label `main`.

The maximum length of a label name MUST NOT exceed 128 characters.

## Backwards compatibility

The conda subdir and package name regexes are backwards compatible with the current `conda`
implementation (25.3) and all existing packages on the `defaults` and `conda-forge` channels,
except for the `__anaconda_core_depends` package on the `defaults` channel, which was [deprecated
in April 2025][anaconda-core-depends-depr]. See [this
comment](https://github.com/conda/ceps/pull/116#discussion_r1992234677).

The regex for labels was pulled from an anaconda.org error message describing the set of valid
labels.

As of 2025-03-12T19:00Z, of the ~1.9M channel names on anaconda.org:

- 7,219 violate the regex `^[a-z0-9]+((-|_|.)[a-z0-9]+)*$`;
- 98 violate the regex `^[a-z0-9][a-z0-9_.-]*$` (allowing channel names to end with `_`, `.`, or
`-`); and
- 6 violate `^[a-z0-9_][a-z0-9_.-]*$` (allowing channel names to start with `_`). Of those six,
five start with `.`, and the other starts with `~`.

See [this comment](https://github.com/conda/ceps/pull/116#discussion_r1992154574) for more details.
The authors have excluded the channel names in the last case that start with `.` or `~` given
possible security implications. A low percentage, ~0.4%, of channels do not match the
recommendations for channel names above, but are allowed.

The maximum lengths allowed for the different fields have been chosen so the resulting path
components (directory names, filenames) comfortably fit in a the 255-char maximum limit some
filesystems impose. As of 2025-03-01T13:00Z, there are no violations of these limits in any of the
packages published for `conda-forge`, `bioconda` and `defaults`. See [this
comment](https://github.com/conda/ceps/pull/116#issuecomment-2763392999) and [this
comment](https://github.com/conda/ceps/pull/116#issuecomment-2759130187) for more details.

## Copyright

All CEPs are explicitly [CC0 1.0 Universal](https://creativecommons.org/publicdomain/zero/1.0/).

<!-- links -->

[RFC2119]: https://www.ietf.org/rfc/rfc2119.txt

[anaconda-core-depends-depr]: https://conda.zulipchat.com/#narrow/channel/457607-general/topic/conda-oci.20incubation/near/512865665