Skip to content

anaconda/anaconda-ident

Repository files navigation

anaconda-ident

Simple user identification for conda

The anaconda-ident package builds upon the anaconda-anon-usage package to enhance the telemetry data that is delivered by conda when performing package management operations and, optionally, when activating conda environments.

Unlike anaconda-anon-usage, which is designed to respect anonymity, this package can be configured to include actual hostnames, usernames, and environment names, to allow administrators to track usage more precisely. This information can optionally be scrambled using a one-way hash function, obscuring the actual values in upstream logs, while still allowing administrators to match to known persons or machines.

This plugin is not shipped by default in freely available Anaconda offerings. Instead, it is offered as an add-on for customers who seek to better track the usage of packages for governance and security purposes.

Quickstart

Installation

To use anaconda-ident, simply install it in your base environment:

conda install -n base anaconda-ident

This package has no additional dependencies other than conda and anaconda-anon-usage. It employs a combination of conda's plugin mechanism and post-link and activation scripts to modify the default behavior of conda.

This plugin builds upon the user-agent telemetry mechanisms built into anaconda-anon-usage, embedding telemetry "tokens" into the standard User-Agent header included with every package or index request made by conda. Additionally, if activation heartbeats are enabled in anaconda-anon-usage, these tokens will also be delivered when conda environments are activated. For a full introduction to this approach, including information about activation heartbeats, please consult the anaconda-anon-usage README.

Additional tokens

In addition to the tokens created by the anaconda-anon-usage package, anaconda-ident provides the following additional options. Each token takes the form X/<value>, where X represents one of the single characters below:

  • u: username, as determined by the getpass.getuser method.
  • h: hostname, as determined by the platform.node method.
  • n: environment name. This is the name of the environment directory (not the full path), or base for the root environment.
  • o: organization token. This is a custom identifier for your organization, specified in the configuration string.
  • U, H, and N: these are hashed versions of the username, hostname, and environment name (see the section "Hashed identifier tokens" below).

The configuration string

A standard configuration string is simply a combination of one or more of the characters uhnUHN. An organization string may also be included by appending it to this configuration with a leading colon :. Here are some examples:

  • uh:finance: username, hostname, and a finance organization.
  • uhn:eng: all the tokens, including an eng organization.

No matter what configuration is chosen, the standard set of anaconda-anon-usage tokens will be delivered as well.

For convenience, a number of special keywords are also available, all of which can be combined with the organization string:

  • username: equivalent to u.
  • hostname: equivalent to h.
  • userhost: equivalent to uh.
  • userenv: equivalent to un.
  • hostenv: equivalent to hn.
  • full: equivalent to uhn.

Here is an example set of tokens for the configuration full:myorg:

c/NIedulQP s/SsYPna-z e/Tfgp_cYz u/mgrant h/m1mbp.local n/base o/myorg

The c, s, and e tokens are generated by anaconda-anon-usage, while the remaining tokens are generated by anaconda-ident.

Local configuration

There are two approaches to setting the configuration for anaconda-ident. The first is to set the anaconda_ident parameter using conda's standard configuration mechanisms. For instance, you can use the conda config command:

conda config --set anaconda_ident userhost:my_org

You can manually edit your ~/.condarc configuration file and insert a line; e.g.:

anaconda_ident: userhost:my_org

Activation heartbeats

By default, anaconda-ident tokens are only sent when conda performs package operations (install, search, etc.). However, anaconda-anon-usage supports an opt-in "activation heartbeat" feature that sends telemetry (including all anaconda-ident tokens) when a conda environment is activated.

To enable activation heartbeats, add this to your system condarc file:

anaconda_heartbeat: true

When enabled, a lightweight HEAD request is sent to the upstream repository each time you run conda activate, providing visibility into environment usage patterns beyond just package installations.

Configuration package creation

A key feature of the anaconda_ident package is the ability to create a sidecar conda package containing any combination of the following:

  • The anaconda_ident configuration string
  • A custom default_channels value to point conda's defaults metachannel to an alternative repository
  • A standard Conda authentication token for a repository

The typical use case for this is to host this conda package on an internal package repository, and/or add it into custom Miniconda / Anaconda installers.

The command to build this package is called anaconda-keymgr. Running anaconda-keymgr --help will provide all of the configuration options. Here is a typical call:

anaconda-keymgr \
    --version <VERSION_NUMBER> --build-string <ORGANIZATION> \
    --config-string <CONFIG_STRING> \
    --default-channel <REPO_URL> \
    --repo-token <REPO_TOKEN> \
    --org-token <ORG_TOKEN>

The above command will create a package called

anaconda-ident-config-<VERSION>-<ORGANIZATION>_0.tar.bz2.

If this package is installed into a root conda environment, it will automatically activate anaconda-ident and configure it according to the settings provided.

Note: By default, anaconda-keymgr enables activation heartbeats. Use --no-heartbeat if you want to disable this feature in the generated configuration package.

Advanced: hashed identifier tokens

The hashed username, environment, and hostname tokens provide a measure of privacy preservation by applying a hash function to the original values. While this approach is not cryptographically secure, it is considered impractical for someone to extract the original identifying data from a hashed token. At the same time, someone with access to the configuration data can readily compute these hashes and use them to, for example, filter logs for records that match particular hosts, users, or environments.

The security of this approach can be improved by supplying a pepper value in the config string. This data is 16 bytes of random data, and can be base64-encoded and appended to the end of the config string following a second colon; for instance:

anaconda_ident: userhost:my_org:ugQzhEX5Fs45/iOonikPXA

For simplicity, a --pepper option has been added to the anaconda-keymgr command to randomly generate a pepper value. To reuse an existing pepper value, simply supply it as part of the --config-string argument.

A command-line utility anaconda-ident-hash has been provided to enable the hash values to be computed for filtering uses:

anaconda-ident-hash <environment|username|hostname> <value>

To obtain the results that match logs, this would need to be run in a conda environment with a matching organization string and pepper value.

anaconda-ident-hash hostname mgrant-mbp

would return the token generated for the hostname mgrant-mbp.

Distributing anaconda-ident

If you are an Anaconda customer interested in deploying anaconda-ident within your organization, please feel free to reach out to Anaconda Support. We can offer the following custom builds:

  • A set of anaconda-ident packages containing your preferred configuration.
  • A set of conda packages with metadata patched to include an anaconda-ident dependency.
  • Builds of the latest Miniconda and Anaconda installers with anaconda-ident added to them.

By hosting these builds in your internal package repository and software store, you can greatly simplify the distribution of this tool throughout your organization.

About

simple, opt-in user identification for conda clients

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 9