The anaconda-ident package builds upon the
anaconda-anon-usage
package to enhance the telemetry data that is delivered by
conda when performing package management
operations and, optionally, when activating conda environments.
Unlike anaconda-anon-usage, which is designed to respect anonymity,
this package can be configured to include actual hostnames, usernames,
and environment names, to allow administrators to track usage more
precisely. This information can optionally be scrambled using a
one-way hash function, obscuring the actual values in upstream logs,
while still allowing administrators to match to known persons or machines.
This plugin is not shipped by default in freely available Anaconda offerings. Instead, it is offered as an add-on for customers who seek to better track the usage of packages for governance and security purposes.
To use anaconda-ident, simply install it in your base environment:
conda install -n base anaconda-ident
This package has no additional dependencies other than conda and
anaconda-anon-usage. It employs a combination of conda's plugin
mechanism and post-link and activation scripts to modify the default
behavior of conda.
This plugin builds upon the user-agent telemetry mechanisms built
into anaconda-anon-usage, embedding telemetry "tokens" into the
standard User-Agent header included with every package or index
request made by conda. Additionally, if activation heartbeats are
enabled in anaconda-anon-usage, these tokens will also be delivered
when conda environments are activated. For a full introduction to this
approach, including information about activation heartbeats, please
consult the anaconda-anon-usage README.
In addition to the tokens created by the anaconda-anon-usage package,
anaconda-ident provides the following additional options.
Each token takes the form X/<value>, where X represents one
of the single characters below:
u: username, as determined by thegetpass.getusermethod.h: hostname, as determined by theplatform.nodemethod.n: environment name. This is the name of the environment directory (not the full path), orbasefor the root environment.o: organization token. This is a custom identifier for your organization, specified in the configuration string.U,H, andN: these are hashed versions of the username, hostname, and environment name (see the section "Hashed identifier tokens" below).
A standard configuration string is simply a combination of one
or more of the characters uhnUHN. An organization string may also
be included by appending it to this configuration with a leading colon :.
Here are some examples:
uh:finance: username, hostname, and afinanceorganization.uhn:eng: all the tokens, including anengorganization.
No matter what configuration is chosen, the standard set of
anaconda-anon-usage tokens will be delivered as well.
For convenience, a number of special keywords are also available, all of which can be combined with the organization string:
username: equivalent tou.hostname: equivalent toh.userhost: equivalent touh.userenv: equivalent toun.hostenv: equivalent tohn.full: equivalent touhn.
Here is an example set of tokens for the configuration full:myorg:
c/NIedulQP s/SsYPna-z e/Tfgp_cYz u/mgrant h/m1mbp.local n/base o/myorg
The c, s, and e tokens are generated by anaconda-anon-usage,
while the remaining tokens are generated by anaconda-ident.
There are two approaches to setting the configuration for
anaconda-ident. The first is to set the anaconda_ident parameter using conda's standard configuration mechanisms.
For instance, you can use the conda config command:
conda config --set anaconda_ident userhost:my_org
You can manually edit your ~/.condarc configuration file and
insert a line; e.g.:
anaconda_ident: userhost:my_org
By default, anaconda-ident tokens are only sent when conda performs package
operations (install, search, etc.). However, anaconda-anon-usage supports an
opt-in "activation heartbeat" feature that sends telemetry (including all
anaconda-ident tokens) when a conda environment is activated.
To enable activation heartbeats, add this to your system condarc file:
anaconda_heartbeat: true
When enabled, a lightweight HEAD request is sent to the upstream repository
each time you run conda activate, providing visibility into environment
usage patterns beyond just package installations.
A key feature of the anaconda_ident package is the ability
to create a sidecar conda package containing any combination
of the following:
- The
anaconda_identconfiguration string - A custom
default_channelsvalue to point conda'sdefaultsmetachannel to an alternative repository - A standard Conda authentication token for a repository
The typical use case for this is to host this conda package on an internal package repository, and/or add it into custom Miniconda / Anaconda installers.
The command to build this package is called anaconda-keymgr.
Running anaconda-keymgr --help will provide all of the
configuration options. Here is a typical call:
anaconda-keymgr \
--version <VERSION_NUMBER> --build-string <ORGANIZATION> \
--config-string <CONFIG_STRING> \
--default-channel <REPO_URL> \
--repo-token <REPO_TOKEN> \
--org-token <ORG_TOKEN>
The above command will create a package called
anaconda-ident-config-<VERSION>-<ORGANIZATION>_0.tar.bz2.
If this package is installed into a root conda environment,
it will automatically activate anaconda-ident and configure
it according to the settings provided.
Note: By default, anaconda-keymgr enables activation heartbeats.
Use --no-heartbeat if you want to disable this feature in the
generated configuration package.
The hashed username, environment, and hostname tokens provide a measure of privacy preservation by applying a hash function to the original values. While this approach is not cryptographically secure, it is considered impractical for someone to extract the original identifying data from a hashed token. At the same time, someone with access to the configuration data can readily compute these hashes and use them to, for example, filter logs for records that match particular hosts, users, or environments.
The security of this approach can be improved by supplying a pepper value in the config string. This data is 16 bytes of random data, and can be base64-encoded and appended to the end of the config string following a second colon; for instance:
anaconda_ident: userhost:my_org:ugQzhEX5Fs45/iOonikPXA
For simplicity, a --pepper option has been added to the
anaconda-keymgr command to randomly generate a pepper value.
To reuse an existing pepper value, simply supply it as part
of the --config-string argument.
A command-line utility anaconda-ident-hash has been provided
to enable the hash values to be computed for filtering uses:
anaconda-ident-hash <environment|username|hostname> <value>
To obtain the results that match logs, this would need to be run in a conda environment with a matching organization string and pepper value.
anaconda-ident-hash hostname mgrant-mbp
would return the token generated for the hostname mgrant-mbp.
If you are an Anaconda customer interested in deploying
anaconda-ident within your organization, please feel free to
reach out to Anaconda Support.
We can offer the following custom builds:
- A set of
anaconda-identpackages containing your preferred configuration. - A set of
condapackages with metadata patched to include ananaconda-identdependency. - Builds of the latest Miniconda and Anaconda installers
with
anaconda-identadded to them.
By hosting these builds in your internal package repository and software store, you can greatly simplify the distribution of this tool throughout your organization.