wurst Documentation
Release 0.1
Chris Mutel
Nov 18, 2022
Contents
1 Installation 3
2 Documents versus matrices 5
3 Searching and filtering 7
4 Transformations 9
5 Unlinking and Re-linking 11
6 Spatial relationships 13
7 Brightway2 IO 15
8 Built-in models 17
9 Technical documentation 19
10 Indices and tables 25
Index 27
i
ii
wurst Documentation, Release 0.1
Show how the sausage is made!
Wurst is a python package for linking and modifying industrial ecology models, with a focus on sparse matrices in
life cycle assessment. Current development focuses on modifying the ecoinvent LCI database with scenario data from
various data sources, using Brightway2 as the data backend.
See also the separate wurst examples repository.
A wurst model run will typically consist of the following steps:
• Load data from several sources
• Modify the LCI data
• Write the modified LCI data to some storage mechanism
Wurst supports the following generic modification types:
• Change the input material efficiency and associated emissions (change_exchanges_by_constant_factor)
• Change specific emissions separate from general efficiency improvements
• Change the relative shares of inputs (including adding new inputs) into markets
• Separate a global dataset into separate regions
In general, a modification function will include the following steps:
• Filter the LCI database by name, unit, location, etc. to get the subset of activities to modify
• Filter the external data source to get the relevant data used for modifications
• Change the values of some of the exchanges in the filtered LCI database using the filtered external data
Contents 1
wurst Documentation, Release 0.1
2 Contents
CHAPTER 1
Installation
Download and install miniconda, create and activate a new environment, and then install:
conda install -y -q -c conda-forge -c cmutel -c haasad -c konstantinstadler
˓→brightway2 jupyter wurst
3
wurst Documentation, Release 0.1
4 Chapter 1. Installation
CHAPTER 2
Documents versus matrices
Inventory matrices can be modified by multiplying or adding vectors, as in the Themis methodology paper. Wurst
takes a different approach - it treats each activity (column in the technosphere matrix) as a document with metadata
and a list of exchanges which can be modified as desired. This approach allows for both flexibility (e.g. the number
of rows and columns are not fixed) and simpler code (no need for an indirection layer to row and column indices). So,
instead of constructing a vector and using it directly, wurst would prefer to write a function like:
import wurst as w
def scale_biosphere_exchanges_by_delta(ds, delta):
# Not directly related to fuel inputs
exclude_list = [
'Methane, fossil', 'Sulfur dioxide',
'Carbon monoxide, fossil',
'Nitrogen oxides', 'Dinitrogen monoxide', 'Particulates'
]
for exc in w.biosphere(ds, w.doesnt_contain_any('name', exclude_list)):
# Modifies in place
w.rescale_exchage(exc, delta)
2.1 Internal data format
The internal data format for Wurst is a subset of the implied internal format for Brightway2.
{
'database': str,
'code': str,
'name': str,
'reference product': str,
'location': str,
'unit': str,
'classifications': [tuple],
(continues on next page)
5
wurst Documentation, Release 0.1
(continued from previous page)
'comment': str,
'parameters': {'parameter name (str)': float},
'exchanges': [
{
'amount': float,
'categories': list, # only for biosphere flows
'type': str, # biosphere, techosphere, production
'name': str,
'database': str,
'product': str,
'unit': str,
'location': str,
'input': tuple, # only if from external database
'uncertainty type': int, # optional
'loc': float, # optional
'scale': float, # optional
'shape': float, # optional
'minimum': float, # optional
'maximum': float, # optional
'production volume': float # optional
'pedigree': { # optional
'completeness': int,
'further technological correlation': int,
'geographical correlation': int,
'reliability': int,
'temporal correlation': int
},
}
]
}
An example classification:
('ISIC rev.4 ecoinvent', '1050:Manufacture of dairy products')
6 Chapter 2. Documents versus matrices
CHAPTER 3
Searching and filtering
Wurst provides helper functions to make searching and filtering easier. These filter functions are designed to be used
with get_many and get_one; here is an example:
nuclear_generation = get_many(
lci_database,
contains('name', 'nuclear'),
contains('name', 'electricity'),
equals('unit', 'kilowatt hour'),
exclude(contains('name', 'aluminium')),
exclude(contains('name', 'import'))
)
It is also OK to write a generator function that does the same thing:
nuclear_generation = (
ds for ds in lci_database
if 'nuclear' in ds['name']
and 'nuclear' in ds['name']
and ds['unit'] == 'kilowatt hour'
and 'aluminium' not in ds['name']
and 'import' not in ds['name']
)
The difference between the styles is ultimately a question of personal preference. For many people, list and generator
expressions are more pythonic; in the specific case of wurst, using helper functions that are composable and reusable
may allow you to not repeat yourself as often. There will also be times when the helper functions in wurst are not
good enough for a specific search. In any case bear in mind the following general guidelines:
• Always manually check the results of your filtering functions before using them! The world is a complicated
place, and our data sources reflect that complexity with unexpected or inconsistent elements.
• It is strongly recommended to use generator instead of list comprehensions, i.e. (x for x in foo) instead
of [x for x in foo].
7
wurst Documentation, Release 0.1
For more information, see the introduction notebook, API documentation for searching, and: itertools, functools, toolz
libraries.
3.1 Exchange iterators
The technosphere, biosphere, and production functions will return generators for exchanges with their
respective exchange types.
Wurst also provides reference_product(dataset), which will return the single reference product for a
dataset. If zero or multiple products are available, it will raise an error.
8 Chapter 3. Searching and filtering
CHAPTER 4
Transformations
wurst.transformations.activity.change_exchanges_by_constant_factor(ds, value,
techno-
sphere_filters=None,
bio-
sphere_filters=None)
Change some or all inputs and biosphere flows by a constant factor.
• ds is a dataset document.
• value is a number. Existing exchange amounts will be multiplied by this number.
• technosphere_filters is an iterable of filter functions. Optional.
• biosphere_filters is an iterable of filter functions. Optional.
Returns the altered dataset. The dataset is also modified in place, so the return value can be ignored.
Example: Changing coal dataset to reflect increased fuel efficiency
import wurst as w
apct_products = w.either(
w.equals('name', 'market for NOx retained'),
w.equals('name', 'market for SOx retained'),
)
generation_filters = [
w.either(w.contains('name', 'coal'), w.contains('name', 'lignite')),
w.contains('name', 'electricity'),
w.equals('unit', 'kilowatt hour'),
w.doesnt_contain_any('name', [
'market', 'aluminium industry',
'coal, carbon capture and storage'
])
]
(continues on next page)
9
wurst Documentation, Release 0.1
(continued from previous page)
fuel_independent = w.doesnt_contain_any('name', (
'Methane, fossil', 'Sulfur dioxide', 'Carbon monoxide, fossil',
'Nitrogen oxides', 'Dinitrogen monoxide', 'Particulates'
))
for ds in w.get_many(data, generation_filters):
change_exchanges_by_constant_factor(
ds,
0.8, # Or whatever from input data
[w.exclude(apct_products)],
[fuel_independent]
)
10 Chapter 4. Transformations
CHAPTER 5
Unlinking and Re-linking
Exchanges are considered “linked” if their input flows are already resolved to point to a certain producing activity. In
Brightway2, this link is the field “input”, whose value takes the form ('database name', 'unique code').
Wurst uses the same convention - the input field is used to uniquely identify an activity that produces the exchange
flow (biosphere flows are also considered activities).
The output field is not needed - this is the activity in question, which consumes the input flow. Production exchanges
will have the same value in input and output.
The default Brightway2 importer will remove the input field for exchanges which are provided by another activity
in the same set of input datasets. Instead of an input field, the exchange will have an activity name, a flow name, a
location, and a unit. This metadata is useful if you want to edit or create new exchange links.
The Brightway2 exporter will automatically re-link (i.e. find the correct input values) exchanges when writing a
new database. You can also manually create input values - no input value will be overwritten. In the database
component of the input field, you can either use the name of the new database to be written, or the name of one of
the input databases (it will be updated automatically).
11
wurst Documentation, Release 0.1
12 Chapter 5. Unlinking and Re-linking
CHAPTER 6
Spatial relationships
Fig. 1: Topological faces in Northeastern Canada, showing both political and geographical divisions.
Wurst uses the constructive_geometries library to make spatial calculations easy. As shown above,
13
wurst Documentation, Release 0.1
constructive_geometries splits the world into a consistent set of topological faces, identified by integer ID
values. This means that we can skip GIS functions like intersects, overlaps, etc. and instead use set algebra.
constructive_geometries is based on the natural earth database, and includes all countries, UN regions and
subregions, some disputed areas, and a number of ecoinvent-specific regions; for full documentation, ; see the ecoin-
vent report for a complete list. Countries are identified by their two-letter ISO 3166-2 codes.
We recommend using the function relink_technosphere_exchanges, as this should be flexible enough for almost all
cases, and has been tested to avoid common corner cases and possible hidden errors. See also the Matching and
linking datasets in space example notebook.
wurst.transformations.geo.relink_technosphere_exchanges(ds, data, exclusive=True,
drop_invalid=False,
biggest_first=False, con-
tained=True)
Find new technosphere providers based on the location of the dataset.
Designed to be used when the dataset’s location changes, or when new datasets are added.
Uses the name, reference product, and unit of the exchange to filter possible inputs. These must match exactly.
Searches in the list of datasets data.
Will only search for providers contained within the location of ds, unless contained is set to False, all
providers whose location intersects the location of ds will be used.
A RoW provider will be added if there is a single topological face in the location of ds which isn’t covered by
the location of any providing activity.
If no providers can be found, relink_technosphere_exchanes will try to add a RoW or GLO providers, in
that order, if available. If there are still no valid providers, a InvalidLink exception is raised, unless
drop_invalid is True, in which case the exchange will be deleted.
Allocation between providers is done using allocate_inputs; results seem strange if
contained=False, as production volumes for large regions would be used as allocation factors.
Input arguments:
• ds: The dataset whose technosphere exchanges will be modified.
• data: The list of datasets to search for technosphere product providers.
• exclusive: Bool, default is True. Don’t allow overlapping locations in input providers.
• drop_invalid: Bool, default is False. Delete exchanges for which no valid provider is available.
• biggest_first: Bool, default is False. Determines search order when selecting provider locations.
Only relevant is exclusive is True.
• contained: Bool, default is True. If ture, only use providers whose location is completely within the
ds location; otherwise use all intersecting locations.
Modifies the dataset in place; returns the modified dataset.
14 Chapter 6. Spatial relationships
CHAPTER 7
Brightway2 IO
wurst.brightway.extract_database.extract_brightway2_databases(database_names,
add_properties=False,
add_identifiers=False)
Extract a Brightway2 SQLiteBackend database to the Wurst internal format.
database_names is a list of database names. You should already be in the correct project.
Returns a list of dataset documents.
wurst.brightway.write_database.write_brightway2_database(data, name)
Write a new database as a new Brightway2 database named name.
You should be in the correct project already.
This function will do the following:
• Change the database name for all activities and internal exchanges to name. All activities will have the
new database name, even if the original data came from multiple databases.
• Relink exchanges using the default fields: ('name', 'product', 'location', 'unit').
• Check that all internal links resolve to actual activities, If the input value is ('name', 'bar'), there
must be an activity with the code bar.
• Check to make sure that all activity codes are unique
• Write the data to a new Brightway2 SQLite database
Will raise an assertion error is name already exists.
Doesn’t return anything.
15
wurst Documentation, Release 0.1
16 Chapter 7. Brightway2 IO
CHAPTER 8
Built-in models
8.1 Marginal electricity mixes
This model is based on the work of Laurent Vandepaer, and changes the electricity mixes (market for
electricity, low/medium/high voltage) in the consequential version of ecoinvent. Input data is gath-
ered from a number of different sources, and processed to an excel sheet that lists the absolute generation values for a
number of ecoinvent electricity generators. This model is illustrated in the notebook marginal-mixes.ipynb.
This model needs to do the following:
• Import the input data
• Load ecoinvent, consequential system model
• Remove generators that aren’t in current version of ecoinvent
• Normalize production values to sum to one kilowatt hour
• Remove all generators from low voltage production mix, replace complete with electricity voltage
transformation from medium to low voltage
• Remove all generators from medium voltage production mix, replace complete with electricity
voltage transformation from high to medium voltage
• Remove all generators from high voltage production mix, replace with new exchanges linked to our new gener-
ation technologies and amounts
• Relink all exchanges in the extracted database
• Write the database
Note that we move solar production from the low voltage mix to the high voltage mix, as new solar PV can be a
large fraction of the marginal increase in production, more than 50% in some countries, but if it was stuck in the low
voltage mix it would only be consumed by a few activities. Most activities consume electricity from the high voltage
production mix.
17
wurst Documentation, Release 0.1
The step removing some potential production technologies is needed because we are developing against ecoinvent 3.3,
but our external import data is also linked against some technologies that will be included in 3.4. We skip these for
now.
When we insert new production exchanges, we need to use a geo-matching function that finds the appropriate gener-
ation technology. Sometimes there are country-specific generators, but other times we will need to use a regional or
even global producer.
18 Chapter 8. Built-in models
CHAPTER 9
Technical documentation
9.1 Technical Reference
9.1.1 Searching
wurst.searching.equals(field, value)
Return function where input field value is equal to value
wurst.searching.contains(field, value)
wurst.searching.startswith(field, value)
wurst.searching.either(*funcs)
Return True is any of the function evaluate true
wurst.searching.exclude(func)
Return the opposite of func (i.e. False instead of True)
wurst.searching.doesnt_contain_any(field, values)
Exclude all dataset whose field contains any of values
wurst.searching.get_many(data, *funcs)
Apply all filter functions funcs to data
wurst.searching.get_one(data, *funcs)
Apply filter functions funcs to data, and return exactly one result.
Raises wurst.errors.NoResults or wurst.errors.MultipleResults if zero or multiple results
are returned.
9.1.2 Exchange iterators
wurst.searching.technosphere(ds, *funcs)
Get all technosphere exchanges in ds that pass filtering functions funcs
19
wurst Documentation, Release 0.1
wurst.searching.biosphere(ds, *funcs)
Get all biosphere exchanges in ds that pass filtering functions funcs
wurst.searching.production(ds, *funcs)
Get all production exchanges in ds that pass filtering functions funcs
wurst.searching.reference_product(ds)
Get single reference product exchange from a dataset.
Raises wurst.errors.NoResults or wurst.errors.MultipleResults if zero or multiple results
are returned.
9.1.3 Geo functions
wurst.transformations.geo.copy_to_new_location(ds, location)
Copy dataset and substitute new location.
Doesn’t change exchange locations, except for production exchanges.
Returns the new dataset.
wurst.transformations.geo.relink_technosphere_exchanges(ds, data, exclusive=True,
drop_invalid=False,
biggest_first=False, con-
tained=True)
Find new technosphere providers based on the location of the dataset.
Designed to be used when the dataset’s location changes, or when new datasets are added.
Uses the name, reference product, and unit of the exchange to filter possible inputs. These must match exactly.
Searches in the list of datasets data.
Will only search for providers contained within the location of ds, unless contained is set to False, all
providers whose location intersects the location of ds will be used.
A RoW provider will be added if there is a single topological face in the location of ds which isn’t covered by
the location of any providing activity.
If no providers can be found, relink_technosphere_exchanes will try to add a RoW or GLO providers, in
that order, if available. If there are still no valid providers, a InvalidLink exception is raised, unless
drop_invalid is True, in which case the exchange will be deleted.
Allocation between providers is done using allocate_inputs; results seem strange if
contained=False, as production volumes for large regions would be used as allocation factors.
Input arguments:
• ds: The dataset whose technosphere exchanges will be modified.
• data: The list of datasets to search for technosphere product providers.
• exclusive: Bool, default is True. Don’t allow overlapping locations in input providers.
• drop_invalid: Bool, default is False. Delete exchanges for which no valid provider is available.
• biggest_first: Bool, default is False. Determines search order when selecting provider locations.
Only relevant is exclusive is True.
• contained: Bool, default is True. If ture, only use providers whose location is completely within the
ds location; otherwise use all intersecting locations.
Modifies the dataset in place; returns the modified dataset.
20 Chapter 9. Technical documentation
wurst Documentation, Release 0.1
wurst.transformations.geo.allocate_inputs(exc, lst)
Allocate the input exchanges in lst to exc, using production volumes where possible, and equal splitting
otherwise.
Always uses equal splitting if RoW is present.
wurst.transformations.default_global_location(database)
Set missing locations to `GLO` for datasets in database.
Changes location if location is missing or None. Will add key location if missing.
9.1.4 Linking
wurst.linking.link_internal(data, fields=(’name’, ’product’, ’location’, ’unit’))
Link internal exchanges by fields. Creates input field in newly-linked exchanges.
wurst.linking.check_internal_linking(data)
Check that each internal link is to an actual activity
wurst.linking.change_db_name(data, name)
Change the database of all datasets in data to name.
Raises errors if each dataset does not have exactly one reference production exchange.
wurst.linking.check_duplicate_codes(data)
Check that there won’t be duplicate codes when activities are merged to new, common database
9.1.5 Transformations
wurst.transformations.activity.change_exchanges_by_constant_factor(ds, value,
techno-
sphere_filters=None,
bio-
sphere_filters=None)
Change some or all inputs and biosphere flows by a constant factor.
• ds is a dataset document.
• value is a number. Existing exchange amounts will be multiplied by this number.
• technosphere_filters is an iterable of filter functions. Optional.
• biosphere_filters is an iterable of filter functions. Optional.
Returns the altered dataset. The dataset is also modified in place, so the return value can be ignored.
Example: Changing coal dataset to reflect increased fuel efficiency
import wurst as w
apct_products = w.either(
w.equals('name', 'market for NOx retained'),
w.equals('name', 'market for SOx retained'),
)
generation_filters = [
w.either(w.contains('name', 'coal'), w.contains('name', 'lignite')),
w.contains('name', 'electricity'),
w.equals('unit', 'kilowatt hour'),
(continues on next page)
9.1. Technical Reference 21
wurst Documentation, Release 0.1
(continued from previous page)
w.doesnt_contain_any('name', [
'market', 'aluminium industry',
'coal, carbon capture and storage'
])
]
fuel_independent = w.doesnt_contain_any('name', (
'Methane, fossil', 'Sulfur dioxide', 'Carbon monoxide, fossil',
'Nitrogen oxides', 'Dinitrogen monoxide', 'Particulates'
))
for ds in w.get_many(data, generation_filters):
change_exchanges_by_constant_factor(
ds,
0.8, # Or whatever from input data
[w.exclude(apct_products)],
[fuel_independent]
)
wurst.transformations.delete_zero_amount_exchanges(data, drop_types=None)
Drop all zero value exchanges from a list of datasets.
drop_types is an optional list of strings, giving the type of exchanges to drop; default is to drop all types.
Returns the modified data.
wurst.transformations.rescale_exchange(exc, value, remove_uncertainty=True)
Dummy function to rescale exchange amount and uncertainty.
This depends on some code being separated from Ocelot, which will take a bit of time.
• exc is an exchange dataset.
• value is a number, to be multiplied by the existing amount.
• remove_uncertainty: Remove (unscaled) uncertainty data, default is True.
Returns the modified exchange.
wurst.transformations.empty_market_dataset(ds, exclude=None)
Remove input exchanges from a market dataset, in preparation for input exchanges defined by an external data
source.
Removes all exchanges which have the same flow as the reference product of the exchange. exclude is an
iterable of activity names to exclude.
9.1.6 Brightway IO
wurst.brightway.extract_database.extract_brightway2_databases(database_names,
add_properties=False,
add_identifiers=False)
Extract a Brightway2 SQLiteBackend database to the Wurst internal format.
database_names is a list of database names. You should already be in the correct project.
Returns a list of dataset documents.
wurst.brightway.write_database.write_brightway2_database(data, name)
Write a new database as a new Brightway2 database named name.
22 Chapter 9. Technical documentation
wurst Documentation, Release 0.1
You should be in the correct project already.
This function will do the following:
• Change the database name for all activities and internal exchanges to name. All activities will have the
new database name, even if the original data came from multiple databases.
• Relink exchanges using the default fields: ('name', 'product', 'location', 'unit').
• Check that all internal links resolve to actual activities, If the input value is ('name', 'bar'), there
must be an activity with the code bar.
• Check to make sure that all activity codes are unique
• Write the data to a new Brightway2 SQLite database
Will raise an assertion error is name already exists.
Doesn’t return anything.
9.1. Technical Reference 23
wurst Documentation, Release 0.1
24 Chapter 9. Technical documentation
CHAPTER 10
Indices and tables
• genindex
• modindex
• search
25
wurst Documentation, Release 0.1
26 Chapter 10. Indices and tables
Index
A get_one() (in module wurst.searching), 19
allocate_inputs() (in module
wurst.transformations.geo), 20 L
link_internal() (in module wurst.linking), 21
B
biosphere() (in module wurst.searching), 19 P
production() (in module wurst.searching), 20
C
change_db_name() (in module wurst.linking), 21 R
change_exchanges_by_constant_factor() reference_product() (in module wurst.searching),
(in module wurst.transformations.activity), 9, 20
21 relink_technosphere_exchanges() (in mod-
check_duplicate_codes() (in module ule wurst.transformations.geo), 14, 20
wurst.linking), 21 rescale_exchange() (in module
check_internal_linking() (in module wurst.transformations), 22
wurst.linking), 21
contains() (in module wurst.searching), 19 S
copy_to_new_location() (in module startswith() (in module wurst.searching), 19
wurst.transformations.geo), 20
T
D technosphere() (in module wurst.searching), 19
default_global_location() (in module
wurst.transformations), 21 W
delete_zero_amount_exchanges() (in module write_brightway2_database() (in module
wurst.transformations), 22 wurst.brightway.write_database), 15, 22
doesnt_contain_any() (in module
wurst.searching), 19
E
either() (in module wurst.searching), 19
empty_market_dataset() (in module
wurst.transformations), 22
equals() (in module wurst.searching), 19
exclude() (in module wurst.searching), 19
extract_brightway2_databases() (in module
wurst.brightway.extract_database), 15, 22
G
get_many() (in module wurst.searching), 19
27