Technical reference

Presample packages

Presamples packages are directories that must minimally contain a datapackage.json file, based on the datapackage standard. It may contain other files, depending on the types of resources contained in the presamples package. The file contents are described here Presample package contents.

presamples.packaging.create_presamples_package(matrix_data=None, parameter_data=None, name=None, id_=None, overwrite=False, dirpath=None, seed=None, collapse_repeated_indices=True)

Create and populate a new presamples package

The presamples package minimally contains a datapackage file with metadata on the datapackage itself and its associated resources (stored presample arrays and identification of what the values in the arrays represent).

matrix_data: list, optional
list of tuples containing raw matrix data (presamples array, indices, matrix label)
parameter_data: list, optional
list of tuples containing raw parameter data (presamples array, names, label)
name: str, optional
A human-readable name for these samples.
id_: str, optional
Unique id for this collection of presamples. Optional, generated automatically if not set.
overwrite: bool, default=False
If True, replace an existing presamples package with the same \id_ if it exists.
dirpath: str, optional
An optional directory path where presamples can be created. If None, a subdirectory in the project folder.
seed: {None, int, “sequential”}, optional
Seed used by indexer to return array columns in random order. Can be an integer, “sequential” or None.
collapse_repeated_indices: bool, default=True
Indicates whether samples for the same matrix cell in a given array should be summed. If False then only the last sample values are used.

Notes

Both matrix_data and parameter_data are optional, but at least one needs to be passed. The documentation gives more details on these input arguments.

Both matrix and parameter data should have the same number of possible values (i.e same number of samples).

The documentations provide more information on the format for these two arguments.

Returns:
  • id_ (str) – The unique id_ of the presamples package
  • dirpath (str) – The absolute path of the created directory.

Description of the parameter_data argument

The parameter_data argument in create_presamples_package is a list (or other iterable) of tuples containing the following three objects:

  • samples: are a two-dimensional numpy array, where each row contains values for a specific named parameter, and columns represent possible values for these parameters. It is possible to have samples arrays with only one column (i.e. only one observation for the named parameters), and only one row (data on only one named parameter).
  • names: list of parameter names, as strings. The order of the names should be the same as the rows in samples, i.e. the first name corresponds to data in the first row of samples.
  • label: a string which will be used to name the resource. The presamples package does not presently use this label.

Important

There must necessarily be as many named parameters in names as there are rows in samples.

Note

It is possible to pass an arbitrary amount of (samples, names, label) tuples in parameter_data. Each will be contained in a distinct resource of the presamples package. However,

  1. the names in each tuple must be unique, and
  2. the number of columns in each samples must be identical.

Hint

While there are no restrictions on the string, using strings that are valid names in AST evaluators can prevent problems down the line.

Description of the matrix_data argument

The matrix_data argument in create_presamples_package is a list (or other iterable) of tuples containing the following three objects:

  • samples: are a two-dimensional Numpy array, where each row contains values for a specific matrix element that will be replaced and each column contains values for a given realization of the LCA model. It is possible to have samples arrays with only one column (i.e. only one observation for the matrix elements), and only one row (data on only one named parameter).
  • indices: is an iterable with row and (usually) column indices. The ith element of indices refers to the ith row of the samples. The exact format of indices depends on the matrix to be constructed. These indices tell us where exactly to insert the samples into a specific matrix.
  • matrix label: is a string giving the name of the matrix to be modified in the LCA class. Strings that are currently supported are ‘technosphere’, biosphere’ and ‘cf’.

Important

The number of rows in samples and indices must be identical.

Note

It is possible to pass an arbitrary amount of (samples, indices, matrix label) tuples in matrix_data. Each will be contained in a distinct resource of the presamples package. However, the number of columns in each samples must be identical.

Presample package contents

datapackage.json

Presample packages are directories that must minimally contain a datapackage.json file, based on the datapackage standard. It may contain other files, depending on the types of resources contained in the presamples package.

The format for the datapackage.json file is:

There can be an arbitrary number of resources. Each resource is represented in the datapackage by a dictionary.

Named parameters

Resources for named parameters have the following format:

Where: - the id is based on the \id_ argument passed to create_presamples_package - the data package index indicates the position (index) of the resource in the list of resources

Matrices

The last elements (“row from label”, “row to label”, etc.) are used by the presamples.loader.PackagesDataLoader.index_arrays() method to map the resource elements to the LCA matrices.

Loading multiple presample packages

Loading multiple presample packages for use in models requiring one value at a time is done using the PackagesDataLoader class.

class presamples.loader.PackagesDataLoader(dirpaths, seed=None, lca=None)

Load set of presample packages and ready underlying data for use

Named parameters found in presample packages will be assembled into a ConsolidatedIndexedParameterMapping, accessed through the parameters property.

Matrix data found in presample packages are readied to be mapped and inserted into LCA matrices using the corresponding methods.

In both cases, elements (named parameter or matrix element) repeated in multiple presample packages will take the value (and the Indexer) of the last presample package in the list that contains data on the element.

Parameters:
  • dirpaths (iterable of paths to presample packages) – See notes below for information on expected contents of directories.
  • seed ({None, int, array_like, "sequential"}, optional) – Seed value to use for index RNGs. Default is to use the seed values in each package, only specify this if you want to override the default.
  • lca (Brightway2 LCA object) – Used when PackagesDataLoader instantiated from LCA (or MonteCarloLCA) object.

Notes

  1. Accessing and using loaded named parameters

The returned PackagesDataLoader instance allows access to loaded parameter data via the parameters property.

  1. Using loaded matrix data in LCA

When used for LCA within the Brightway2 framework, the PackagesDataLoader instance is an attribute of the LCA (or MonteCarloLCA) instance. The LCA instance will call the method index_arrays in order to identify the matrix indices of values that will be overwritten, and the update_matrices method to update values.

Warning

The order of the passed presample package dirpaths is key! Every matrix element or named parameter that is repeated in multiple presample packages will take the value of the last presample package that is passed. All former values will not be used.

Warning

Note that we currently assume that all index values in matrix data will be present in the built matrices of the LCA instance. Silent errors or losses in efficiency could happen if this assumption does not hold.

When creating a PackagesDataLoader instance, parameter and matrix data automatically loaded by invoking the method load_data to each path in dirpaths:

classmethod PackagesDataLoader.load_data(dirpath, seed=None)

Load data and metadata from a directory containing a presamples package

Parameters:
  • dirpath (str or Pathlike object) – path to a presamples package
  • seed ({None, int, array_like, "sequential"}, optional) – Only specify this if you want to override seed value in presamples package.
Returns:

Dictionary with loaded data

Return type:

dict

Loaded data can then be parsed for accessing consolidated parameters or for injecting data in LCA matrices.

Using named parameters in PackagesDataLoader

With load_data, all presample packages are loaded. However, to ensure that only the last presample package with data on a specific named parameter is used, parameters are consolidated. The consolidated parameters are available via the parameters property. parameters points to a ConsolidatedIndexedParameterMapping object.

class presamples.loader.ConsolidatedIndexedParameterMapping(list_IPM)

Interface for consolidated named parameters in set of presample packages

Map all named parameters in a list of IndexedParameterMapping objects to presample arrays and Indexers identified in the last presample package that contains data on the named parameter.

This allows named parameters to be overwritten by successive presample packages.

Typically called directly from a PackagesDataLoader instance.

Parameters:list_IPM (list) – List of IndexedParameterMapping objects. The IndexedParameterMapping (IPM) objects are typically created by a PackagesDataLoader instance.

Important

The order of the IPMs is crucial, as named parameters in later IPMs overwrites data from earlier IMPs.

Notes

The CIPM instance can be used to access the following properties:

  • names: names of all n named parameters
  • ipm_mapper: dict {parameter name: IndexedParameterMapping}, identifying the IndexedParameterMapping used for a given named parameter.
  • consolidated_array: array of shape (n,) values, giving access to the values for the n named parameters
  • consolidated_index: array of shape (n,) values, giving access to the index values for the n named parameters in their respective IndexedParameterMapping
  • ids: dict of format {named parameters: ids}, where ids are tuples of (presamples package path, presamples package name, name of parameter). ids only contains information about the last presamples package with data on the named parameter.
  • replaced: dict of format {named parameters: ids of presample packages that were overwritten}
consolidated_array

Array of values for named parameter

Each value is taken from the last IndexedParameterMapping object that contains data on the named parameter. The used IndexedParameterMapping contains information about the path to the presamples array, the corresponding mapping for the named parameter and the current Indexer value.

consolidated_indices

Return the index value for the IndexedParameterMapping used for each name

Using PackagesDataLoader with LCA

Brightway LCA (and MonteCarloLCA) objects can seamlessly integrate presample packages.

>>> from brightway2 import LCA
>>> lca = LCA(demand={product:1}, presamples=[pp_path1, pp_path2, pp_path3])

This instantiates a PackagesDataLoader as described above.

It then indexes arrays:

PackagesDataLoader.index_arrays(lca)

Add row and column values to the indices.

As this function can be called multiple times, we check for each element if it has already been called, and whether the required mapping dictionary is present.

Finally, data from the correct columns in the presamples arrays are inserted in the LCA matrices:

PackagesDataLoader.update_matrices(lca=None, matrices=None, advance_indices=True)

Update the LCA instance matrices from presamples