Technical reference¶
Presample packages¶
Presamples packages are directories that must minimally contain a datapackage.json
file, based on the
datapackage standard.
It may contain other files, depending on the types of resources contained in the presamples package.
The file contents are described here Presample package contents.
-
presamples.packaging.
create_presamples_package
(matrix_data=None, parameter_data=None, name=None, id_=None, overwrite=False, dirpath=None, seed=None, collapse_repeated_indices=True)¶ Create and populate a new presamples package
The presamples package minimally contains a datapackage file with metadata on the datapackage itself and its associated resources (stored presample arrays and identification of what the values in the arrays represent).
- matrix_data: list, optional
- list of tuples containing raw matrix data (presamples array, indices, matrix label)
- parameter_data: list, optional
- list of tuples containing raw parameter data (presamples array, names, label)
- name: str, optional
- A human-readable name for these samples.
- id_: str, optional
- Unique id for this collection of presamples. Optional, generated automatically if not set.
- overwrite: bool, default=False
- If True, replace an existing presamples package with the same
\id_
if it exists. - dirpath: str, optional
- An optional directory path where presamples can be created. If None, a subdirectory in the
project
folder. - seed: {None, int, “sequential”}, optional
- Seed used by indexer to return array columns in random order. Can be an integer, “sequential” or None.
- collapse_repeated_indices: bool, default=True
- Indicates whether samples for the same matrix cell in a given array should be summed. If False then only the last sample values are used.
Notes
Both
matrix_data
andparameter_data
are optional, but at least one needs to be passed. The documentation gives more details on these input arguments.Both matrix and parameter data should have the same number of possible values (i.e same number of samples).
The documentations provide more information on the format for these two arguments.
Returns: - id_ (str) – The unique
id_
of the presamples package - dirpath (str) – The absolute path of the created directory.
Description of the parameter_data
argument¶
The parameter_data
argument in create_presamples_package
is a list (or other iterable) of tuples containing
the following three objects:
samples
: are a two-dimensional numpy array, where each row contains values for a specific named parameter, and columns represent possible values for these parameters. It is possible to have samples arrays with only one column (i.e. only one observation for the named parameters), and only one row (data on only one named parameter).names
: list of parameter names, as strings. The order of the names should be the same as the rows insamples
, i.e. the first name corresponds to data in the first row ofsamples
.label
: a string which will be used to name the resource. The presamples package does not presently use this label.
Important
There must necessarily be as many named parameters in names
as there are rows in samples
.
Note
It is possible to pass an arbitrary amount of (samples
, names
, label
) tuples in parameter_data
.
Each will be contained in a distinct resource of the presamples package. However,
- the names in each tuple must be unique, and
- the number of columns in each
samples
must be identical.
Hint
While there are no restrictions on the string, using strings that are valid names in AST evaluators can prevent problems down the line.
Description of the matrix_data
argument¶
The matrix_data
argument in create_presamples_package
is a list (or other iterable) of tuples containing
the following three objects:
samples
: are a two-dimensional Numpy array, where each row contains values for a specific matrix element that will be replaced and each column contains values for a given realization of the LCA model. It is possible to have samples arrays with only one column (i.e. only one observation for the matrix elements), and only one row (data on only one named parameter).indices
: is an iterable with row and (usually) column indices. The ith element of indices refers to the ith row of the samples. The exact format of indices depends on the matrix to be constructed. These indices tell us where exactly to insert the samples into a specific matrix.matrix label
: is a string giving the name of the matrix to be modified in the LCA class. Strings that are currently supported are ‘technosphere’, biosphere’ and ‘cf’.
Important
The number of rows in samples
and indices
must be identical.
Note
It is possible to pass an arbitrary amount of (samples
, indices
, matrix label
) tuples in
matrix_data
. Each will be contained in a distinct resource of the presamples package. However, the number of
columns in each samples
must be identical.
Presample package contents¶
datapackage.json¶
Presample packages are directories that must minimally contain a datapackage.json
file, based on the
datapackage standard.
It may contain other files, depending on the types of resources contained in the presamples package.
The format for the datapackage.json file is:
There can be an arbitrary number of resources. Each resource is represented in the datapackage by a dictionary.
Named parameters¶
Resources for named parameters have the following format:
Where:
- the id is based on the \id_
argument passed to create_presamples_package
- the data package index indicates the position (index) of the resource in the list of resources
Matrices¶
The last elements (“row from label”, “row to label”, etc.) are used by the presamples.loader.PackagesDataLoader.index_arrays()
method to map the resource elements to the LCA matrices.
Loading multiple presample packages¶
Loading multiple presample packages for use in models requiring one value at a time is done using the
PackagesDataLoader
class.
-
class
presamples.loader.
PackagesDataLoader
(dirpaths, seed=None, lca=None)¶ Load set of presample packages and ready underlying data for use
Named parameters found in presample packages will be assembled into a
ConsolidatedIndexedParameterMapping
, accessed through theparameters
property.Matrix data found in presample packages are readied to be mapped and inserted into LCA matrices using the corresponding methods.
In both cases, elements (named parameter or matrix element) repeated in multiple presample packages will take the value (and the
Indexer
) of the last presample package in the list that contains data on the element.Parameters: - dirpaths (iterable of paths to presample packages) – See notes below for information on expected contents of directories.
- seed ({None, int, array_like, "sequential"}, optional) – Seed value to use for index RNGs. Default is to use the seed values in each package, only specify this if you want to override the default.
- lca (Brightway2 LCA object) – Used when
PackagesDataLoader
instantiated from LCA (or MonteCarloLCA) object.
Notes
- Accessing and using loaded named parameters
The returned
PackagesDataLoader
instance allows access to loaded parameter data via theparameters
property.- Using loaded matrix data in LCA
When used for LCA within the Brightway2 framework, the
PackagesDataLoader
instance is an attribute of theLCA
(orMonteCarloLCA
) instance. TheLCA
instance will call the methodindex_arrays
in order to identify the matrix indices of values that will be overwritten, and theupdate_matrices
method to update values.Warning
The order of the passed presample package dirpaths is key! Every matrix element or named parameter that is repeated in multiple presample packages will take the value of the last presample package that is passed. All former values will not be used.
Warning
Note that we currently assume that all index values in matrix data will be present in the built matrices of the LCA instance. Silent errors or losses in efficiency could happen if this assumption does not hold.
When creating a PackagesDataLoader
instance, parameter and matrix data automatically loaded by invoking the method
load_data
to each path in dirpaths
:
-
classmethod
PackagesDataLoader.
load_data
(dirpath, seed=None)¶ Load data and metadata from a directory containing a presamples package
Parameters: - dirpath (str or Pathlike object) – path to a presamples package
- seed ({None, int, array_like, "sequential"}, optional) – Only specify this if you want to override seed value in presamples package.
Returns: Dictionary with loaded data
Return type: dict
Loaded data can then be parsed for accessing consolidated parameters or for injecting data in LCA matrices.
Using named parameters in PackagesDataLoader
¶
With load_data
, all presample packages are loaded. However, to ensure that only the last presample package with
data on a specific named parameter is used, parameters are consolidated. The consolidated parameters are available
via the parameters
property. parameters
points to a ConsolidatedIndexedParameterMapping
object.
-
class
presamples.loader.
ConsolidatedIndexedParameterMapping
(list_IPM)¶ Interface for consolidated named parameters in set of presample packages
Map all named parameters in a list of IndexedParameterMapping objects to presample arrays and Indexers identified in the last presample package that contains data on the named parameter.
This allows named parameters to be overwritten by successive presample packages.
Typically called directly from a
PackagesDataLoader
instance.Parameters: list_IPM (list) – List of IndexedParameterMapping objects. The IndexedParameterMapping (IPM) objects are typically created by a PackagesDataLoader
instance.Important
The order of the IPMs is crucial, as named parameters in later IPMs overwrites data from earlier IMPs.
Notes
The CIPM instance can be used to access the following properties:
names
: names of all n named parametersipm_mapper
: dict {parameter name: IndexedParameterMapping}, identifying the IndexedParameterMapping used for a given named parameter.consolidated_array
: array of shape (n,) values, giving access to the values for the n named parametersconsolidated_index
: array of shape (n,) values, giving access to the index values for the n named parameters in their respective IndexedParameterMappingids
: dict of format {named parameters: ids}, whereids
are tuples of (presamples package path, presamples package name, name of parameter).ids
only contains information about the last presamples package with data on the named parameter.replaced
: dict of format {named parameters: ids of presample packages that were overwritten}
-
consolidated_array
¶ Array of values for named parameter
Each value is taken from the last IndexedParameterMapping object that contains data on the named parameter. The used IndexedParameterMapping contains information about the path to the presamples array, the corresponding mapping for the named parameter and the current Indexer value.
-
consolidated_indices
¶ Return the index value for the IndexedParameterMapping used for each name
Using PackagesDataLoader
with LCA¶
Brightway LCA
(and MonteCarloLCA
) objects can seamlessly integrate presample packages.
>>> from brightway2 import LCA
>>> lca = LCA(demand={product:1}, presamples=[pp_path1, pp_path2, pp_path3])
This instantiates a PackagesDataLoader
as described above.
It then indexes arrays:
-
PackagesDataLoader.
index_arrays
(lca)¶ Add row and column values to the indices.
As this function can be called multiple times, we check for each element if it has already been called, and whether the required mapping dictionary is present.
Finally, data from the correct columns in the presamples arrays are inserted in the LCA matrices:
-
PackagesDataLoader.
update_matrices
(lca=None, matrices=None, advance_indices=True)¶ Update the LCA instance matrices from presamples