The configuration file is by default expected to be named as config.yaml
placed at the root of the repository. Use environment variable ODAP_CONFIG_PATH
to use a different file.
This can be used for keeping multiple configurations within the same directory with each of them being used by a different computation job.
Example: ODAP_CONFIG_PATH="_config/config-customer.yaml"
It is necessary to configure the source directories from which the feature notebooks will be calculated.
feature_sources
takes a list of objects with keys path
(mandatory) and prefix
(optional)
path
is a path to one of the feature source directories relative to the location of the orchestration notebook from which the calculation is being called
prefix
is an optional string which if given is appended at the beginning of each feature name from a given feature source
e.g. if path contains notebooks producing features called gender
and transaction_count_30d
then given prefix global
these features will be called global_gender
and global_transaction_count_30d
featurefactory:
feature_sources:
- path: "../../odap-test/odap_framework_demo/notebooks"
prefix: "global" # all feature names are prefixed
- path: "notebooks"
# prefix can be ommited
- path: "hidden/confidential/features"
prefix: "top_secret"
It is possible to use only some of the feature notebooks for calculations.
To white-list particular notebooks, include_notebooks
section can be specified for each feature source object. It takes a list of individual notebooks that should be included into the computation. The example above would look as follows:
featurefactory:
feature_sources:
- path: "../../odap-test/odap_framework_demo/notebooks"
prefix: "global" # all feature names are prefixed
include_notebooks:
- "feature_notebook_1"
- "feature_notebook_3"
- "feature_notebook_4"
- path: "notebooks"
# prefix can be ommited
- path: "hidden/confidential/features"
prefix: "top_secret"
To black-list notebooks, there is an exclude_notebooks
section for each feature source object as well. exclude_notebooks
cannot be used without defining the include_notebooks
section so that the explicit inclusion is always specified in case of blacklisting. To include all notebooks, a wildcard symbol can be used instead of individual notebook names. The example of excluding only some of the notebooks would be as follows:
featurefactory:
feature_sources:
- path: "../../odap-test/odap_framework_demo/notebooks"
prefix: "global" # all feature names are prefixed
include_notebooks:
- "*"
exclude_notebooks:
- "feature_notebook_1"
- "feature_notebook_3"
- "feature_notebook_4"
- path: "notebooks"
# prefix can be ommited
- path: "hidden/confidential/features"
prefix: "top_secret"
Both include_notebooks
& exclude_notebooks
sections are optional. If not specified, all notebooks are considered as included. The notebooks are specified by their name, not their relative path to the root of the feature source directory. It is thus recommended to use unique descriptive names for individual feature notebooks (in case of having multiple notebooks with the same name, the configuration would affect all the notebooks at once).