Common configurations like segmentation logging and export destinations are in the main configuration file config.yaml
located in the project root.
Additional use-case specific configurations are in config.yaml
files located in custom-named folders under the /use_cases
folder.
├── use_cases
│ ├── upsell_uc
│ ├── segments
│ │ ├── customer_loan_interest.sql
│ │ └── customer_mortgage_interest.sql
│ └── config.yaml
├── exporters
│ └── azure_blob.py
└── config.yaml
config.yaml
- main configuration fileuse_cases/<use_case_name>/config.yaml
- configuration file of specific use caseuse_cases/<use_case_name>/segments/<segment_name>.sql
- SQL or PySpark notebook with a specific segmentexporters/<exporter_name>.py
- custom exporterEach segmentation run writes a log and segments to Hive tables. The tables must be configured as shown below.
catalog
- catalog where all the segments databases will be stored
database
- database where all the segments tables will be stored
table
- the name of the Hive table
path
- the physical location of the data, path
is optional, when not present - managed table using either dbfs or Unity catalog settings will be used.
# /config.yaml
segmentfactory:
catalog: "hive_metastore"
database: "odap_segments"
log:
table: "export_logs"
path: "dbfs:/odap_segments/export_logs.delta"
segment:
table: "segments"
path: "dbfs:/odap_segments/segments.delta"
my_azure_blob_export
as shown below).type
. The value must match the name of some predefined or custom exporter (see Creating a custom exporter ).attributes
. These attributes are selected from the Feature store and are included in the final segment dataframe.
attributes
is a dictionary whose keys are entity names and values are lists of attributes