The basic recommendation is to group features with the same period of computation to the same table.
For example if I have three feature notebooks:
customer_gender
monthly_repayments
daily_transactions
It is necessary to put all of these into three separate tables.
In this case it is hard to give a general rule for grouping features into tables.
It depends on how many feature columns are calculated in each notebooks, if different input data arrives at different times and so on.
Our recommendation for a project with N
notebooks is to have approx. N/10
tables where each table corresponds to a particular dataset e.g. web data
, card transactions
, account transactions
etc. When one of these tables becomes large in column size we recommend splitting it into two keeping roughly 150-300 columns in each table.