Deep Feature Synthesis is a technique to automatically generate new features from existing features in a dataset based on relationships within the data. It applies mathematical functions across features to create "deep" derived features. It handles increased dimensionality through dimension reduction using SVD. The authors use a machine learning model to tune hyperparameters by sampling parameters randomly, assessing models through cross-validation, and using a Gaussian copula model to predict optimal parameter neighborhoods to sample from next.
3. FEATURE ENGINEERING IS:
the act of transforming your data
into a format that better represents
the underling problem
4. - Peter Norvig, Director of Research, Google
More data beats clever algorithms,
but better data beats more data.
5. Deep Feature Synthesis…
Generate new features based on relationships
within the data
Applies mathematical functions across feature
space depending on data types
Can stack functions to create ‘deep’ features
6. Nominal Data…
Labels without any quantitative value
• Gender: male / female
• Customer type: active / churned
• Blood type : A, O, AB
Cannot perform quantitative operations
• Frequency counts
• Mode
7. Ordinal Data…
Nominal data with a natural order
• Exam Grades: A, B, C, D, E, F
• Likert Scale: 1-10*
• Customer ratings: 1-5 Stars*
Inherits all the properties of nominal data,
plus, we have ordering
• Medians
• Quantiles
*It may look numeric but it’s not
8. Interval Data…
•Numeric data where the distance between
values is meaningful (but no ‘true’ zero)
• Temperature (Celsius, Fahrenheit)
• Time on a clock
•Quantitative data, meaning we have many
more options
• Add / Subtract
• Mean
• Standard Deviation
9. Ratio Data…
•Numeric data with a ‘true’ zero
• Height
• Number of orders
• Revenue
•Quantitative data, meaning we have many
more options
• Multiply / divide
• Ratio
10. An Example Dataset…
Customers Orders
customer_id gender age customer_id order_id date
1 m 28 1 1 01/05/2018
2 f 45 4 2 12/05/2018
3 … … … … …
4 … … … … …
Products Items Ordered
product_id price colour product_id order_id quantity
1 100.00 red 1 1 1
2 49.99 white 5 1 12
3 … … … … …
4 … … … … …
11. Feature Abstraction..
•Entity Features (EFEAT)
• Function applied element-wise to existing features, e.g. convert
date to day of week, normalize feature’s scale to 0-1
•Relational Features (RFEAT)
• Function applied to group of values via backward relationships,
e.g. Min, Max, AVG, Count
•Direct Features (DFEAT)
• Function applied to group of values via forward relationships
14. Handling the Increased Dimensionality…
•Process creates a lot of features
• Slows model training
• More expensive hardware required
• Increased risk of overfitting
• Reduced model performance, e.g. for clustering
15. Dimension Reduction…
•Authors use SVD to reduce dimensionality
• Create new features called components, which contain
linear combinations of original features
• Compresses data into a smaller feature space
• Select top n% features based on feature importance
17. Tuning Hyper Parameters…
6 * 490 * 90 * 10 * 450 * 20 * 100 = two trillion three hundred eighty-
one billion four hundred million combinations
Parameter Range
Clusters [1-6]
SVD Components [10-500]
% Components Selected [10-100]
Oversampling Ratio [1-10]
Trees in Random Forest [50-500]
Decision Tree Depth [1-20]
% Features in Tree [1-100]
18. Using a Model to Tune a Model…
•Tune parameters using Gaussian Copula
• Sample hyperparameters randomly
• Assess model using cross-validation
• Model non-linear functions between parameters using
Gaussian copula
• Predict neighborhood of parameters to sample from next
• Repeat…
19. Conclusions…
•DFS automatically synthesizes new features
based on relationships in the data
•Use SVD to control the size of the feature
space and keep it manageable
•Optimize parameter tuning by modeling
parameter space