17. Deep Learning
- Very Powerful
- Needs lots (millions) of data
Classic ML:
Random Forest
Gradient Boosting
Choosing a Machine Learning Algorithm
17
MODEL SELECTION
Chosen algorithm: CatBoost
18. - Number of trees
- Depth of trees
- Learning rate
18
HYPERPARAMETERS
Example from Titanic data
Source: Wikipedia
19. CatBoost brings:
- Powerful
- Open-source
- Handles categorical data very well
- Automatic overfitting detection
- (Works on GPU)
19
CATBOOST
21. Trade-off between being too severe/ Being too friendly
Avoiding Overfitting
False positives > Innocents wrongly flagged
False Negatives > Fraudsters not detected
EVALUATING PERFORMANCE
Hyperparameter Tuning
21