23. How to be awesome
Listen and don’t be an a-hole
Business value comes first
If you can solve without ML, go for it
24. Intro to Machine Learning - Terms you got to know
Term Explanation
Feature
Engineering
Feature engineering is a key step in applying ML. A “feature” is an attribute shared by all the data units
used by the ML model. Finding the right features is key to creating an effective ML model, but it’s also
difficult and time-consuming. Learn more about feature engineering here.
SKLearn and
TensorFlow
SKLearn - also called SciKit Learn - and TensorFlow are free, open-source software libraries for ML and
deep learning workflows, models, and algorithms. SKLearn and TensorFlow also work together as SciKit
Flow, a simplified interface to help people get started on predictive analytics and ML. Learn more about
SKLearn and TensorFlow here.
Hyperparameters
Optimization
Hyperparameters are the rules that govern training algorithms. The success of a training model - and thus
the entire ML model - depends on optimizing the hyperparameters to the correct settings. Learn more
about hyperparameter optimization here.
Data splitting Before data analysts can use data for machine learning models, they need to split the data. Usually, data
is split into three datasets: a training dataset, a validation dataset, and a test dataset. Learn more about
datasets and data splitting here.
Leakage and
Overfitting
Overfitting is when an ML model produces training results that are too accurate, so much so that it can’t
produce reliable results with real world data. Leakage means that unexpected information leaked in to the
training dataset, causing unrealistic results. Leakage has a number of different causes, and it is itself one
of many causes of overfitting. Learn more about leakage and overfitting here and here.