This document discusses using machine learning and AI techniques with IBM DB2 database. It outlines 6 steps for building machine learning models in DB2: 1) data exploration, 2) data transformation, 3) model building, 4) model evaluation, 5) model inference/deployment, and 6) collecting data from trucks into the DB2 database. It provides examples of stored procedures in DB2 for each step and machine learning algorithms like decision trees, linear regression, naive Bayes, and k-means clustering. Finally, it discusses opportunities to apply predictive modeling to tasks like truck maintenance optimization and driver retention.
5. STEP 2. Data exploration
Use these stored procedures to evaluate the content of the given
data
IDAX.SUMMARY1000 - Summary of up to 1000 columns
IDAX.COLUMN_PROPERTIES - Create a column properties table
IDAX.GET_COLUMN_LIST - Get a list of columns
6. STEP 3 : Data transformation
• Use the following stored procedures to transform the data
before passing it to a machine learning algorithm.
IDAX.IMPUTE_DATA - Impute missing data
IDAX.SPLIT_DATA - Split data into training data and test data
IDAX.STD_NORM - Standardize or normalize columns of the input table
IDAX.EFDISC - Discretization bins of equal frequency
IDAX.APPLY_DISC - Discretize data by using limits for discretization bins
7. STEP 4 : Model building
• Use these stored procedures to build machine learning models.
Decision trees - IDAX.GROW_DECTREE A decision tree is a
hierarchical, graphical structure accurately classify a model.
Linear regression - IDAX.LINEAR_REGRESSION is the most
commonly used method of predictive analysis.
Naive Bayes IDAX.NAIVEBAYES - The Naive Bayes classification
algorithm is a probabilistic classifier.
K-means clustering IDAX.KMEANS - The K-means algorithm is the
most widely used clustering algorithm
8. Step 5 : Model evaluation
• Use these stored procedures to evaluate the performance of your model by comparing predictions to the
true values.
IDAX.CMATRIX_STATS - Calculate classification quality factors from a
confusion matrix
IDAX.CONFUSION_MATRIX - Build a confusion matrix
IDAX.MAE - Calculate the mean absolute error of regression predictions
IDAX.MSE - Calculate the mean squared error of regression predictions
9. STEP 6 Model inferencing/ Deployment
• Use these stored procedures to make predictions with your trained machine learning model.
IDAX.PREDICT_DECTREE - Apply a decision tree model
IDAX.PREDICT_KMEANS - Apply a K-means clustering model to new data
IDAX.PREDICT_LINEAR_REGRESSION - Apply a linear regression model to a
target
IDAX.PREDICT_NAIVEBAYES - Apply a Naive Bayes model to new data
16. Opportunities for predictive Modeling
How long the truck can run without maintenance ?
How many drivers can quit in the next 30 days ?
Company turnover in future ?
Accident Prevention ?
Battery life on the trucks ?
Weather alerts to drivers - k-means algorithm?
Run machine learning set up on AIX-Open shift environment (Testing in
progress)
17. Q & A
Q & A
srikamani@gmail.com
jssivakumar@hotmail.com