Machine Learning for Java Developers - Nasser Ebrahim

© 2018 IBM Corporation
Eclipse Day
Machine Learning for Java Developers
Nasser Ebrahim(enasser@in.ibm.com)
Software Architect, IBM Software Lab

2 © 2017 IBM Corporation© 2018 IBM Corporation
Agenda
• Introduction
• Algorithms
• Frameworks in Java
• Demo with Jupyter Notebook & Weka
• Demo with Eclipse & DL4J
• Q & A

Machine Learning
Machine learning is a field of
computer science that gives
computers the ability to learn
without being explicitly
programmed.
• an application of artificial intelligence
• ability to automatically learn and improve from experience
without being explicitly programmed
• development of computer programs that can access data and
use it learn for themselves

Machine Learning – Why now?
• Availability of Data
• Storage cost
• Computational power

Machine Learning – Applications

Machine Learning – Workflow

Machine Learning – Features & Label
Features are relevant and independent variables in data (X)
Label is the dependent variable that we need to predict (y)

Types of Machine Learning

Supervised Learning
Labeled training data
• Classification
• Regression
Unsupervised Learning
• Clustering
• Association
Unlabeled training data

Reinforcement Learning

Linear Regression
In statistics, linear regression is a linear approach for
modelling the relationship between a scalar dependent
variable y and one or more explanatory variables (or
independent variables) denoted X.
y = b0 + b1*X
Best-fit-line that best
describe dataset with
reduced square error

Logistic Regression
Used when data has binary dependent variables

K Nearest Neighbors (Classification)
• Based on Euclidian distance of
K points
• K should be greater than
classification groups
• K should always be odd number
for better classification

K Means clustering
• Works with very large datasets
• Starts by picking k, the number
of clusters
• Start by choosing k random
points – called as centroids
• Populate clusters

K Means clustering

Apriori Algorithm - Illustration
• Uses level-wise search, where k-itemsets are used to explore (k+1)
itemsets
• Candidate Generation – Frequent itemsets are extended one at a
time
• Determines frequent itemsets that can be used to determine
association rule which highlight general trends in the database.
• Provides insight into which products tend to be purchased together and
which are most amenable to promotion.
• Trivial pattern
• People who buy chalk-
piece also buy duster
• Inexplicable pattern
• People who buy
mobile also buy bag

Apriori Algorithm - Example
Min Support = 2
TID Items
100 1 3 4
200 2 3 5
300 1 2 3 5
400 2 5
500 1 3 5
Itemset Support
1 3
2 3
3 4
4 1
5 4
Itemset Support
1 3
2 3
3 4
5 4
CL1 FL1
TID Items
100 1 3 4
200 2 3 5
300 1 2 3 5
400 2 5
500 1 3 5
Itemset Support
1 2 1
1 3 3
1 5 2
2 3 2
2 5 3
3 5 3
Itemset Support
1 3 3
1 5 2
2 3 2
2 5 3
3 5 3
CL2
FL2

Java ML Libraries & Frameworks
ADAMS
ELKI
Java-ML
JSAT
Encog

Waikato Environment for Knowledge Analysis (Weka)
• Machine learning/data mining software written in Java (distributed
under the GNU Public License)
• Developed at University of Waikato, New Zealand
• Comprehensive set of data pre-processing tools, learning algorithms
and evaluation methods
• Graphical user interfaces (incl. data visualization)
• Environment for comparing learning algorithms

Deeplearning4j - DL4J

Jupyter Notebook
• An open-source web application that allows you to create and
share documents that contain live code, equations, visualizations
and narrative text.
• Install Jupyter using anaconda
• http://jupyter.org/install
• Different kernals to work on
different languages
• python, R, scala, Java, Spark

Jupyter Notebook with Java Kernal
• IJAVA - Jupyter kernel for executing Java code
• https://github.com/SpencerPark/IJava
• Install IJAVA using the archive from
https://github.com/SpencerPark/IJava/releases/download/v1.1.2/ijava-
1.1.2.zip
• The kernel executes code via the new JShell tool from Java 9.

Eclipse with maven M2Eclipse
• Launching Maven builds from within Eclipse
• Dependency management for Eclipse build
• Resolving Maven dependencies from the Eclipse workspace
• Wizards for creating new Maven projects, pom.xml, etc
Install Maven in Eclipse
• Eclipse -> Help -> Install New Software
• Enter http://download.eclipse.org/technology/m2e/releases/
• Select “Maven Integration for Eclipse”
• Click on next, accept agreement & Finish

Q & A

Machine Learning for Java Developers - Nasser Ebrahim

More Related Content

Similar to Machine Learning for Java Developers - Nasser Ebrahim

More from Eclipse Day India

Recently uploaded

Machine Learning for Java Developers - Nasser Ebrahim