Correlation Analysis Modeling Use Case - IBM Power Systems

Agenda • Introduction
• Current Affairs
• Automated Modeling methodology of H2o.ai and their
association with IBM
• Integrated Systems Approach
• PowerAI Overview
• Model Development, Management, Deployment using
IBM Data Science Experience on Local
• NVIDIA in AI Space
• ShortTerm ScalingOptions
• Fast track time to product
• Hands-on Demo - Exercise – H2o.ai

Introductions
Network with IBM
team present at the
event
Discuss a ‘Use Case’
Extend a LinkedIn Invite
Connect on twitter
• Technical team Leader
• Project head
• Speaker
• Speaker

Current
Affairs
• PowerAI: The World’s Fastest
Deep Learning Solution Among
Leading Enterprise Servers
• Deep Learning Software Distro
• Evaluating Julia for Deep Learning
on Power + NVIDIA Tesla

Automated Modeling
methodology of H2o.ai and
their association with IBM
• Technology leader with most completeness
of vision
• H2O.ai customers gave the highest overall
score
4

H2O Driverless AI Complements IBM Power AI & Vision
IBM Power AI is
Automatic Deep
Learning
Sensor
Log
Transactional
H2O Driverless AI is Automatic
Machine Learning
Image
Integrated Systems Approach (On Premises SAAS Platform)
5

Model Development,
Management,
Deployment using
IBM Data Science
Experience on Local
(DSX - Local) Winning with Data
Science
Key Features :
• Available for Anaconda 2.7, Anaconda 3.5 and Anaconda plus Spark
• Runs on Horton Works Data Platform Hadoop
• SPSS for DSX –Visual Productivity for Data Science
• Decision optimization for DSX -Transform insights from ML into actions
• Model Management and Deployment - Accelerate time to production
• Deploy Python, R, & Spark Models online or batch
• Track model accuracy and schedule evaluations
• Load-balancing and auto-scaling support
DSX Local
a platform for enterprise
data science teams
...
For individuals
For teams
SPSS for DSX DO for DSX Notebooks
6

IBM Data Science Experience Local – Leadership Position
Forrester BARC

Fast track time to product - Utility leverages different
type of deployments, Options for Digital Transformation
Watson & AI
Analytics
IoT
Machine
Learning
Multi-Cloud
Multi-Cloud
Hybrid
___aaS
DevOps
CDM
Agile Workflow
Automation & Orchestration
Modernization &
Transformation
Containerization
Private Cloud
Digital Transformation
8

• A single binary download of the most ubiquitous open source AI and
major deep learning frameworks packages that are all precompiled
with of the supporting software that they require to run.
flattening the deep
learning time to value
curve
Artificial Intelligence
on Power
Infrastructure
Enterprise Software
Distribution
• PowerAI S/W
• Support
Tools for Ease of
Development
• PowerAI VISION
• Apache-Spark Data
ETL and
Preparation Tool
• DL Insight
Automation & Modeling
• Distributed Deep Learning
• Large Model Support
Enterprise Deep Learning Distribution
PowerAI is the “Red Hat” of Deep Learning
• Simplified deployment, integrated support, excellent performance
Safe and Secure
• On Premises
Installation
• Easy integration
• Supports Several API’s
• Embedded Safety
&Security features
9

GoogLeNet – 1000 epochs
LOWER IS BETTER
3.8x
faster
[9709]
seconds
4xTesla
V100 GPUs
PCIe3
Critical capabilities (regression, nearest neighbor,
recommendation systems, +++) operate on more
than just the GPU memory
Use Server and GPU memory to support higher
resolution data by moving large amounts of data
between the CPU and GPU
PowerAI automatically enables seamless use of Server
and GPU memory
NVLink 2.0 and POWER9 significantly cuts training
times and boosts performance (accuracy) of the
model with higher resolution data
train more | build more | know more
• POWER9 delivers 3.8x reduction in
AI training with same NVIDIA GPU
Benchmark details in speaker notes.
[2622]
seconds
4xTesla
V100 GP
NVLink 2.0

3.7x
2.3x
train more
build more
know more
3.8x
11

If you don’t want to pay for On premises- systems then you can scale out
using Nutanix Components in IBM Hyper converged System
Turnkey infrastructure platform
that converges compute, storage, networking and
virtualization to run any application, at any scale.
Comprehensive application and
infrastructure management solution
that radically simplifies datacenter operations.
Short Term Scaling Options
Nutanix Acropoxlis - ProNutanix Prism - Starter
(Pro Edition available sometime after GA … but soon.)
Simplify infrastructure
management with one-click
operations.
(Ultimate Edition available as add-on.)
A powerful scale-out data fabric for
server, storage, virtualization and
networking.
14

NVIDIA
Tech Chip
Company
“A giant leap”
Taking AI from
video games to
autonomous car
• Partners with IBM, known for their Graphical
processing Units on desktops, Key element : Pascal
GPU
• Making an impact in driverless/Autonomous driving
space
• Accelerated computing, NVIDIAGPUs enabling NVLink
technology on Power that is a huge edge

EXAMPLE – BUILDING, IMPLEMENTING AND
DEPLOYING MODELS
CASE STUDY ON
CRIME & ITS IMPACT ON SOCIETY

Case Study
Question
Do the people having good
financial standing ,higher
education level, a steady job
corresponds to commit fewer
crime, and Does the uneducated,
or poor people commit more
crime?
• Data Source : From the Communities and Crime Un-
normalized Data Set
• Website : http://archive.ics.uci.edu/ml/machine-learning-
databases/00211/CommViolPredUnnormalizedData.txt
• Total Observations : 2215
• TotalVariables : 147

Extracted
PctLess9thGrad
e PctNotHSGrad
with exterior
parameters
related to
employment and
poverty
Variable Format Min Max Description
PctWWage Num 31.68 96.76
Percentage of households
with wage or salary income in
1989
PctPopUnderPov Num 0.64 58
Percentage of People under
the poverty level
PctLess9thGrade Num 0.2 49.89
Percentage of People with 25
and Over with less then 9th
grade of education
PctNotHSGrad Num 1.46 73.66
Percentage of people 25 and
over that are not high school
graduates
PctUnemployed Num 1.32 31.23
over, in the labor force, and
unemployed
PctEmploy Num 24.82 84.67
over who are employed

Quantitative
Exploration
Variable Mean Median Comments
PctLess9thGrade 9.186646 7.74
Will be used in the model to
calculate the education level
PctNotHSGrad 22.30512 21.38
Will be used in the model to
calculate the education level
PctUnemployed 6.045242 5.45 Exterior Element
PctEmploy 62.02161 62.44 Exterior Element
PctPopUnderPov 11.62054 9.33 Will be used as the dependent
Crime 1081.1729 150 Sum of the Crime
robberies 237.9521 19 Exterior Element
autoTheft 516.6926 75 Exterior Element
assaults 326.5282 56 Exterior Element
Community 65.58753 27
Removed from the analysis
due to data scarcity

Outliers, and
Data
Transformation
• Dropped 9 outliers from population
• Dropped 2 outliers where people > 50,000 for
population and > 400 for communities.
• Examined the normality of data
• PctLess9thGrade and PctNotHSGrad were
transformed to improve their normality
Normal Distribution
pctWWage
MedOwnCostPctIncNoMtg
Violent Crimes
PctPopUnderPov

Correlation
Analysis
• The only exterior factor highly correlated to various
burglary, a type of crimes is people who live under
poverty
• People who attended school, and are less then 9th grade,
and also people who aren’t High School grad also
correlates positively on a low level to various types of
crime.
• Wage of a person or employment has negative
correlation with Crimes
population pctWWage PctEmploy PctPopUnderPov PctLess9thGrade PctNotHSGrad
population 1.00000000 -0.01273487 -0.02246343 0.09537704 0.04318251 0.05564877
pctWWage -0.01273487 1.00000000 0.87065844 -0.52248347 -0.43352533 -0.54723404
PctEmploy -0.02246343 0.87065844 1.00000000 -0.70090092 -0.53131695 -0.61725106
PctPopUnderPov 0.09537704 -0.52248347 -0.70090092 1.00000000 0.64238395 0.66442624
PctLess9thGrade 0.04318251 -0.43352533 -0.53131695 0.64238395 1.00000000 0.92756033
PctNotHSGrad 0.05564877 -0.54723404 -0.61725106 0.66442624 0.92756033 1.00000000
robberies 0.24168197 -0.25706104 -0.30391282 0.48295302 0.32251613 0.41247479
burglaries 0.10177926 -0.25440430 -0.25898625 0.41750058 0.21035835 0.30832998
autoTheft 0.95461771 -0.03632701 -0.04477308 0.09044789 0.05807327 0.07301789

Demographics & Crime Correlation
positively correlated to the crime rate:
• People who live under poverty are more likely to be
involved in burglaries
• Percentage of adults with less than 9th grade
education has high tendency to commit crime
negative to crime rate:
• People who are employed
• People who make good wages
• percentage of 65 and older in population
1) Correlation plots to visualize the relationship between
each independent variable and the dependent variable

2) crime rate against each variable group to visualize the correlation
violent vs. population NonViolentCrime vs. population
• no significant correlation between population and crime rate

violent vs. Education
• The more educated, the lower crime rate
NonViolentCrime vs. Education

• Poor people, they commit more is the crime
violent vs. PercentageUnderPovertyviolent vs. PctUnderPoverty

Data Description • Communities and Crime Un-normalized Dataset
• over 2215 observations and 147 attributes collected
between 1995 and 2011
• We selected 13 major independent variables for our
project
• Dataset clean process:
• Plot each independent variable Elimate outliers
Close to Norm Distribution
• Omit “Crime Per Capita” for there are missing data
• We eventually obtained 2011 observations for non-
violent crime and 1894 observations for violent
observations.

• PrincipalComponentAnalysis &
Regression:
Use an orthogonal transformation to convert a set
of observations of possibly correlated variables
into a set of values of linearly uncorrelated
variables
• Benefit:
1) Able to evaluate and summarize a dataset before
implementing a statistical algorithm to evaluate
the results
2) Able to decompose a high dimensional data to
extract the main feature components of data
MethodsThis lm command
consumes a lot of
processing; Power
Infrastructures makes sure
that your model runs and
produce results in time.

• Effective factors for Crime:
Population Education Age Income Poverty
• Certain Areas do have greater rate of crime than
national average level
• People who are poor commit more crime in certain
region
• Crime statistics in these towns are different as
compared to the other cities
• Crime influences the legal system through new laws,
laws that impact the life of the poor, especially those
in vulnerable areas
• One must be careful to use crime data to form the
basis of new laws
• There is further scope of this research as the
contributing variables to Crime Attribute are vast
Conclusion

Brainstorming
Neural Networks
Capsule networks
• Neural Networks: Artificial Intelligence, Leverage Neural Networks to
solve subsets of general Intelligence problems that includes perception,
computer vision, prediction analysis, knowledge representation,
Intelligent Agents, NLP, Automated reasoning, AI Planning (DFS is one
approach to solve this), Interface for control systems.
• Model fit /Optimization /Analytic and Simulation: Real time decision
making, autonomous agents to make real time in less time, and provide
support order Actions, Hard to put a boxer on any particular problem,
need to have individuals for the areas that cannot be modeled accurately
that area would be useful to know.
• Discussions on deployment and Continuous learning: How to
monitor and manage models that could go to production. Underlying the
learning track for modern Data platforms, Big Data Analytic and Open
Source.
• Illustrating descriptive and diagnostic approach: Value within a data
science not just building models, but supporting decision making and
automating them by linear programming type approaches; discussing
initial descriptive and diagnostic effort that is tremendously valuable
Capsule Network :When multiple predictions agree, a higher level capsule becomes active.
27

Correlation Analysis Modeling Use Case - IBM Power Systems

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Correlation Analysis Modeling Use Case - IBM Power Systems

Similar to Correlation Analysis Modeling Use Case - IBM Power Systems (20)

Recently uploaded

Recently uploaded (20)

Correlation Analysis Modeling Use Case - IBM Power Systems

Editor's Notes