SlideShare a Scribd company logo
1 of 60
Download to read offline
A New Era for Predictive Analytics
with SPSS
© 2012 IBM Corporation
The Mining Metaphor
2
!
●Gold Mining Diamond Mining Data Mining
© 2012 IBM Corporation
What is Data Mining? An early definition
Finding patterns in your data
which you can use
to do your business better
!
–It’s about patterns
–It’s about something you can use – practical things
–It’s about business
A recent definition
▪ Business-oriented discovery of patterns across all forms of data
▪ Produces insight and a predictive capability
▪ Deployment of predictions throughout the enterprise
© 2012 IBM Corporation
What is Data Mining?
4
!
Information Retrieval Information Extraction Information Analysis
!
+ +
Discover new, previously unknown information
© 2012 IBM Corporation
IBM SPSS Supports the Predictive Enterprise

Delivering Profitable Revenue Growth & Operational 

Efficiency
▪Capture a complete perspective
–Survey customers & constituents
–Leverage structured, semi-structured &
unstructured data

▪Predict behavior and preferences
–Statistics for deeper insight
–Data & text mining for predictive modeling

▪Act on results
–Deploy scoring models for dynamic
decisions
–Directly affect business process with event
integration
© 2012 IBM Corporation
IBM SPSS: Our core value proposition
SPSS’ goal is to apply analytic to optimize decisions at every contact point, made possible by
enabling pervasive, predictive real-time decisions at the point of impact

© 2012 IBM Corporation
▪ SPSS Data Collection
– Collecting additional Attitudinal data for advanced
analytics typically collected through surveys
!
▪ SPSS Statistics
– Expand analytics capabilities to Professional
Business User / Statistician
– Add advanced statistical analysis to PM
!
▪ SPSS Modeler
– Provide predictive analytics using data mining & text
mining methods for key parts of the business
– Predict future outcome and understand what
influences it.
!
▪ SPSS Deployment & Collaboration Services
– Analytical asset management across multiple
analysts
– Audit, security, refresh
– Provide a web service interface
!
▪ SPSS Analytic Server
– Provide Big Data connectivity to SPSS Modeler
– It translate SPSS modeler server requests into
Hadoop jobs
!!
▪ SPSS Analytical Decision Manager
– Business scenario analysis
– Complex Rule for operational decision management
!
SPSS Predictive Analytic Platform
© 2012 IBM Corporation
SPSS Modeler 16 Editions
• SPSS Modeler GOLD
-Enables organizations to build predictive models to improve business process and help people or systems
make the right decisions each time. It combines and integrates predictive analytics, rules, scoring, and
optimization techniques to deliver recommended actions at the point of impact.
!
SPSS Modeler Premium + C&DS + Analytical Decision Management
!
• SPSS Modeler Premium
- Offers a range of advanced algorithms and capabilities including text analytics, entity analytics, social network
analysis, and automated modeling and preparation techniques to address a multitude of business problems
and analytic requirements on almost any type of data.
!
SPSS Modeler Professional + Text Analytics Workbench
!
• SPSS Modeler Professional
-Includes a range of advanced algorithms, data manipulation, and automated modeling and preparation
techniques to build predictive models and uncover hidden patterns in structured data.
© 2012 IBM Corporation
R is gaining in popularity, Do not walk away from R
opportunities it's not a competitor
You Ready ?
▪ EMBRACE:
Integrate R algorithms (e.g. Random Forest)
Generate R charts
Use R functions for data preparations
Make R available for non-programmers
!
▪ EXTEND:
Scalability (e.g. database pushback)
Leverage R engines of other vendors like SAP HANA
Enterprise deployment
Big Data (Analytic Server)
Powered by
Introducing CRISP-DM Methodology
&
SPSS Modeling Techniques 

© 2012 IBM Corporation
Modeler Interface
Stream Canvas
Stream, Outputs
& Model Manager
Palettes
Nodes
© 2012 IBM Corporation
Visual Programming with Modeler
4
-Visual programming
-Based on icons ("nodes")
-Pick nodes from palette & place them on the bench
-Edit their attributes
-Connect to specify flow of data ("streams")
© 2012 IBM Corporation5
Can be exported to PMML to be reuse outside of Modeler :
like in Java applications, SAS, IBM Infosphere stream using the DataMining
ToolKit, …
Is the Result of a predictive model Generation
Yellow Nugget or Yellow Diamond
© 2012 IBM Corporation
CRoss-Industry Standard Process for Data Mining
2
1. Business Understanding
Project objectives and requirements
understanding, Data mining problem definition

2. Data Understanding
Initial data collection and familiarization, data
quality problems identification

3. Data Preparation
Table, record and attribute selection, data
transformation and cleaning

4. Modeling
Modeling techniques selection and application,
Parameters calibration

5. Evaluation
Business objectives & issues achievement
evaluation

6. Deployment
Result model deployment, Repeatable data
mining process implementationCRoss-Industry Standard Process for - Data Mining

( CRISP – DM )
© 2012 IBM Corporation
2. Data Understanding
4
Initial data collection and familiarization, data quality
problems identification

CRoss-Industry Standard Process for - Data Mining

( CRISP – DM )
© 2012 IBM Corporation
Reading Data
5
Modeler reads a variety of different file types, including data
stored in spreadsheets and databases, using the nodes within
the Sources palette.
© 2012 IBM Corporation
Getting to Know your Data
8
Data Audit Node
Distribution Node
Histogram Node
…
© 2012 IBM Corporation
3. Data Preparation
9
!
Table, record and attribute selection, data
transformation and cleaning
CRoss-Industry Standard Process for - Data Mining

( CRISP – DM )
© 2012 IBM Corporation
Data Manipulation in Modeler
10
To prepare the data before analysis:
• Eliminate missing values
• Remove unwanted fields from analysis
• Derive new fields
• Merge and match data
Intermediate nodes in Modeler
• Record operation nodes
• Field operation nodes
!
!
▪CLEM language is case sensitive
© 2012 IBM Corporation
CLEM language: The Expression Builder
11
© 2012 IBM Corporation
4. Modeling
13
!
Modeling techniques selection and application,
Parameters calibration

CRoss-Industry Standard Process for - Data Mining

( CRISP – DM )
© 2012 IBM Corporation
Sampling or Partitioning your Data
• May not want to use all records
• Score your model with remaining Data
• May wish to examine a subgroup separately
• May assist us with building a predictive model (oversampling)
• Keep in mind that the sampling method must be fit to the problem at hand
!
-Similar customers and I want to reduce size of dataset for modelling
then I can use simple sampling.
!
-But if you want to directly sample from a database with customers of
different types you may want to draw a complex sample.
!
© 2012 IBM Corporation
Matching Data to the Modeling Tool
• For example – we want to use Rule Induction...we will need to
think about
!
-How algorithm handles missing data
!
-Output that is created (binary versus larger splits)
!
-What are we trying to predict (numeric target or binary?)
!
-In Which format the input predictors have to be ?
© 2012 IBM Corporation
Modeling Technics in Modeler
• Supervised techniques (Predictive Models)
To model an output variable based on the several input variables, to predict future cases
where the outcome is unknown
-Neural Networks, Rule Induction (C5.0, CHAID, QUEST & C&RT)
-Decision List, Binary Classifier
-Linear Regression and Logistic Regression
-Generalized Linear Models
• Unsupervised Techniques (Clustering)
No field to predict, used to group similar records within the data
-Kohonen Networks, K-Means, Two Step, Anomaly, Discriminant
• Association Rules
To search for things that typically occur together
-APRIORI, CARMA, GRI and SLRM
!
• Data Reduction:
-PCA/Factor Analysis, Feature Selection
• Sequence Detection Models:
-Sequence
• Time Series
• Text Mining
!
SPSS Modeling Techniques
!
Association Models


© 2012 IBM Corporation
Association Models
!
–Association rules search for things (events, purchases, attributes)
that typically occur together in the data
!
–They find the patterns in data that you could manually find using
visualization techniques such as the web node (yikes!) but can do
so much faster and can explore more complex patterns.
!
–Used to answer questions such as:
• Do customers who buy fruit usually buy cheese?
© 2012 IBM Corporation
Output
!
SPSS Modeling Techniques
!
Segmentation Models


© 2012 IBM Corporation
Segmentation or Clustering Models
!
–Clustering techniques segment data into groups of cases/records/
customers that have similar patterns of input fields.
!
–Used in market segmentation studies whose aim it is to find distinct
types of customers so they can be targeted more effectively
!
–Used to answer questions such as:
• How can I group my customer to address the right marketing campaign?
© 2012 IBM Corporation
Clusters Output
!
SPSS Modeling Techniques
!
Classification & Statistical
Models


© 2012 IBM Corporation
Predictive or Classification Models
!
–Algorithms that are used to make predictions or forecasts based on
historical data
!
–Automatic classification allows customers to let the software
determine the best one or customers can choose a specific
algorithms such as Neural Networks, Logistic Regression, Time
Series, etc.
!
–Used to answer questions such as:
• What predicts whether a customer will leave?
• What predicts whether this employee will be a super-star?
• How many umbrellas will I sell in the next three months in Chicago?
© 2012 IBM Corporation
Output
© 2012 IBM Corporation
5. Evaluation
54
Business objectives & issues achievement
evaluation

CRoss-Industry Standard Process for - Data Mining

( CRISP – DM )
© 2012 IBM Corporation
6. Deployment
55
Result model deployment, Repeatable data mining
process implementation
CRoss-Industry Standard Process for - Data Mining

( CRISP – DM )
© 2012 IBM Corporation
Deployment Family: Products 

▪IBM SPSS Collaboration and Deployment
Services
– A foundation for managing and
deploying analytics
!
▪IBM SPSS Analytical Decision
Management
– Integrates analytics and business
knowledge to deliver optimal outcomes
56
© 2012 IBM Corporation
IBM SPSS Modeler Deployment Options
▪Client (Desktop)
–Access local files
–Connect to operational databases
–Connect to Cognos BI
–Processing performed on local installation
!
!
▪Client/Server
–Data operations/processing on server
–In-database data mining
–SQL pushback For PureData and Hadoop Platform
–Modeler Batch
–SuSE Linux Enterprise Server 10 (zLinux)
–Inclusion in Smart Analytics System for Power (AIX)
!
!
!
!
!
What’s New & Hot


© 2012 IBM Corporation
Predictive Analytics for Big Data

Get more Accurate Models with bigger volume and variety of data
- Read Data from Hadoop
!
- Write back to Hadoop
!
- Export your Models to Streams
!
- Prepare your Data on Hadoop
!
- Few Models can run on Hadoop
!
- R analytic capabilities in SPSS
!
© 2012 IBM Corporation
Bring Analytics on Big Data for Everyone
Automatic Summarization
• Top findings in data ranked by
“interestingness” and association strength
• Plain language synopsis
!
Automatic Exploration
• Guided presentation by selecting fields of
interest
• Dynamic Visual Insights
• Users can refine auto generated parameters
!
Automatic Modeling
• Auto selection of best models and detection
of strongest relationships: Decision Tree
(CHAID) and Key Driver Reports (based on
linear and logistic regression)
!
Sharing of Output
• Collaboration with peers
• Tablet optimization
!
!
SPSS Analytics Catalyst CR.I.S.P.-D.M. Methology
© 2012 IBM Corporation
Generate simulated data
!
Fit distributions from existing data
!
Evaluate the simulation
Example Use Cases:
- A retailer wants to simulate alternative
sales scenarios to identify which
strategy will make them most likely to hit
their targets
!
- A parts manufacturer is interested in
modeling storage costs based on
simulating different scenarios for future part
orders against stock supplies and excess
order fees
!
Monte Carlo Smulation
© 2012 IBM Corporation
Geospatial Data Mining– Understanding Geohashes
▪ Space-time Boxes use geohashes and timestamps to locate where
and when entities exist
▪ A geohash is a unique identifier that uses latitude and longitude to
create an alphanumeric string
▪ Its precision depends on its length; longer geohash = better
precision
▪ For example, geohash dr5ru7 is midtown Manhattan...but how do we
know?
© 2012 IBM Corporation
What Exactly is a Space –Time Box?
▪ Space-time Boxes extend geohashes to include a third
dimension: time
!
!
!
▪ Space-time Boxes ‘bin’ events in 3-D space and time
▪ Density (i.e. size) of the Space-time Box is a required
input
▪ Can help analysts understand proximity between
entities, verify relationships
dr5ru7|2013-01-01 00:00:00|2013-01-01 00:15:00
Geohash Start timestamp End timestamp
© 2012 IBM Corporation
IBM SPSS Modeler Embraces R
1. SPSS Modeler allows the user
to build and score R models
within the Modeler interface
2. SPSS Modeler allows the use of
R functions for data preparation
and chart/output creation
3. The Custom Dialog Builder for
R allows the user to create
custom nodes that run R
algorithms, functions, or
outputs
4. These custom nodes can be
shared with other users and
they do not require the end
user to know any R code
© 2012 IBM Corporation
Use R to build a custom node
The world of analytics !
made easy for everyone
Bouchra Denis Antoine Danil
I am Sandra, a
data analyst.
USER
CODE
Sadly, SPSS Modeler 

cannot do 

EVERYTHING
SPSS Modeler Marketplace

App Store for Analytics
Spatial
Plot insightful interactive !
maps to explore your data
Visualize new patterns
Spatial
SocialSocial
Enhance your client understanding with social data!
Analyse the public opinion!
Spatial
Social
Databases
Connect to noSQL databases!
Connect to Bluemix in 2 clicks!
Connect to bigSQL and Hadoop!
Spatial
Social
Databases
Models
For our Business Partner
Predict which customers will come back and how much they will spend
Implemented in a BI solution for a large retailer
and Generate enterprise-grade reporting
Spatial
Social
Databases
Models
And many more!
…
Come to our booth to try them out
More than 30 new functionnalities
Potential growth
A lot of code already
available in packages
R is a widely used
language
Survey of use
R
IBM SPSS Statistics
Rapid Miner
SAS
Weka
Microsoft SQL Server
Matlab
IBM SPSS Modeler
0 % 18 % 35 % 53 % 70 %
Value
SPSS Modeler
Marketplace
SPSS Modeler BRAND
SPSS Modeler USERS
IBM PARTNERS
NODE DEVELOPERS
© 2012 IBM Corporation
Q&A

More Related Content

What's hot

Bi presentation Designing and Implementing Business Intelligence Systems
Bi presentation   Designing and Implementing Business Intelligence SystemsBi presentation   Designing and Implementing Business Intelligence Systems
Bi presentation Designing and Implementing Business Intelligence SystemsVispi Munshi
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business IntelligenceSukirti Garg
 
Building enterprise advance analytics platform
Building enterprise advance analytics platformBuilding enterprise advance analytics platform
Building enterprise advance analytics platformHaoran Du
 
Third Nature - Open Source Data Warehousing
Third Nature - Open Source Data WarehousingThird Nature - Open Source Data Warehousing
Third Nature - Open Source Data Warehousingmark madsen
 
Business Intelligence Architecture
Business Intelligence ArchitectureBusiness Intelligence Architecture
Business Intelligence ArchitecturePhilippe Julio
 
Advanced Analytics Platform for Big Data Analytics
Advanced Analytics Platform for Big Data AnalyticsAdvanced Analytics Platform for Big Data Analytics
Advanced Analytics Platform for Big Data AnalyticsArvind Sathi
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMBig Data Joe™ Rossi
 
Microsoft business intelligence
Microsoft business intelligenceMicrosoft business intelligence
Microsoft business intelligenceJawad Mohmand
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business IntelligenceAlmog Ramrajkar
 
Data-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse StrategiesData-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse StrategiesDATAVERSITY
 
Bi presentation to bkk
Bi presentation to bkkBi presentation to bkk
Bi presentation to bkkguest4e975e2
 
Business Intelligence Overview
Business Intelligence OverviewBusiness Intelligence Overview
Business Intelligence Overviewnetpeachteam
 
Microsoft Business Intelligence - Practical Approach & Overview
Microsoft Business Intelligence - Practical Approach & OverviewMicrosoft Business Intelligence - Practical Approach & Overview
Microsoft Business Intelligence - Practical Approach & OverviewLi Ken Chong
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryDomino Data Lab
 
Unified big data architecture
Unified big data architectureUnified big data architecture
Unified big data architectureDataWorks Summit
 
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning PlatformsWebinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning PlatformsBigDataCloud
 
What exactly is Business Intelligence?
What exactly is Business Intelligence?What exactly is Business Intelligence?
What exactly is Business Intelligence?James Serra
 
Advanced Topics In Business Intelligence
Advanced Topics In Business IntelligenceAdvanced Topics In Business Intelligence
Advanced Topics In Business Intelligenceguest1a9ef2
 
Bi Architecture And Conceptual Framework
Bi Architecture And Conceptual FrameworkBi Architecture And Conceptual Framework
Bi Architecture And Conceptual FrameworkSlava Kokaev
 

What's hot (20)

Bi presentation Designing and Implementing Business Intelligence Systems
Bi presentation   Designing and Implementing Business Intelligence SystemsBi presentation   Designing and Implementing Business Intelligence Systems
Bi presentation Designing and Implementing Business Intelligence Systems
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
Building enterprise advance analytics platform
Building enterprise advance analytics platformBuilding enterprise advance analytics platform
Building enterprise advance analytics platform
 
Third Nature - Open Source Data Warehousing
Third Nature - Open Source Data WarehousingThird Nature - Open Source Data Warehousing
Third Nature - Open Source Data Warehousing
 
Business Intelligence Architecture
Business Intelligence ArchitectureBusiness Intelligence Architecture
Business Intelligence Architecture
 
Advanced Analytics Platform for Big Data Analytics
Advanced Analytics Platform for Big Data AnalyticsAdvanced Analytics Platform for Big Data Analytics
Advanced Analytics Platform for Big Data Analytics
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBM
 
Microsoft business intelligence
Microsoft business intelligenceMicrosoft business intelligence
Microsoft business intelligence
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business Intelligence
 
Spring 2017 Sage 300 (Accpac) Users Group
Spring 2017 Sage 300 (Accpac) Users GroupSpring 2017 Sage 300 (Accpac) Users Group
Spring 2017 Sage 300 (Accpac) Users Group
 
Data-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse StrategiesData-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse Strategies
 
Bi presentation to bkk
Bi presentation to bkkBi presentation to bkk
Bi presentation to bkk
 
Business Intelligence Overview
Business Intelligence OverviewBusiness Intelligence Overview
Business Intelligence Overview
 
Microsoft Business Intelligence - Practical Approach & Overview
Microsoft Business Intelligence - Practical Approach & OverviewMicrosoft Business Intelligence - Practical Approach & Overview
Microsoft Business Intelligence - Practical Approach & Overview
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive Industry
 
Unified big data architecture
Unified big data architectureUnified big data architecture
Unified big data architecture
 
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning PlatformsWebinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
 
What exactly is Business Intelligence?
What exactly is Business Intelligence?What exactly is Business Intelligence?
What exactly is Business Intelligence?
 
Advanced Topics In Business Intelligence
Advanced Topics In Business IntelligenceAdvanced Topics In Business Intelligence
Advanced Topics In Business Intelligence
 
Bi Architecture And Conceptual Framework
Bi Architecture And Conceptual FrameworkBi Architecture And Conceptual Framework
Bi Architecture And Conceptual Framework
 

Viewers also liked

Vondráková, A: The influence of applied cartographic methods on the map infor...
Vondráková, A: The influence of applied cartographic methods on the map infor...Vondráková, A: The influence of applied cartographic methods on the map infor...
Vondráková, A: The influence of applied cartographic methods on the map infor...indogpr
 
Popelka, S: Space-Time-Cube for Visualization of Eye-tracking data
Popelka, S: Space-Time-Cube for Visualization of Eye-tracking dataPopelka, S: Space-Time-Cube for Visualization of Eye-tracking data
Popelka, S: Space-Time-Cube for Visualization of Eye-tracking dataindogpr
 
Building a Spatial Database in PostgreSQL
Building a Spatial Database in PostgreSQLBuilding a Spatial Database in PostgreSQL
Building a Spatial Database in PostgreSQLKudos S.A.S
 
Adobe Marketing Cloud Integrations: Myth or Reality? by Holger Marsen
Adobe Marketing Cloud Integrations: Myth or Reality? by Holger MarsenAdobe Marketing Cloud Integrations: Myth or Reality? by Holger Marsen
Adobe Marketing Cloud Integrations: Myth or Reality? by Holger MarsenAEM HUB
 
Open Architecture in the Adobe Marketing Cloud - Summit 2014
Open Architecture in the Adobe Marketing Cloud - Summit 2014Open Architecture in the Adobe Marketing Cloud - Summit 2014
Open Architecture in the Adobe Marketing Cloud - Summit 2014Paolo Mottadelli
 
O que é BIG DATA e como pode influenciar nossas vidas
O que é BIG DATA e como pode influenciar nossas vidasO que é BIG DATA e como pode influenciar nossas vidas
O que é BIG DATA e como pode influenciar nossas vidasElaine Naomi
 

Viewers also liked (7)

AAG_2011
AAG_2011AAG_2011
AAG_2011
 
Vondráková, A: The influence of applied cartographic methods on the map infor...
Vondráková, A: The influence of applied cartographic methods on the map infor...Vondráková, A: The influence of applied cartographic methods on the map infor...
Vondráková, A: The influence of applied cartographic methods on the map infor...
 
Popelka, S: Space-Time-Cube for Visualization of Eye-tracking data
Popelka, S: Space-Time-Cube for Visualization of Eye-tracking dataPopelka, S: Space-Time-Cube for Visualization of Eye-tracking data
Popelka, S: Space-Time-Cube for Visualization of Eye-tracking data
 
Building a Spatial Database in PostgreSQL
Building a Spatial Database in PostgreSQLBuilding a Spatial Database in PostgreSQL
Building a Spatial Database in PostgreSQL
 
Adobe Marketing Cloud Integrations: Myth or Reality? by Holger Marsen
Adobe Marketing Cloud Integrations: Myth or Reality? by Holger MarsenAdobe Marketing Cloud Integrations: Myth or Reality? by Holger Marsen
Adobe Marketing Cloud Integrations: Myth or Reality? by Holger Marsen
 
Open Architecture in the Adobe Marketing Cloud - Summit 2014
Open Architecture in the Adobe Marketing Cloud - Summit 2014Open Architecture in the Adobe Marketing Cloud - Summit 2014
Open Architecture in the Adobe Marketing Cloud - Summit 2014
 
O que é BIG DATA e como pode influenciar nossas vidas
O que é BIG DATA e como pode influenciar nossas vidasO que é BIG DATA e como pode influenciar nossas vidas
O que é BIG DATA e como pode influenciar nossas vidas
 

Similar to 05 predictive with spss

NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data FederationNRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data FederationNRB
 
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation NRB
 
IBM Industry Models and Data Lake
IBM Industry Models and Data Lake IBM Industry Models and Data Lake
IBM Industry Models and Data Lake Pat O'Sullivan
 
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...DataWorks Summit
 
Edr mds a less is more approach to MDM
Edr mds a less is more approach to MDMEdr mds a less is more approach to MDM
Edr mds a less is more approach to MDMThor Henning Hetland
 
Understanding DataOps and Its Impact on Application Quality
Understanding DataOps and Its Impact on Application QualityUnderstanding DataOps and Its Impact on Application Quality
Understanding DataOps and Its Impact on Application QualityDevOps.com
 
PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014
PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014
PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014Daniel Westzaan
 
Enterprise analytics journey from Helene Lyon
Enterprise analytics journey from Helene LyonEnterprise analytics journey from Helene Lyon
Enterprise analytics journey from Helene LyonHelene Lyon
 
Data modeling for the business 09282010
Data modeling for the business  09282010Data modeling for the business  09282010
Data modeling for the business 09282010ERwin Modeling
 
Visualisation and forecasting on IT capacity planning data
Visualisation and forecasting on IT capacity planning dataVisualisation and forecasting on IT capacity planning data
Visualisation and forecasting on IT capacity planning dataAndrew Gadsby
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMBig Data Joe™ Rossi
 
LDM Webinar: Data Modeling & Business Intelligence
LDM Webinar: Data Modeling & Business IntelligenceLDM Webinar: Data Modeling & Business Intelligence
LDM Webinar: Data Modeling & Business IntelligenceDATAVERSITY
 
An AI Maturity Roadmap for Becoming a Data-Driven Organization
An AI Maturity Roadmap for Becoming a Data-Driven OrganizationAn AI Maturity Roadmap for Becoming a Data-Driven Organization
An AI Maturity Roadmap for Becoming a Data-Driven OrganizationDavid Solomon
 
Best practice for_agile_ds_projects
Best practice for_agile_ds_projectsBest practice for_agile_ds_projects
Best practice for_agile_ds_projectsKhalid Kahloot
 
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptx
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptxLecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptx
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptxRATISHKUMAR32
 
Zementis hortonworks-webinar-2014-09
Zementis hortonworks-webinar-2014-09Zementis hortonworks-webinar-2014-09
Zementis hortonworks-webinar-2014-09Hortonworks
 
Analytics with IMS Assets - 2017
Analytics with IMS Assets - 2017Analytics with IMS Assets - 2017
Analytics with IMS Assets - 2017Helene Lyon
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
Hooduku - Big data analytics - case study
Hooduku - Big data analytics - case studyHooduku - Big data analytics - case study
Hooduku - Big data analytics - case studySudhi Seshachala
 
Latest corp big data and acme
Latest corp   big data and acmeLatest corp   big data and acme
Latest corp big data and acmehooduku
 

Similar to 05 predictive with spss (20)

NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data FederationNRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
 
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
 
IBM Industry Models and Data Lake
IBM Industry Models and Data Lake IBM Industry Models and Data Lake
IBM Industry Models and Data Lake
 
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
 
Edr mds a less is more approach to MDM
Edr mds a less is more approach to MDMEdr mds a less is more approach to MDM
Edr mds a less is more approach to MDM
 
Understanding DataOps and Its Impact on Application Quality
Understanding DataOps and Its Impact on Application QualityUnderstanding DataOps and Its Impact on Application Quality
Understanding DataOps and Its Impact on Application Quality
 
PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014
PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014
PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014
 
Enterprise analytics journey from Helene Lyon
Enterprise analytics journey from Helene LyonEnterprise analytics journey from Helene Lyon
Enterprise analytics journey from Helene Lyon
 
Data modeling for the business 09282010
Data modeling for the business  09282010Data modeling for the business  09282010
Data modeling for the business 09282010
 
Visualisation and forecasting on IT capacity planning data
Visualisation and forecasting on IT capacity planning dataVisualisation and forecasting on IT capacity planning data
Visualisation and forecasting on IT capacity planning data
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBM
 
LDM Webinar: Data Modeling & Business Intelligence
LDM Webinar: Data Modeling & Business IntelligenceLDM Webinar: Data Modeling & Business Intelligence
LDM Webinar: Data Modeling & Business Intelligence
 
An AI Maturity Roadmap for Becoming a Data-Driven Organization
An AI Maturity Roadmap for Becoming a Data-Driven OrganizationAn AI Maturity Roadmap for Becoming a Data-Driven Organization
An AI Maturity Roadmap for Becoming a Data-Driven Organization
 
Best practice for_agile_ds_projects
Best practice for_agile_ds_projectsBest practice for_agile_ds_projects
Best practice for_agile_ds_projects
 
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptx
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptxLecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptx
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptx
 
Zementis hortonworks-webinar-2014-09
Zementis hortonworks-webinar-2014-09Zementis hortonworks-webinar-2014-09
Zementis hortonworks-webinar-2014-09
 
Analytics with IMS Assets - 2017
Analytics with IMS Assets - 2017Analytics with IMS Assets - 2017
Analytics with IMS Assets - 2017
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Hooduku - Big data analytics - case study
Hooduku - Big data analytics - case studyHooduku - Big data analytics - case study
Hooduku - Big data analytics - case study
 
Latest corp big data and acme
Latest corp   big data and acmeLatest corp   big data and acme
Latest corp big data and acme
 

More from IBM_cloud_ecosystem_development_france (12)

Offre ibm developer works premium
Offre ibm developer works premiumOffre ibm developer works premium
Offre ibm developer works premium
 
promo-code-for-ibm-academic-initiative-for-cloud
promo-code-for-ibm-academic-initiative-for-cloudpromo-code-for-ibm-academic-initiative-for-cloud
promo-code-for-ibm-academic-initiative-for-cloud
 
Ibm bluemix paris_techtalks 2015
Ibm bluemix paris_techtalks 2015Ibm bluemix paris_techtalks 2015
Ibm bluemix paris_techtalks 2015
 
IBM Bluemix prerequisites
IBM Bluemix prerequisitesIBM Bluemix prerequisites
IBM Bluemix prerequisites
 
Ibm global entrepreneur for cloud startups
Ibm global entrepreneur for cloud startupsIbm global entrepreneur for cloud startups
Ibm global entrepreneur for cloud startups
 
Ws io t dotscale juin 2015 - introduction bluemix
Ws io t dotscale   juin 2015 - introduction bluemixWs io t dotscale   juin 2015 - introduction bluemix
Ws io t dotscale juin 2015 - introduction bluemix
 
06 summary
06 summary06 summary
06 summary
 
04 blue mixintro
04 blue mixintro04 blue mixintro
04 blue mixintro
 
03 future bda
03 future bda03 future bda
03 future bda
 
01 big dataoverview
01 big dataoverview01 big dataoverview
01 big dataoverview
 
02 io t&wearables
02 io t&wearables02 io t&wearables
02 io t&wearables
 
Ibm academic initiative for cloud
Ibm academic initiative for cloud Ibm academic initiative for cloud
Ibm academic initiative for cloud
 

Recently uploaded

Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 

Recently uploaded (20)

Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 

05 predictive with spss

  • 1. A New Era for Predictive Analytics with SPSS
  • 2. © 2012 IBM Corporation The Mining Metaphor 2 ! ●Gold Mining Diamond Mining Data Mining
  • 3. © 2012 IBM Corporation What is Data Mining? An early definition Finding patterns in your data which you can use to do your business better ! –It’s about patterns –It’s about something you can use – practical things –It’s about business A recent definition ▪ Business-oriented discovery of patterns across all forms of data ▪ Produces insight and a predictive capability ▪ Deployment of predictions throughout the enterprise
  • 4. © 2012 IBM Corporation What is Data Mining? 4 ! Information Retrieval Information Extraction Information Analysis ! + + Discover new, previously unknown information
  • 5. © 2012 IBM Corporation IBM SPSS Supports the Predictive Enterprise
 Delivering Profitable Revenue Growth & Operational 
 Efficiency ▪Capture a complete perspective –Survey customers & constituents –Leverage structured, semi-structured & unstructured data
 ▪Predict behavior and preferences –Statistics for deeper insight –Data & text mining for predictive modeling
 ▪Act on results –Deploy scoring models for dynamic decisions –Directly affect business process with event integration
  • 6. © 2012 IBM Corporation IBM SPSS: Our core value proposition SPSS’ goal is to apply analytic to optimize decisions at every contact point, made possible by enabling pervasive, predictive real-time decisions at the point of impact

  • 7. © 2012 IBM Corporation ▪ SPSS Data Collection – Collecting additional Attitudinal data for advanced analytics typically collected through surveys ! ▪ SPSS Statistics – Expand analytics capabilities to Professional Business User / Statistician – Add advanced statistical analysis to PM ! ▪ SPSS Modeler – Provide predictive analytics using data mining & text mining methods for key parts of the business – Predict future outcome and understand what influences it. ! ▪ SPSS Deployment & Collaboration Services – Analytical asset management across multiple analysts – Audit, security, refresh – Provide a web service interface ! ▪ SPSS Analytic Server – Provide Big Data connectivity to SPSS Modeler – It translate SPSS modeler server requests into Hadoop jobs !! ▪ SPSS Analytical Decision Manager – Business scenario analysis – Complex Rule for operational decision management ! SPSS Predictive Analytic Platform
  • 8. © 2012 IBM Corporation SPSS Modeler 16 Editions • SPSS Modeler GOLD -Enables organizations to build predictive models to improve business process and help people or systems make the right decisions each time. It combines and integrates predictive analytics, rules, scoring, and optimization techniques to deliver recommended actions at the point of impact. ! SPSS Modeler Premium + C&DS + Analytical Decision Management ! • SPSS Modeler Premium - Offers a range of advanced algorithms and capabilities including text analytics, entity analytics, social network analysis, and automated modeling and preparation techniques to address a multitude of business problems and analytic requirements on almost any type of data. ! SPSS Modeler Professional + Text Analytics Workbench ! • SPSS Modeler Professional -Includes a range of advanced algorithms, data manipulation, and automated modeling and preparation techniques to build predictive models and uncover hidden patterns in structured data.
  • 9. © 2012 IBM Corporation R is gaining in popularity, Do not walk away from R opportunities it's not a competitor You Ready ? ▪ EMBRACE: Integrate R algorithms (e.g. Random Forest) Generate R charts Use R functions for data preparations Make R available for non-programmers ! ▪ EXTEND: Scalability (e.g. database pushback) Leverage R engines of other vendors like SAP HANA Enterprise deployment Big Data (Analytic Server) Powered by
  • 10. Introducing CRISP-DM Methodology & SPSS Modeling Techniques 

  • 11. © 2012 IBM Corporation Modeler Interface Stream Canvas Stream, Outputs & Model Manager Palettes Nodes
  • 12. © 2012 IBM Corporation Visual Programming with Modeler 4 -Visual programming -Based on icons ("nodes") -Pick nodes from palette & place them on the bench -Edit their attributes -Connect to specify flow of data ("streams")
  • 13. © 2012 IBM Corporation5 Can be exported to PMML to be reuse outside of Modeler : like in Java applications, SAS, IBM Infosphere stream using the DataMining ToolKit, … Is the Result of a predictive model Generation Yellow Nugget or Yellow Diamond
  • 14. © 2012 IBM Corporation CRoss-Industry Standard Process for Data Mining 2 1. Business Understanding Project objectives and requirements understanding, Data mining problem definition
 2. Data Understanding Initial data collection and familiarization, data quality problems identification
 3. Data Preparation Table, record and attribute selection, data transformation and cleaning
 4. Modeling Modeling techniques selection and application, Parameters calibration
 5. Evaluation Business objectives & issues achievement evaluation
 6. Deployment Result model deployment, Repeatable data mining process implementationCRoss-Industry Standard Process for - Data Mining
 ( CRISP – DM )
  • 15. © 2012 IBM Corporation 2. Data Understanding 4 Initial data collection and familiarization, data quality problems identification
 CRoss-Industry Standard Process for - Data Mining
 ( CRISP – DM )
  • 16. © 2012 IBM Corporation Reading Data 5 Modeler reads a variety of different file types, including data stored in spreadsheets and databases, using the nodes within the Sources palette.
  • 17. © 2012 IBM Corporation Getting to Know your Data 8 Data Audit Node Distribution Node Histogram Node …
  • 18. © 2012 IBM Corporation 3. Data Preparation 9 ! Table, record and attribute selection, data transformation and cleaning CRoss-Industry Standard Process for - Data Mining
 ( CRISP – DM )
  • 19. © 2012 IBM Corporation Data Manipulation in Modeler 10 To prepare the data before analysis: • Eliminate missing values • Remove unwanted fields from analysis • Derive new fields • Merge and match data Intermediate nodes in Modeler • Record operation nodes • Field operation nodes ! ! ▪CLEM language is case sensitive
  • 20. © 2012 IBM Corporation CLEM language: The Expression Builder 11
  • 21. © 2012 IBM Corporation 4. Modeling 13 ! Modeling techniques selection and application, Parameters calibration
 CRoss-Industry Standard Process for - Data Mining
 ( CRISP – DM )
  • 22. © 2012 IBM Corporation Sampling or Partitioning your Data • May not want to use all records • Score your model with remaining Data • May wish to examine a subgroup separately • May assist us with building a predictive model (oversampling) • Keep in mind that the sampling method must be fit to the problem at hand ! -Similar customers and I want to reduce size of dataset for modelling then I can use simple sampling. ! -But if you want to directly sample from a database with customers of different types you may want to draw a complex sample. !
  • 23. © 2012 IBM Corporation Matching Data to the Modeling Tool • For example – we want to use Rule Induction...we will need to think about ! -How algorithm handles missing data ! -Output that is created (binary versus larger splits) ! -What are we trying to predict (numeric target or binary?) ! -In Which format the input predictors have to be ?
  • 24. © 2012 IBM Corporation Modeling Technics in Modeler • Supervised techniques (Predictive Models) To model an output variable based on the several input variables, to predict future cases where the outcome is unknown -Neural Networks, Rule Induction (C5.0, CHAID, QUEST & C&RT) -Decision List, Binary Classifier -Linear Regression and Logistic Regression -Generalized Linear Models • Unsupervised Techniques (Clustering) No field to predict, used to group similar records within the data -Kohonen Networks, K-Means, Two Step, Anomaly, Discriminant • Association Rules To search for things that typically occur together -APRIORI, CARMA, GRI and SLRM ! • Data Reduction: -PCA/Factor Analysis, Feature Selection • Sequence Detection Models: -Sequence • Time Series • Text Mining
  • 26. © 2012 IBM Corporation Association Models ! –Association rules search for things (events, purchases, attributes) that typically occur together in the data ! –They find the patterns in data that you could manually find using visualization techniques such as the web node (yikes!) but can do so much faster and can explore more complex patterns. ! –Used to answer questions such as: • Do customers who buy fruit usually buy cheese?
  • 27. © 2012 IBM Corporation Output
  • 29. © 2012 IBM Corporation Segmentation or Clustering Models ! –Clustering techniques segment data into groups of cases/records/ customers that have similar patterns of input fields. ! –Used in market segmentation studies whose aim it is to find distinct types of customers so they can be targeted more effectively ! –Used to answer questions such as: • How can I group my customer to address the right marketing campaign?
  • 30. © 2012 IBM Corporation Clusters Output
  • 32. © 2012 IBM Corporation Predictive or Classification Models ! –Algorithms that are used to make predictions or forecasts based on historical data ! –Automatic classification allows customers to let the software determine the best one or customers can choose a specific algorithms such as Neural Networks, Logistic Regression, Time Series, etc. ! –Used to answer questions such as: • What predicts whether a customer will leave? • What predicts whether this employee will be a super-star? • How many umbrellas will I sell in the next three months in Chicago?
  • 33. © 2012 IBM Corporation Output
  • 34. © 2012 IBM Corporation 5. Evaluation 54 Business objectives & issues achievement evaluation
 CRoss-Industry Standard Process for - Data Mining
 ( CRISP – DM )
  • 35. © 2012 IBM Corporation 6. Deployment 55 Result model deployment, Repeatable data mining process implementation CRoss-Industry Standard Process for - Data Mining
 ( CRISP – DM )
  • 36. © 2012 IBM Corporation Deployment Family: Products 
 ▪IBM SPSS Collaboration and Deployment Services – A foundation for managing and deploying analytics ! ▪IBM SPSS Analytical Decision Management – Integrates analytics and business knowledge to deliver optimal outcomes 56
  • 37. © 2012 IBM Corporation IBM SPSS Modeler Deployment Options ▪Client (Desktop) –Access local files –Connect to operational databases –Connect to Cognos BI –Processing performed on local installation ! ! ▪Client/Server –Data operations/processing on server –In-database data mining –SQL pushback For PureData and Hadoop Platform –Modeler Batch –SuSE Linux Enterprise Server 10 (zLinux) –Inclusion in Smart Analytics System for Power (AIX) ! ! !
  • 39. © 2012 IBM Corporation Predictive Analytics for Big Data
 Get more Accurate Models with bigger volume and variety of data - Read Data from Hadoop ! - Write back to Hadoop ! - Export your Models to Streams ! - Prepare your Data on Hadoop ! - Few Models can run on Hadoop ! - R analytic capabilities in SPSS !
  • 40. © 2012 IBM Corporation Bring Analytics on Big Data for Everyone Automatic Summarization • Top findings in data ranked by “interestingness” and association strength • Plain language synopsis ! Automatic Exploration • Guided presentation by selecting fields of interest • Dynamic Visual Insights • Users can refine auto generated parameters ! Automatic Modeling • Auto selection of best models and detection of strongest relationships: Decision Tree (CHAID) and Key Driver Reports (based on linear and logistic regression) ! Sharing of Output • Collaboration with peers • Tablet optimization ! ! SPSS Analytics Catalyst CR.I.S.P.-D.M. Methology
  • 41. © 2012 IBM Corporation Generate simulated data ! Fit distributions from existing data ! Evaluate the simulation Example Use Cases: - A retailer wants to simulate alternative sales scenarios to identify which strategy will make them most likely to hit their targets ! - A parts manufacturer is interested in modeling storage costs based on simulating different scenarios for future part orders against stock supplies and excess order fees ! Monte Carlo Smulation
  • 42. © 2012 IBM Corporation Geospatial Data Mining– Understanding Geohashes ▪ Space-time Boxes use geohashes and timestamps to locate where and when entities exist ▪ A geohash is a unique identifier that uses latitude and longitude to create an alphanumeric string ▪ Its precision depends on its length; longer geohash = better precision ▪ For example, geohash dr5ru7 is midtown Manhattan...but how do we know?
  • 43. © 2012 IBM Corporation What Exactly is a Space –Time Box? ▪ Space-time Boxes extend geohashes to include a third dimension: time ! ! ! ▪ Space-time Boxes ‘bin’ events in 3-D space and time ▪ Density (i.e. size) of the Space-time Box is a required input ▪ Can help analysts understand proximity between entities, verify relationships dr5ru7|2013-01-01 00:00:00|2013-01-01 00:15:00 Geohash Start timestamp End timestamp
  • 44. © 2012 IBM Corporation IBM SPSS Modeler Embraces R 1. SPSS Modeler allows the user to build and score R models within the Modeler interface 2. SPSS Modeler allows the use of R functions for data preparation and chart/output creation 3. The Custom Dialog Builder for R allows the user to create custom nodes that run R algorithms, functions, or outputs 4. These custom nodes can be shared with other users and they do not require the end user to know any R code
  • 45. © 2012 IBM Corporation Use R to build a custom node
  • 46. The world of analytics ! made easy for everyone Bouchra Denis Antoine Danil
  • 47. I am Sandra, a data analyst. USER CODE
  • 48. Sadly, SPSS Modeler 
 cannot do 
 EVERYTHING
  • 49. SPSS Modeler Marketplace
 App Store for Analytics
  • 50.
  • 51.
  • 52. Spatial Plot insightful interactive ! maps to explore your data Visualize new patterns
  • 53. Spatial SocialSocial Enhance your client understanding with social data! Analyse the public opinion!
  • 54. Spatial Social Databases Connect to noSQL databases! Connect to Bluemix in 2 clicks! Connect to bigSQL and Hadoop!
  • 55. Spatial Social Databases Models For our Business Partner Predict which customers will come back and how much they will spend Implemented in a BI solution for a large retailer and Generate enterprise-grade reporting
  • 56. Spatial Social Databases Models And many more! … Come to our booth to try them out More than 30 new functionnalities
  • 57.
  • 58. Potential growth A lot of code already available in packages R is a widely used language Survey of use R IBM SPSS Statistics Rapid Miner SAS Weka Microsoft SQL Server Matlab IBM SPSS Modeler 0 % 18 % 35 % 53 % 70 %
  • 59. Value SPSS Modeler Marketplace SPSS Modeler BRAND SPSS Modeler USERS IBM PARTNERS NODE DEVELOPERS
  • 60. © 2012 IBM Corporation Q&A