TIBCO Advanced Analytics Meetup (TAAM) - June 2015

!
TIBCO Advanced Analytics Meetup !
!
Michael O’Connell!
Chief Data Scientist!
moconnell@tibco.com!
@moc_tib !
!
!
!June 2015!

•  TIBCO
Analy,cs
&
Data
Science
(MOC:
30
min)

•  Data
Analysis
Pipeline

•  Understand
–
An2cipate
–
Act

•  Predic,ve
Analy,cs
(JM,
IC:
25
+
20
min)

•  TERR
Expressions
and
Data
Func2ons

•  GeoLoca2on
Analy2cs

•  Real-‐Time
Analy,cs
(UK:
15
min)

•  Customer
Analy2cs
with
Event
Processing

•  APIs
(AB:
15
min)

•  Iron
Python
for
Data
Write-‐Back

•  Wrap-‐Up
/
Ques,ons
(MOC:
10
min)

Increase
Productivity
Grow
Revenue
Value

Reduce
Risk
ROI
TIBCO Analytics – Insight to Action!
© Copyright 2000-2015 TIBCO Software Inc.

Data
Access

&
Prep

Exploratory

Data
Analysis

Features

Visual

Dashboard

Model
&

Predict

Deploy

Champion

Model

Test
&

Learn

Channel
Social
Loyalty
Campaign
Filter
Map
Merge
Shape
Propensity
Affinity
Improve

Guided
-‐-‐-‐-‐-‐-‐-‐-‐
Deploy
-‐-‐-‐-‐-‐-‐-‐-‐
In-‐Line
Explore
Data

Aggregate
Prepare
Data
Business
Case

Increase
Productivity
Grow
Revenue
Ensemble
Forest
Regression
Additive
Models
Segment
Visualize
Pricing
Promotion
Challenger
Models
At Rest
In Motion
Value

Theses

Reduce
Risk
ROI
Value
Dashboard
Updates
Data a Insight a Action!

Spotﬁre Platform!
SpoTire

Desktop

Spotﬁre Platform!

Spotﬁre Data Access!
!
DATA
SOURCES
XMLRDBMS
Flat
Files
CubesSpread-
sheets
Hadoop &
Big Data
stores
Analytical
DWs e.g.
Exadata
Event Data
Streams
Active
Spaces
In-‐Memory

Load
data
from

source
in
to

memory

In-‐Database

Leave
data
in
DB

Dynamically
load
and

discard
data
to
visualize

On-‐Demand

Dynamically
swap

data
in
and
out
of

memory.

SQL

MDX

1010
0110

Custom
GUI-‐driven

data
access
via
SDK

Enterprise Data Access!
Siebel
eBusiness
Local
data
sources

Access
Excel
STDF

Drag-‐and-‐drop

MySQL

SQL
Server

Oracle

Informa2on
Services

(join,
transform,
reusable,

parameterized,
dynamic
query

for
in-‐memory
use)

Databases

JDBC/ODBC

Hadoop

SFDC

PostgreSQL

Teradata

Netezza

Etc.
XML
RDBMS
Flat
Files
Spread-
sheets
Web
Services
Oracle
E-Business
RDBMS
RDBMS
RDBMS
SAP BWSAP R/3 D
A
T
A
F
A
B
R
I
C
Salesforce
ODBC

OLE
DB

SqlClient

Direct

connec2on

Oracle

TeradataAster
MS
SSAS

Teradata

Direct
Query

(dynamically
query
and
retrieve
data

for
visualiza2on
and
analysis)

Databases

MySQL

Etc.

OBIEE
Netezza

Hadoop


Supported Data Sources!
In-Memory, In-Database and Data-On-Demand!
•  Amazon Redshift!
•  Apache Hadoop/Hive!
•  Cloudera Hive CDH4.x, CDH5.x!
•  Cloudera Impala CDH4.x, CDH5.x, 0.6, 1.2.2, 1.2.3!
•  Composite Information Server 6.1.x, 6.2.x!
•  Hortonworks Data Platform 1.3, 2.0, 2.1.x, 2.2.x!
•  HP Vertica 5.0, 6.0, 6.1, 7!
•  IBM DB2 LUW 8, 9, 9.5, 10.x!
•  IBM Informix 9.4!
•  IBM Netezza 5, 6, 7!
•  JDBC!
•  Microsoft SQL Server 2000, 2005, 2008, 2012, 2014!
•  Oracle MySQL 4.1, 5.1, 5.5, 5.6!
•  Oracle and Oracle Exadata (Oracle 9i, 10g, 11gR1 and R2, RAC, 12c)!
•  Pivotal Greenplum 3.3, 4.1, 4.2, 4.3!
•  Pivotal HAWQ!
•  Pivotal HD 1.0.7!
•  PostgreSQL 8.4, 9.0, 9.1, 9.2!
•  SAP HANA SPS5, SPS6; AWS SAP HANA One!
•  SAP Sybase 12.5, 15, 15.5!
•  SAP Sybase IQ 15!
•  Teradata 12.00.12, 13.00, 13.10, 14.00, 14.10, 15.00!
•  Teradata Aster 5.0, 5.11, 6.0!
In-Memory and In-Database!
•  Microsoft SQL Server Analysis Services 2008, 2012, 2014!
•  Oracle Essbase 9.3, 11.1!
•  SAP NetWeaver Business Warehouse 7.0.1 SP10, 7.3!
!
In-Memory and Data-On-Demand!
•  Aurea Sonic 7.5!
•  Oracle E-Business Suite 11.5.8, 11.5.10!
•  Oracle Siebel 7.7, 7.8, 8.0!
•  Salesforce.com!
•  SAP R/3 4.7, mySAP 5.0, 6.0!
•  TIBCO ActiveMatrix BusinessWorks™!
•  TIBCO ActiveSpaces!
•  TIBCO StreamBase LiveView!
•  Web Services!
In-Memory Only!
•  ADO.NET!
•  Comma-Separated Values (.csv)!
•  ESRI Shape Files (.shp)!
•  Microsoft Access Databases (.mdb, .mde)!
•  Microsoft Excel Workbooks (.xls, .xlsx, .xlsm)!
•  ODBC!
•  OData 1,2,3,4!
•  SAS Data Files (.sas7bdat, .sd2)!
•  Spotfire DecisionSite Files (.sfs)!
•  Spotfire Text Data Format (.stdf)!
•  Spotfire Binary Data Format (.sbdf)!
•  Text (.txt)!
•  TIBCO Formvine!
•  Universal Data Link (.udl)!
9!

Extended Data Source Access with TIBCO TERR!

Data – the Issues!
Organic
Data
Quality
Ladder

•  Machines

•  Sales

•  Logis2cs

•  Web

•  Scanners

•  Logs

•  Email,
text

•  Social

Rigobono,
2015


Data and Features!
April
–
21
Customers
•  Representa,veness

•  Inference
from
Sample
to
Popula2on

•  Iden,ﬁca,on
and
Features

•  Data
relevant
for
the
Process

•  Q:
Who
most
likely
to
drown
while
swimming
in
ocean?

•  A:
Great
swimmers
!

•  Feature
needed:
Willingness
to
take
risk
beyond
ability

•  Telco
Churn
Example:
who
is
more
likely
to
leave
plan?

•  Answer:
people
who
spend
more
2me
talking
to
people

who
have
already
leb
the
plan.

•  Raw
(Big)
Data:
zillions
of
calls

•  Feature
needed:
2me
spent
prior
to
leaving
plan,

speaking
with
other
people
who
leb
the
same
plan

•  Feature
not
in
any
database
!


June
–
4
Deac,va,ons

Data and Features!

•  Inference
from
Sample
to
Popula2on

and
Features

•  Data
relevant
for
the
Process

•  Telco
Churn
Example:
who
is
more
likely
to
leave
plan?

•  Answer:
people
who
spend
more
2me
talking
to
people

who
have
already
leb
the
plan.

•  Raw
(Big)
Data:
zillions
of
calls

•  Feature
needed:
2me
spent
prior
to
leaving
plan,

speaking
with
other
people
who
leb
the
same
plan

•  Feature
not
in
any
database
!

July
–
7
Deac,va,ons

Data and Features!

•  Inference
from
Sample
to
Popula2on

and
Features

•  Data
relevant
for
the
Process

•  Telco
Churn
Example:
who
is
more
likely
to
leave
plan?

•  Answer:
people
who
spend
more
2me
talking
to
people

who
have
already
leb
the
plan.

•  Raw
(Big)
Data:
zillions
of
calls

•  Feature
needed:
2me
spent
prior
to
leaving
plan,

speaking
with
other
people
who
leb
the
same
plan

•  Feature
not
in
any
database
!

Immediate

Long-‐Term

CompeDDve
Advantage
Value
to
the
Organiza,on

TIBCO
is
the
only
analy,cs
plaTorm
that
can
provide
value

to
the
organiza,on
across
the
full
spectrum
of
use
cases

Self-‐service

Dashboards

Event

Processing

Predic,ve
and

Prescrip,ve

Analy,cs

Measure
Diagnose
Predict
Op2mize
Opera2onalize
Automate

Analy2cs
Maturity

Analy2cs
Maturity
Model

© Copyright 2000-2015 TIBCO Software Inc. 16!
Visual Analytics !

Visual Analytics !

Visual Analytics – Dashboards !

Visual Analytics – d3 Community !

Immediate

Long-‐Term

CompeDDve
Advantage
Value
to
the
Organiza,on

TIBCO
is
the
only
analy,cs
plaTorm
that
can
provide
value

to
the
organiza,on
across
the
full
spectrum
of
use
cases

Self-‐service

Dashboards

Event
Analy,cs

Predic,ve
and

Prescrip,ve
Analy,cs

Measure
Diagnose
Predict
Op,mize
Opera2onalize
Automate

Analy2cs
Maturity

Analy2cs
Maturity
Model

Advanced Analytics Ecosystem!

TIBCO Enterprise Runtime for R (TERR)!
•  TIBCO
Enterprise
Run,me
for
R
(TERR)

•  Latest
sta2s2cs
scrip2ng
engine:

S
a

S-‐PLUS®

a R
a
TERR

•  Developer
Edi2on:
www.TIBCOmmunity.com

•  Engine
internals
rebuilt
from
scratch
at
low-‐level

•  Redesigned
data
objects,
memory
management

•  Addresses
long-‐standing
issues
with
S
(R)
language

•  TERR
addresses
deployment
issues
with
R

•  Performance

•  Big
data,
fast
data

•  TERR
is
commercially
licensed
from
TIBCO

•  TERR
Installs
(free)
with
Spodire
Analyst
/
Desktop
and
other
TIBCO
products
(CEP,
Stats)

•  Spodire
Server
can
manage
all
TERR
/
R
scripts,
ar2facts
for
reuse


Spotfire and TERR local TERR on server !
Spotfire-TERR Data Flows!
•  Build
models
on
data
using
local

TERR
engine
embedded
in

Spodire

•  Build
models
on
big
data
directly
in
TERR
on

server
and
display
results
in
Spodire

•  Run
TERR
as
parallel
sessions
on
Hadoop
cluster,

controlled
and
visualized
in
Spodire

Data Source TERR
TSSS
Spotfire
Results
ODBC
JDBC
SDC
File
Data
Function
Larger Data
Modeling

Spotfire
Local
TERR
ODBC
JDBC
SDC
File
Data
Data Source
Both Spotfire and TERR can load data from any ODBC or JDBC compliant source or from
Spotfire Data Connections (SDC) or Spotfire Information Links stored in the Spotfire library.

Spotfire-TERR : Data Types, Analyses!
Spotfire data functions support any
type of data as input and output
parameters to and from TERR.

TERR data functions used for data
prep, integration, predictive &
prescriptive analytics, …

TERR data functions can output
content metadata to Spotfire
•  formatting of fields
•  handling of binary data including
images and geospatial objects.
Rows
Columns
Values
Tables
Metadata
Blobs
Geometries
Images
Spotfire TERR
Data
Function

•  Forecas,ng

Y

•  Performance
–
sales,
revenues,
value/volume
share

•  Summary
sta,s,cs

•  Correla2on,
…

•  Modeling

Y
=
f
(X,
b)

•  Customer
Analy2cs
e.g.
propensity
analysis

•  Segmenta,on,
Clustering

X

•  Customer
segmenta2on

•  Op,miza,on

•  Prescrip2ve
analyses

•  Simula,on

•  Prescrip2ve
analyses

Predictive & Prescriptive Analytics!

Model Fitting: 5 Million Rows Model Scoring: 20 Million Rows
TERR

7X

faster

84X

TERR Performance!

TERR in Spotﬁre !
What
does
TERR
do
in
SpoTire?

•  Runs
TERR
Data
Func2ons
in
Spodire
analyses

•  Powers
the
Predic2ve
Modeling
Tools;
the
Forecast
Tool;
…

•  Can
be
used
directly
in
Expressions

• 
Runs
on
Hadoop
nodes;
called
from
Spodire;
Runs
in
Streambase

TERR
is
embedded
in
SpoTire
Analyst/Desktop
and
Streambase

•  No
other
sobware
required,
no
connec2on
to
server
required


1.
In-‐line
Expressions
2.
Expression
Func2ons

Spotﬁre-TERR Expression Functions!
Type
R
code
in
to
expression
ﬁeld
in
Spo3ire
e.g.

-‐  Color
graph
by
clusters

-‐  Smooth
points
on
graph

Use
TERR_*
inbuilt
expression
funcAons

Many
entry
points
for
adding
expressions

Choose
Expression
FuncAon
from
menu

-‐  Inbuilt

-‐  Extension
(you
or
someone
else)
via
R
code

Use
just
like
other
expression
funcAons
in
an
expression

Many
entry
points
for
adding
expressions

1.
Develop
and
test
R
code
in
R
Studio
/
Spodire
2.
Map
inputs
and
outputs
in
Spodire

Spotﬁre-TERR Data Functions – 1, 2, 3!
R
Programmer

-‐  Set
engine
to
TERR
in
opAons

-‐  Graphs
in
Viewer
Regular
Spo3ire
User

-‐
Spo3ire
columns
mapped
to
R
inputs


3.
Point-‐click
to
analyze
and
visualize

Any
business
or
tech
user

Spotﬁre-TERR Data Functions – 1, 2, 3!

Spotﬁre Library !
Manage
data
func2ons,
templates,

informa2on
links
in
Spodire
library
Manage
permissions
in
library

Data
func2ons
import
/
export
as
.sfd
ﬁles


TERR and R Packages & Spotﬁre !
Packages
Shipped
with
TERR
3.2


R is the lingua franca of Statistical Computing
Date
RPackages
1/1/2002 1/1/2003 1/1/2004 1/1/2005 1/1/2006 1/1/2007 1/1/2008 1/1/2009 1/1/2010 1/1/2011 1/1/2012 1/1/2013
5000
4500
4000
3500
3000
2500
2000
1500
1000
500
0
Number
of
R-‐
or
SAS-‐related
posts
to
Stack
Overﬂow
by
week.

(copyright
by
r4stats.com)

Number
of
contributed
packages
on
CRAN

(hQp://cran.r-‐project.org/)

>
6,000
Packages
!

R Community!

Big Data Community !

Winner of 2014 Strata Cloudera Award
For Best Advanced Analytics Application
Big Data Analytics with Spotﬁre and TERR!

Big Data Analytics with TERR!
TERR
on
the
nodes
of
Hadoop
Cluster

TERR
in
AcDon

•  Hadoop
cluster
compute

•  TIBCO
Cloud
Compute
Grid

•  TIBCO
Streambase

•  TIBCO
Business
Events

•  KNIME

•  Lavastorm

•  Rstudio

•  Teradata

•  TIBCO
Sta2s2cs
Services

•  TIBCO
Spodire


•  Cluster
customers

by
geography

•  Trade
area
analysis

•  Asset
acquisi2on
&

dives2ture

•  Overlay
maps
with

predic2ve
metrics

•  Compute
op2mal

paths

•  Library
of
geospa2al

func2ons

Advanced Geospatial Analytics!

BIG
DATA

AT
REST

FAST
DATA

IN
MOTION

Insight to Action

Analyze And Act On “Critical Business Moments”
Op2mize

pricing
Check
for

fraud

Make
oﬀer

to
customer

Restock

inventory

Reroute

transport

Give
customer

service

Proac2vely

maintain
machines


Managing Industrial Equipment!
Big Data
–  Analysis of production
–  Failure analytics
Fast Data
–  Real-time sensor data
–  Leading indicator for shutdowns
–  Drilling: kick detection
–  Flow monitoring
Benefits
–  Reduced NPT: Big $$s
–  System reliability
–  Efficient drilling

2. Find Leading
Indicators
3. Backtest
Rules / Models
4. Push
Rules / Models
to Event Server
1. Study
Anomalies
Managing Industrial Equipment!

Industrial Equipment Management Improves Operations 
!

Optimizing Manufacturing Processes
Big Data
–  Analysis of product quality
–  Models for yield
–  Models for defects
Fast Data
–  In-line QA/QC!
Benefits
Maximize productivity
Improve quality
Optimize machine operations

Optimizing Manufacturing Processes

Customer Offers for Retailers
Big Data
–  Customer propensity to purchase
products
–  Product affinity
–  Customer segmentation
Fast Data
–  In-line scoring on transactions!
–  Targeted offers to customers!
Benefits
–  Optimize inventory
–  Enhance customer experience

Monitor
Notify!
Act!
Analyze!
Store!
Analyze!
Data - Information - Knowledge
.
.
.
Data

Informa,on

Knowledge

.
.
.

•  IronPython

controls
behavior

of
Spodire

•  We
maintain

library
of

IronPython

func2ons

•  ….
toggling
all

zoom
sliders

•  Adding
marker

layers
to
a
map

•  …
and
many
more

Spotfire API’s

Todays Presenters: Jagrata Minardi
Jagrata
Minardi
is
a
Staﬀ
Solu2ons
Consultant
with
TIBCO
Sobware,
suppor2ng

Financial
Services
and
other
industries.

Previously,
he
worked
for
Insighdul
Corpora2on,
a
provider
of
analy2c

sobware
and
solu2ons.

Since
1997,
he
has
supported
customers
in
the
areas
of
pordolio
construc2on,

pordolio
management,
asset
price
forecas2ng,
risk
modeling,
and
risk

aggrega2on.

Todays Presenters: Jagrata Minardi

Ian
Cook
is
a
Data
Scien2st
at
TIBCO
focused
on
applying
the
R
sta2s2cal

programming
language
to
rapidly
solve
business
problems
across
industry

ver2cals.

Ian
founded
and
organizes
the
R
users
group
in
the
Raleigh,
North
Carolina

area.

Prior
to
his
role
at
TIBCO,
Ian
worked
as
a
sta2s2cal
sobware
developer
for
the

semiconductor
company
Advanced
Micro
Devices.

Todays Presenters: Ian Cook

Interpolation

Contour Lines

Transforming Coordinate Reference Systems

Performing Spatial Overlay

Todays Presenters: Ujval Kamath
Ujval
Kamath
is
a
Data
Scien2st
at
TIBCO.

He
is
focused
on
developing
predic2ve
models
in
R
that
are
deployed
in

Spodire
and
StreamBase
for
data
at
mo2on
and
data
at
rest.

He
has
experience
in
a
range
of
industries,
including
Oil
and
Gas/Energy,

Consumer
Packaged
Goods,
Manufacturing,
and
Compu2ng

Spotﬁre and StreamBase!
Spodire
is
used
to
Create
and
Analyze
Customer
Segmenta2on
and
Propensity

StreamBase
is
used
to
score
new
transac2ons
in
real
2me

Spodire
is
used
to
understand
the
demographics
of
customers
around
stores

Todays Presenters: Andrew Berridge
Andrew
Berridge
is
a
Sr
Solu2on
Consultant
at
TIBCO.

He
joined
the
Spodire
data
science
team
in
2011
and
has
15
years'
experience

working
in
pharmaceu2cals
and
other
industries.

Andrew
specializes
in
developing
tools,
extensions
and
integra2ons
with
other

technology
pladorms
for
Spodire
using
IronPython,
C#,
Java
and
JavaScript.

Extending and Customizing Spotfire!
•  Many ways of extending and customizing Spotfire platform
•  All APIs are publicly documented, eg
–  Spotfire .NET API: https://docs.tibco.com/pub/doc_remote/spotfire/7.0.0/doc/api/Index.aspx
•  Extend functionality of desktop and web clients:
–  TERR scripting
–  Data functions
–  IronPython scripting
–  JavaScript in text areas for UI elements
–  C# extensions (tools, transformations, calculations, etc.)
–  JavaScript mashup API for embedding in web applications
•  JavaScript Visualizations
–  Use any JavaScript visualization framework
–  e.g. D3, HighCharts
•  Extend Automation Services
–  Custom tasks
•  Custom authentication/Single Sign-on (SSO)

Example: Write-back to Database from Spotﬁre!
•  Why!
–  Take action from within your analysis!
–  Comment on data points!
–  Update external systems!
•  How!
–  SQL within Spotﬁre Information Link with parameters!
–  Execute Information Link with IronPython, passing in marked data as parameters!
–  Can use other methods - this is simple !

SQL In Information Link!
•  Must return data to Spotﬁre – we return the data table!
•  INSERT then SELECT!
INSERT INTO [SimpleDemo].[dbo].[UserActions]!
([State], [CoC], [Username], [Comment])!
VALUES!
(?State, ?CoC, %CURRENT_USER%, ?Comment);!
SELECT!
U1."id" AS "ID", U1."DateTime" AS "DATETIME", U1."State" AS "STATE",!
U1."CoC" AS "COC", U1."Username" AS "USERNAME",!
U1."Comment" AS "COMMENT"!
FROM!
"SimpleDemo"."dbo"."UserActions" U1!
WHERE!
<conditions>!
!

IronPython Code!
•  Iterate over the marked rows in the data table:!
–  Set up the parameters for the Information Link!
•  Name!
•  Value!
–  Call the Information Link for each marked row!
•  Identiﬁed by its GUID in the Spotﬁre library!
!

Next Steps with Spotfire!
!
!
spodire.2bco.com/trial

spodire.2bco.com/learn/spodire-‐desktop-‐quickstart

spodire.2bco.com/learn/spodire-‐cloud-‐quickstart

Register for a live Spotfire demonstration
spotfire.tibco.com/learn/live-demo

spotfire.tibco.com/demos!
!
spotfire.tibco.com/tips/!
!
tibco.com/blog/tag/trends-and-outliers/!
!
www.tibcommunity.com!
!
Resources spotfire.tibco.com!
!
!

learn.spotﬁre.tibco.com
Training learn.spotﬁre.tibco.com!
!
!

Monthly
Knowledge
Share

Hosted
by
Quintus

LinkedIn!
!
!

Webcasts!
!
Insight and Action - Analyzing Your OSIsoft
PI System Data!
Tuesday, July 7, 2015 1 PM EST!
Presenter: Michael O'Connell & Dave Leigh!
!
Predictive Analytics in the Energy Sector:
Asset Valuation!
Tuesday, July 28, 2015 1PM EST!
Presenter: Michael O'Connell & Peter Shaw with
Haas Engineering and R Lacy!
!
Seeing Stars: the Gartner BI Bakeoff!
Recording, May 27, 2015!
Presenter: Anna Nowakowska & Michael
O'Connell!
Events spotﬁre.tibco.com/about-us/events!
!

78

Fast Data ! ! ! ! ! !www.tibco.com!
htp://d2.2bco.com/fast-‐data-‐
webinars#event-‐processing-‐ROI

79

useR!!
!
! Lou
Bajuk-‐Yorgan
–
Spodire
Product
Management

Ian
Cook
–
Data
Scien2st

Difei
Luo
–
Data
Scien2st

If
you
would
like
to
set
up
a
mee2ng
please

contact
Lou
Bajuk-‐Yorkan
at
lbajuk@,bco.com
or

Lars
Sveding
at
lsveding@,bco.com

Thank you!
Michael
O’Connell,
PhD

Chief
Data
Scien2st

TIBCO
Fellow

moconnell@2bco.com

@moc_2b

htp://about.me/moconnell

+1-‐919-‐7401560

First to Insight, First to Action

TIBCO Advanced Analytics Meetup (TAAM) - June 2015

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to TIBCO Advanced Analytics Meetup (TAAM) - June 2015

Similar to TIBCO Advanced Analytics Meetup (TAAM) - June 2015 (20)

Recently uploaded

Recently uploaded (20)

TIBCO Advanced Analytics Meetup (TAAM) - June 2015