Houston Energy Data Science Meet up_TIBCO Slides

TIBCO Advanced Analytics
Houston Energy Data Science
Meetup
Michael O’Connell
Chief Data Scientist
moconnell@tibco.com
@moc_tib
August 2015

• Data Science Process
• Data Analysis Pipeline
• Understand – Anticipate – Act
• Advanced Analytics
• TIBCO’s R engine
• GeoLocation Analytics
• Real-Time Analytics
• Remote Monitoring – the Digital Nervous System
• Software & APIs
• Wrap-Up / Questions
Increase
Productivity
Grow
Revenue
Value
Reduce
Risk
ROI
TIBCO Analytics – Insight to Action
© Copyright 2000-2015 TIBCO Software Inc.

“Data Science”
Engineer/Marketeer
“Address the
business issue”
Statistician
“Build the
best model”
IT / Developer
“Manage my
infrastructure”
Engineer/Marketeer:
Knows the business problem but
doesn’t know how to prepare data
or build models.
Statistician:
Knows how to develop appropriate
models to address business
problems but is in short supply and
can’t deploy IT or business
systems
IT / Developer:
Knows databases, application
provisioning and development tools
but isn’t familiar with data meaning
or analytical workflow purpose
What is a Data Scientist

Data Access
& Prep
Exploratory
Data Analysis
Features
Visual
Dashboard
Model &
Predict
Deploy
Champion
Model
Test &
Learn
Channel
Social
Loyalty
Campaign
Filter
Map
Merge
Shape
Propensity
Affinity
ImproveGuided -------- Deploy -------- In-LineExplore Data
Aggregate
Prepare DataBusiness Case
Increase
Productivity
Grow
Revenue
Ensemble
Forest
Regression
Additive
Models
Segment
Visualize
Pricing
Promotion
Challenger
Models
At Rest
In Motion
Value
Theses
Reduce
Risk
ROI
Value
Dashboard
Updates
Data a Insight a Action

Spotfire
Desktop
TIBCO Analytics Stack

Custom GUI-driven
data access via SDK
Enterprise Data Access
Siebel
eBusiness
Local data sources
AccessExcel STDF
Drag-and-drop
MySQL
SQL Server
Oracle
Information Services
(join, transform, reusable,
parameterized, dynamic query
for in-memory use)
Databases
JDBC/ODBC
Hadoop
SFDC
PostgreSQL
Teradata
Netezza
Etc.XML
RDBMS
Flat
Files
Spread-
sheets
Web
Services
Oracle
E-Business
RDBMS
RDBMS
RDBMS
SAP BWSAP R/3 D
A
T
A
F
A
B
R
I
C
Salesforce
ODBC
OLE DB
SqlClient
Direct
connection
Oracle
TeradataAsterMS SSAS
Teradata
Direct Query
(dynamically query and retrieve data
for visualization and analysis)
Databases
MySQL
Etc.
OBIEE
Netezza
Hadoop

Immediate
Long-Term
Competitive AdvantageValue to the Organization
TIBCO is the only analytics platform that provides business
value across the Analytics Spectrum
Self-service
Dashboards
Event Processing
Predictive and
Prescriptive Analytics
Measure Diagnose Predict Optimize Operationalize Automate
Analytics Maturity
Analytics Spectrum

Immediate
Long-Term
Self-service
Dashboards
Analytics Maturity
Analytics Spectrum
Predictive and
Event Processing

Immediate
Long-Term
Self-service
Dashboards
Predictive and
Analytics Maturity
Analytics Spectrum
Event Processing

© Copyright 2000-2015 TIBCO Software Inc. 10
Visual Analytics – Spotfire

Visual Analytics – Spotfire
3D rotate SurfacePolar
Contour Network Funnel

Spotfire Extensions – d3 and JS
Sankey
Venn
ChordDonut
Dials
Gantt

Visual Analytics – Dashboards

Jaspersoft Pixel-Perfect Embedded Reports

Analytic Workspaces & Analytic Fabric
APIs
Search,Sharingetc.
Business Analysts Report Developers
Analytic
Workspaces
Analytic
Fabric
Data Discovery Analytics Dashboards Reports

Spotfire is Super Simple to Use
US Homeless Analysis
Step-by-Step
YouTube Playlist
• Dashboards
• Predictive
• GeoLocation

Advanced Analytics Ecosystem

TIBCO Enterprise Runtime for R (TERR)
• TIBCO has rewritten R as a Commercial Compute Engine
• Latest statistics scripting engine: S a S-PLUS® a R a TERR
• Runs R code including CRAN packages
• Engine internals rebuilt from scratch at low-level
• Redesigned data objects, memory management
• High performance + Big Data
• TERR is licensed from TIBCO
• TERR Installs (free) with Spotfire Analyst / Desktop and other TIBCO products (CEP, Stats)
• Spotfire Server can manage all TERR / R scripts, artifacts for reuse
• Standalone Developer Edition: www.TIBCOmmunity.com
• Supported by TIBCO

Model Fitting: 5 Million Rows Model Scoring: 20 Million Rows
TERR 7X faster 84X
TERR Performance

Spotfire and TERR local TERR on server
Spotfire-TERR – Local and Server
• Build models on data using local
TERR engine embedded in
Spotfire
• Build models on big data directly in TERR on
server and display results in Spotfire
• Run TERR as parallel sessions on Hadoop cluster,
controlled and visualized in Spotfire
Data Source TERR
TSSS
Spotfire
Results
ODBC
JDBC
SDC
File
Data
Function
Larger Data
Modeling
Spotfire
Local
TERR
ODBC
JDBC
SDC
File
Data
Data Source
Both Spotfire and TERR can load data from any ODBC or JDBC compliant source or from
Spotfire Data Connections (SDC) or Spotfire Information Links stored in the Spotfire library.

Simple Predictive Analytics – Forecasting & Modeling
Contextual Analytics
- Forecasting
Contextual Analytics
- Machine Learning

Extensible Predictive Analytics – Analysis Workflows
Interactive Spotfire Analytics with R
- Data Function
- Robust Cluster Analysis
- Any Analysis in R / CRAN
Variables driving segments
- Random Forest
Revenue by product
- Color by segment

Free Scripts - GeoCluster [kmeans(x,y)]

Free Scripts - Contours [contourLines(x,y,z)]

Spotfire-TERR : Data Types, Analyses
Spotfire data functions support any
type of data as input and output
parameters to and from TERR.
TERR data functions used for data
prep, integration, predictive &
prescriptive analytics, …
TERR data functions can output
content metadata to Spotfire
• formatting of fields
• handling of binary data including
images and geospatial objects.
Rows
Columns
Values
Tables
Metadata
Blobs
Geometries
Images
Spotfire TERR
Data
Function

Production Forecasting
Forecast Production – Set Expected Production for Wells• Resource Play
• Repeatable distribution for EUR
• Offset not reliable predictor
• Continuous hydrocarbon system
• Free hydrocarbon not held in place by
hydrodynamics
• Geologic Subset
• Analogous Wells
• Geology, completion, spacing, vintage
• Analysis and Data
• Production forecasting (EUR)
• Probability of production
• Proven (P90), Probable (P50), Possible (P10)
• Cluster and Regression Analysis

Proven, Probable and Possible Production• Resource Play
• Repeatable distribution for EUR
• Offset not reliable predictor
• Continuous hydrocarbon system
• Free hydrocarbon not held in place by
hydrodynamics
• Geologic Subset
• Analogous Wells
• Geology, completion, spacing, vintage
• Production forecasting (EUR)
• Probability of production
• Proven (P90), Probable (P50), Possible (P10)
• Cluster and Regression Analysis
Probability: Proven & Probable Production

Completions Optimization
• Business Opportunities
• Completions optimization by well
• Production prediction for new wells
• Identify factors driving production vs
expected production e.g. operator
• Subsurface (e.g. Spectra)
• Location
• Completions
• Production
• Value and Financial Impact
• Optimal completions
• Operations management
• Asset valuation & “where to drill”
Optimize Completions – Location, Subsurface

41
• Maintenance optimization
• Failure times and locations
• Maintenance and failure costs
• Root cause analysis
• Visibility into maintenance
expenses and root causes
• Optimal maintenance scheduling
Maintenance Optimization
Equipment Reliability - Refining

Winner of 2014 Strata Cloudera Award
For Best Advanced Analytics Application
Big Data Analytics with Spotfire and TERR

Big Data Analytics with TERR
TERR on the nodes of Hadoop Cluster
TERR in Action
• Hadoop cluster compute
• TIBCO Cloud Compute Grid
• TIBCO Streambase
• TIBCO Business Events
• KNIME
• Lavastorm
• Rstudio
• Teradata
• TIBCO Statistics Services
• TIBCO Spotfire

Predictive & Collaborative Analytics
Library of Data Functions – everyone Shares
• Analysts use functions – no code
• Coders develop new functions – R
Data Function Samples
• Ship with Spotfire Server
• Geospatial
• Computations with polygons on a map
• Computing optimal routes in logistics
• Machine Learning
• Fitting models and making predictions
• Applications
• Customers, Finance, Machines, …
IT View - GovernanceUser View - Functions

BIG DATA
AT REST
FAST DATA
IN MOTION
Insight to Action

Analyze And Act On “Critical Business Moments”
Optimize
pricing Check for
fraud
Make offer
to customer
Restock
inventory
Reroute
transport
Give customer
service
Proactively
maintain machines

Big Data
– Analysis of production
– Analysis of contracts and product
inventory
Fast Data
– Location data from ships and
trains, weather and tides
– Manage product supply
– Optimize fuel use
Benefits
– Optimize product contracts
– Maximize product shipped
– Minimize logistics cost
Managing Supply Chain

Managing Industrial Equipment
Big Data
– Analysis of production
– Failure analytics
Fast Data
– Real-time sensor data
– Leading indicator for shutdowns
– Drilling: kick detection
– Flow monitoring
Benefits
– Reduced NPT: Big $$s
– System reliability
– Efficient drilling

Data Monitoring
• Motor temperature
• Motor vibration
• Current
• Intake pressure
• Intake temperature
 Flow
Electrical power cable
Pump
Intake
Protector
ESP motor
Pump monitoring unit
Pump Components
Equipment Monitoring & Management
Video: https://youtu.be/vIVepQRl5SY

• Pump health & performance surveillance
• Condition-based maintenance
• Effects of operating conditions on performance
• Effects of suppliers on reliability
• Component faults and failure analysis
• Prioritization of engineering and retrofit
• Supplier involvement in system reliability
• ID systems for Engineering focus
• Warranty cost recovery

Trend Analysis
Combination of Rules
CUSUM Analysis
Statistical Analysis
Statistical Process Control
Machine Learning
Location Change
– Variable moves up or down
Slope Change
– Variable changes trend
Variance Change
– Variable becomes more/less volatile
Process Threshold
– Shewhart control chart
Failure Model
y (0/1) = f (X, b) + e; f = logistic regression, trees, svm, nnet, ...
Sensor Analytics

1. Analytics models
2. Data streams
3. Calculations on live data
4. Analysis notifications
Fast Data Analytics

Live Data

Industrial Equipment Management Improves Operations

IT & Governance
© Copyright 2000-2015 TIBCO Software Inc.© Copyright 2000-2014 TIBCO Software Inc.
• Library Services
• Centralized management of Spotfire analysis files,
metadata, information links, TERR scripts, …
• User Services
• User authentication, role-based authorization
• Audit Services
• Content access, modification, deletion
• User authentication, data access, library operations
• Usage Log Analytics
• Sessions, Users, Admin, Local Files
• Library, Information Links, Admin, Detailed Logs
• Analysis Profiler
• Automate every analysis file during upgrade / migration

Tibco’s Fast Data Platform Architecture

Learn how some of the major players in
the energy industry are using Spotfire to
revolutionize their business:
• How to minimize risks by better
understanding exposure to asset
integrity issues
• Using analytics to control margins
and conduct customer profiling
• Leveraging forensics to reduce NPT
and monitor production
• Production optimization techniques
http://energyforum.tibco.com/
Energy Forum
September 1st – 2nd | Norris Conference Center | Houston, TX

spotfire.tibco.com/demos
spotfire.tibco.com/tips/
tibco.com/blog/tag/trends-and-outliers/
www.tibcommunity.com
Resources spotfire.tibco.com

Monthly Knowledge Share
Hosted by Quintus
Linked In hosted by Syntelli
LinkedIn

Webcasts
Insight and Action - Analyzing Your OSIsoft
PI System Data
Tuesday, July 7, 2015 1 PM EST
Presenter: Michael O'Connell & Dave Leigh
Predictive Analytics in the Energy Sector:
Asset Valuation
Tuesday, July 28, 2015 1PM EST
Presenter: Michael O'Connell & Peter Shaw with
Haas Engineering and R Lacy
Seeing Stars: the Gartner BI Bakeoff
Recording, May 27, 2015
Presenter: Anna Nowakowska & Michael
O'Connell
Events spotfire.tibco.com/about-us/events

66
Spotfire Ecosystem

Thank you!
Michael O’Connell, PhD
Chief Data Scientist
TIBCO
moconnell@tibco.com
@moc_tib
http://about.me/moconnell
+1-919-7401560
First to Insight, First to Action

Houston Energy Data Science Meet up_TIBCO Slides

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Houston Energy Data Science Meet up_TIBCO Slides

Similar to Houston Energy Data Science Meet up_TIBCO Slides (20)

Recently uploaded

Recently uploaded (20)

Houston Energy Data Science Meet up_TIBCO Slides

Editor's Notes