SlideShare a Scribd company logo
Big Data, little data, whatever…
Making the world a little smarter
Matt Denesuk
Manager, Natural Resources Modeling and Social Analytics, IBM Research
Partner, IBM Venture Capital Group
Launch of SPE Technical Section, Petroleum
Data-Driven Analytics (PD2A), October 8, 2012
3 big things
• Physical-meets-Digital
• Data-driven approach
• Heterogeneity & integration (data &
approaches)
Physical-meets-digital is driving highly physical industries toward
being more about moving & manipulating data.
INSTRUMENTED
meters, sensors, actuators, IP enablement, ...
INTERCONNECTED
transmitters, networks, taxonomies, ...
+
+
=
3 key things:
Physical-meets-Digital,
Smarter Planet,
Cyber-physical systems, …transmitters, networks, taxonomies, ...
INTELLIGENT
reporting, visualization, predictive analytics &
modeling, decision mgmnt, closed-loop
automation, ...
+
= Cyber-physical systems, …
Heavy, physical industries are increasingly infusing their operations
with information technology, and this will result in higher growth &
productivity trajectories.
2009 – 20102009
ITSpending/Revenue(%)
A 0.5pt increase in IT spend ratio would drive
$31B in incremental IT spend.
Operating Margin (%)
ITSpending/Revenue(%)
Industries where value is generated by moving and manipulating data
have high IT-spend ratios (and high productivity growth)
Data-driven approach
How Big the data are is just one factor…
Analytical
&/or Data
Complexity
Watson
Computer
Chess
Customer
Data Size
Search Engines
Statistical
Translation
Customer
Churn
But bigger data sets let us use a whole new set of
“dumb” tools that can deliver high-value, with
remarkable speed.
Example: Google & Statistical Translation
• Employ language experts to codify
rules, exceptions, vocabulary
mappings, etc.
• Gather and classify lots of
translated docs (websites, UN,
books, …)
Regular Science approach Statistical (data-driven)
approach
Use of language is infinitely
complex, but you can teach a
computer all the rules and
content.
People say the same kind of
things over and over. And
somebody has already
translated it.
mappings, etc.
• Apply transformation to user’s
query.
books, …)
• Identify & match patterns
• Map to user’s translation query.
• Costly, hard to scale
• Can translate nearly any statement
(but accuracy variable)
• In theory, could be better than
human.
• Incrementally low cost, highly
scalable.
• Limited in scope to digitized
docs that have been translated
before
• Limited by skill of human
translators
Heterogeneity & Integration
Two ways of seeing a data set (and the world)
• The data set is record of everything that happened, e.g.,
– All customer transactions last month
– All friendship links between members of social networking site
• Goal is to find interesting patterns, rules, and/or
associations.
Regular Scientist – “get the knowledge”
Computer Scientist – “get the knowledge locked in the data”
Regular Scientist – “get the knowledge”
(See D. Lambert, or R. Mahoney, e.g.)
• The data set is an partial, and often very noisy
reflection of some underlying phenomenon, e.g.,
– Emission spectra from stars
– Battery voltage varying with current, time, and temperature
• Goal is better understanding or ability to predict,
often through a mathematical model
But the approaches & skill sets can
be joined…
Examples of hybrid, integrated approaches
• Simple, well-defined rules, but computationally impossible
to solve (today)
• Relies on position evaluation function.
– Use human-derived chess theory to set up initially.
– But tune by comparing to the best games humans have
played.
• Better than any human (1997)
• Issues
– Saturation, fatigue, psychology, …
Computer Chess
• People’s opinions reflected in many digitized forms
• Articles, blogs, social media, playlists, …
• “Big Data” search & transform capabilities can generate
buzz metrics (“ink”, sentiment, category, …)
• BUT WHAT DO WITH THEM? Need to apply traditional,
small-data modeling approaches.
• Examples
• Pre-launch promotion management for albums
• Movie trailer management
Buzz & the CMO
Hybrid example: “equipment health” models driving operational
optimization
Oil & Gas Scenario
Gas compressor showing signs of trouble
3 months before a scheduled turnaround.
The system indicates that lowering
pressure by 20% will extend health
enough to make it to turnaround.
–But then production levels will not be
sufficient to fulfill scheduled shipment.
11
sufficient to fulfill scheduled shipment.
The system identifies that another
platform can be run for 30 days at 115%
throughput without significant risk before
its next scheduled turnaround.
Coordinated actions taken, and $40M
production loss avoided.
Trying to combine 3 different kinds of modeling
• Data-driven / Machine-learning
– Early days, often not enough data
– Bias limited region of parameter spaces explored (by
management design)
• Knowledge-based
– Rule capture, experience
Initial use to generate hypotheses for other approaches.– Initial use to generate hypotheses for other approaches.
• Physics-based
– Difficult to scale
– Use for seed models
– Locked-up in OEMs?
12
Also simulation, for what-if
analyses, and verification See Peng et al.
Example: Condition-based Management
Multiple sensor data
streams
Outcomes
Environmental data
Higher-
order
“Events”
&
measures
Probabilistic Models /
Rule Mining
Actionable
Rules,
measures,
& options
Management system
• Maintenance optimization
• Use / output optimization
• Energy / comfort / safety
balancing
Physical Models
Example process:
Text data
Image data
13
Broad range of applications.
Bridges
Water
Infrastructure
Railroads
Aircraft
Mining
Equipment
Oil
Pipelines
Oil
Platforms
Steel
manufacture
Trucking Mobile
ComputersIT Infrastructure
Heavy Infrastructure Business Equipment /
Consumer Products
Human Health?
Home
AppliancesBuildings
(HVAC, Elevators,
Lighting, …)
Photocopiers
Refrigeration
Business value requires both Modeling and Process
Integration
• Many organization not used
to making data-driven
decisions.
– Culturally
– Process-wise
• Mathematical proof of
business value not initially
ProcessIntegration
1. Integration pilot &
evaluation.
2. Deploy/scale
Capability & value
growth
business value not initially
compelling
• Example: CbM & false
positives.
• Initial deployment very
risky!
14
Modeling & Analytics
ProcessIntegration
Models developed &
tested
2. Deploy/scale
14
Key points
• Physical-meets-Digital is happening
• This makes data-driven approaches much more
important
• But most real problems require integration of• But most real problems require integration of
very different approaches and data types
– Not easy to build these teams
• The realities of current culture & process must be
addressed early.

More Related Content

What's hot

Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
Enes Bolfidan
 
What is data science artical
What is data science articalWhat is data science artical
What is data science artical
kavyapandala
 
Datascienceindia article
Datascienceindia articleDatascienceindia article
Datascienceindia article
HimanshuPise1
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG Data
Prasant Misra
 
Business Intelligence & Predictive Analytic by Prof. Lili Saghafi
Business Intelligence & Predictive Analytic by Prof. Lili SaghafiBusiness Intelligence & Predictive Analytic by Prof. Lili Saghafi
Business Intelligence & Predictive Analytic by Prof. Lili Saghafi
Professor Lili Saghafi
 
Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases
Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases
Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases
Big Data Pulse
 
AWC Career Bootcamp- August 21, 2013
AWC Career Bootcamp- August 21, 2013AWC Career Bootcamp- August 21, 2013
AWC Career Bootcamp- August 21, 2013
Patricia A Gilson
 
Optimizing Data Synthesis and Visualization in Real-Time Decision-Making
Optimizing Data Synthesis and Visualization in Real-Time Decision-MakingOptimizing Data Synthesis and Visualization in Real-Time Decision-Making
Optimizing Data Synthesis and Visualization in Real-Time Decision-Making
CSSI_Inc
 
Artificial Intelligence for Automated Decision Support Project
Artificial Intelligence for Automated Decision Support ProjectArtificial Intelligence for Automated Decision Support Project
Artificial Intelligence for Automated Decision Support Project
Valerii Klymchuk
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
Utkarsh Sharma
 
Introduction to machine_learning_us
Introduction to machine_learning_usIntroduction to machine_learning_us
Introduction to machine_learning_us
Anasua Sarkar
 
Data analytics
Data analyticsData analytics
Data analytics
Bhanu Pratap
 
Data mining
Data mining Data mining
Mining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmMining Big Data using Genetic Algorithm
Mining Big Data using Genetic Algorithm
IRJET Journal
 
Smart IoT for Connected Manufacturing
Smart IoT for Connected ManufacturingSmart IoT for Connected Manufacturing
Smart IoT for Connected Manufacturing
Amit Sheth
 
Artificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of IntelligenceArtificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of Intelligence
Abhishek Upadhyay
 
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
Artificial Intelligence Institute at UofSC
 
The NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoTThe NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoT
Prasant Misra
 
Data mining-implementation-to-predict-sales-using-time-series-method By Raiha...
Data mining-implementation-to-predict-sales-using-time-series-method By Raiha...Data mining-implementation-to-predict-sales-using-time-series-method By Raiha...
Data mining-implementation-to-predict-sales-using-time-series-method By Raiha...
raihansikdar
 
KU_Big_Data_3_25_2015a
KU_Big_Data_3_25_2015aKU_Big_Data_3_25_2015a
KU_Big_Data_3_25_2015avonmcconnell
 

What's hot (20)

Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
What is data science artical
What is data science articalWhat is data science artical
What is data science artical
 
Datascienceindia article
Datascienceindia articleDatascienceindia article
Datascienceindia article
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG Data
 
Business Intelligence & Predictive Analytic by Prof. Lili Saghafi
Business Intelligence & Predictive Analytic by Prof. Lili SaghafiBusiness Intelligence & Predictive Analytic by Prof. Lili Saghafi
Business Intelligence & Predictive Analytic by Prof. Lili Saghafi
 
Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases
Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases
Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases
 
AWC Career Bootcamp- August 21, 2013
AWC Career Bootcamp- August 21, 2013AWC Career Bootcamp- August 21, 2013
AWC Career Bootcamp- August 21, 2013
 
Optimizing Data Synthesis and Visualization in Real-Time Decision-Making
Optimizing Data Synthesis and Visualization in Real-Time Decision-MakingOptimizing Data Synthesis and Visualization in Real-Time Decision-Making
Optimizing Data Synthesis and Visualization in Real-Time Decision-Making
 
Artificial Intelligence for Automated Decision Support Project
Artificial Intelligence for Automated Decision Support ProjectArtificial Intelligence for Automated Decision Support Project
Artificial Intelligence for Automated Decision Support Project
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
 
Introduction to machine_learning_us
Introduction to machine_learning_usIntroduction to machine_learning_us
Introduction to machine_learning_us
 
Data analytics
Data analyticsData analytics
Data analytics
 
Data mining
Data mining Data mining
Data mining
 
Mining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmMining Big Data using Genetic Algorithm
Mining Big Data using Genetic Algorithm
 
Smart IoT for Connected Manufacturing
Smart IoT for Connected ManufacturingSmart IoT for Connected Manufacturing
Smart IoT for Connected Manufacturing
 
Artificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of IntelligenceArtificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of Intelligence
 
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
 
The NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoTThe NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoT
 
Data mining-implementation-to-predict-sales-using-time-series-method By Raiha...
Data mining-implementation-to-predict-sales-using-time-series-method By Raiha...Data mining-implementation-to-predict-sales-using-time-series-method By Raiha...
Data mining-implementation-to-predict-sales-using-time-series-method By Raiha...
 
KU_Big_Data_3_25_2015a
KU_Big_Data_3_25_2015aKU_Big_Data_3_25_2015a
KU_Big_Data_3_25_2015a
 

Viewers also liked

%81นวโน้วเศรษฐกิจไทย
%81นวโน้วเศรษฐกิจไทย%81นวโน้วเศรษฐกิจไทย
%81นวโน้วเศรษฐกิจไทยnongalisanaja
 
Seminar mar 10_ashvin
Seminar mar 10_ashvinSeminar mar 10_ashvin
Seminar mar 10_ashvinnongalisanaja
 

Viewers also liked (8)

%81นวโน้วเศรษฐกิจไทย
%81นวโน้วเศรษฐกิจไทย%81นวโน้วเศรษฐกิจไทย
%81นวโน้วเศรษฐกิจไทย
 
Grammar2
Grammar2Grammar2
Grammar2
 
Papalandscape
PapalandscapePapalandscape
Papalandscape
 
Grammar
GrammarGrammar
Grammar
 
For
ForFor
For
 
05
0505
05
 
Grammar2
Grammar2Grammar2
Grammar2
 
Seminar mar 10_ashvin
Seminar mar 10_ashvinSeminar mar 10_ashvin
Seminar mar 10_ashvin
 

Similar to Big data, little data, whatever

Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
mattdenesuk
 
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
ijdpsjournal
 
Cybernetics in supply chain management
Cybernetics in supply chain managementCybernetics in supply chain management
Cybernetics in supply chain management
Luis Cabrera
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
Mahir Haque
 
No Free Lunch: Metadata in the life sciences
No Free Lunch:  Metadata in the life sciencesNo Free Lunch:  Metadata in the life sciences
No Free Lunch: Metadata in the life sciences
Chris Dwan
 
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Chief Analytics Officer Forum
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
Editor IJCATR
 
Predictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesPredictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use Cases
Kimberley Mitchell
 
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
Editor IJMTER
 
Lesson1.2.pptx.pdf
Lesson1.2.pptx.pdfLesson1.2.pptx.pdf
Lesson1.2.pptx.pdf
JhimarPeredoJurado
 
BsidesLVPresso2016_JZeditsv6
BsidesLVPresso2016_JZeditsv6BsidesLVPresso2016_JZeditsv6
BsidesLVPresso2016_JZeditsv6Rod Soto
 
EDRG12_Re.doc
EDRG12_Re.docEDRG12_Re.doc
EDRG12_Re.docbutest
 
EDRG12_Re.doc
EDRG12_Re.docEDRG12_Re.doc
EDRG12_Re.docbutest
 
12209508.ppt
12209508.ppt12209508.ppt
12209508.ppt
RCTan1
 
Big Data
Big DataBig Data
Big Data
Seminar Links
 
inaugural lecture Kang
inaugural lecture Kanginaugural lecture Kang
inaugural lecture Kang
Jing Deng
 
MIS.pptx
MIS.pptxMIS.pptx
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Bigfinite
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
Dr-Dipali Meher
 

Similar to Big data, little data, whatever (20)

Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
 
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
 
Cybernetics in supply chain management
Cybernetics in supply chain managementCybernetics in supply chain management
Cybernetics in supply chain management
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
No Free Lunch: Metadata in the life sciences
No Free Lunch:  Metadata in the life sciencesNo Free Lunch:  Metadata in the life sciences
No Free Lunch: Metadata in the life sciences
 
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
 
Predictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesPredictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use Cases
 
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
 
Lesson1.2.pptx.pdf
Lesson1.2.pptx.pdfLesson1.2.pptx.pdf
Lesson1.2.pptx.pdf
 
BsidesLVPresso2016_JZeditsv6
BsidesLVPresso2016_JZeditsv6BsidesLVPresso2016_JZeditsv6
BsidesLVPresso2016_JZeditsv6
 
EDRG12_Re.doc
EDRG12_Re.docEDRG12_Re.doc
EDRG12_Re.doc
 
EDRG12_Re.doc
EDRG12_Re.docEDRG12_Re.doc
EDRG12_Re.doc
 
12209508.ppt
12209508.ppt12209508.ppt
12209508.ppt
 
An introduction to data mining
An introduction to data miningAn introduction to data mining
An introduction to data mining
 
Big Data
Big DataBig Data
Big Data
 
inaugural lecture Kang
inaugural lecture Kanginaugural lecture Kang
inaugural lecture Kang
 
MIS.pptx
MIS.pptxMIS.pptx
MIS.pptx
 
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
 

Recently uploaded

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 

Big data, little data, whatever

  • 1. Big Data, little data, whatever… Making the world a little smarter Matt Denesuk Manager, Natural Resources Modeling and Social Analytics, IBM Research Partner, IBM Venture Capital Group Launch of SPE Technical Section, Petroleum Data-Driven Analytics (PD2A), October 8, 2012
  • 2. 3 big things • Physical-meets-Digital • Data-driven approach • Heterogeneity & integration (data & approaches)
  • 3. Physical-meets-digital is driving highly physical industries toward being more about moving & manipulating data. INSTRUMENTED meters, sensors, actuators, IP enablement, ... INTERCONNECTED transmitters, networks, taxonomies, ... + + = 3 key things: Physical-meets-Digital, Smarter Planet, Cyber-physical systems, …transmitters, networks, taxonomies, ... INTELLIGENT reporting, visualization, predictive analytics & modeling, decision mgmnt, closed-loop automation, ... + = Cyber-physical systems, …
  • 4. Heavy, physical industries are increasingly infusing their operations with information technology, and this will result in higher growth & productivity trajectories. 2009 – 20102009 ITSpending/Revenue(%) A 0.5pt increase in IT spend ratio would drive $31B in incremental IT spend. Operating Margin (%) ITSpending/Revenue(%) Industries where value is generated by moving and manipulating data have high IT-spend ratios (and high productivity growth)
  • 6. How Big the data are is just one factor… Analytical &/or Data Complexity Watson Computer Chess Customer Data Size Search Engines Statistical Translation Customer Churn But bigger data sets let us use a whole new set of “dumb” tools that can deliver high-value, with remarkable speed.
  • 7. Example: Google & Statistical Translation • Employ language experts to codify rules, exceptions, vocabulary mappings, etc. • Gather and classify lots of translated docs (websites, UN, books, …) Regular Science approach Statistical (data-driven) approach Use of language is infinitely complex, but you can teach a computer all the rules and content. People say the same kind of things over and over. And somebody has already translated it. mappings, etc. • Apply transformation to user’s query. books, …) • Identify & match patterns • Map to user’s translation query. • Costly, hard to scale • Can translate nearly any statement (but accuracy variable) • In theory, could be better than human. • Incrementally low cost, highly scalable. • Limited in scope to digitized docs that have been translated before • Limited by skill of human translators
  • 9. Two ways of seeing a data set (and the world) • The data set is record of everything that happened, e.g., – All customer transactions last month – All friendship links between members of social networking site • Goal is to find interesting patterns, rules, and/or associations. Regular Scientist – “get the knowledge” Computer Scientist – “get the knowledge locked in the data” Regular Scientist – “get the knowledge” (See D. Lambert, or R. Mahoney, e.g.) • The data set is an partial, and often very noisy reflection of some underlying phenomenon, e.g., – Emission spectra from stars – Battery voltage varying with current, time, and temperature • Goal is better understanding or ability to predict, often through a mathematical model But the approaches & skill sets can be joined…
  • 10. Examples of hybrid, integrated approaches • Simple, well-defined rules, but computationally impossible to solve (today) • Relies on position evaluation function. – Use human-derived chess theory to set up initially. – But tune by comparing to the best games humans have played. • Better than any human (1997) • Issues – Saturation, fatigue, psychology, … Computer Chess • People’s opinions reflected in many digitized forms • Articles, blogs, social media, playlists, … • “Big Data” search & transform capabilities can generate buzz metrics (“ink”, sentiment, category, …) • BUT WHAT DO WITH THEM? Need to apply traditional, small-data modeling approaches. • Examples • Pre-launch promotion management for albums • Movie trailer management Buzz & the CMO
  • 11. Hybrid example: “equipment health” models driving operational optimization Oil & Gas Scenario Gas compressor showing signs of trouble 3 months before a scheduled turnaround. The system indicates that lowering pressure by 20% will extend health enough to make it to turnaround. –But then production levels will not be sufficient to fulfill scheduled shipment. 11 sufficient to fulfill scheduled shipment. The system identifies that another platform can be run for 30 days at 115% throughput without significant risk before its next scheduled turnaround. Coordinated actions taken, and $40M production loss avoided.
  • 12. Trying to combine 3 different kinds of modeling • Data-driven / Machine-learning – Early days, often not enough data – Bias limited region of parameter spaces explored (by management design) • Knowledge-based – Rule capture, experience Initial use to generate hypotheses for other approaches.– Initial use to generate hypotheses for other approaches. • Physics-based – Difficult to scale – Use for seed models – Locked-up in OEMs? 12 Also simulation, for what-if analyses, and verification See Peng et al.
  • 13. Example: Condition-based Management Multiple sensor data streams Outcomes Environmental data Higher- order “Events” & measures Probabilistic Models / Rule Mining Actionable Rules, measures, & options Management system • Maintenance optimization • Use / output optimization • Energy / comfort / safety balancing Physical Models Example process: Text data Image data 13 Broad range of applications. Bridges Water Infrastructure Railroads Aircraft Mining Equipment Oil Pipelines Oil Platforms Steel manufacture Trucking Mobile ComputersIT Infrastructure Heavy Infrastructure Business Equipment / Consumer Products Human Health? Home AppliancesBuildings (HVAC, Elevators, Lighting, …) Photocopiers Refrigeration
  • 14. Business value requires both Modeling and Process Integration • Many organization not used to making data-driven decisions. – Culturally – Process-wise • Mathematical proof of business value not initially ProcessIntegration 1. Integration pilot & evaluation. 2. Deploy/scale Capability & value growth business value not initially compelling • Example: CbM & false positives. • Initial deployment very risky! 14 Modeling & Analytics ProcessIntegration Models developed & tested 2. Deploy/scale 14
  • 15. Key points • Physical-meets-Digital is happening • This makes data-driven approaches much more important • But most real problems require integration of• But most real problems require integration of very different approaches and data types – Not easy to build these teams • The realities of current culture & process must be addressed early.