SlideShare a Scribd company logo
1 of 50
Download to read offline
Grab some coffee and enjoy 
the pre-show banter before 
the top of the hour!
“How 
Can 
Analy,cs 
Improve 
Business?” 
TechWise 
Webcast 
| 
July 
23, 
2014
+ 
Guests 
Host: Eric Kavanagh 
CEO, 
The Bloor Group 
Dr. Kirk Borne 
Data Scientist, 
George Mason University 
Dr. Robin Bloor 
Chief Analyst, 
The Bloor Group 
PLUS: 
Will Gorman Chief Architect, Pentaho 
Steve Wilkes CTO, WebAction 
Frank Sanders Technical Director, MarkLogic 
Hannah Smalltree Director, Treasure Data
Analytics Can Help a Business: 
• Streamline operations 
• Improve marketing 
• Raise revenue 
• Identify opportunities 
• Assess plans 
+ 
Executive Summary
Dr. Kirk Borne 
Data Scientist, George Mason University 
+
Big Data Analytics for 
Data-to-Decisions Support 
Kirk Borne 
George Mason University, Fairfax, VA ● www.kirkborne.net @KirkDBorne
Extrac,ng 
Knowledge, 
Insights, 
and 
Data-­‐to-­‐Decisions 
(D2D) 
from 
Big 
Data 
is 
hard!
The D2D Challenge** 
1. Characterize and 
!me 
flux 
Contextualize first. 
2. Collect and Curate 
each entity’s features. 
…then Come to the 
data-driven decision! 
• Data-to-Discoveries 
• Data-to-Decisions 
• Data-to-Dollars
Characteriza,on 
& 
Contextualiza,on 
Feature & Context Detection and Extraction: 
• Identify and characterize features in the data: 
– Machine-generated 
– Human-generated 
– Crowdsourced? (= Tapping the Power of Human Cognition 
to find patterns and anomalies in massive data!) 
• Extract the context of the data: the source, the channel, 
the data user, the use cases, the value, the re-uses … 
where, when, who, how, what, why = Metadata! 
• Curate these features for search, re-use, and D2D! 
• Find other parameters and features from other data 
sources and databases – integrate all information to 
help characterize & contextualize (and ultimately make 
decision regarding) each new event.
Characterization via Tagging & Annotation 
• Report entity’s features & characteristics back to the 
database for search, retrieval, sharing, and reuse 
• Individual (or groups of) entities (objects and/or 
events) are tagged and annotated ... 
– with new knowledge discovered 
– with related data/information of any kind 
– with common knowledge about those things 
– with inter-relationships between entities and their properties 
– with concepts 
– with context 
– i.e., assertions (e.g., classifications, interpretations, quality 
flags, relationships, references, common knowledge, 
learned knowledge, inter-connectivity with other entities) 
– with data collection parameters 
– with sensor channel descriptors 
Semantics! 
Data integration 
Provenance 
(for data curation)
Characteriza,on 
& 
Contextualiza,on 
Feature & Context Detection and Extraction: 
• Identify and characterize features in the data: 
– Machine-generated 
– Human-generated 
– Crowdsourced? (= Tapping the Power of Human Cognition 
to find patterns and anomalies in massive data!) 
• Extract the context of the data: the source, the channel, 
the data user, the use cases, the value, the re-uses … 
where, when, who, how, what, why = Metadata! 
• Curate these features for search, re-use, and D2D! 
• Find other parameters and features from other data 
sources and databases – integrate all information to 
help characterize & contextualize (and ultimately make 
decision regarding) each new event.
Then 
what?
Then 
what? 
Get down to business with the Curated 
Collection of Characterizations and 
Contextualizations: 
• Data Analytics: 
– Outlier / Anomaly / Novelty / Surprise detection 
– Clustering (= New Class discovery) 
– Correlation & Association discovery 
• D2D: 
– Data-to-Discoveries 
– Data-to-Decisions 
– Data-to-Dollars
The 
Business 
Analyst-­‐in-­‐the-­‐Loop 
Tags, 
annota,ons, 
features, 
and 
context 
– 
– These 
can 
be 
… 
• measured 
(by 
observa,on), 
or 
• inferred 
through 
machine 
learning, 
or 
• provided 
by 
human 
analysts. 
– The 
resul,ng 
synergy 
yields: 
• improved 
or all 3 of these 
processes 
simultaneously. 
training 
sets, 
more 
accurate 
predic,ve 
models, 
fewer 
false 
posi,ves/nega,ves, 
ac,ve 
learning, 
efficient 
human 
interven,ons 
– Combining 
machine 
learning 
on 
Big 
Data 
with 
the 
power 
of 
human 
cogni,on 
for 
discovery 
(e.g., 
using 
Data 
Visualiza,on, 
Visual 
Analy,cs, 
Immersive 
Data 
Environments, 
or 
Crowdsourcing) 
therefore 
augments 
and 
accelerates 
discovery, 
insights, 
and 
D2D.
Dr. Robin Bloor 
Chief Analyst, The Bloor Group 
+
The 
Data 
Scientist 
& 
The 
Business 
Analyst 
Robin Bloor
The Data Analysis Budget 
u Data Analysis is 
Business R&D 
u The focus is on 
business process 
u The outcome of successful 
R&D is a changed process 
u Think of manufacturing for 
a useful example
Big Data Architecture
What is a Data Scientist? 
u Project manager 
u Qualified statistician 
u Domain Business expert 
u Experienced data 
architect 
u Software engineer 
(IT’S A TEAM)
The Impact of Machine Learning 
Machine learning is changing the process 
(for the BUSINESS ANALYST & the DATA SCIENTIST) 
BUT the analytics team needs to understand IT!!
Take Note! 
You can know more 
about a business 
from its data than 
by any other 
means
There are Two Issues for the Business 
Can you get the 
Can you get the 
TECHNOLOGY right? 
PEOPLE right? 
&
+ 
Will Gorman 
Chief Architect, Pentaho
© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide 24 +1 (866) 660-7555 
July 2014 
Pentaho Business 
Analytics 
Architected for the 
Future of Analytics 
Will Gorman, Chief Architect
WHAT WE DO 
We enable the modern, big data-driven business 
Modern, cohesive data integration and business analytics platform 
• Full spectrum of advanced analytics for all key roles 
• Embeddable, cloud-ready analytics 
• Big data blending for analytics in real-time environments 
• Broadest and deepest big data integration 
Innovation through open source 
• Open, pluggable, purpose-built for the future 
• Early sustained leadership in big data 
ecosystem with technology innovation 
Critical mass achieved 
• Over 1,500 commercial customers 
• Over 10,000 production deployments 
© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide 25 +1 (866) 660-7555
Pentaho 5.1 Architected for the Future 
Simplified analytics @ scale for all users 
© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide 26 +1 (866) 660-7555
Evolving Big Data Architectures 
© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide 27 +1 (866) 660-7555 
Existing 
ETL Tool 
or PDI 
EDW 
Data 
Marts 
Analytics 
Existing 
ETL Tool 
or PDI 
Customer 
Provisioning 
Billing 
Other 
BI Tools
Evolving Big Data Architectures 
Existing 
ETL Tool 
or PDI 
P Just-in-Time Integration 
D 
I 
Network 
© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide 28 +1 (866) 660-7555 
PDI 
Analytic 
DB 
Location 
Web 
Social Media 
Existing 
Process 
or PDI Hadoop 
Cluster 
NoSQL 
EDW 
Data 
Marts 
Analytics 
Existing 
ETL Tool 
or PDI 
Customer 
Provisioning 
Billing 
Other 
BI Tools
The strength of Pentaho 
lies in the power of combination 
Data 
integration 
Big data +Any data 
© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide 29 +1 (866) 660-7555 
Business 
+analytics 
The IT 
department 
Lines of 
+business 
Any data. Any environment. Any analytics.
Thank You 
JOIN THE CONVERSATION. YOU CAN FIND US ON: 
blog.pentaho.com 
@Pentaho 
© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide 30 +1 (866) 660-7555 
Facebook.com/Pentaho 
Pentaho Business Analytics
Steve Wilkes 
CTO, WebAction 
+
The Future of Data Driven Apps 
July 2014
WebAction® delivers the leading 
Real-time App Platform 
enabling the next generation of 
Data Driven Apps 
for the Agile Enterprise
Acquire Store Process 
Batch Reactive 
RDBMS EDW BI / Analytics 
Structured 
Data 
Machine 
Data 
Click Location 
Stream 
Structured 
Data 
Machine 
Data 
Real-time Proactive 
Click Location 
Stream 
REALTIME BARRIER 
Data Driven 
Apps 
RDBMS 
Hadoop 
Acquire Process in Memory Store
Distributed DIM 
Processor 
Distributed 
WAction Cache 
Metadata 
High Speed Data Acquisition 
WActionStore 
Transaction Data 
Social Feeds 
Tungsten Device Data Visualization 
RDBMS 
Big Data 
Infrastructure 
Industry Data 
Enterprise 
Applications 
Enterprise Data 
Warehouse 
Data Driven Apps 
System/ IT Data
Security 
Event 
Processing 
Cloud 
Application 
Control 
Risk & Fraud 
Alerting 
Quality of 
Service 
Management 
Consumer 
Analytics 
DataCenter 
Management
Frank Sanders 
Technical Director, MarkLogic 
+
Data Centered Approach is More Flexible 
PDF 
SLIDE: 38 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. 
Slide 38 Copyright © 2010 MarkLogic® Corporation. All 2011 rights reserved.
Universal Index Powers Search & Analytics 
<location> 
<lat> 
37.497075 
<long> 
-122.363319 
Unstructured full-text 
<object> 
SLIDE: 39 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. 
Slide 39 Copyright © 2010 MarkLogic® Corporation. All 2011 rights reserved. 
<SAR> 
<title> 
Suspicious vehicle… 
<date> 
2012-11-12Z 
<type> 
<threat> 
suspicious activity 
<category> 
suspicious vehicle 
<description> 
A blue van… 
<subject> 
<subject> 
<predicate> 
<object> 
IRIID 
IRIID 
isa 
value 
license-plate 
<predicate> ABC 123 
observation/surveillance 
<type> 
<triple> 
<triple> 
Geospatial 
Va l u e s
Fairfax County Police Events Application 
SLIDE: 40 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. 
Slide 40 Copyright © 2010 MarkLogic® Corporation. All 2011 rights reserved.
OECD Better Life Index 
SLIDE: 41 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. 
Slide 41 Copyright © 2010 MarkLogic® Corporation. All 2011 rights reserved.
MarkMail: Search-powered Visualization 
SLIDE: 42 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. 
Slide 42 Copyright © 2010 MarkLogic® Corporation. All 2011 rights reserved.
Hannah Smalltree 
Director, Treasure Data 
+
The Treasure Data Cloud Service 
Store! 
Cloud Storage! 
Managed, Monitored, 
Scalable, Secure! 
Web Mgmt. Console! 
View/query data, 
Access controls! 
Collect! 
Stream ! 
Logs/Events in 
Real-time! 
Bulk Import! 
from Most 
Sources! 
Copyright 
©2014 
Treasure 
Data. 
All 
Rights 
Reserved. 
Analyze! 
Query with SQL 
Multiple Query 
Engines, Ad Hoc! 
! 
! 
BI Tool Connectivity! 
Tableau, Most BI/Viz/ 
Analytics Tools! 
Export! 
Query Results 
or Datasets! 
Anytime! 
Cloud Managed Service (SaaS) || <2 Week Setup || Flat monthly rate!
Specializing in Streaming “BIG” Data 
Volume 
Velocity 
Variety 
Examples: 
Clickstream, 
Web 
Access 
Logs, 
Mobile 
Data, 
App 
Logs, 
Event 
Logs, 
Sensors, 
Machine 
Data… 
Copyright 
©2014 
Treasure 
Data. 
All 
Rights 
Reserved.
Big Data Analytics Use Cases 
Use Case! Key Data Sources! Results! Treasure Example! 
Copyright 
©2014 
Treasure 
Data. 
All 
Rights 
Reserved. 
Website & " 
Mobile App " 
Behavior Analytics" 
Mobile App Clicks " 
Web Clickstream" 
+ eComm, POS" 
Increase sales and 
retail foot traffic within 
weeks" 
Mobile Application 
Analytics" 
Mobile Application 
Logs" 
Increase Engagement 
(=Sales) by Iterating 
Quickly" 
Product Behavior " 
& Sensor Analytics" 
Sensor Data" 
Improved Product 
Development" 
" 
New Product/Service 
Development" 
$216B 
Global 
Retailer 
Video 
Games
Treasure Data In Your Analytics Environment 
Collect" Store" Analyze" 
Copyright 
©2014 
Treasure 
Data. 
All 
Rights 
Reserved. 
Your" 
Server," 
Device," 
Gateway" 
etc…" 
SQL" 
Your BI, 
Visualization" 
Adv. Analytics" 
Your Data Mart" 
Data Warehouse" 
DBMS, etc." 
Streaming" Treasure Data Service" 
Aggregates" 
Export/Integrate"
Copyright 
©2014 
Treasure 
Data. 
All 
Rights 
Reserved. 
Resources 
TreasureData.com! 
Datasheets, Case Studies, Whitepapers! 
TDWI, 451, Analyst Whitepapers! 
Gartner Report: Cool Vendors in Big Data! 
! 
Try the Starter Service For Free! 
TreasureData.com/TryItNow!
+ 
Questions? 
#TechWise 
or 
USE THE Q&A
+ 
THANK 
YOU! 
FIND THE ARCHIVE AT 
InsideAnalysis.com & Techopedia.com

More Related Content

What's hot

Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...
Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...
Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...
emermell
 

What's hot (20)

AI on Big Data
AI on Big DataAI on Big Data
AI on Big Data
 
Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...
Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...
Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...
 
Big Data Analytics in Government
Big Data Analytics in GovernmentBig Data Analytics in Government
Big Data Analytics in Government
 
When Big Data and Predictive Analytics Collide: Visual Magic Happens
When Big Data and Predictive Analytics Collide: Visual Magic HappensWhen Big Data and Predictive Analytics Collide: Visual Magic Happens
When Big Data and Predictive Analytics Collide: Visual Magic Happens
 
Left Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsLeft Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise Analytics
 
GDPR: Leverage the Power of Graphs
GDPR: Leverage the Power of GraphsGDPR: Leverage the Power of Graphs
GDPR: Leverage the Power of Graphs
 
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
 
Smart Data Webinar: Machine Learning Update
Smart Data Webinar: Machine Learning UpdateSmart Data Webinar: Machine Learning Update
Smart Data Webinar: Machine Learning Update
 
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
 
Data-centric design and the knowledge graph
Data-centric design and the knowledge graphData-centric design and the knowledge graph
Data-centric design and the knowledge graph
 
Stanford DeepDive Framework
Stanford DeepDive FrameworkStanford DeepDive Framework
Stanford DeepDive Framework
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science Teams
 
Future of Data - Big Data
Future of Data - Big DataFuture of Data - Big Data
Future of Data - Big Data
 
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse..."Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
 
A Pragmatic AI Maturity Model
A Pragmatic AI Maturity ModelA Pragmatic AI Maturity Model
A Pragmatic AI Maturity Model
 
Data Scientist Toolbox
Data Scientist ToolboxData Scientist Toolbox
Data Scientist Toolbox
 
3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning
 
HPE IDOL Technical Overview - july 2016
HPE IDOL Technical Overview - july 2016HPE IDOL Technical Overview - july 2016
HPE IDOL Technical Overview - july 2016
 
Göteborg university(condensed)
Göteborg university(condensed)Göteborg university(condensed)
Göteborg university(condensed)
 
PROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked DataPROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked Data
 

Similar to How Can Analytics Improve Business?

Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
Julian Tong
 
The Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallThe Bigger They Are The Harder They Fall
The Bigger They Are The Harder They Fall
Trillium Software
 

Similar to How Can Analytics Improve Business? (20)

Just ask Watson Seminar
Just ask Watson SeminarJust ask Watson Seminar
Just ask Watson Seminar
 
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the Enterprise
 
Matt McIlwain opening keynote
Matt McIlwain opening keynoteMatt McIlwain opening keynote
Matt McIlwain opening keynote
 
Bridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudBridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the Cloud
 
Building the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseBuilding the Artificially Intelligent Enterprise
Building the Artificially Intelligent Enterprise
 
The Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallThe Bigger They Are The Harder They Fall
The Bigger They Are The Harder They Fall
 
The Power of Data
The Power of DataThe Power of Data
The Power of Data
 
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationKASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
 
Data science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxData science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptx
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
 
Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It?
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teams
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
How to Identify, Train or Become a Data Scientist
How to Identify, Train or Become a Data ScientistHow to Identify, Train or Become a Data Scientist
How to Identify, Train or Become a Data Scientist
 

More from Inside Analysis

Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
Inside Analysis
 

More from Inside Analysis (20)

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BI
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data Letdown
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On Time
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of Data
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop Adoption
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time Analytics
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of Everything
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global Level
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your Architecture
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the Risk
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big Data
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data Warehouse
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave Duggal
 
Modus Operandi
Modus OperandiModus Operandi
Modus Operandi
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Recently uploaded (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

How Can Analytics Improve Business?

  • 1. Grab some coffee and enjoy the pre-show banter before the top of the hour!
  • 2. “How Can Analy,cs Improve Business?” TechWise Webcast | July 23, 2014
  • 3. + Guests Host: Eric Kavanagh CEO, The Bloor Group Dr. Kirk Borne Data Scientist, George Mason University Dr. Robin Bloor Chief Analyst, The Bloor Group PLUS: Will Gorman Chief Architect, Pentaho Steve Wilkes CTO, WebAction Frank Sanders Technical Director, MarkLogic Hannah Smalltree Director, Treasure Data
  • 4. Analytics Can Help a Business: • Streamline operations • Improve marketing • Raise revenue • Identify opportunities • Assess plans + Executive Summary
  • 5. Dr. Kirk Borne Data Scientist, George Mason University +
  • 6. Big Data Analytics for Data-to-Decisions Support Kirk Borne George Mason University, Fairfax, VA ● www.kirkborne.net @KirkDBorne
  • 7. Extrac,ng Knowledge, Insights, and Data-­‐to-­‐Decisions (D2D) from Big Data is hard!
  • 8. The D2D Challenge** 1. Characterize and !me flux Contextualize first. 2. Collect and Curate each entity’s features. …then Come to the data-driven decision! • Data-to-Discoveries • Data-to-Decisions • Data-to-Dollars
  • 9. Characteriza,on & Contextualiza,on Feature & Context Detection and Extraction: • Identify and characterize features in the data: – Machine-generated – Human-generated – Crowdsourced? (= Tapping the Power of Human Cognition to find patterns and anomalies in massive data!) • Extract the context of the data: the source, the channel, the data user, the use cases, the value, the re-uses … where, when, who, how, what, why = Metadata! • Curate these features for search, re-use, and D2D! • Find other parameters and features from other data sources and databases – integrate all information to help characterize & contextualize (and ultimately make decision regarding) each new event.
  • 10. Characterization via Tagging & Annotation • Report entity’s features & characteristics back to the database for search, retrieval, sharing, and reuse • Individual (or groups of) entities (objects and/or events) are tagged and annotated ... – with new knowledge discovered – with related data/information of any kind – with common knowledge about those things – with inter-relationships between entities and their properties – with concepts – with context – i.e., assertions (e.g., classifications, interpretations, quality flags, relationships, references, common knowledge, learned knowledge, inter-connectivity with other entities) – with data collection parameters – with sensor channel descriptors Semantics! Data integration Provenance (for data curation)
  • 11. Characteriza,on & Contextualiza,on Feature & Context Detection and Extraction: • Identify and characterize features in the data: – Machine-generated – Human-generated – Crowdsourced? (= Tapping the Power of Human Cognition to find patterns and anomalies in massive data!) • Extract the context of the data: the source, the channel, the data user, the use cases, the value, the re-uses … where, when, who, how, what, why = Metadata! • Curate these features for search, re-use, and D2D! • Find other parameters and features from other data sources and databases – integrate all information to help characterize & contextualize (and ultimately make decision regarding) each new event.
  • 13. Then what? Get down to business with the Curated Collection of Characterizations and Contextualizations: • Data Analytics: – Outlier / Anomaly / Novelty / Surprise detection – Clustering (= New Class discovery) – Correlation & Association discovery • D2D: – Data-to-Discoveries – Data-to-Decisions – Data-to-Dollars
  • 14. The Business Analyst-­‐in-­‐the-­‐Loop Tags, annota,ons, features, and context – – These can be … • measured (by observa,on), or • inferred through machine learning, or • provided by human analysts. – The resul,ng synergy yields: • improved or all 3 of these processes simultaneously. training sets, more accurate predic,ve models, fewer false posi,ves/nega,ves, ac,ve learning, efficient human interven,ons – Combining machine learning on Big Data with the power of human cogni,on for discovery (e.g., using Data Visualiza,on, Visual Analy,cs, Immersive Data Environments, or Crowdsourcing) therefore augments and accelerates discovery, insights, and D2D.
  • 15. Dr. Robin Bloor Chief Analyst, The Bloor Group +
  • 16. The Data Scientist & The Business Analyst Robin Bloor
  • 17. The Data Analysis Budget u Data Analysis is Business R&D u The focus is on business process u The outcome of successful R&D is a changed process u Think of manufacturing for a useful example
  • 19. What is a Data Scientist? u Project manager u Qualified statistician u Domain Business expert u Experienced data architect u Software engineer (IT’S A TEAM)
  • 20. The Impact of Machine Learning Machine learning is changing the process (for the BUSINESS ANALYST & the DATA SCIENTIST) BUT the analytics team needs to understand IT!!
  • 21. Take Note! You can know more about a business from its data than by any other means
  • 22. There are Two Issues for the Business Can you get the Can you get the TECHNOLOGY right? PEOPLE right? &
  • 23. + Will Gorman Chief Architect, Pentaho
  • 24. © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide 24 +1 (866) 660-7555 July 2014 Pentaho Business Analytics Architected for the Future of Analytics Will Gorman, Chief Architect
  • 25. WHAT WE DO We enable the modern, big data-driven business Modern, cohesive data integration and business analytics platform • Full spectrum of advanced analytics for all key roles • Embeddable, cloud-ready analytics • Big data blending for analytics in real-time environments • Broadest and deepest big data integration Innovation through open source • Open, pluggable, purpose-built for the future • Early sustained leadership in big data ecosystem with technology innovation Critical mass achieved • Over 1,500 commercial customers • Over 10,000 production deployments © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide 25 +1 (866) 660-7555
  • 26. Pentaho 5.1 Architected for the Future Simplified analytics @ scale for all users © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide 26 +1 (866) 660-7555
  • 27. Evolving Big Data Architectures © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide 27 +1 (866) 660-7555 Existing ETL Tool or PDI EDW Data Marts Analytics Existing ETL Tool or PDI Customer Provisioning Billing Other BI Tools
  • 28. Evolving Big Data Architectures Existing ETL Tool or PDI P Just-in-Time Integration D I Network © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide 28 +1 (866) 660-7555 PDI Analytic DB Location Web Social Media Existing Process or PDI Hadoop Cluster NoSQL EDW Data Marts Analytics Existing ETL Tool or PDI Customer Provisioning Billing Other BI Tools
  • 29. The strength of Pentaho lies in the power of combination Data integration Big data +Any data © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide 29 +1 (866) 660-7555 Business +analytics The IT department Lines of +business Any data. Any environment. Any analytics.
  • 30. Thank You JOIN THE CONVERSATION. YOU CAN FIND US ON: blog.pentaho.com @Pentaho © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide 30 +1 (866) 660-7555 Facebook.com/Pentaho Pentaho Business Analytics
  • 31. Steve Wilkes CTO, WebAction +
  • 32. The Future of Data Driven Apps July 2014
  • 33. WebAction® delivers the leading Real-time App Platform enabling the next generation of Data Driven Apps for the Agile Enterprise
  • 34. Acquire Store Process Batch Reactive RDBMS EDW BI / Analytics Structured Data Machine Data Click Location Stream Structured Data Machine Data Real-time Proactive Click Location Stream REALTIME BARRIER Data Driven Apps RDBMS Hadoop Acquire Process in Memory Store
  • 35. Distributed DIM Processor Distributed WAction Cache Metadata High Speed Data Acquisition WActionStore Transaction Data Social Feeds Tungsten Device Data Visualization RDBMS Big Data Infrastructure Industry Data Enterprise Applications Enterprise Data Warehouse Data Driven Apps System/ IT Data
  • 36. Security Event Processing Cloud Application Control Risk & Fraud Alerting Quality of Service Management Consumer Analytics DataCenter Management
  • 37. Frank Sanders Technical Director, MarkLogic +
  • 38. Data Centered Approach is More Flexible PDF SLIDE: 38 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Slide 38 Copyright © 2010 MarkLogic® Corporation. All 2011 rights reserved.
  • 39. Universal Index Powers Search & Analytics <location> <lat> 37.497075 <long> -122.363319 Unstructured full-text <object> SLIDE: 39 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Slide 39 Copyright © 2010 MarkLogic® Corporation. All 2011 rights reserved. <SAR> <title> Suspicious vehicle… <date> 2012-11-12Z <type> <threat> suspicious activity <category> suspicious vehicle <description> A blue van… <subject> <subject> <predicate> <object> IRIID IRIID isa value license-plate <predicate> ABC 123 observation/surveillance <type> <triple> <triple> Geospatial Va l u e s
  • 40. Fairfax County Police Events Application SLIDE: 40 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Slide 40 Copyright © 2010 MarkLogic® Corporation. All 2011 rights reserved.
  • 41. OECD Better Life Index SLIDE: 41 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Slide 41 Copyright © 2010 MarkLogic® Corporation. All 2011 rights reserved.
  • 42. MarkMail: Search-powered Visualization SLIDE: 42 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Slide 42 Copyright © 2010 MarkLogic® Corporation. All 2011 rights reserved.
  • 43. Hannah Smalltree Director, Treasure Data +
  • 44. The Treasure Data Cloud Service Store! Cloud Storage! Managed, Monitored, Scalable, Secure! Web Mgmt. Console! View/query data, Access controls! Collect! Stream ! Logs/Events in Real-time! Bulk Import! from Most Sources! Copyright ©2014 Treasure Data. All Rights Reserved. Analyze! Query with SQL Multiple Query Engines, Ad Hoc! ! ! BI Tool Connectivity! Tableau, Most BI/Viz/ Analytics Tools! Export! Query Results or Datasets! Anytime! Cloud Managed Service (SaaS) || <2 Week Setup || Flat monthly rate!
  • 45. Specializing in Streaming “BIG” Data Volume Velocity Variety Examples: Clickstream, Web Access Logs, Mobile Data, App Logs, Event Logs, Sensors, Machine Data… Copyright ©2014 Treasure Data. All Rights Reserved.
  • 46. Big Data Analytics Use Cases Use Case! Key Data Sources! Results! Treasure Example! Copyright ©2014 Treasure Data. All Rights Reserved. Website & " Mobile App " Behavior Analytics" Mobile App Clicks " Web Clickstream" + eComm, POS" Increase sales and retail foot traffic within weeks" Mobile Application Analytics" Mobile Application Logs" Increase Engagement (=Sales) by Iterating Quickly" Product Behavior " & Sensor Analytics" Sensor Data" Improved Product Development" " New Product/Service Development" $216B Global Retailer Video Games
  • 47. Treasure Data In Your Analytics Environment Collect" Store" Analyze" Copyright ©2014 Treasure Data. All Rights Reserved. Your" Server," Device," Gateway" etc…" SQL" Your BI, Visualization" Adv. Analytics" Your Data Mart" Data Warehouse" DBMS, etc." Streaming" Treasure Data Service" Aggregates" Export/Integrate"
  • 48. Copyright ©2014 Treasure Data. All Rights Reserved. Resources TreasureData.com! Datasheets, Case Studies, Whitepapers! TDWI, 451, Analyst Whitepapers! Gartner Report: Cool Vendors in Big Data! ! Try the Starter Service For Free! TreasureData.com/TryItNow!
  • 49. + Questions? #TechWise or USE THE Q&A
  • 50. + THANK YOU! FIND THE ARCHIVE AT InsideAnalysis.com & Techopedia.com