SlideShare a Scribd company logo
From Business Intelligence to Big Data:
The Evolution of Business Analytics
@hackreduce – Dec. 3, 2014
Adam Ferrari
@AJFerrari
(All opinions expressed are my own / I’m not here representing my employers)
Adams-MacBook-Pro:~ ferrari$ whoami
ferrari
2012- 2014-
BA ‘91 MS ‘94 PhD CS ’98
2000-2012
(CTO 2005-2012)
Endeca did this…
Which led us into this…
This talk
What did I learn as CTO of a BI product company as we
jumped into the BI market mid-stream, and then later as we
were acquired by one of the biggest “traditional BI” vendors?
Most Importantly:
Stay focused on real business value, not technology.
Note: My context is very “product provider” oriented, but I believe the lessons
are equally interesting to “product consumers” – after all, we’re all interested
in where the toolset is going and why
A note about scope
Analytics is a highly overloaded term
The vast majority of my experience, and the focus of this talk, is
around “BI-style” analytics, i.e.,
Delivering historical and aggregate views of data (e.g.,
charts, reports, dashboards, etc.) to business decision makers
There are many other important forms of “analytics”
E.g., Data mining, statistics, data science, etc.
These are very important and complementary,
but not in my scope here
Part 1 (of 3)
Some Ancient History
(or, a bunch of important stuff that happened before my time)
In the beginning…
…there was the cube
(well, there was a bunch of stuff before that – Hans Peter Luhn coins the term Business
Intelligence in 1958, Edgar Codd invents the relational data model in 1970, etc…
but we’ll start with the beginning of modern Business Intelligence, which is OLAP)
Image source: oracle.com
Research sponsored by Arbor Software in 1993,
defined the “12 Rules for OLAP Products”
Rule #1 – “Multidimensional Conceptual View”
OLAP = Multidimensional Analysis
Notable “traditional”
OLAP Products
• Express
(IRI - Oracle)
• Essbase
(Arbor - Hyperian - Oracle)
• Microsoft Analysis Services
(Panorama - MS)
Image source: microsoft.com
[1995]
Notable “traditional”
ROLAP Products
• Microstrategy
• Business Objects
(Business Objects - SAP)
• Cognos
(Cognis - IBM)
• OBIEE
(nQuire - Siebel - Oracle)
• Actuate, Birst, Pentaho
etc…
Image source: microstrategy.com
ROLAP Modeling
• Manage mapping between
physical data stores, “logical
view” (core dimensional model),
and “business view”
• Definition of metrics,
dimensions
• Management of pre-
computed aggregates
Image source: rittmanmead.com
Data Warehousing: go big or go home
HW
• Teradata
• Netezza (IBM)
• Oracle Exadata
SW – Traditional DBMS
• Oracle
• MS SQL Server
• IBM DB2
SW – Analytical DBMS
• Vertica (HP)
• ParAccell (/ RedShift)
• SAP HANA
Image source: teradata.com
ETL- Extract/Transform/Load
Image source: informatica.com
Notable ETL Products
• Informatica Power Center
• Ascential DataStage (IBM)
• Ab Initio
• … numerous others
• Capture History
• Manage dimensions
– E.g., what happens if a
customer moves?
“slow changing dimensions”
• Pre-compute aggregates
• Serve as the versionable
managed record of how the
dimensional model of the
warehouse is derived from
the raw data
Best Practices
[First Edition, 1992]
Image source: wiley.com
[Founded, 1995]
[First Edition, 1996]
“Business Analytics” 1.0 Architecture
Image source: ibm.com
Business Analytics 1.0 - Pros & Cons
• Governance, re-use, and quality
– “One Version of the Truth” – correct, agreed upon, reusable definitions of core
business metrics and dimensions
But…
• Poor Agility – development process requires:
– Creating or modifying a dimensional model
– Creating ETL to populate the new model
– Creating report or dashboard content on top of the model
– Iterating to make the model perform
• Lack of self-service for end users
• Historically, poor user experience for end consumers
• Cost and Complexity – large, complex stack of
components, code, and configuration to manage, scale,
troubleshoot, etc.
Part 2 (of 3)
Some Recent History
(or, where I joined the story already in progress)
Data Discovery & Visualization
Key Features
• Visual data presentation
• Interactive data exploration –
“facets,” “lassos,” etc.
• Simplified stack – DBMS and Server optional
• Self-service: data loading & content creation,
no dimensional modeling
Notable products:
• QlikView (Qlik Tech)
• Tableau
• Spotfire (TIBCO)
• Endeca Latitude
(now Oracle Information Discovery)
• EdgeSpring (now Salesforce.com Wave)
• Business Objects Explorer
Image source: tibco.com
Image source: sap.com
Source: http://www.tcsnycmarathon.org/analytics
Image source: community.qlik.com
QlikView configuration example…
Source: http://www.tableausoftware.com/learn/gallery
Image source: vizwiz.blogspot.com
Tableau configuration example…
Data Discovery Lessons
• Improved User Experience, Self-service
But…
• BI is still really hard
– Reading from raw, real-world operational schemas is messy and
complicated
– And the requisite history may not even be available
• The usability benefits of discovery tools come with significant
scalability limitations
• Additional data types – so called “unstructured” data (logs,
text, etc.) is even harder, as discovery tools (generally) target
structured, tabular data (didn’t address “Big Data”)
And…
• Traditional BI tools are rapidly adding better UX, Visualization,
and Self-service
Part 3 (of 3) (woohoo!)
Future History
(or, stuff that’s still anyone’s guess)
Our analytics ambitions have only grown!
We want BIG, EASY, DEEP analytics
• [BIG] the headline grabber:
More data from more sources, aka: Big Data
• [EASY] the real issue (IMHO):
Faster time to value, at lower cost of ownership
• [DEEP] increasingly important:
Deeper intelligence from data…
not just data, but actions, predictions, etc…
… Can we solve these problems without creating an
ever larger mess of technology and products?
[BIG]: the Hadoop Solution
Posits that what we need is a better, more flexible and
scalable foundation for the Data Warehouse – more like a
“data operating system” than a DBMS
Image source: cloudera.com
[BIG] and [EASY] “On-Hadoop” Solutions
Image source: gigaom.com
Platfora Architecture
Posit that although Hadoop
is indeed a powerful
platform, it’s complexity
needs to be wrapped in a BI
/ analytics application
Notable Products
• Platfora
• Datameer
• Oracle Big Data Discovery
(based on Endeca)
[BIG+]: The Logical Data Warehouse*
Posits that what is needed is a variety of data stores to constitute the
“Data Warehouse,” along with integration to allow data to be stored
and processed where most appropriate with little or no additional
development effort or operational management overhead
Image source: teradata.com
* From Understanding the Logical Data Warehouse: The Emerging
Practice, 21 June 2012, Mark A. Beyer and Roxane Edjlali
[EASY] The Cloud Solution
• Agility via all of the traditional cloud benefits –
reduced setup, less customization, reduced
ongoing management, etc…
• SaaS-based BI tools, such as
– GoodData
– Domo
• SaaS-based BI applications, such as
– Numerify (IT analytics on ServiceNow, etc.)
– InsightSquared (Sales analytics on Salesforce)
Other notable examples
• [DEEP] and [EASY]: BeyondCore – data discovery
with automatic/algorithmic analysis of attribute
relationships
• [DEEP]: Ayasdi – deeper insight into data based
on novel topological data visualization
• [DEEP] Alteryx – democratizing more complex
analytical workflows
• [EASY 2.0]: Looker – lightweight BI without
sacrificing modeling, yet avoiding the need for a
warehouse
• [BIG] and [EASY]: Tamr, Trifacta - curating and
wrangling data into usable forms
My guesses about the future?
• I voted with my feet. My beliefs:
– Fast time to real value is of paramount importance
• Zero-friction SaaS applications targeted to specific
business problems are an essential enabler – essential to
amortizing the cost of developing meaningful analytics
and quickly disseminating best practice updates – DIY just
doesn’t cut it any more in many cases.
– Our ability to do basic BI (dashboards, data
discovery, etc.) is mature, and the real action is in
deeper analysis of data
• Yet highly custom data science efforts are at odds with
fast time to value, and hard to advance in many cases
Crisply – quantified work for CRM
model & activity
activity
quantified
work
• Algorithmic quantification of the human effort behind each customer,
opportunity, support case, etc.
• Determine the true cost to acquire a specific customer or type of customer,
and understand the true profitability of that customer or segment over time
Thanks!
And stay focused on
the value that analytics creates
(the technology with follow from that)

More Related Content

What's hot

The Present - the History of Business Intelligence
The Present - the History of Business IntelligenceThe Present - the History of Business Intelligence
The Present - the History of Business Intelligence
Phocas Software
 
Blueprint for integrating big data analytics and bi
Blueprint for integrating big data analytics and biBlueprint for integrating big data analytics and bi
Blueprint for integrating big data analytics and biDataWorks Summit
 
Big Data Analytics and a Chartered Accountant
Big Data Analytics and a Chartered AccountantBig Data Analytics and a Chartered Accountant
Big Data Analytics and a Chartered Accountant
Bharath Rao
 
Data Services and the Modern Data Ecosystem (Middle East)
Data Services and the Modern Data Ecosystem (Middle East)Data Services and the Modern Data Ecosystem (Middle East)
Data Services and the Modern Data Ecosystem (Middle East)
Denodo
 
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATIONBig Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Matt Stubbs
 
Modern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | QuboleModern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | Qubole
Vasu S
 
Business intelligence tools
Business intelligence toolsBusiness intelligence tools
Business intelligence toolsBhavya01
 
Business intelligence, Data Analytics & Data Visualization
Business intelligence, Data Analytics & Data VisualizationBusiness intelligence, Data Analytics & Data Visualization
Business intelligence, Data Analytics & Data Visualization
Muthu Natarajan
 
SplunkLive! Splunk for Business Analytics
SplunkLive! Splunk for Business AnalyticsSplunkLive! Splunk for Business Analytics
SplunkLive! Splunk for Business AnalyticsSplunk
 
Three Big Data Case Studies
Three Big Data Case StudiesThree Big Data Case Studies
Three Big Data Case Studies
Atidan Technologies Pvt Ltd (India)
 
Four Pillars of Business Analytics by Actuate
Four Pillars of Business Analytics by ActuateFour Pillars of Business Analytics by Actuate
Four Pillars of Business Analytics by Actuate
Edgar Alejandro Villegas
 
Accelerating Data-Driven Enterprise Transformation in Banking, Financial Serv...
Accelerating Data-Driven Enterprise Transformation in Banking, Financial Serv...Accelerating Data-Driven Enterprise Transformation in Banking, Financial Serv...
Accelerating Data-Driven Enterprise Transformation in Banking, Financial Serv...
Denodo
 
Business case for Big Data Analytics
Business case for Big Data AnalyticsBusiness case for Big Data Analytics
Business case for Big Data Analytics
Vijay Rao
 
M365 Saturday Saskatchewan 2020 - Build your #PowerPlatform #Governance
M365 Saturday Saskatchewan 2020 - Build your #PowerPlatform #GovernanceM365 Saturday Saskatchewan 2020 - Build your #PowerPlatform #Governance
M365 Saturday Saskatchewan 2020 - Build your #PowerPlatform #Governance
Nicolas Georgeault
 
Real-Time Data Integration for Modern BI
Real-Time Data Integration for Modern BIReal-Time Data Integration for Modern BI
Real-Time Data Integration for Modern BI
ibi
 
IoT and Big Data
IoT and Big DataIoT and Big Data
IoT and Big Data
Musa Kalimullah
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics Architecture
Arvind Sathi
 
Data Virtualization at Logitech = #Winning
Data Virtualization at Logitech = #WinningData Virtualization at Logitech = #Winning
Data Virtualization at Logitech = #Winning
Denodo
 
Big Data analytics best practices
Big Data analytics best practicesBig Data analytics best practices
Big Data analytics best practices
The Marketing Distillery
 
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
Denodo
 

What's hot (20)

The Present - the History of Business Intelligence
The Present - the History of Business IntelligenceThe Present - the History of Business Intelligence
The Present - the History of Business Intelligence
 
Blueprint for integrating big data analytics and bi
Blueprint for integrating big data analytics and biBlueprint for integrating big data analytics and bi
Blueprint for integrating big data analytics and bi
 
Big Data Analytics and a Chartered Accountant
Big Data Analytics and a Chartered AccountantBig Data Analytics and a Chartered Accountant
Big Data Analytics and a Chartered Accountant
 
Data Services and the Modern Data Ecosystem (Middle East)
Data Services and the Modern Data Ecosystem (Middle East)Data Services and the Modern Data Ecosystem (Middle East)
Data Services and the Modern Data Ecosystem (Middle East)
 
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATIONBig Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
 
Modern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | QuboleModern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | Qubole
 
Business intelligence tools
Business intelligence toolsBusiness intelligence tools
Business intelligence tools
 
Business intelligence, Data Analytics & Data Visualization
Business intelligence, Data Analytics & Data VisualizationBusiness intelligence, Data Analytics & Data Visualization
Business intelligence, Data Analytics & Data Visualization
 
SplunkLive! Splunk for Business Analytics
SplunkLive! Splunk for Business AnalyticsSplunkLive! Splunk for Business Analytics
SplunkLive! Splunk for Business Analytics
 
Three Big Data Case Studies
Three Big Data Case StudiesThree Big Data Case Studies
Three Big Data Case Studies
 
Four Pillars of Business Analytics by Actuate
Four Pillars of Business Analytics by ActuateFour Pillars of Business Analytics by Actuate
Four Pillars of Business Analytics by Actuate
 
Accelerating Data-Driven Enterprise Transformation in Banking, Financial Serv...
Accelerating Data-Driven Enterprise Transformation in Banking, Financial Serv...Accelerating Data-Driven Enterprise Transformation in Banking, Financial Serv...
Accelerating Data-Driven Enterprise Transformation in Banking, Financial Serv...
 
Business case for Big Data Analytics
Business case for Big Data AnalyticsBusiness case for Big Data Analytics
Business case for Big Data Analytics
 
M365 Saturday Saskatchewan 2020 - Build your #PowerPlatform #Governance
M365 Saturday Saskatchewan 2020 - Build your #PowerPlatform #GovernanceM365 Saturday Saskatchewan 2020 - Build your #PowerPlatform #Governance
M365 Saturday Saskatchewan 2020 - Build your #PowerPlatform #Governance
 
Real-Time Data Integration for Modern BI
Real-Time Data Integration for Modern BIReal-Time Data Integration for Modern BI
Real-Time Data Integration for Modern BI
 
IoT and Big Data
IoT and Big DataIoT and Big Data
IoT and Big Data
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics Architecture
 
Data Virtualization at Logitech = #Winning
Data Virtualization at Logitech = #WinningData Virtualization at Logitech = #Winning
Data Virtualization at Logitech = #Winning
 
Big Data analytics best practices
Big Data analytics best practicesBig Data analytics best practices
Big Data analytics best practices
 
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
 

Viewers also liked

SQL Server Data Discovery with PowerPivot
SQL Server Data Discovery with PowerPivotSQL Server Data Discovery with PowerPivot
SQL Server Data Discovery with PowerPivot
Eduardo Castro
 
Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015
Brad Culbert
 
Business Analytics System
Business Analytics SystemBusiness Analytics System
Business Analytics System
Mahesh Patwardhan
 
The Evolution of Big Data Analytics
The Evolution of Big Data AnalyticsThe Evolution of Big Data Analytics
The Evolution of Big Data Analytics
AYATA
 
Rd big data & analytics v1.0
Rd big data & analytics v1.0Rd big data & analytics v1.0
Rd big data & analytics v1.0
Yadu Balehosur
 
Why Data Vault?
Why Data Vault? Why Data Vault?
Why Data Vault?
Kent Graziano
 
Evolution of Data Analytics: the past, the present and the future
Evolution of Data Analytics: the past, the present and the futureEvolution of Data Analytics: the past, the present and the future
Evolution of Data Analytics: the past, the present and the future
Varun Nemmani
 

Viewers also liked (7)

SQL Server Data Discovery with PowerPivot
SQL Server Data Discovery with PowerPivotSQL Server Data Discovery with PowerPivot
SQL Server Data Discovery with PowerPivot
 
Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015
 
Business Analytics System
Business Analytics SystemBusiness Analytics System
Business Analytics System
 
The Evolution of Big Data Analytics
The Evolution of Big Data AnalyticsThe Evolution of Big Data Analytics
The Evolution of Big Data Analytics
 
Rd big data & analytics v1.0
Rd big data & analytics v1.0Rd big data & analytics v1.0
Rd big data & analytics v1.0
 
Why Data Vault?
Why Data Vault? Why Data Vault?
Why Data Vault?
 
Evolution of Data Analytics: the past, the present and the future
Evolution of Data Analytics: the past, the present and the futureEvolution of Data Analytics: the past, the present and the future
Evolution of Data Analytics: the past, the present and the future
 

Similar to From Business Intelligence to Big Data - hack/reduce Dec 2014

IT webinar 2016
IT webinar 2016IT webinar 2016
IT webinar 2016
PR Cell, IIM Rohtak
 
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
BIWUG
 
How to build your own Delve: combining machine learning, big data and SharePoint
How to build your own Delve: combining machine learning, big data and SharePointHow to build your own Delve: combining machine learning, big data and SharePoint
How to build your own Delve: combining machine learning, big data and SharePoint
Joris Poelmans
 
Levelling up your data infrastructure
Levelling up your data infrastructureLevelling up your data infrastructure
Levelling up your data infrastructure
Simon Belak
 
BI Introduction
BI IntroductionBI Introduction
BI Introduction
Taras Panchenko
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and TableauAnalyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
DATAVERSITY
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
Toigo Critical Convergence
Toigo  Critical ConvergenceToigo  Critical Convergence
Toigo Critical Convergencehypknight
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
Bbbt presentation 210415_final_2
Bbbt presentation 210415_final_2Bbbt presentation 210415_final_2
Bbbt presentation 210415_final_2
Roland Bullivant
 
Data Vault Introduction
Data Vault IntroductionData Vault Introduction
Data Vault Introduction
Patrick Van Renterghem
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment Options
Caserta
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
James Serra
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
DATAVERSITY
 
Jan 2017 Investment Recommendation for Tableau
Jan 2017 Investment Recommendation for TableauJan 2017 Investment Recommendation for Tableau
Jan 2017 Investment Recommendation for Tableau
paulchenuva
 
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Rittman Analytics
 
Power BI - 2016 - Public
Power BI - 2016 - PublicPower BI - 2016 - Public
Power BI - 2016 - PublicJulian Payne
 

Similar to From Business Intelligence to Big Data - hack/reduce Dec 2014 (20)

IT webinar 2016
IT webinar 2016IT webinar 2016
IT webinar 2016
 
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
 
How to build your own Delve: combining machine learning, big data and SharePoint
How to build your own Delve: combining machine learning, big data and SharePointHow to build your own Delve: combining machine learning, big data and SharePoint
How to build your own Delve: combining machine learning, big data and SharePoint
 
Levelling up your data infrastructure
Levelling up your data infrastructureLevelling up your data infrastructure
Levelling up your data infrastructure
 
BI Introduction
BI IntroductionBI Introduction
BI Introduction
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and TableauAnalyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Toigo Critical Convergence
Toigo  Critical ConvergenceToigo  Critical Convergence
Toigo Critical Convergence
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Bbbt presentation 210415_final_2
Bbbt presentation 210415_final_2Bbbt presentation 210415_final_2
Bbbt presentation 210415_final_2
 
Data Vault Introduction
Data Vault IntroductionData Vault Introduction
Data Vault Introduction
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment Options
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
Jan 2017 Investment Recommendation for Tableau
Jan 2017 Investment Recommendation for TableauJan 2017 Investment Recommendation for Tableau
Jan 2017 Investment Recommendation for Tableau
 
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
 
Power BI - 2016 - Public
Power BI - 2016 - PublicPower BI - 2016 - Public
Power BI - 2016 - Public
 

Recently uploaded

State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 

Recently uploaded (20)

State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 

From Business Intelligence to Big Data - hack/reduce Dec 2014

  • 1. From Business Intelligence to Big Data: The Evolution of Business Analytics @hackreduce – Dec. 3, 2014 Adam Ferrari @AJFerrari (All opinions expressed are my own / I’m not here representing my employers)
  • 2. Adams-MacBook-Pro:~ ferrari$ whoami ferrari 2012- 2014- BA ‘91 MS ‘94 PhD CS ’98 2000-2012 (CTO 2005-2012)
  • 4. Which led us into this…
  • 5. This talk What did I learn as CTO of a BI product company as we jumped into the BI market mid-stream, and then later as we were acquired by one of the biggest “traditional BI” vendors? Most Importantly: Stay focused on real business value, not technology. Note: My context is very “product provider” oriented, but I believe the lessons are equally interesting to “product consumers” – after all, we’re all interested in where the toolset is going and why
  • 6. A note about scope Analytics is a highly overloaded term The vast majority of my experience, and the focus of this talk, is around “BI-style” analytics, i.e., Delivering historical and aggregate views of data (e.g., charts, reports, dashboards, etc.) to business decision makers There are many other important forms of “analytics” E.g., Data mining, statistics, data science, etc. These are very important and complementary, but not in my scope here
  • 7. Part 1 (of 3) Some Ancient History (or, a bunch of important stuff that happened before my time)
  • 8. In the beginning… …there was the cube (well, there was a bunch of stuff before that – Hans Peter Luhn coins the term Business Intelligence in 1958, Edgar Codd invents the relational data model in 1970, etc… but we’ll start with the beginning of modern Business Intelligence, which is OLAP) Image source: oracle.com
  • 9. Research sponsored by Arbor Software in 1993, defined the “12 Rules for OLAP Products” Rule #1 – “Multidimensional Conceptual View”
  • 10. OLAP = Multidimensional Analysis Notable “traditional” OLAP Products • Express (IRI - Oracle) • Essbase (Arbor - Hyperian - Oracle) • Microsoft Analysis Services (Panorama - MS) Image source: microsoft.com
  • 11. [1995] Notable “traditional” ROLAP Products • Microstrategy • Business Objects (Business Objects - SAP) • Cognos (Cognis - IBM) • OBIEE (nQuire - Siebel - Oracle) • Actuate, Birst, Pentaho etc… Image source: microstrategy.com
  • 12. ROLAP Modeling • Manage mapping between physical data stores, “logical view” (core dimensional model), and “business view” • Definition of metrics, dimensions • Management of pre- computed aggregates Image source: rittmanmead.com
  • 13. Data Warehousing: go big or go home HW • Teradata • Netezza (IBM) • Oracle Exadata SW – Traditional DBMS • Oracle • MS SQL Server • IBM DB2 SW – Analytical DBMS • Vertica (HP) • ParAccell (/ RedShift) • SAP HANA Image source: teradata.com
  • 14. ETL- Extract/Transform/Load Image source: informatica.com Notable ETL Products • Informatica Power Center • Ascential DataStage (IBM) • Ab Initio • … numerous others • Capture History • Manage dimensions – E.g., what happens if a customer moves? “slow changing dimensions” • Pre-compute aggregates • Serve as the versionable managed record of how the dimensional model of the warehouse is derived from the raw data
  • 15. Best Practices [First Edition, 1992] Image source: wiley.com [Founded, 1995] [First Edition, 1996]
  • 16. “Business Analytics” 1.0 Architecture Image source: ibm.com
  • 17. Business Analytics 1.0 - Pros & Cons • Governance, re-use, and quality – “One Version of the Truth” – correct, agreed upon, reusable definitions of core business metrics and dimensions But… • Poor Agility – development process requires: – Creating or modifying a dimensional model – Creating ETL to populate the new model – Creating report or dashboard content on top of the model – Iterating to make the model perform • Lack of self-service for end users • Historically, poor user experience for end consumers • Cost and Complexity – large, complex stack of components, code, and configuration to manage, scale, troubleshoot, etc.
  • 18. Part 2 (of 3) Some Recent History (or, where I joined the story already in progress)
  • 19. Data Discovery & Visualization Key Features • Visual data presentation • Interactive data exploration – “facets,” “lassos,” etc. • Simplified stack – DBMS and Server optional • Self-service: data loading & content creation, no dimensional modeling Notable products: • QlikView (Qlik Tech) • Tableau • Spotfire (TIBCO) • Endeca Latitude (now Oracle Information Discovery) • EdgeSpring (now Salesforce.com Wave) • Business Objects Explorer Image source: tibco.com Image source: sap.com
  • 21. Image source: community.qlik.com QlikView configuration example…
  • 23. Image source: vizwiz.blogspot.com Tableau configuration example…
  • 24. Data Discovery Lessons • Improved User Experience, Self-service But… • BI is still really hard – Reading from raw, real-world operational schemas is messy and complicated – And the requisite history may not even be available • The usability benefits of discovery tools come with significant scalability limitations • Additional data types – so called “unstructured” data (logs, text, etc.) is even harder, as discovery tools (generally) target structured, tabular data (didn’t address “Big Data”) And… • Traditional BI tools are rapidly adding better UX, Visualization, and Self-service
  • 25. Part 3 (of 3) (woohoo!) Future History (or, stuff that’s still anyone’s guess)
  • 26. Our analytics ambitions have only grown! We want BIG, EASY, DEEP analytics • [BIG] the headline grabber: More data from more sources, aka: Big Data • [EASY] the real issue (IMHO): Faster time to value, at lower cost of ownership • [DEEP] increasingly important: Deeper intelligence from data… not just data, but actions, predictions, etc… … Can we solve these problems without creating an ever larger mess of technology and products?
  • 27. [BIG]: the Hadoop Solution Posits that what we need is a better, more flexible and scalable foundation for the Data Warehouse – more like a “data operating system” than a DBMS Image source: cloudera.com
  • 28. [BIG] and [EASY] “On-Hadoop” Solutions Image source: gigaom.com Platfora Architecture Posit that although Hadoop is indeed a powerful platform, it’s complexity needs to be wrapped in a BI / analytics application Notable Products • Platfora • Datameer • Oracle Big Data Discovery (based on Endeca)
  • 29. [BIG+]: The Logical Data Warehouse* Posits that what is needed is a variety of data stores to constitute the “Data Warehouse,” along with integration to allow data to be stored and processed where most appropriate with little or no additional development effort or operational management overhead Image source: teradata.com * From Understanding the Logical Data Warehouse: The Emerging Practice, 21 June 2012, Mark A. Beyer and Roxane Edjlali
  • 30. [EASY] The Cloud Solution • Agility via all of the traditional cloud benefits – reduced setup, less customization, reduced ongoing management, etc… • SaaS-based BI tools, such as – GoodData – Domo • SaaS-based BI applications, such as – Numerify (IT analytics on ServiceNow, etc.) – InsightSquared (Sales analytics on Salesforce)
  • 31. Other notable examples • [DEEP] and [EASY]: BeyondCore – data discovery with automatic/algorithmic analysis of attribute relationships • [DEEP]: Ayasdi – deeper insight into data based on novel topological data visualization • [DEEP] Alteryx – democratizing more complex analytical workflows • [EASY 2.0]: Looker – lightweight BI without sacrificing modeling, yet avoiding the need for a warehouse • [BIG] and [EASY]: Tamr, Trifacta - curating and wrangling data into usable forms
  • 32. My guesses about the future? • I voted with my feet. My beliefs: – Fast time to real value is of paramount importance • Zero-friction SaaS applications targeted to specific business problems are an essential enabler – essential to amortizing the cost of developing meaningful analytics and quickly disseminating best practice updates – DIY just doesn’t cut it any more in many cases. – Our ability to do basic BI (dashboards, data discovery, etc.) is mature, and the real action is in deeper analysis of data • Yet highly custom data science efforts are at odds with fast time to value, and hard to advance in many cases
  • 33. Crisply – quantified work for CRM model & activity activity quantified work • Algorithmic quantification of the human effort behind each customer, opportunity, support case, etc. • Determine the true cost to acquire a specific customer or type of customer, and understand the true profitability of that customer or segment over time
  • 34. Thanks! And stay focused on the value that analytics creates (the technology with follow from that)

Editor's Notes

  1. http://www.tcsnycmarathon.org/analytics
  2. Data Discovery tools improved agility and UX, and enabled more powerful self-service / DIY But did these “model-less tools” truly advance Business Analytics, or just expand the toolset? How will their impact trend as traditional tools become more agile and visual, at the same time that more modern tools advance the functional envelope? Large organizations are still sorting out the impact of data discovery on their BI strategies, even as the picture changes quickly with new tools emerging and incumbent standards improving.