SlideShare a Scribd company logo
1 of 19
BI/Analytics for NoSQL:
Review of Architectures
What we'll answer in 50 minutes
•   Who is this guy?
•   How do I enable AdHoc, self
    service reporting on NoSQL?
•   How do I improve the
    performance of dashboards
    on top of NoSQL?
•   How do I integrate NoSQL
    data with my other data not
    inside NoSQL?
•   How do I enable, easy to build
    simple reports but also
    preserve the ability for rich
    NoSQL queries?
Nicholas Goodman

•    Open Source BI thought leader
       –    50+ Open Source BI customer projects
       –    Blogger, whitepapers, etc
•    Entrepreneur
       –    DynamoBI Corporation
       –    Bayon Technologies, Inc.
•    Data Geek, hacker, tinkerer, committer



    GOAL: Share perspectives,
    research, opinions.
    DISCLAIMER: Your Mileage ...
How do we answer those Q's?
Promise of “Big Data”
•   NoSQL/Hadoop/MapReduce Systems
     –   Keep more of it
     –   Cost effective analysis
     –   “Massive scale” data, now accessible to everyone (elastic)
     –   Not just SQL queries, more complex analysis




     ACCOMPLISHED: WEB SCALE, MASSIVE
     NEVER BEFORE SEEN SCALE OF DATA
     STORAGE AND PROCESSING
Reality Check!


•   Petabytes? Y                  •   Fast Queries? N
•   Cheap Storage? Y              •   Ad Hoc access? N
•   Raw Processing? Y             •   Accessibility to commodity BI
                                      tools? N
•   Rich Query Languages? Y
•   Flexible data structures? Y•      Easy report authoring? N

•   Reliable, Fault Tolerant? Y•      Levels of Aggregation? N
                               •      Integrated Data? N

     Big Data has solved the INFRASTRUCTURE of
     raw/core data storage but has provided less value
     to what BUSINESS users want for analytics.
Data Gaps too!



•   Code, Developers             •   Analysts w/ Excel, Dashboards
•   MR, Rich Graph/Access        •   Simple 2D (tables, charts)
•   Hierarchical, Unstructured   •   Filtering and easy analytics
Levels of Aggregation

SAME DATA AT VARIOUS
LEVELS OF AGGREGATION
HUGELY IMPORTANT IN REAL
LIFE IMPLEMENTATIONS!

                               10K
1 ROW                       1 MILLION
TO                         100 MILLION
1 BILLION ROWS
                           100 BILLION
Architectures

•   NoSQL   reports
•   NoSQL   thru and thru
•   NoSQL   + MySQL
•   NoSQL   as ETL Source
•   NoSQL   programs in BI Tools
•   NoSQL   via BI Database (SQL)
NoSQL reports
•   Pay Developer to build applications for reports



                                              Apps




•      100% Richness of NoSQL           •     $$, developer driven process
•      Up to date, current              •     No commodity BI tools
•      Excellent performance on         •     Managing rollups/summaries
       large datasets                   •     Schema-less = Harder!
•      Custom built, beautiful          •     Hard to integrate other
       reports/dashboards                     reporting information
•      Single system to manage
NoSQL thru and thru
•   Pay Developer to build FLEXIBLE applications for reports


      Indices                                 Advanced
       Aggs                                   Apps




•      All of NoSQL report              •     $$, developer driven process
       advantages                       •     $$, app required for aggs
•      Managed aggregations,            •     No commodity BI tools
       rollups
                                        •     Hard to integrate other
•      “Guided Adhoc” available               reporting information
       inside application
                                        •     Limited AdHoc (only
•      Higher performance for                 developer built
       dashboards/summaries                   combinations)
NoSQL + MySQL
•   Pay Developer to build FLEXIBLE applications for reports


                         ETL
                         App                MySQL




•      Less IT $$ since developers      •     Data freshness (24 hrs old)
       aren't “building reports”        •     Once into MySQL no rich
•      Rich, NoSQL analysis left in           NoSQL application use (M/R)
       place (ETL + NoSQL)              •     BI Tool can connect ONLY to
•       Easy, Ad Hoc reporting via            data in MySQL, not NoSQL
       commodity BI tools               •     Aggregations still self
•      Easier to understand data for          managed in MySQL
       self service reports
NoSQL as ETL Data Source
•   NoSQL treated like any other data source


                    Informatica         Teradata




•   Allows use of consolidated,     •     ETL Development Expense
    BI tool for AdHoc               •     Data Latency
•   Enables integrated              •     Loss of NoSQL language
    (combined) datasets for               richness
    reporting
                                    •     Traditional DW tools are $$
•   Aggregations Often
    “managed”                       •     Scaling issues with DW
                                          Database
•   Best of Breed tools
NoSQL programs in BI Tools
•   Write a program in BI tool that flattens data, output into report




•   Rich use of NoSQL native         •      Developer required to write
    language                                program ($$)
•   Direct, up to date access        •      Slow-er (aggs, summaries)
•   Access to 100% of dataset        •      Lacks integration with other
•   Leverage “guided” report                datasets
    parameter pages                  •      Still (usually) no AdHoc
•   Less expensive than apps                access
NoSQL via BI Database (SQL)
•   Enable NoSQL data access via SQL (gasp!)            Live Query
                                                        Cached, 24hr data




•      Easy reports, easy (SQL)      •         Another system in between
•      Integration with other data   •         Still needs to be refreshed,
•      ETL is simple INSERT/MERGEs             nightly
•      Live, up to date access       •         Not all capabilities for NoSQL
                                               richness available via SQL
•      High performance, cached data
•      AdHoc access to Live + Cached
•      Aggregations/Summaries
Mozilla: NoSQL thru and thru(DB)
•   Socorro Project: Crash reports, optionally sent to Mozilla
•   https://crash-stats.mozilla.com
X: NoSQL via SQL
•   Using “Splunk” (ie, a commercial NoSQL-eee data aggregator/etc)
•   Desire to use Tableau for advanced analytics/visualization
Meteor Solutions:
        NoSQL thru and thru
•   Using Cloudant BigCouch solution (SaaS)
•   High performance set of multi purpose indices on pre defined
    aggregations
•   Up to date aggregation/reports
•   Better fit for Social Media graph structures over relational DB
•   Custom built BI applications (dashboards/reports) providing a
    flexible guided view through data


                                          Advanced
                                          Apps
A,B,C: NoSQL + MySQL
•   Many Many companies (3 we've worked with)
•   All “web related” companies (semi structured, some, mostly
    volume)
•   Heavy lifting and storage, and “ETL/Data prepartion” inside
    Hadoop
•   Push summarized, aggregated data into MySQL for analysis by
    easy, dashboarding/BI Tools




                     ETL
                     App              MySQL

More Related Content

What's hot

Aws based digital_transformation_platform
Aws based digital_transformation_platformAws based digital_transformation_platform
Aws based digital_transformation_platformSlobodan Sipcic
 
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...Mariano Gonzalez
 
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...Amazon Web Services
 
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...SnapLogic
 
Postgres Vision 2018: AI Needs IA
Postgres Vision 2018: AI Needs IAPostgres Vision 2018: AI Needs IA
Postgres Vision 2018: AI Needs IAEDB
 
Toyota Financial Services Digital Transformation - Think 2019
Toyota Financial Services Digital Transformation - Think 2019Toyota Financial Services Digital Transformation - Think 2019
Toyota Financial Services Digital Transformation - Think 2019Slobodan Sipcic
 
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014Amazon Web Services
 
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...Kai Wähner
 
Informatica Cloud Data Replication for Salesforce
Informatica Cloud Data Replication for SalesforceInformatica Cloud Data Replication for Salesforce
Informatica Cloud Data Replication for SalesforceDarren Cunningham
 
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Mariano Gonzalez
 
Framework and Product Comparison for Big Data Log Analytics and ITOA
Framework and Product Comparison for Big Data Log Analytics and ITOA Framework and Product Comparison for Big Data Log Analytics and ITOA
Framework and Product Comparison for Big Data Log Analytics and ITOA Kai Wähner
 
20181212 AWS NL - Informatica Cloud Overview
20181212 AWS NL - Informatica Cloud Overview20181212 AWS NL - Informatica Cloud Overview
20181212 AWS NL - Informatica Cloud OverviewGreg Rakers
 
5 Pillars of API Management
5 Pillars of API Management5 Pillars of API Management
5 Pillars of API ManagementRich Graham
 
Business Intelligence in the Cloud I
Business Intelligence in the Cloud IBusiness Intelligence in the Cloud I
Business Intelligence in the Cloud IRightScale
 
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons LearnedRightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons LearnedRightScale
 
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Igor De Souza
 
Powering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data SystemsPowering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data SystemsHitachi Vantara
 
Roman Pavlyuk, Yaroslav Ravlinko, Intellias. Enterprise IT Transformation and...
Roman Pavlyuk, Yaroslav Ravlinko, Intellias. Enterprise IT Transformation and...Roman Pavlyuk, Yaroslav Ravlinko, Intellias. Enterprise IT Transformation and...
Roman Pavlyuk, Yaroslav Ravlinko, Intellias. Enterprise IT Transformation and...IT Arena
 
Migrating to the Cloud – Is Application Performance Monitoring still required?
Migrating to the Cloud – Is Application Performance Monitoring still required?Migrating to the Cloud – Is Application Performance Monitoring still required?
Migrating to the Cloud – Is Application Performance Monitoring still required?eG Innovations
 
Informatica Cloud Winter 2016 Release Webinar
Informatica Cloud Winter 2016 Release WebinarInformatica Cloud Winter 2016 Release Webinar
Informatica Cloud Winter 2016 Release WebinarInformatica Cloud
 

What's hot (20)

Aws based digital_transformation_platform
Aws based digital_transformation_platformAws based digital_transformation_platform
Aws based digital_transformation_platform
 
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...
 
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...
 
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
 
Postgres Vision 2018: AI Needs IA
Postgres Vision 2018: AI Needs IAPostgres Vision 2018: AI Needs IA
Postgres Vision 2018: AI Needs IA
 
Toyota Financial Services Digital Transformation - Think 2019
Toyota Financial Services Digital Transformation - Think 2019Toyota Financial Services Digital Transformation - Think 2019
Toyota Financial Services Digital Transformation - Think 2019
 
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
 
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...
 
Informatica Cloud Data Replication for Salesforce
Informatica Cloud Data Replication for SalesforceInformatica Cloud Data Replication for Salesforce
Informatica Cloud Data Replication for Salesforce
 
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
 
Framework and Product Comparison for Big Data Log Analytics and ITOA
Framework and Product Comparison for Big Data Log Analytics and ITOA Framework and Product Comparison for Big Data Log Analytics and ITOA
Framework and Product Comparison for Big Data Log Analytics and ITOA
 
20181212 AWS NL - Informatica Cloud Overview
20181212 AWS NL - Informatica Cloud Overview20181212 AWS NL - Informatica Cloud Overview
20181212 AWS NL - Informatica Cloud Overview
 
5 Pillars of API Management
5 Pillars of API Management5 Pillars of API Management
5 Pillars of API Management
 
Business Intelligence in the Cloud I
Business Intelligence in the Cloud IBusiness Intelligence in the Cloud I
Business Intelligence in the Cloud I
 
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons LearnedRightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
 
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
 
Powering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data SystemsPowering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data Systems
 
Roman Pavlyuk, Yaroslav Ravlinko, Intellias. Enterprise IT Transformation and...
Roman Pavlyuk, Yaroslav Ravlinko, Intellias. Enterprise IT Transformation and...Roman Pavlyuk, Yaroslav Ravlinko, Intellias. Enterprise IT Transformation and...
Roman Pavlyuk, Yaroslav Ravlinko, Intellias. Enterprise IT Transformation and...
 
Migrating to the Cloud – Is Application Performance Monitoring still required?
Migrating to the Cloud – Is Application Performance Monitoring still required?Migrating to the Cloud – Is Application Performance Monitoring still required?
Migrating to the Cloud – Is Application Performance Monitoring still required?
 
Informatica Cloud Winter 2016 Release Webinar
Informatica Cloud Winter 2016 Release WebinarInformatica Cloud Winter 2016 Release Webinar
Informatica Cloud Winter 2016 Release Webinar
 

Viewers also liked

Cómo usar pentaho report design
Cómo usar pentaho report designCómo usar pentaho report design
Cómo usar pentaho report designJavier Garcia Lopez
 
Webinar: Attaining Excellence in Big Data Integration
Webinar: Attaining Excellence in Big Data IntegrationWebinar: Attaining Excellence in Big Data Integration
Webinar: Attaining Excellence in Big Data IntegrationSnapLogic
 
Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle)
Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle)Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle)
Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle)Roland Bouman
 
The Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management StackThe Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management StackSnapLogic
 
Kettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration toolKettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration toolAlex Rayón Jerez
 
Pentaho Data Integration Introduction
Pentaho Data Integration IntroductionPentaho Data Integration Introduction
Pentaho Data Integration Introductionmattcasters
 
Technical workshops during #disummit in Brussels 30 March
Technical workshops during #disummit in Brussels 30 MarchTechnical workshops during #disummit in Brussels 30 March
Technical workshops during #disummit in Brussels 30 MarchDigitYser
 

Viewers also liked (9)

Cómo usar pentaho report design
Cómo usar pentaho report designCómo usar pentaho report design
Cómo usar pentaho report design
 
Anubhav Jain
Anubhav JainAnubhav Jain
Anubhav Jain
 
Webinar: Attaining Excellence in Big Data Integration
Webinar: Attaining Excellence in Big Data IntegrationWebinar: Attaining Excellence in Big Data Integration
Webinar: Attaining Excellence in Big Data Integration
 
Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle)
Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle)Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle)
Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle)
 
The Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management StackThe Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management Stack
 
Kettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration toolKettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration tool
 
Pentaho Data Integration Introduction
Pentaho Data Integration IntroductionPentaho Data Integration Introduction
Pentaho Data Integration Introduction
 
Couchbase 3.0.2 d1
Couchbase 3.0.2  d1Couchbase 3.0.2  d1
Couchbase 3.0.2 d1
 
Technical workshops during #disummit in Brussels 30 March
Technical workshops during #disummit in Brussels 30 MarchTechnical workshops during #disummit in Brussels 30 March
Technical workshops during #disummit in Brussels 30 March
 

Similar to No sql now2011_review_of_adhoc_architectures

Presentation big dataappliance-overview_oow_v3
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3xKinAnx
 
Big Data Warehousing Meetup with Riak
Big Data Warehousing Meetup with RiakBig Data Warehousing Meetup with Riak
Big Data Warehousing Meetup with RiakCaserta
 
Power BI vs Tableau
Power BI vs TableauPower BI vs Tableau
Power BI vs TableauDon Hyun
 
SQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightSQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightTillmann Eitelberg
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutionssolarisyougood
 
Data Visualization_ Power BI vs. Tableau.pptx
Data Visualization_ Power BI vs. Tableau.pptxData Visualization_ Power BI vs. Tableau.pptx
Data Visualization_ Power BI vs. Tableau.pptxHakimAlHuribi
 
Preparing for BI in the Cloud with Windows Azure
Preparing for BI in the Cloud with Windows AzurePreparing for BI in the Cloud with Windows Azure
Preparing for BI in the Cloud with Windows AzurePerficient, Inc.
 
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...Alex Gorbachev
 
The Best Local Database for React Native Application Development .pdf
The Best Local Database for React Native Application Development .pdfThe Best Local Database for React Native Application Development .pdf
The Best Local Database for React Native Application Development .pdfTechugo
 
Sql Saturday Jacksonville- Power BI Report Server Enterprise Architecture, to...
Sql Saturday Jacksonville- Power BI Report Server Enterprise Architecture, to...Sql Saturday Jacksonville- Power BI Report Server Enterprise Architecture, to...
Sql Saturday Jacksonville- Power BI Report Server Enterprise Architecture, to...Vishal Pawar
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarRTTS
 
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with HadoopBig Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with HadoopCaserta
 
Big Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQLBig Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQLTugdual Grall
 
The Crown Jewels: Is Enterprise Data Ready for the Cloud?
The Crown Jewels: Is Enterprise Data Ready for the Cloud?The Crown Jewels: Is Enterprise Data Ready for the Cloud?
The Crown Jewels: Is Enterprise Data Ready for the Cloud?Inside Analysis
 

Similar to No sql now2011_review_of_adhoc_architectures (20)

Presentation big dataappliance-overview_oow_v3
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3
 
Big Data Warehousing Meetup with Riak
Big Data Warehousing Meetup with RiakBig Data Warehousing Meetup with Riak
Big Data Warehousing Meetup with Riak
 
Power BI vs Tableau
Power BI vs TableauPower BI vs Tableau
Power BI vs Tableau
 
Power bi vs tableau
Power bi vs tableauPower bi vs tableau
Power bi vs tableau
 
SQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightSQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsight
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutions
 
Apache Hadoop Hive
Apache Hadoop HiveApache Hadoop Hive
Apache Hadoop Hive
 
Data Visualization_ Power BI vs. Tableau.pptx
Data Visualization_ Power BI vs. Tableau.pptxData Visualization_ Power BI vs. Tableau.pptx
Data Visualization_ Power BI vs. Tableau.pptx
 
Preparing for BI in the Cloud with Windows Azure
Preparing for BI in the Cloud with Windows AzurePreparing for BI in the Cloud with Windows Azure
Preparing for BI in the Cloud with Windows Azure
 
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
 
The Best Local Database for React Native Application Development .pdf
The Best Local Database for React Native Application Development .pdfThe Best Local Database for React Native Application Development .pdf
The Best Local Database for React Native Application Development .pdf
 
Sql Saturday Jacksonville- Power BI Report Server Enterprise Architecture, to...
Sql Saturday Jacksonville- Power BI Report Server Enterprise Architecture, to...Sql Saturday Jacksonville- Power BI Report Server Enterprise Architecture, to...
Sql Saturday Jacksonville- Power BI Report Server Enterprise Architecture, to...
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
FatDB Intro
FatDB IntroFatDB Intro
FatDB Intro
 
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with HadoopBig Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
 
Oracle OpenWo2014 review part 03 three_paa_s_database
Oracle OpenWo2014 review part 03 three_paa_s_databaseOracle OpenWo2014 review part 03 three_paa_s_database
Oracle OpenWo2014 review part 03 three_paa_s_database
 
Bake-off Power BI
Bake-off Power BIBake-off Power BI
Bake-off Power BI
 
Big data
Big dataBig data
Big data
 
Big Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQLBig Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQL
 
The Crown Jewels: Is Enterprise Data Ready for the Cloud?
The Crown Jewels: Is Enterprise Data Ready for the Cloud?The Crown Jewels: Is Enterprise Data Ready for the Cloud?
The Crown Jewels: Is Enterprise Data Ready for the Cloud?
 

More from Nicholas Goodman

More from Nicholas Goodman (16)

Module Owb Targets
Module Owb TargetsModule Owb Targets
Module Owb Targets
 
Module Owb External Execution
Module Owb External ExecutionModule Owb External Execution
Module Owb External Execution
 
Module Owb Mappings
Module Owb MappingsModule Owb Mappings
Module Owb Mappings
 
Module Owb Tuning
Module Owb TuningModule Owb Tuning
Module Owb Tuning
 
Module Owb Source Metadata
Module Owb Source MetadataModule Owb Source Metadata
Module Owb Source Metadata
 
Module Owb Basics
Module Owb BasicsModule Owb Basics
Module Owb Basics
 
Module Owb Execute Mappings
Module Owb Execute MappingsModule Owb Execute Mappings
Module Owb Execute Mappings
 
Module Owb Web Browsers
Module Owb Web BrowsersModule Owb Web Browsers
Module Owb Web Browsers
 
Module Owb Process Flows
Module Owb Process FlowsModule Owb Process Flows
Module Owb Process Flows
 
Module Owb Deploying Objects
Module Owb Deploying ObjectsModule Owb Deploying Objects
Module Owb Deploying Objects
 
Module Owb Metadata
Module Owb MetadataModule Owb Metadata
Module Owb Metadata
 
Module Owb Security
Module Owb SecurityModule Owb Security
Module Owb Security
 
Module Owb Lifecycle
Module Owb LifecycleModule Owb Lifecycle
Module Owb Lifecycle
 
Module Owb Repositories
Module Owb RepositoriesModule Owb Repositories
Module Owb Repositories
 
Module Owb Advanced Features
Module Owb Advanced FeaturesModule Owb Advanced Features
Module Owb Advanced Features
 
Data Warehouse 101 - U W Guest Lecture
Data Warehouse 101 - U W Guest LectureData Warehouse 101 - U W Guest Lecture
Data Warehouse 101 - U W Guest Lecture
 

Recently uploaded

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....rightmanforbloodline
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaWSO2
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxMarkSteadman7
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfdanishmna97
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governanceWSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...caitlingebhard1
 

Recently uploaded (20)

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 

No sql now2011_review_of_adhoc_architectures

  • 2. What we'll answer in 50 minutes • Who is this guy? • How do I enable AdHoc, self service reporting on NoSQL? • How do I improve the performance of dashboards on top of NoSQL? • How do I integrate NoSQL data with my other data not inside NoSQL? • How do I enable, easy to build simple reports but also preserve the ability for rich NoSQL queries?
  • 3. Nicholas Goodman • Open Source BI thought leader – 50+ Open Source BI customer projects – Blogger, whitepapers, etc • Entrepreneur – DynamoBI Corporation – Bayon Technologies, Inc. • Data Geek, hacker, tinkerer, committer GOAL: Share perspectives, research, opinions. DISCLAIMER: Your Mileage ...
  • 4. How do we answer those Q's?
  • 5. Promise of “Big Data” • NoSQL/Hadoop/MapReduce Systems – Keep more of it – Cost effective analysis – “Massive scale” data, now accessible to everyone (elastic) – Not just SQL queries, more complex analysis ACCOMPLISHED: WEB SCALE, MASSIVE NEVER BEFORE SEEN SCALE OF DATA STORAGE AND PROCESSING
  • 6. Reality Check! • Petabytes? Y • Fast Queries? N • Cheap Storage? Y • Ad Hoc access? N • Raw Processing? Y • Accessibility to commodity BI tools? N • Rich Query Languages? Y • Flexible data structures? Y• Easy report authoring? N • Reliable, Fault Tolerant? Y• Levels of Aggregation? N • Integrated Data? N Big Data has solved the INFRASTRUCTURE of raw/core data storage but has provided less value to what BUSINESS users want for analytics.
  • 7. Data Gaps too! • Code, Developers • Analysts w/ Excel, Dashboards • MR, Rich Graph/Access • Simple 2D (tables, charts) • Hierarchical, Unstructured • Filtering and easy analytics
  • 8. Levels of Aggregation SAME DATA AT VARIOUS LEVELS OF AGGREGATION HUGELY IMPORTANT IN REAL LIFE IMPLEMENTATIONS! 10K 1 ROW 1 MILLION TO 100 MILLION 1 BILLION ROWS 100 BILLION
  • 9. Architectures • NoSQL reports • NoSQL thru and thru • NoSQL + MySQL • NoSQL as ETL Source • NoSQL programs in BI Tools • NoSQL via BI Database (SQL)
  • 10. NoSQL reports • Pay Developer to build applications for reports Apps • 100% Richness of NoSQL • $$, developer driven process • Up to date, current • No commodity BI tools • Excellent performance on • Managing rollups/summaries large datasets • Schema-less = Harder! • Custom built, beautiful • Hard to integrate other reports/dashboards reporting information • Single system to manage
  • 11. NoSQL thru and thru • Pay Developer to build FLEXIBLE applications for reports Indices Advanced Aggs Apps • All of NoSQL report • $$, developer driven process advantages • $$, app required for aggs • Managed aggregations, • No commodity BI tools rollups • Hard to integrate other • “Guided Adhoc” available reporting information inside application • Limited AdHoc (only • Higher performance for developer built dashboards/summaries combinations)
  • 12. NoSQL + MySQL • Pay Developer to build FLEXIBLE applications for reports ETL App MySQL • Less IT $$ since developers • Data freshness (24 hrs old) aren't “building reports” • Once into MySQL no rich • Rich, NoSQL analysis left in NoSQL application use (M/R) place (ETL + NoSQL) • BI Tool can connect ONLY to • Easy, Ad Hoc reporting via data in MySQL, not NoSQL commodity BI tools • Aggregations still self • Easier to understand data for managed in MySQL self service reports
  • 13. NoSQL as ETL Data Source • NoSQL treated like any other data source Informatica Teradata • Allows use of consolidated, • ETL Development Expense BI tool for AdHoc • Data Latency • Enables integrated • Loss of NoSQL language (combined) datasets for richness reporting • Traditional DW tools are $$ • Aggregations Often “managed” • Scaling issues with DW Database • Best of Breed tools
  • 14. NoSQL programs in BI Tools • Write a program in BI tool that flattens data, output into report • Rich use of NoSQL native • Developer required to write language program ($$) • Direct, up to date access • Slow-er (aggs, summaries) • Access to 100% of dataset • Lacks integration with other • Leverage “guided” report datasets parameter pages • Still (usually) no AdHoc • Less expensive than apps access
  • 15. NoSQL via BI Database (SQL) • Enable NoSQL data access via SQL (gasp!) Live Query Cached, 24hr data • Easy reports, easy (SQL) • Another system in between • Integration with other data • Still needs to be refreshed, • ETL is simple INSERT/MERGEs nightly • Live, up to date access • Not all capabilities for NoSQL richness available via SQL • High performance, cached data • AdHoc access to Live + Cached • Aggregations/Summaries
  • 16. Mozilla: NoSQL thru and thru(DB) • Socorro Project: Crash reports, optionally sent to Mozilla • https://crash-stats.mozilla.com
  • 17. X: NoSQL via SQL • Using “Splunk” (ie, a commercial NoSQL-eee data aggregator/etc) • Desire to use Tableau for advanced analytics/visualization
  • 18. Meteor Solutions: NoSQL thru and thru • Using Cloudant BigCouch solution (SaaS) • High performance set of multi purpose indices on pre defined aggregations • Up to date aggregation/reports • Better fit for Social Media graph structures over relational DB • Custom built BI applications (dashboards/reports) providing a flexible guided view through data Advanced Apps
  • 19. A,B,C: NoSQL + MySQL • Many Many companies (3 we've worked with) • All “web related” companies (semi structured, some, mostly volume) • Heavy lifting and storage, and “ETL/Data prepartion” inside Hadoop • Push summarized, aggregated data into MySQL for analysis by easy, dashboarding/BI Tools ETL App MySQL