No sql now2011_review_of_adhoc_architectures

Nicholas Goodman
Nicholas GoodmanAdvisor at Let's Catch Up
BI/Analytics for NoSQL:
Review of Architectures
What we'll answer in 50 minutes
•   Who is this guy?
•   How do I enable AdHoc, self
    service reporting on NoSQL?
•   How do I improve the
    performance of dashboards
    on top of NoSQL?
•   How do I integrate NoSQL
    data with my other data not
    inside NoSQL?
•   How do I enable, easy to build
    simple reports but also
    preserve the ability for rich
    NoSQL queries?
Nicholas Goodman

•    Open Source BI thought leader
       –    50+ Open Source BI customer projects
       –    Blogger, whitepapers, etc
•    Entrepreneur
       –    DynamoBI Corporation
       –    Bayon Technologies, Inc.
•    Data Geek, hacker, tinkerer, committer



    GOAL: Share perspectives,
    research, opinions.
    DISCLAIMER: Your Mileage ...
How do we answer those Q's?
Promise of “Big Data”
•   NoSQL/Hadoop/MapReduce Systems
     –   Keep more of it
     –   Cost effective analysis
     –   “Massive scale” data, now accessible to everyone (elastic)
     –   Not just SQL queries, more complex analysis




     ACCOMPLISHED: WEB SCALE, MASSIVE
     NEVER BEFORE SEEN SCALE OF DATA
     STORAGE AND PROCESSING
Reality Check!


•   Petabytes? Y                  •   Fast Queries? N
•   Cheap Storage? Y              •   Ad Hoc access? N
•   Raw Processing? Y             •   Accessibility to commodity BI
                                      tools? N
•   Rich Query Languages? Y
•   Flexible data structures? Y•      Easy report authoring? N

•   Reliable, Fault Tolerant? Y•      Levels of Aggregation? N
                               •      Integrated Data? N

     Big Data has solved the INFRASTRUCTURE of
     raw/core data storage but has provided less value
     to what BUSINESS users want for analytics.
Data Gaps too!



•   Code, Developers             •   Analysts w/ Excel, Dashboards
•   MR, Rich Graph/Access        •   Simple 2D (tables, charts)
•   Hierarchical, Unstructured   •   Filtering and easy analytics
Levels of Aggregation

SAME DATA AT VARIOUS
LEVELS OF AGGREGATION
HUGELY IMPORTANT IN REAL
LIFE IMPLEMENTATIONS!

                               10K
1 ROW                       1 MILLION
TO                         100 MILLION
1 BILLION ROWS
                           100 BILLION
Architectures

•   NoSQL   reports
•   NoSQL   thru and thru
•   NoSQL   + MySQL
•   NoSQL   as ETL Source
•   NoSQL   programs in BI Tools
•   NoSQL   via BI Database (SQL)
NoSQL reports
•   Pay Developer to build applications for reports



                                              Apps




•      100% Richness of NoSQL           •     $$, developer driven process
•      Up to date, current              •     No commodity BI tools
•      Excellent performance on         •     Managing rollups/summaries
       large datasets                   •     Schema-less = Harder!
•      Custom built, beautiful          •     Hard to integrate other
       reports/dashboards                     reporting information
•      Single system to manage
NoSQL thru and thru
•   Pay Developer to build FLEXIBLE applications for reports


      Indices                                 Advanced
       Aggs                                   Apps




•      All of NoSQL report              •     $$, developer driven process
       advantages                       •     $$, app required for aggs
•      Managed aggregations,            •     No commodity BI tools
       rollups
                                        •     Hard to integrate other
•      “Guided Adhoc” available               reporting information
       inside application
                                        •     Limited AdHoc (only
•      Higher performance for                 developer built
       dashboards/summaries                   combinations)
NoSQL + MySQL
•   Pay Developer to build FLEXIBLE applications for reports


                         ETL
                         App                MySQL




•      Less IT $$ since developers      •     Data freshness (24 hrs old)
       aren't “building reports”        •     Once into MySQL no rich
•      Rich, NoSQL analysis left in           NoSQL application use (M/R)
       place (ETL + NoSQL)              •     BI Tool can connect ONLY to
•       Easy, Ad Hoc reporting via            data in MySQL, not NoSQL
       commodity BI tools               •     Aggregations still self
•      Easier to understand data for          managed in MySQL
       self service reports
NoSQL as ETL Data Source
•   NoSQL treated like any other data source


                    Informatica         Teradata




•   Allows use of consolidated,     •     ETL Development Expense
    BI tool for AdHoc               •     Data Latency
•   Enables integrated              •     Loss of NoSQL language
    (combined) datasets for               richness
    reporting
                                    •     Traditional DW tools are $$
•   Aggregations Often
    “managed”                       •     Scaling issues with DW
                                          Database
•   Best of Breed tools
NoSQL programs in BI Tools
•   Write a program in BI tool that flattens data, output into report




•   Rich use of NoSQL native         •      Developer required to write
    language                                program ($$)
•   Direct, up to date access        •      Slow-er (aggs, summaries)
•   Access to 100% of dataset        •      Lacks integration with other
•   Leverage “guided” report                datasets
    parameter pages                  •      Still (usually) no AdHoc
•   Less expensive than apps                access
NoSQL via BI Database (SQL)
•   Enable NoSQL data access via SQL (gasp!)            Live Query
                                                        Cached, 24hr data




•      Easy reports, easy (SQL)      •         Another system in between
•      Integration with other data   •         Still needs to be refreshed,
•      ETL is simple INSERT/MERGEs             nightly
•      Live, up to date access       •         Not all capabilities for NoSQL
                                               richness available via SQL
•      High performance, cached data
•      AdHoc access to Live + Cached
•      Aggregations/Summaries
Mozilla: NoSQL thru and thru(DB)
•   Socorro Project: Crash reports, optionally sent to Mozilla
•   https://crash-stats.mozilla.com
X: NoSQL via SQL
•   Using “Splunk” (ie, a commercial NoSQL-eee data aggregator/etc)
•   Desire to use Tableau for advanced analytics/visualization
Meteor Solutions:
        NoSQL thru and thru
•   Using Cloudant BigCouch solution (SaaS)
•   High performance set of multi purpose indices on pre defined
    aggregations
•   Up to date aggregation/reports
•   Better fit for Social Media graph structures over relational DB
•   Custom built BI applications (dashboards/reports) providing a
    flexible guided view through data


                                          Advanced
                                          Apps
A,B,C: NoSQL + MySQL
•   Many Many companies (3 we've worked with)
•   All “web related” companies (semi structured, some, mostly
    volume)
•   Heavy lifting and storage, and “ETL/Data prepartion” inside
    Hadoop
•   Push summarized, aggregated data into MySQL for analysis by
    easy, dashboarding/BI Tools




                     ETL
                     App              MySQL
1 of 19

Recommended

Webinar: Hybrid Cloud Integration - Why It's Different and Why It Matters by
Webinar: Hybrid Cloud Integration - Why It's Different and Why It MattersWebinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
Webinar: Hybrid Cloud Integration - Why It's Different and Why It MattersSnapLogic
6.2K views36 slides
Worldwide Hybrid Cloud Computing Market – Drivers, Opportunities, Trends, and... by
Worldwide Hybrid Cloud Computing Market – Drivers, Opportunities, Trends, and...Worldwide Hybrid Cloud Computing Market – Drivers, Opportunities, Trends, and...
Worldwide Hybrid Cloud Computing Market – Drivers, Opportunities, Trends, and...Infoholic Research
604 views12 slides
EMC APAC State of Hybrid Cloud by
EMC APAC State of Hybrid CloudEMC APAC State of Hybrid Cloud
EMC APAC State of Hybrid CloudAi-Ling See
593 views13 slides
On Demand BI by
On Demand BIOn Demand BI
On Demand BIDarren Cunningham
4K views33 slides
[Infographic] Cloud Integration Drivers and Requirements in 2015 by
[Infographic] Cloud Integration Drivers and Requirements in 2015[Infographic] Cloud Integration Drivers and Requirements in 2015
[Infographic] Cloud Integration Drivers and Requirements in 2015SnapLogic
4K views1 slide
The SnapLogic Integration Cloud for ServiceNow by
The SnapLogic Integration Cloud for ServiceNowThe SnapLogic Integration Cloud for ServiceNow
The SnapLogic Integration Cloud for ServiceNowSnapLogic
1.7K views10 slides

More Related Content

What's hot

Aws based digital_transformation_platform by
Aws based digital_transformation_platformAws based digital_transformation_platform
Aws based digital_transformation_platformSlobodan Sipcic
290 views41 slides
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo... by
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...Mariano Gonzalez
570 views38 slides
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin... by
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...Amazon Web Services
4.1K views42 slides
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup... by
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...SnapLogic
4.9K views22 slides
Postgres Vision 2018: AI Needs IA by
Postgres Vision 2018: AI Needs IAPostgres Vision 2018: AI Needs IA
Postgres Vision 2018: AI Needs IAEDB
376 views13 slides
Toyota Financial Services Digital Transformation - Think 2019 by
Toyota Financial Services Digital Transformation - Think 2019Toyota Financial Services Digital Transformation - Think 2019
Toyota Financial Services Digital Transformation - Think 2019Slobodan Sipcic
3K views32 slides

What's hot(20)

Aws based digital_transformation_platform by Slobodan Sipcic
Aws based digital_transformation_platformAws based digital_transformation_platform
Aws based digital_transformation_platform
Slobodan Sipcic290 views
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo... by Mariano Gonzalez
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...
Mariano Gonzalez570 views
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin... by Amazon Web Services
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...
Amazon Web Services4.1K views
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup... by SnapLogic
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
SnapLogic4.9K views
Postgres Vision 2018: AI Needs IA by EDB
Postgres Vision 2018: AI Needs IAPostgres Vision 2018: AI Needs IA
Postgres Vision 2018: AI Needs IA
EDB376 views
Toyota Financial Services Digital Transformation - Think 2019 by Slobodan Sipcic
Toyota Financial Services Digital Transformation - Think 2019Toyota Financial Services Digital Transformation - Think 2019
Toyota Financial Services Digital Transformation - Think 2019
Slobodan Sipcic3K views
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014 by Amazon Web Services
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
Amazon Web Services1.3K views
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ... by Kai Wähner
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...
Kai Wähner14K views
Informatica Cloud Data Replication for Salesforce by Darren Cunningham
Informatica Cloud Data Replication for SalesforceInformatica Cloud Data Replication for Salesforce
Informatica Cloud Data Replication for Salesforce
Darren Cunningham2.9K views
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020 by Mariano Gonzalez
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Mariano Gonzalez115 views
Framework and Product Comparison for Big Data Log Analytics and ITOA by Kai Wähner
Framework and Product Comparison for Big Data Log Analytics and ITOA Framework and Product Comparison for Big Data Log Analytics and ITOA
Framework and Product Comparison for Big Data Log Analytics and ITOA
Kai Wähner17.1K views
20181212 AWS NL - Informatica Cloud Overview by Greg Rakers
20181212 AWS NL - Informatica Cloud Overview20181212 AWS NL - Informatica Cloud Overview
20181212 AWS NL - Informatica Cloud Overview
Greg Rakers436 views
5 Pillars of API Management by Rich Graham
5 Pillars of API Management5 Pillars of API Management
5 Pillars of API Management
Rich Graham611 views
Business Intelligence in the Cloud I by RightScale
Business Intelligence in the Cloud IBusiness Intelligence in the Cloud I
Business Intelligence in the Cloud I
RightScale2.2K views
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned by RightScale
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons LearnedRightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
RightScale1.6K views
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic... by Igor De Souza
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Igor De Souza492 views
Powering the Enterprise Cloud with CSC and Hitachi Data Systems by Hitachi Vantara
Powering the Enterprise Cloud with CSC and Hitachi Data SystemsPowering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data Systems
Hitachi Vantara2.9K views
Roman Pavlyuk, Yaroslav Ravlinko, Intellias. Enterprise IT Transformation and... by IT Arena
Roman Pavlyuk, Yaroslav Ravlinko, Intellias. Enterprise IT Transformation and...Roman Pavlyuk, Yaroslav Ravlinko, Intellias. Enterprise IT Transformation and...
Roman Pavlyuk, Yaroslav Ravlinko, Intellias. Enterprise IT Transformation and...
IT Arena23 views
Migrating to the Cloud – Is Application Performance Monitoring still required? by eG Innovations
Migrating to the Cloud – Is Application Performance Monitoring still required?Migrating to the Cloud – Is Application Performance Monitoring still required?
Migrating to the Cloud – Is Application Performance Monitoring still required?
eG Innovations114 views
Informatica Cloud Winter 2016 Release Webinar by Informatica Cloud
Informatica Cloud Winter 2016 Release WebinarInformatica Cloud Winter 2016 Release Webinar
Informatica Cloud Winter 2016 Release Webinar
Informatica Cloud1.8K views

Viewers also liked

Cómo usar pentaho report design by
Cómo usar pentaho report designCómo usar pentaho report design
Cómo usar pentaho report designJavier Garcia Lopez
3K views27 slides
Anubhav Jain by
Anubhav JainAnubhav Jain
Anubhav JainAnubhav Jain
659 views6 slides
Webinar: Attaining Excellence in Big Data Integration by
Webinar: Attaining Excellence in Big Data IntegrationWebinar: Attaining Excellence in Big Data Integration
Webinar: Attaining Excellence in Big Data IntegrationSnapLogic
5.8K views21 slides
Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle) by
Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle)Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle)
Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle)Roland Bouman
9.4K views25 slides
The Impact of SMACT on the Data Management Stack by
The Impact of SMACT on the Data Management StackThe Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management StackSnapLogic
4.9K views32 slides
Kettle: Pentaho Data Integration tool by
Kettle: Pentaho Data Integration toolKettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration toolAlex Rayón Jerez
20.5K views85 slides

Viewers also liked(9)

Webinar: Attaining Excellence in Big Data Integration by SnapLogic
Webinar: Attaining Excellence in Big Data IntegrationWebinar: Attaining Excellence in Big Data Integration
Webinar: Attaining Excellence in Big Data Integration
SnapLogic5.8K views
Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle) by Roland Bouman
Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle)Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle)
Moving and Transforming Data with Pentaho Data Integration 5.0 CE (aka Kettle)
Roland Bouman9.4K views
The Impact of SMACT on the Data Management Stack by SnapLogic
The Impact of SMACT on the Data Management StackThe Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management Stack
SnapLogic4.9K views
Kettle: Pentaho Data Integration tool by Alex Rayón Jerez
Kettle: Pentaho Data Integration toolKettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration tool
Alex Rayón Jerez20.5K views
Pentaho Data Integration Introduction by mattcasters
Pentaho Data Integration IntroductionPentaho Data Integration Introduction
Pentaho Data Integration Introduction
mattcasters32.5K views
Technical workshops during #disummit in Brussels 30 March by DigitYser
Technical workshops during #disummit in Brussels 30 MarchTechnical workshops during #disummit in Brussels 30 March
Technical workshops during #disummit in Brussels 30 March
DigitYser4.2K views

Similar to No sql now2011_review_of_adhoc_architectures

Presentation big dataappliance-overview_oow_v3 by
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3xKinAnx
308 views36 slides
Big Data Warehousing Meetup with Riak by
Big Data Warehousing Meetup with RiakBig Data Warehousing Meetup with Riak
Big Data Warehousing Meetup with RiakCaserta
1.9K views19 slides
Power BI vs Tableau by
Power BI vs TableauPower BI vs Tableau
Power BI vs TableauDon Hyun
5.8K views34 slides
Power bi vs tableau by
Power bi vs tableauPower bi vs tableau
Power bi vs tableauAffirma Consulting
3.8K views34 slides
SQL Server Konferenz 2014 - SSIS & HDInsight by
SQL Server Konferenz 2014 - SSIS & HDInsightSQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightTillmann Eitelberg
2.8K views45 slides
Oracle big data appliance and solutions by
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutionssolarisyougood
5.9K views40 slides

Similar to No sql now2011_review_of_adhoc_architectures(20)

Presentation big dataappliance-overview_oow_v3 by xKinAnx
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3
xKinAnx308 views
Big Data Warehousing Meetup with Riak by Caserta
Big Data Warehousing Meetup with RiakBig Data Warehousing Meetup with Riak
Big Data Warehousing Meetup with Riak
Caserta 1.9K views
Power BI vs Tableau by Don Hyun
Power BI vs TableauPower BI vs Tableau
Power BI vs Tableau
Don Hyun5.8K views
SQL Server Konferenz 2014 - SSIS & HDInsight by Tillmann Eitelberg
SQL Server Konferenz 2014 - SSIS & HDInsightSQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsight
Tillmann Eitelberg2.8K views
Oracle big data appliance and solutions by solarisyougood
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutions
solarisyougood5.9K views
Preparing for BI in the Cloud with Windows Azure by Perficient, Inc.
Preparing for BI in the Cloud with Windows AzurePreparing for BI in the Cloud with Windows Azure
Preparing for BI in the Cloud with Windows Azure
Perficient, Inc.13.1K views
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op... by Alex Gorbachev
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Alex Gorbachev3.3K views
Sql Saturday Jacksonville- Power BI Report Server Enterprise Architecture, to... by Vishal Pawar
Sql Saturday Jacksonville- Power BI Report Server Enterprise Architecture, to...Sql Saturday Jacksonville- Power BI Report Server Enterprise Architecture, to...
Sql Saturday Jacksonville- Power BI Report Server Enterprise Architecture, to...
Vishal Pawar370 views
QuerySurge Slide Deck for Big Data Testing Webinar by RTTS
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
RTTS29.6K views
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop by Caserta
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with HadoopBig Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
Caserta 2.4K views
Big data by R prasad
Big dataBig data
Big data
R prasad188 views
Big Data Paris : Hadoop and NoSQL by Tugdual Grall
Big Data Paris : Hadoop and NoSQLBig Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQL
Tugdual Grall1.1K views
The Crown Jewels: Is Enterprise Data Ready for the Cloud? by Inside Analysis
The Crown Jewels: Is Enterprise Data Ready for the Cloud?The Crown Jewels: Is Enterprise Data Ready for the Cloud?
The Crown Jewels: Is Enterprise Data Ready for the Cloud?
Inside Analysis349 views
The modern analytics architecture by Joseph D'Antoni
The modern analytics architectureThe modern analytics architecture
The modern analytics architecture
Joseph D'Antoni2.4K views
the Data World Distilled by RTTS
the Data World Distilledthe Data World Distilled
the Data World Distilled
RTTS14.5K views

More from Nicholas Goodman

Module Owb Targets by
Module Owb TargetsModule Owb Targets
Module Owb TargetsNicholas Goodman
513 views13 slides
Module Owb External Execution by
Module Owb External ExecutionModule Owb External Execution
Module Owb External ExecutionNicholas Goodman
455 views8 slides
Module Owb Mappings by
Module Owb MappingsModule Owb Mappings
Module Owb MappingsNicholas Goodman
491 views9 slides
Module Owb Tuning by
Module Owb TuningModule Owb Tuning
Module Owb TuningNicholas Goodman
979 views30 slides
Module Owb Source Metadata by
Module Owb Source MetadataModule Owb Source Metadata
Module Owb Source MetadataNicholas Goodman
447 views9 slides
Module Owb Basics by
Module Owb BasicsModule Owb Basics
Module Owb BasicsNicholas Goodman
826 views25 slides

More from Nicholas Goodman(16)

Recently uploaded

Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O... by
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...ShapeBlue
88 views13 slides
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... by
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...ShapeBlue
117 views25 slides
The Power of Heat Decarbonisation Plans in the Built Environment by
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built EnvironmentIES VE
69 views20 slides
Uni Systems for Power Platform.pptx by
Uni Systems for Power Platform.pptxUni Systems for Power Platform.pptx
Uni Systems for Power Platform.pptxUni Systems S.M.S.A.
61 views21 slides
Business Analyst Series 2023 - Week 4 Session 7 by
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7DianaGray10
126 views31 slides
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueShapeBlue
93 views15 slides

Recently uploaded(20)

Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O... by ShapeBlue
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
ShapeBlue88 views
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... by ShapeBlue
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
ShapeBlue117 views
The Power of Heat Decarbonisation Plans in the Built Environment by IES VE
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built Environment
IES VE69 views
Business Analyst Series 2023 - Week 4 Session 7 by DianaGray10
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7
DianaGray10126 views
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue93 views
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue132 views
Digital Personal Data Protection (DPDP) Practical Approach For CISOs by Priyanka Aash
Digital Personal Data Protection (DPDP) Practical Approach For CISOsDigital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOs
Priyanka Aash153 views
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue by ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueWhat’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
ShapeBlue222 views
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT by ShapeBlue
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
ShapeBlue166 views
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue by ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlueElevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
ShapeBlue179 views
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti... by ShapeBlue
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
ShapeBlue98 views
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc160 views
Why and How CloudStack at weSystems - Stephan Bienek - weSystems by ShapeBlue
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsWhy and How CloudStack at weSystems - Stephan Bienek - weSystems
Why and How CloudStack at weSystems - Stephan Bienek - weSystems
ShapeBlue197 views
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda... by ShapeBlue
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
ShapeBlue120 views
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue by ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
ShapeBlue163 views
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software385 views

No sql now2011_review_of_adhoc_architectures

  • 2. What we'll answer in 50 minutes • Who is this guy? • How do I enable AdHoc, self service reporting on NoSQL? • How do I improve the performance of dashboards on top of NoSQL? • How do I integrate NoSQL data with my other data not inside NoSQL? • How do I enable, easy to build simple reports but also preserve the ability for rich NoSQL queries?
  • 3. Nicholas Goodman • Open Source BI thought leader – 50+ Open Source BI customer projects – Blogger, whitepapers, etc • Entrepreneur – DynamoBI Corporation – Bayon Technologies, Inc. • Data Geek, hacker, tinkerer, committer GOAL: Share perspectives, research, opinions. DISCLAIMER: Your Mileage ...
  • 4. How do we answer those Q's?
  • 5. Promise of “Big Data” • NoSQL/Hadoop/MapReduce Systems – Keep more of it – Cost effective analysis – “Massive scale” data, now accessible to everyone (elastic) – Not just SQL queries, more complex analysis ACCOMPLISHED: WEB SCALE, MASSIVE NEVER BEFORE SEEN SCALE OF DATA STORAGE AND PROCESSING
  • 6. Reality Check! • Petabytes? Y • Fast Queries? N • Cheap Storage? Y • Ad Hoc access? N • Raw Processing? Y • Accessibility to commodity BI tools? N • Rich Query Languages? Y • Flexible data structures? Y• Easy report authoring? N • Reliable, Fault Tolerant? Y• Levels of Aggregation? N • Integrated Data? N Big Data has solved the INFRASTRUCTURE of raw/core data storage but has provided less value to what BUSINESS users want for analytics.
  • 7. Data Gaps too! • Code, Developers • Analysts w/ Excel, Dashboards • MR, Rich Graph/Access • Simple 2D (tables, charts) • Hierarchical, Unstructured • Filtering and easy analytics
  • 8. Levels of Aggregation SAME DATA AT VARIOUS LEVELS OF AGGREGATION HUGELY IMPORTANT IN REAL LIFE IMPLEMENTATIONS! 10K 1 ROW 1 MILLION TO 100 MILLION 1 BILLION ROWS 100 BILLION
  • 9. Architectures • NoSQL reports • NoSQL thru and thru • NoSQL + MySQL • NoSQL as ETL Source • NoSQL programs in BI Tools • NoSQL via BI Database (SQL)
  • 10. NoSQL reports • Pay Developer to build applications for reports Apps • 100% Richness of NoSQL • $$, developer driven process • Up to date, current • No commodity BI tools • Excellent performance on • Managing rollups/summaries large datasets • Schema-less = Harder! • Custom built, beautiful • Hard to integrate other reports/dashboards reporting information • Single system to manage
  • 11. NoSQL thru and thru • Pay Developer to build FLEXIBLE applications for reports Indices Advanced Aggs Apps • All of NoSQL report • $$, developer driven process advantages • $$, app required for aggs • Managed aggregations, • No commodity BI tools rollups • Hard to integrate other • “Guided Adhoc” available reporting information inside application • Limited AdHoc (only • Higher performance for developer built dashboards/summaries combinations)
  • 12. NoSQL + MySQL • Pay Developer to build FLEXIBLE applications for reports ETL App MySQL • Less IT $$ since developers • Data freshness (24 hrs old) aren't “building reports” • Once into MySQL no rich • Rich, NoSQL analysis left in NoSQL application use (M/R) place (ETL + NoSQL) • BI Tool can connect ONLY to • Easy, Ad Hoc reporting via data in MySQL, not NoSQL commodity BI tools • Aggregations still self • Easier to understand data for managed in MySQL self service reports
  • 13. NoSQL as ETL Data Source • NoSQL treated like any other data source Informatica Teradata • Allows use of consolidated, • ETL Development Expense BI tool for AdHoc • Data Latency • Enables integrated • Loss of NoSQL language (combined) datasets for richness reporting • Traditional DW tools are $$ • Aggregations Often “managed” • Scaling issues with DW Database • Best of Breed tools
  • 14. NoSQL programs in BI Tools • Write a program in BI tool that flattens data, output into report • Rich use of NoSQL native • Developer required to write language program ($$) • Direct, up to date access • Slow-er (aggs, summaries) • Access to 100% of dataset • Lacks integration with other • Leverage “guided” report datasets parameter pages • Still (usually) no AdHoc • Less expensive than apps access
  • 15. NoSQL via BI Database (SQL) • Enable NoSQL data access via SQL (gasp!) Live Query Cached, 24hr data • Easy reports, easy (SQL) • Another system in between • Integration with other data • Still needs to be refreshed, • ETL is simple INSERT/MERGEs nightly • Live, up to date access • Not all capabilities for NoSQL richness available via SQL • High performance, cached data • AdHoc access to Live + Cached • Aggregations/Summaries
  • 16. Mozilla: NoSQL thru and thru(DB) • Socorro Project: Crash reports, optionally sent to Mozilla • https://crash-stats.mozilla.com
  • 17. X: NoSQL via SQL • Using “Splunk” (ie, a commercial NoSQL-eee data aggregator/etc) • Desire to use Tableau for advanced analytics/visualization
  • 18. Meteor Solutions: NoSQL thru and thru • Using Cloudant BigCouch solution (SaaS) • High performance set of multi purpose indices on pre defined aggregations • Up to date aggregation/reports • Better fit for Social Media graph structures over relational DB • Custom built BI applications (dashboards/reports) providing a flexible guided view through data Advanced Apps
  • 19. A,B,C: NoSQL + MySQL • Many Many companies (3 we've worked with) • All “web related” companies (semi structured, some, mostly volume) • Heavy lifting and storage, and “ETL/Data prepartion” inside Hadoop • Push summarized, aggregated data into MySQL for analysis by easy, dashboarding/BI Tools ETL App MySQL