• Save
No sql now2011_review_of_adhoc_architectures
Upcoming SlideShare
Loading in...5
×
 

No sql now2011_review_of_adhoc_architectures

on

  • 20,174 views

 

Statistics

Views

Total Views
20,174
Views on SlideShare
3,240
Embed Views
16,934

Actions

Likes
8
Downloads
0
Comments
0

31 Embeds 16,934

http://www.nicholasgoodman.com 15093
http://todobi.blogspot.com 1526
http://todobi.blogspot.com.es 79
http://todobi.blogspot.mx 51
http://translate.googleusercontent.com 36
http://people.ntis.vn 34
http://abtasty.com 23
http://todobi.blogspot.com.ar 19
http://www.todobi.blogspot.com 12
http://www.twylah.com 11
http://us-w1.rockmelt.com 5
http://newsblur.com 5
http://webcache.googleusercontent.com 5
http://127.0.0.1 4
http://todobi.blogspot.com.br 4
http://feeds.feedburner.com 4
http://todobi.blogspot.no 3
http://www.blogger.com 3
http://www.feedspot.com 3
http://www.techgig.com 2
http://todobi.blogspot.ie 2
http://ranksit.com 1
http://192.168.1.8 1
http://www.google.com 1
http://www.newsblur.com 1
http://www.tuicool.com 1
http://webcache-exp-test.googleusercontent.com 1
http://www.todobi.blogspot.com.ar 1
http://aratus.loria.fr 1
http://twitter.com 1
https://twitter.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

No sql now2011_review_of_adhoc_architectures No sql now2011_review_of_adhoc_architectures Presentation Transcript

  • BI/Analytics for NoSQL:Review of Architectures
  • What well answer in 50 minutes• Who is this guy?• How do I enable AdHoc, self service reporting on NoSQL?• How do I improve the performance of dashboards on top of NoSQL?• How do I integrate NoSQL data with my other data not inside NoSQL?• How do I enable, easy to build simple reports but also preserve the ability for rich NoSQL queries?
  • Nicholas Goodman• Open Source BI thought leader – 50+ Open Source BI customer projects – Blogger, whitepapers, etc• Entrepreneur – DynamoBI Corporation – Bayon Technologies, Inc.• Data Geek, hacker, tinkerer, committer GOAL: Share perspectives, research, opinions. DISCLAIMER: Your Mileage ...
  • How do we answer those Qs?
  • Promise of “Big Data”• NoSQL/Hadoop/MapReduce Systems – Keep more of it – Cost effective analysis – “Massive scale” data, now accessible to everyone (elastic) – Not just SQL queries, more complex analysis ACCOMPLISHED: WEB SCALE, MASSIVE NEVER BEFORE SEEN SCALE OF DATA STORAGE AND PROCESSING
  • Reality Check!• Petabytes? Y • Fast Queries? N• Cheap Storage? Y • Ad Hoc access? N• Raw Processing? Y • Accessibility to commodity BI tools? N• Rich Query Languages? Y• Flexible data structures? Y• Easy report authoring? N• Reliable, Fault Tolerant? Y• Levels of Aggregation? N • Integrated Data? N Big Data has solved the INFRASTRUCTURE of raw/core data storage but has provided less value to what BUSINESS users want for analytics.
  • Data Gaps too!• Code, Developers • Analysts w/ Excel, Dashboards• MR, Rich Graph/Access • Simple 2D (tables, charts)• Hierarchical, Unstructured • Filtering and easy analytics
  • Levels of AggregationSAME DATA AT VARIOUSLEVELS OF AGGREGATIONHUGELY IMPORTANT IN REALLIFE IMPLEMENTATIONS! 10K1 ROW 1 MILLIONTO 100 MILLION1 BILLION ROWS 100 BILLION
  • Architectures• NoSQL reports• NoSQL thru and thru• NoSQL + MySQL• NoSQL as ETL Source• NoSQL programs in BI Tools• NoSQL via BI Database (SQL)
  • NoSQL reports• Pay Developer to build applications for reports Apps• 100% Richness of NoSQL • $$, developer driven process• Up to date, current • No commodity BI tools• Excellent performance on • Managing rollups/summaries large datasets • Schema-less = Harder!• Custom built, beautiful • Hard to integrate other reports/dashboards reporting information• Single system to manage
  • NoSQL thru and thru• Pay Developer to build FLEXIBLE applications for reports Indices Advanced Aggs Apps• All of NoSQL report • $$, developer driven process advantages • $$, app required for aggs• Managed aggregations, • No commodity BI tools rollups • Hard to integrate other• “Guided Adhoc” available reporting information inside application • Limited AdHoc (only• Higher performance for developer built dashboards/summaries combinations)
  • NoSQL + MySQL• Pay Developer to build FLEXIBLE applications for reports ETL App MySQL• Less IT $$ since developers • Data freshness (24 hrs old) arent “building reports” • Once into MySQL no rich• Rich, NoSQL analysis left in NoSQL application use (M/R) place (ETL + NoSQL) • BI Tool can connect ONLY to• Easy, Ad Hoc reporting via data in MySQL, not NoSQL commodity BI tools • Aggregations still self• Easier to understand data for managed in MySQL self service reports
  • NoSQL as ETL Data Source• NoSQL treated like any other data source Informatica Teradata• Allows use of consolidated, • ETL Development Expense BI tool for AdHoc • Data Latency• Enables integrated • Loss of NoSQL language (combined) datasets for richness reporting • Traditional DW tools are $$• Aggregations Often “managed” • Scaling issues with DW Database• Best of Breed tools
  • NoSQL programs in BI Tools• Write a program in BI tool that flattens data, output into report• Rich use of NoSQL native • Developer required to write language program ($$)• Direct, up to date access • Slow-er (aggs, summaries)• Access to 100% of dataset • Lacks integration with other• Leverage “guided” report datasets parameter pages • Still (usually) no AdHoc• Less expensive than apps access
  • NoSQL via BI Database (SQL)• Enable NoSQL data access via SQL (gasp!) Live Query Cached, 24hr data• Easy reports, easy (SQL) • Another system in between• Integration with other data • Still needs to be refreshed,• ETL is simple INSERT/MERGEs nightly• Live, up to date access • Not all capabilities for NoSQL richness available via SQL• High performance, cached data• AdHoc access to Live + Cached• Aggregations/Summaries
  • Mozilla: NoSQL thru and thru(DB)• Socorro Project: Crash reports, optionally sent to Mozilla• https://crash-stats.mozilla.com
  • X: NoSQL via SQL• Using “Splunk” (ie, a commercial NoSQL-eee data aggregator/etc)• Desire to use Tableau for advanced analytics/visualization
  • Meteor Solutions: NoSQL thru and thru• Using Cloudant BigCouch solution (SaaS)• High performance set of multi purpose indices on pre defined aggregations• Up to date aggregation/reports• Better fit for Social Media graph structures over relational DB• Custom built BI applications (dashboards/reports) providing a flexible guided view through data Advanced Apps
  • A,B,C: NoSQL + MySQL• Many Many companies (3 weve worked with)• All “web related” companies (semi structured, some, mostly volume)• Heavy lifting and storage, and “ETL/Data prepartion” inside Hadoop• Push summarized, aggregated data into MySQL for analysis by easy, dashboarding/BI Tools ETL App MySQL