Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Raj Babu of AgileIss

854 views

Published on

Why and How has the Big Data based Enterprise Data Lake solution based on No-SQL and SQL technologies has become significantly effective in solving enterprise data challenges than its predecessor EDW which had tried and failed to solve the same problem entirely based on SQL database only.

Published in: Technology
  • Be the first to comment

Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Raj Babu of AgileIss

  1. 1. DATA LAKE - RE BIRTH OF ENTERPRISE DATA THINKING MAKING BIG DATA MEANINGFUL FOR ALL ENTERPRISE WWW.AGILEISS.COM 1 Making BiG Data meaningful for All By Raj Babu Raj@AgileiSS.com HADOOP IS NOT FOR SELECTED FEW, BUT FOR ALL ENTERPRISE
  2. 2. About Agile iSS Agile iSS , We are a BI & Analytics services company servicing our clients on Big Data, Data Lake, BI, BI on Cloud, BI/Analytics As Service. Our Goal is to make Big Data meaningful for all Enterprises. We are focused on helping our clients upgrade their current EXPENSIVE and old tech based ineffective BI solution to a POWERFUL, EFFECTIVE BI & ANALYTICS solution that is effective and has lower TCO. WWW.AGILEISS.COM 2
  3. 3. WWW.AGILEISS.COM DATA LAKE - RE BIRTH OF ENTERPRISE DATA THINKING ENTERPRISE DATA LAKE (EDL) I have just two goal for my 25 minute presentation today…… To convince you all on following…… Big Data is not only a solution for the select few Enterprises…..who have 100’s of TB’s or ZB’s of data. Big Data through Enterprise Data Lake (EDL) is now Mainstream and should be part of standard IT stack solution for all mid and large Enterprises. EDL makes Enterprise BI systems more Agile, Nimble, Economical & Valuable.
  4. 4. WWW.AGILEISS.COM DATA LAKE - RE BIRTH OF ENTERPRISE DATA THINKING MAKING BIG DATA MEANINGFUL FOR ALL ENTERPRISE Why Enterprise Data Lake Solution (based on Big Data, No-SQL technology) + Traditional BI as Enterprise BI & Analytics Solution is a significantly more effective, than its predecessor EDW that has tried and failed in the last 2 decades ..?
  5. 5. Why EDW Failed ? WWW.AGILEISS.COM If you Google “Challenges with EDW”, you will get something like this…… Takes too long to get anything done BI is too Expensive to Build and Manage and never on the schedule that Business wants Our BI team and system can’t implement changes fast.. Over complicated Architecture… Our BI cant do anything ad- hoc, they need requirements, design, architecture, ETL for everything & it never gets done after all…… Our BI is Always incomplete, it never has all the data we need Our BI is not suitable for ad-hoc Analytics
  6. 6. WWW.AGILEISS.COM 6 It is extremely expensive and practically impossible to gather requirements, design, build ETL and store all the data needed in EDW & DM. EDW or Data Martsare optimized for data analysis by processing and storing only subsets of datasets. An EDL is designed to “RETAIN ALL DATASETS“. This is the single most powerful feature of EDL as we will never know the future complete scope of datasets for analytics. Why EDW Failed? & EDL is taking over
  7. 7. Why EDL clearly wins over EDW ? WWW.AGILEISS.COM Service ad-hoc request with no latency & no development Inexpensive and low maintenance cost to manage as there is no or very minimal Build effort Minimal development team involvement, unless data is needed in Data Mart All Data is in Data Lake… Can do ad-hoc, no need for any SDLC to access any new data. No more waiting….Perfect place to offload all new & ad-hoc request. In EDL, ETL or Database is not needed for Reporting or Analytics Offers a perfect solution..NO heavy duty ETL
  8. 8. What is a Data Lake ? WWW.AGILEISS.COM 8 From Wiktionary data lake A massive, easily accessible data repository built on (relatively) inexpensive computer hardware for storing “Big Data". Techtarget A data lake is a large object-based storage repository that holds data in its native format until it is needed. Etymology Pentaho CTO James Dixon is credited with coining the term "data lake". As he described it in his blog entry. If you Google Data Lake you will get following results…….
  9. 9. What is Data Lake Cont……. WWW.AGILEISS.COM 9 From Wiktionary…… Pentaho CTO James Dixon described it in his blog entry, "If you think of a datamart as a store of bottled water – cleansed and packaged and structured for easy consumption. -The data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.
  10. 10. What Data Lake has to Offer WWW.AGILEISS.COM 10 ** EDL image by PWC ETL In here all kinds of Analytics happen. 85% Analytics, 15% Proto type Reporting EDL, ODS, Warm Archive Data Marts
  11. 11. Is EDL a Product or tool ? WWW.AGILEISS.COM 11 EDL is really a Reference Architecturefor the Enterprise BI solution using Hadoop based Big-Data as the foundation. There are now many leading DB vendors seeing EDL as a clear winner and are incorporating it in their offering and calling it Data Hub
  12. 12. Traditional ETL Analytics & Data Scientist Meta Data Enterprise Data WWW.AGILEISS.COM 12 Big Data ETL Direct Analytics & Reporting Data Mart’s Enterprise Data Lake (EDL) On-Premise Reference Architecture For BI & Analytics Data Lake on Hadoop (Horton Works, Cloudera, MAPR )
  13. 13. Traditional ETL WWW.AGILEISS.COM 13 Enterprise Data Meta Data Analytics & Data Scientist Data Lake on Hadoop (Horton Works, Cloudera, MAPR ) Data Mart’s Data Mart’sData Mart’s Enterprise Data Lake (EDL) On-Premise Reference Architecture For BI & Analytics – Stack View
  14. 14. WWW.AGILEISS.COM Reference Architecture for EDL on Cloud or Hybrid
  15. 15. Your EDL can be Following WWW.AGILEISS.COM • A central Enterprise Data Repository ODS, Data Hub • Staging source for all systems • A warm and Active Data Archive /Vault • Hadoop Data Warehouse
  16. 16. WWW.AGILEISS.COM • Anyone one and everyone who is impatient about getting their hands on data • The ones that cant give requirement but wanted reports yesterday • The ones that have no patience for ETL or Report development • Analytics, Data Science team • ETL team for Staging • By not having to buy DB capacity to store all data in BI database • When volume of data too high to process through a regular DB Your EDL can service following……
  17. 17. Who are all supporting Data Lake or Data Hub ? WWW.AGILEISS.COM 17
  18. 18. Explore EDL - There is nothing to loose WWW.AGILEISS.COM 18 With EDL there is no need for expensive ETL, Databases and long delays associated with your BI & Analytics Platform.
  19. 19. Questions ? Email - Raj@AgileiSS.com Thanks Raj Babu WWW.AGILEISS.COM 20 www.AgileiSS.com

×