Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Combining Big Data with Existing Analytics Technologies

1,482 views

Published on

In this session, we will explore successful approaches to securing initial quick wins with big data analytics pilot projects without boiling the ocean (data lake). Business intelligence and big data initiatives remain the No. 1 CIO priority for the second consecutive year. In this session we look at practical options to get started by combining existing data warehouse and OLAP assets with new Hadoop data sources.
- Share popular big data analytics use cases
- Discuss modern analytics solution architecture
- How to choose the right pilot project

Published in: Data & Analytics
  • Be the first to comment

Combining Big Data with Existing Analytics Technologies

  1. 1. © 2014 ImpactAnalytix, LLC Kickstart Big Data: Combine Existing Analytics Assets with New Hadoop Data SourcesJen Underwood Founder & Principal Consultant Impact Analytix, LLC jen@impactanalytix.com www.impactanalytix.com q u i c k l y m a k e a p o s i t i v e i m p a c t
  2. 2. © 2014 ImpactAnalytix, LLC Agenda Presenter: Jen Underwood Title: Kickstart Big Data: Combine Existing Analytics Assets with New Hadoop Data Sources Tagline: Use Big Data technologies to Leverage your Existing Data Warehouse and BI/Analytics (OLAP) Investments Abstract: Explore successful approaches to securing initial quick wins with big data analytics pilot projects without boiling the ocean (data lake). Business intelligence and big data initiatives remain the No. 1 CIO priority for the second consecutive year. In this session we look at practical options to get started by combining existing data warehouse and OLAP assets with new Hadoop data sources. - Share popular big data analytics use cases - Discuss modern analytics solution architecture - How to choose the right pilot project Key Takeaways: First Step – Data Warehouse Modernization for speed, scale and outcomes Next Step – Analytics Optimization for simplicity, alignment and value The Goal – Advanced Analytics Platform for Big Data as a Service, Feature, Source
  3. 3. © 2014 ImpactAnalytix, LLC Mega-Trends Source: http://www.burrus.com/resources/ daniel-burrus-top-twenty-technology-driven-trends-for-2013/ 1. Rapid Growth of Big Data 2. Cloud Computing and Advanced Cloud Services 3. On Demand Services 4. Virtualization 5. Consumerization of IT Increases
  4. 4. © 2014 ImpactAnalytix, LLC Living in the Age of Data Explosion Exponential increase in unstructured data New breed of highly distributed, elastic scale non-relational databases Revolutionary market shift after 40 years of relational database dominance Big data requires modernizing architecture and approach to analytics
  5. 5. © 2014 ImpactAnalytix, LLC What is Big Data?
  6. 6. © 2014 ImpactAnalytix, LLC Big Data Analytics ≠ Traditional BI with More Data Volume Variety Velocity RelationalData 10x increase every five years 85% from new data types Real Time petabytes Batch & Streaming Structured & Unstructured
  7. 7. © 2014 ImpactAnalytix, LLC Big Data Analytics ≠ Traditional BI with More Data Big Data is redefining the processes of managing master data, data quality, and information lifecycle management Big Data is NOT replacing EDW and OLAP, it supplements those investments Big Data ecosystem includes variety of analytic technologies • Columnar databases, JSON, and unstructured file stores • Hadoop and NoSQL platforms adding SQL, search, and streaming capabilities, while NoSQL platforms are adding MPP and transactional support • Data tiering that aggressively leverages SSD (Flash) and DRAM Source: Gartner
  8. 8. © 2014 ImpactAnalytix, LLC Hadoop: Move Compute to the Data Inspired by Google’s Map Reduce Infrastructure to automatically scale-out storage and distributed data processing on commodity hardware Hadoop
  9. 9. © 2014 ImpactAnalytix, LLC Hadoop: Move Compute to the Data Source: Datameer Another way to think about this shift…
  10. 10. © 2014 ImpactAnalytix, LLC Traditional RDBMS MapReduce Data Size Gigabytes (Terabytes) Petabytes (Hexabytes) Access Interactive and Batch Batch Updates Read / Write many times Write once, Read many times Structure Static Schema Dynamic Schema Integrity High (ACID) Low Scaling Nonlinear Linear DBA Ratio 1:40 1:3000 Source: Tom White’s Hadoop: The Definitive Guide Hadoop: Move Compute to the Data
  11. 11. © 2014 ImpactAnalytix, LLC Process Shift from Schema First to Schema Later 1. Data arrives 2. Derive schema 3. Cleanse data 4. Transform 5. Load to EDW 6. Analyze 1. Data arrives 2. Load to Hadoop 3. Analyze 4. Subsets of data loaded to EDW SLOW VALUE FROM DATA RAPID VALUE FROM DATA
  12. 12. © 2014 ImpactAnalytix, LLC Modern Analytics Architecture
  13. 13. © 2014 ImpactAnalytix, LLC Modern Data Warehousing
  14. 14. © 2014 ImpactAnalytix, LLC Changes in Data Warehousing Patterns Free up the EDW from low value tasks Keep 100% of the source data and historical data Explore and mine data with "schema on read" Cold data storage with Hadoop, warm data with MPP/Columnar, hot data in-memory Non-relational data Hadoop – Cold Data MPP/Columnar – Warm Data In-Memory – Hot Data
  15. 15. © 2014 ImpactAnalytix, LLC Changes in Data Warehousing Patterns Results Non-relational data Social apps Sensor and RFID Mobile apps Web apps Hadoop Relational and OLAP data Traditional schema-based data warehouse applications EDWHDFS bridge Enhanced query engine External table External data source External file format Regular T-SQL Basically adding a “bridge” to Big Data from your existing investments
  16. 16. © 2014 ImpactAnalytix, LLC Changes in Data Warehousing Patterns Big Data storage aka Data Lake is characterized by three key attributes: Collect everything A data lake contains all data, both raw sources over extended periods of time as well as any processed data Dive in anywhere A data lake enables users across multiple business units to refine, explore and enrich data on their terms Flexible access A data lake enables multiple data access patterns across a shared infrastructure: batch, interactive, online, search, in-memory and other processing engine
  17. 17. © 2014 ImpactAnalytix, LLC Changes in Data Warehousing Patterns Modern MPP, Columnar and Visual Analytics Innovations: Nature of Hadoop data access Historically querying Hadoop entailed complex Java, results were slow and batch processes thus improved tools made to expedite Hadoop data access External tables, compression, HDFS, Hive, other means Easy visual analytics tools use business user friendly means to access Hadoop data and often brings that data into an in- memory cache for rapid data analysis Materialized Views “v2” and analytic functions Big data visual analytic tools improve upon traditional view techniques to bring bid data into memory or chip and intelligently, automatically re-use and refresh those views
  18. 18. © 2014 ImpactAnalytix, LLC Why Now? What’s the big deal? “By 2015, organizations that build a modern information management system will outperform their peers financially by 20 percent.” – Gartner, Mark Beyer, Information Management in the 21st Century
  19. 19. © 2014 ImpactAnalytix, LLC Source: 2014 IDG EnterpriseBig Data Research An online survey of 46 questionswas used with 751 respondentsrandomly selected from CIO, Computerworld,CSO, InfoWorld, ITworld,and Network World subscribers,e-mail subscription lists and LinkedIn forums. Big Data Adoption
  20. 20. © 2014 ImpactAnalytix, LLC Big Data Changing the Landscape Beyond hype, it is imperative to understand when it is time to embrace a technology-enabled trend in its formative stages Organizations are already vastly improving the quality and speed of decision making – big data is a competitive need to thrive Look around you… ALL the major database vendors and analytics software providers evolving their solution offerings for big data sources New analytical solutions easily, quickly unlock the value in big data
  21. 21. © 2014 ImpactAnalytix, LLC Big Data Today
  22. 22. © 2014 ImpactAnalytix, LLC Areas of Business Intelligence Tools Source: http://www.b-eye-network.com/blogs/eckerson/archives/2013/03/a_guide_for_bi.php
  23. 23. © 2014 ImpactAnalytix, LLC Unlocking the Value of Big Data Today’s easy visual analytics and integration tools empower the business to make smarter decisions and generate more value from more data Fast, direct, agile access to big data to analyze in-place, blend with EDW, OLAP and personal data sources, decreasing long BI backlogs for faster actionable insight Less need to move large volumes of data between platforms just to ask new questions or perform predictive analytics
  24. 24. © 2014 ImpactAnalytix, LLC Integrate Predictive Intelligence Transform business using “Smart” Apps and Reports Analytic tool specific integration options In-Database Predictive UDF Functions and Predictive Queries PMML to exchange models Programming with APIs
  25. 25. © 2014 ImpactAnalytix, LLC Hunk Unlocking the Value of Big Data Many Others…
  26. 26. © 2014 ImpactAnalytix, LLC Tableau
  27. 27. © 2014 ImpactAnalytix, LLC Datameer
  28. 28. © 2014 ImpactAnalytix, LLC Platfora
  29. 29. © 2014 ImpactAnalytix, LLC SAS Visual Analytics
  30. 30. © 2014 ImpactAnalytix, LLC Excel 2013
  31. 31. © 2014 ImpactAnalytix, LLC Demos
  32. 32. © 2014 ImpactAnalytix, LLC Choosing a Pilot Project Secure practical, initial quick wins without boiling the ocean (data lake)
  33. 33. © 2014 ImpactAnalytix, LLC How to Start 1. Develop Roadmap 2. Plan to invest in modern infrastructure as a long-term equal analytic partner for traditional EDW and BI assets 3. Develop a "skills matrix" and staffing plan 4. Identify and prioritize projects that present only one or two of the extreme data challenges — volume, variety, velocity, or complexity — and include visual analytics where the business can immediately see value 5. Gradually invest in training by partnering with experts and adding staff as needed
  34. 34. © 2014 ImpactAnalytix, LLC Why Visual Analytics 1. Let’s face it… if the business can see it, they can immediately recognize the value 2. Easy – few weeks to a month to highly visible results the business can understand and truly appreciate (much more so than a cold data back up!) 3. Choose a project with measurable ROI for a specific for a valued business use case a. Outline the goals of a big data pilot b. Get assistance from experts to reduce learning curve, fast-track learning curve and ensure initial successes c. Sell the vision with the end result imagery d. Start small and specific area with one big data “V” BUT large enough that people care about the results
  35. 35. © 2014 ImpactAnalytix, LLC Key Takeaways Don’t be left behind. Get started now. First Step – Data Warehouse Modernization for speed, scale and outcomes Next Step – Analytics Optimization for simplicity, alignment and value The Goal – Advanced Analytics Platform for Big Data as a Service, Feature, Source
  36. 36. © 2014 ImpactAnalytix, LLC© 2013 Impact Analytix, LLC

×