QlikView & Big Data


Published on

Slides used during the presentation and demonstration 'QlikView & Big Data' at the Business Discovery World Tour on 9 October 2013 by Mischa van Werkhoven and Michael Robertshaw.

Big Data. We've all heard about it. We all think we should do something with it. But do we know exactly what it is and how to create value from it? How reasonable are our expectations? This session focuses on the myths of Big Data, technologies involved as well as how QlikView can be used to add relevance and context to Big Data for the end user.

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

QlikView & Big Data

  1. 1. QlikView & Big Data Mischa van Werkhoven Senior Solution Architect QlikTech Michael Robertshaw Senior Solution Architect QlikTech
  2. 2. Legal Disclaimer This Presentation contains forward-looking statements, including, but not limited to, statements regarding the value and effectiveness of QlikTech's products, the introduction of product enhancements or additional products and QlikTech's growth, expansion and market leadership, that involve risks, uncertainties, assumptions and other factors which, if they do not materialize or prove correct, could cause QlikTech's results to differ materially from those expressed or implied by such forward-looking statements. All statements, other than statements of historical fact, are statements that could be deemed forward-looking statements, including statements containing the words "predicts," "plan," "expects," "anticipates," "believes," "goal," "target," "estimate," "potential," "may", "will," "might," "could," and similar words. QlikTech intends all such forward- looking statements to be covered by the safe harbor provisions for forward-looking statements contained in Section 21E of the Exchange Act and the Private Securities Litigation Reform Act of 1995. Actual results may differ materially from those projected in such statements due to various factors, including but not limited to: risks and uncertainties inherent in our business; our ability to attract new customers and retain existing customers; our ability to effectively sell, service and support our products; our ability to manage our international operations; our ability to compete effectively; our ability to develop and introduce new products and add-ons or enhancements to existing products; our ability to continue to promote and maintain our brand in a cost-effective manner; our ability to manage growth; our ability to attract and retain key personnel; the scope and validity of intellectual property rights applicable to our products; adverse economic conditions in general and adverse economic conditions specifically affecting the markets in which we operate; and other risks more fully described in QlikTech's publicly available filings with the Securities and Exchange Commission. Past performance is not necessarily indicative of future results. The forward-looking statements included in this presentation represent QlikTech's views as of the date of this presentation. QlikTech anticipates that subsequent events and developments will cause its views to change. QlikTech undertakes no intention or obligation to update or revise any forward-looking statements, whether as a result of new information, future events or otherwise. These forward-looking statements should not be relied upon as representing QlikTech's views as of any date subsequent to the date of this presentation. This Presentation should be read in conjunction with QlikTech's periodic reports filed with the SEC (SEC Information), including the disclosures therein of certain factors which may affect QlikTech’s future performance. Individual statements appearing in this Presentation are intended to be read in conjunction with and in the context of the complete SEC Information documents in which they appear, rather than as stand-alone statements. © 2013 Qlik Technologies Inc. All rights reserved. QlikTech and QlikView are trademarks or registered trademarks of Qlik Technologies Inc. or its subsidiaries in the U.S. and other countries. Other company names, product names and company logos mentioned herein are the trademarks, or registered trademarks of their owners.
  3. 3. Key Takeaways • The Most Common Purpose of Big Data Is to Produce Small Data • Big Data is About Relevance and Context • Know What You Want to Achieve
  4. 4. Agenda • What is Big Data? • Myths about Big Data • Gartner – Hype Cycle – Top Challenges • Who’s doing it? • What technologies are they using? • Hadoop Components • The Bloor Group – The Intelligent Thing – Cost vs Benefit • How to do it using QlikView • Demonstration “Big Data Analytics refers to analytics on data that is not able to be performed on a standard relational data warehouse in a timeframe and cost that is acceptable for its business purpose”
  5. 5. What is Big Data?
  6. 6. / 6 http://www.qlikview.com/us/landing/open-data-challenge Take Action Open Data Challenge
  7. 7. Paper Print Computer Internet Big Data happens in every part of History • Medium to write ideas and information • Not enough writers to disseminate • Technology to distribute information • No place to store • Place to store • Can’t keep up with computing requirements • Distributed computing globally • Too many Emails to read We always create more than we can consume!
  8. 8. Success characterized by: Veracity Visualization Value Data characterized by:
  9. 9. The Myth of Big Data
  10. 10. In Many Cases, Reality Looks More Like This
  11. 11. Hype Cycle Big Data In-Memory Analytics
  12. 12. Gartner – Top Big Data Challenges You need to determine your goals/objectives QlikView may help you with these challenges
  13. 13. Who is doing it?
  14. 14. Who is doing it? Who What Why Telco Usage and Location Analysis, Customer Interactions, Services Data Analysis Operational Excellence Financial Services Trading Analysis, Portfolio Analysis Improve Profit, Minimize Risk Utilities Smart Metering Analysis Operational Excellence Travel and other Retail Cross Sell Opportunity Realisation Increase Sales Customer Behaviour Click Stream Analysis, Location Analysis, Social Media Sentiment Analysis Customer Experience, Loyalty, Increase Sales
  15. 15. What technologies are they using?
  16. 16. What Technologies? • Hadoop – Cloudera Hadoop – HortonWorks Hadoop – Teradata Aster • Relational Technologies – Teradata – HP Vertica – IBM Netezza – EMC GreenPlum – Amazon Redshift (Postgres)
  17. 17. Hadoop Overview ODBC 2.0 ODBC 2.5 Improvement Hive 3h17m 51s 232x faster Impala 9m7s 11s 50x faster
  18. 18. Big Data Expectations
  19. 19. How Reasonable are your Expectations? Notebook HDD Server HDD SSD RAM Hadoop Tape Performance Cost
  20. 20. The Bloor Group Hard Disk Drives (HDD) Solid State Storage (SSD) Random Access Memory (RAM) Speed (t/TB) 3300s 1000-300s 1s Price $/TB $ 50 $ 500 $ 4 500 • Keep data in memory when the value obtained from processing it is high • Leave data on disk when it is inactive or the value from processing it is low
  21. 21. How to do it using QlikView
  22. 22. The Value in Big Data Comes from Context and Relevance Machine data, web data, cloud data Big Data cluster Operational systems Data warehouse Google BigQuery
  23. 23. The Value in Big Data Comes from Context and Relevance Business Discovery is about enabling the users to find their own path through a pre-defined Dataset. Structure needs to be defined by a QlikView document developer, though content could be refreshed periodically (conventionally) or impacted and triggered by the user (on demand).
  24. 24. The Value in Big Data Comes from Context and RelevanceMoreHistory More Categories They’re both the same number of bricks! The same volume of data, same schema. You choose what is relevant to your analysis.
  25. 25. Using QlikView with Big Data 1. Conventional Reloads with Document Chaining 2. Direct Discovery – Hybrid Approach 3. Reload on Demand
  26. 26. 1. Conventional Reloads • Reload available data into multiple QVW documents segmented by Region and current Financial Year reloaded Monthly • Entry Document contains Details for All Regions for Current Period only. Reloaded Daily • Use Document Chaining to navigate to/amongst Region- Year documents • A lot of Publisher capacity and Data Replication
  27. 27. 2. Direct Discovery • Reload available data into multiple QVW documents segmented by Region and current Financial Year reloaded Monthly • Entry Document provides Trends for All Regions for Any Period. Dimensions reloaded Daily. QvS generates aggregate SQL to draw Charts • Use Document Chaining to navigate to/amongst Region-Year documents containing Detail • Performance dependent upon Database
  28. 28. 3. On Demand Reloads • Entry Document provides some Aggregate KPIs for All Regions, but mostly just Dimension selection. • When User selects sufficient criteria, a Link is enabled to pass criteria to custom ASPX page. • ASPX page causes User document to be Reloaded with chosen criteria • User Document contains relevant subset entirely in Memory • Reload requires a little patience but then performance is great.
  29. 29. Demonstration
  30. 30. Demo – Document Chaining
  31. 31. Demo – Hybrid Approach - Direct Discovery // Direct Discovery v2 DIRECT QUERY DIMENSION OrderID, ProductID MEASURE UnitPrice, Quantity, Discount FROM “ Northwind"."dbo"."Order Details"; // Direct Discovery v1 DIRECT SELECT OrderID, ProductID FROM “ Northwind"."dbo"."Order Details"; // Conventional QlikView [Order Details]: SQL SELECT OrderID, ProductID, UnitPrice, Quantity, Discount FROM “Northwind"."dbo"."Order Details";
  32. 32. Demo – Hybrid Approach http://demo.qlikview.com/detail.aspx?appName=American%20Birth%20Statistics.qvw
  33. 33. Key Takeaways 1. The Most Common Purpose of Big Data Is to Produce Small Data 2. Big Data is About Relevance and Context 3. Know What You Want to Achieve
  34. 34. / 34 Questions? Business Discovery World Tour 9 October 2013
  35. 35. Thank You!