Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Smarter Analytics: Supporting the Enterprise with Automation

489 views

Published on

The Briefing Room with Barry Devlin and WhereScape
Live Webcast on June 10, 2014
Watch the archive:
https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=5230c31ab287778c73b56002bc2c51a

The data warehouse is intended to support analysis by making the right data available to the right people in a timely fashion. But conditions change all the time, and when data doesn’t keep up with the business, analysts quickly turn to workarounds. This leads to ungoverned and largely un-managed side projects, which trade short-term wins for long-term trouble. One way to keep everyone happy is by creating an integrated environment that pulls data from all sources, and is capable of automating both the model development and delivery of analyst-ready data.

Register for this episode of The Briefing Room to hear data warehousing pioneer and Analyst Barry Devlin as he explains the critical components of a successful data warehouse environment, and how traditional approaches must be augmented to keep up with the times. He’ll be briefed by WhereScape CEO Michael Whitehead, who will showcase his company’s data warehousing automation solutions. He’ll discuss how a fast, well-managed and automated infrastructure is the key to empowering faster, smarter, repeatable decision making.

Visit InsideAnlaysis.com for more information.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Smarter Analytics: Supporting the Enterprise with Automation

  1. 1. Grab some coffee and enjoy the pre-show banter before the top of the hour!
  2. 2. Smarter Analytics: Supporting the Enterprise with Automation The Briefing Room
  3. 3. Twitter Tag: #briefr The Briefing Room Welcome Host: Eric Kavanagh eric.kavanagh@bloorgroup.com @eric_kavanagh
  4. 4. ! Reveal the essential characteristics of enterprise software, good and bad ! Provide a forum for detailed analysis of today’s innovative technologies ! Give vendors a chance to explain their product to savvy analysts ! Allow audience members to pose serious questions... and get answers! Twitter Tag: #briefr The Briefing Room Mission
  5. 5. Twitter Tag: #briefr The Briefing Room Topics This Month: ANALYTICS & MACHINE LEARNING July: INNOVATIVE TECHNOLOGY August: BIG DATA ECOSYSTEM 2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room
  6. 6. Twitter Tag: #briefr The Briefing Room
  7. 7. Twitter Tag: #briefr The Briefing Room Analyst: Barry Devlin Dr. Barry Devlin is among the foremost authorities on business insight and one of the founders of data warehousing, having published the first architectural paper on the topic in 1988. With over 30 years of IT experience, he is a widely respected analyst, consultant, lecturer and author. His 2013 book, “Business unIntelligence—Insight and Innovation beyond Analytics and Big Data,” is available as hardcopy and e-book. Barry is founder and principal of 9sight Consulting. He specializes in the human, organizational and IT implications of deep business insight solutions that combine operational, informational and collaborative environments. A regular contributor to BeyeNETWORK and TDWI, Barry is based in Cape Town, South Africa and operates worldwide.
  8. 8. Twitter Tag: #briefr The Briefing Room WhereScape ! WhereScape is a data warehousing software company ! It offers WhereScape 3D, software for planning and reality-testing data warehousing and business intelligence projects; and WhereScape RED, an integrated development environment used for building, deploying and managing data warehouses and data marts. ! WhereScape RED allows developers to automate the data warehousing life cycle
  9. 9. Twitter Tag: #briefr The Briefing Room Guest: Michael Whitehead A data warehousing industry veteran, Michael Whitehead has spent more than a decade designing and building commercial data warehouses for customers in a wide variety of industries. Prior to founding WhereScape, Michael had Asia Pacific responsibilities for data warehousing for Sequent Computer Systems, Inc.
  10. 10. Michael Whitehead June 2014 Smarter Analytics
  11. 11. Why were sales down this week versus last year? Grocery Store with Class, Walter Watzpatzkowski, 15 /1/09
  12. 12. We promoted ice cream but the weather was unreasonably cold Grocery Store with Class, Walter Watzpatzkowski, 15 /1/09
  13. 13. Our competitor ran a better promotion Grocery Store with Class, Walter Watzpatzkowski, 15 /1/09
  14. 14. 1990s - Decision support system (For the time) large amounts of data, stored in various inscrutable file formats and database management systems. Want actionable information? Write a program. One program per analytical problem…. Reporting bureaus This model’s dysfuncBons created the need for data warehousing…
  15. 15. 2000s - Enterprise data warehousing Separate the refinement of raw data – regardless of the source – from the delivery of subsets of that data, to various decision-making constituencies. Build a solid, scalable information delivery infrastructure for the corporation. Support variability, and change, at both ends. Apply appropriate governance, risk management, compliance mechanisms. [And stabilize the supply side of the market, in the process…] A design paFern for stable, OperaBonalized informaBon refining and delivery
  16. 16. The economic conditions led to a change in demographics of the people walking past my store Grocery Store with Class, Walter Watzpatzkowski, 15 /1/09
  17. 17. 2014 - big data technologies Large amounts of data, stored in various inscrutable file formats and database management systems. Want actionable information? Write a program. One program per analytical problem…. Oh, and batch-oriented. And integrate-it-yourself. Instead of JCL, Pig. Instead of CICS and Comshare, Cloudera. In what way is this model a leap forward?
  18. 18. HOW DID WE GET HERE?
  19. 19. People built Data warehouses that don’t support analytics Grocery Store with Class, Walter Watzpatzkowski, 15 /1/09
  20. 20. 2014 – “self service” technologies Large amounts of data, stored in various inscrutable file formats AND data warehouses. Want actionable information? Create a dataset. One dataset per analytical problem…. The newer tech is great. Is the way it is used a leap forward?
  21. 21. Automation is key for better support of analytics Smith Cannery: Extension and Experiment StaBon CommunicaBons Photograph CollecBon (p120)
  22. 22. STEPS 1. Identify attributes 2. Identify business key 3. Index business key and add a unique constraint 4. Create surrogate key with auto sequence generation 5. Index surrogate key 6. Insert zero surrogate key row with values set for each attribute 7. Add a modified timestamp column 8. Write the SQL code to Insert new business keys or Update existing business key rows. Maintain the modified timestamp 9. Create any other indexes required for querying 10. Decide best practice for index maintenance during load. Keep in situ or drop and recreate after load. 11. Document procedure Etc Etc
  23. 23. Really? 1. Identify attributes 2. Identify business key 3. Index business key and add a unique constraint 4. Create surrogate key with auto sequence generation 5. Index surrogate key 6. Insert zero surrogate key row with values set for each attribute 7. Add a modified timestamp column 8. Write the SQL code to Insert new business keys or Update existing business key rows. Maintain the modified timestamp 9. Create any other indexes required for querying 10. Decide best practice for index maintenance during load. Keep in situ or drop and recreate after load. 11. Document procedure Etc Etc
  24. 24. What can be automated? • Profiling • Model conversion • Object creation • Code generation • Indexing • Impact analysis • Documentation
  25. 25. What it will look like? The new data warehouse
  26. 26. The new data warehouse Five Key Changes Pooling – new types of data, staged differently than we’ve staged pampered data, in the past. A multi-engine “logical” data warehouse: NoSQL à Not Only SQL Support for discovery, prototyping and evaluation of analytics Support for continuing data integration, through to the “end use” tier Automation of the data warehousing platform’s core functionality Back to best-­‐of-­‐breed, customer-­‐specific IntegraBon models
  27. 27. Conclusion Let’s not stuff it up (again) • Data people – challenge ourselves to do more, faster • Analysts – don’t give up on the data people
  28. 28. Twitter Tag: #briefr The Briefing Room Perceptions & Questions Analyst: Barry Devlin
  29. 29. un ^ Business Intelligence: Smarter Analytics: Supporting the Enterprise with Automation Dr Barry Devlin Founder & Principal 9sight Consulting Bloor Briefing Room 10 June 2014 Copyright © 2014 9sight Consulting, All Rights Reserved
  30. 30. Analytics (and big data ) emerged for business with social media and web logs § Understanding and tracking sentiment – What do you think? How do you react? – Basic analytics and BI activity on a new data source § Real-time insight into and influence on website activities – Why did you abandon your cart? – What would you most likely buy on getting a cross-sell? – Deep, real-time analytics and BI with operational integration 30 Copyright © 2014, 9sight Consulting
  31. 31. The Internet of Things adds urgency to a new automation of analytics and BI § Extends existing processes – Micro-management of supply chains and extension all the way to the consumer – Sourcing and delivery § Creates completely new business models – Often depending on analytics – Motor insurance à encouragement & prevention – Hospital care à health monitoring 31 Copyright © 2014, 9sight Consulting
  32. 32. The biz-tech ecosystem reflects the complexity of today’s business. Speed of decision and appropriate action Business Information Technology Customer interaction and technical savvy Competition Mobile devices Information abundance and variety Market flexibility and uncertainty Externally-sourced information 32 Copyright © 2014, 9sight Consulting
  33. 33. The architecture for the biz-tech ecosystem consists of information pillars. § Single architecture for all types of data/information – Mix/match technology as needed – Relational, NoSQL, Hadoop, etc. § Integration of sources and stores – Instantiation gathers measures, events, messages and transactions – Assimilation integrates stored info. – Reification virtualizes access § Data flows as fast as needed and reconciled when necessary – No unnecessary storage or transformations – (Contrast layered data architecture) Reification Process-mediated (data) Assimilation Context-setting (information) Transactional (data) Transactions Human-sourced (information) Machine-generated (data) Instantiation Measures Events Messages 33 Copyright © 2014, 9sight Consulting
  34. 34. Information pillars can be mapped to today’s BI and analytics tools and environments. § Process-mediated data – Traditional computing – Via data entry, cleansing processes – Relational databases § Machine-generated data – Output of machines and sensors – The Internet of Things – NoSQL, Streaming, (RDBMS) § Human-sourced information – Subjectively interpreted record of personal experiences – From Tweets to Videos – Hadoop, Enterprise Content Management EDW BI Process-mediated (data) Assimilation Oper. Analytics Pred. Analytics Context-setting (information) OLTP Transactional (data) Transactions Human-sourced (information) Machine-generated (data) Instantiation Measures Events Messages 34 Copyright © 2014, 9sight Consulting
  35. 35. From BI to Business unIntelligence § Information, knowledge and meaning – Understanding real world context § Process, predefined and emergent – Automating the creation and use of information § Beyond bounded rationality – How decisions are really made § http://bit.ly/BunI-Technics : 25% discount with code “BIInsights25” 35 Copyright © 2014, 9sight Consulting
  36. 36. Dr Barry Devlin Founder & Principal 9sight Consulting Thank you! Additional resources § All articles and white papers available at: http://bit.ly/9sight_papers § Blogs at: http://bit.ly/BD_Blog § Follow me on Twitter: @BarryDevlin Copyright © 2014 9sight Consulting, All Rights Reserved 36
  37. 37. Questions (1) 1. The Enterprise Data Warehousing architecture of the 2000s (I would say 1990s) was driven by the business need for consistency / reconciliation of data from many sources. It’s perhaps suboptimal for timeliness (real-time data) and maintenance (multiple layers of ETL function). How can the sort of automation you’re proposing help in these two areas? 2. You compare 1980s and 2014 approaches asking how this model is a “leap forward.” One difference is users’ (data scientists) skills with technology. Wouldn’t automation disempower such users? 3. What would a warehouse that “supports Analytics” look like? 4. You say “Automation is the key for better support of analytics,” but how does automation support the agility and flexibility needed for analytics? 5. A big idea in analytics is “model on read.” Automation typically requires/provides “model on write.” How do you address these very opposite needs? 37 Copyright © 2014, 9sight Consulting
  38. 38. Questions (2) 6. Your pooling tier reminds me of the “Data Lake” – of which I’m not a big fan! Why would I want to bring “pampered data” ( I assume traditional data) through this pool? Seems like an additional / unnecessary step? 7. What engines (other than SQL) do you envisage? Which do / will you support? 8. Can you describe what the linkage between the different engines means? If integration how is it done? 9. What data integration support do you envisage in the “end use” tier? 10. Overall, how do you see your existing products evolving to implement the various aspects of this architecture? Does the relational database remain the core component, or do you envisage a more central role for Hadoop, as in Cloudera’s Enterprise Data Hub? 38 Copyright © 2014, 9sight Consulting
  39. 39. Twitter Tag: #briefr The Briefing Room
  40. 40. This Month: ANALYTICS & MACHINE LEARNING July: INNOVATIVE TECHNOLOGY August: BIG DATA ECOSYSTEM www.insideanalysis.com/webcasts/the-briefing-room Twitter Tag: #briefr The Briefing Room Upcoming Topics 2014 Editorial Calendar at www.insideanalysis.com
  41. 41. Twitter Tag: #briefr THANK YOU for your ATTENTION! The Briefing Room

×