The role of Data Virtualisation in your EIM strategy


Published on

A description of Data Virtuakisation and discussion on the role it can play in an Enterprise Information Management strategy

Published in: Technology, Business

The role of Data Virtualisation in your EIM strategy

  1. 1. How do you want your data served? Use this layout for a title with a horizontally striped picture.The role of Data Virtualisation inyour EIM StrategyChristopher Bradley, IPL Intelligent 1
  2. 2. Presenter Chris Bradley Head of Business Consulting +44 1225 475000 Use this layout for a title @InfoRacer My blog: Information Management, Life & Petrol with a vertically striped picture. Intelligent Business2
  3. 3. Introduction & Agenda Use this layout for a title with a horizontally striped picture.7 I Intelligent Business
  4. 4. Chris Bradley Summary: Chris Bradley Recent speaking engagements: DAMA UK & BCS Data Management Group:, June 11th 2009; London, DAMA International (DAMA / Wilshire), March 5th -8th 2007, Boston, MA “Evolve or Die - Data Modelling is not just for DBMS’s”30 years Information BPM Europe: (IRM), September 2009, London: ½ day workshop “Data as a service”Management experience “Panel of Data Modelling experts” “An introduction to Data and the BPMN” CDi_MDM Summit (IRM UK), April 30 – May 2nd 2007, London, Data Migration Matters: October 1st 2009, London,MOD, Volvo, Thorn EMI, “A Data Architecture for Data Governance” “Designing for Success”Coopers & Lybrand, IPL DAMA UK: June 15th 2007, London, Data Management & Information Management Europe: (DAMA / IRM), November 2-5 2009, London, “Data Modelling – Where did it all go wrong?” “Modelling is NOT just for DBMS’s anymore”Sample Clients: BP, Data Governance Conference, (Debtech / Wilshire) June 25 -28, 2007, San Francisco, CA, “Meet the Metadata Professional Organisation”Enterprise Oil, Statoil, “Data Architecture for Governance – case study” IPL & Embarcadero seminar series: (Bristol, London, Manchester, Edinburgh), October Enterprise Data World International: (DAMA / Wilshire), March 14th – 19th 2010, SanExxon Mobil, Audit 2007, Francisco CA,Commission, MoD, Merrill “Data Modelling – Where did it all go wrong?” “How to communicate with the business using high level models” IPL & DataFlux Seminar Series: (IPL/DataFlux), March 26th 2010, Bath, UK. “TheLynch, Barclays, DoD, DQ/IM & DAMA Europe (IRM London), November 2007, Information Advantage – Exploiting Information Management For The Business”Imperial Tobacco, GSK …. “Data Modelling as a service” Data Governance Conference: (Debtech / Wilshire) Florida, December 2007, BeyeNETWORK Webinar: (CA/BeyeNETWORK), March 31st 2010, Webinar. “Data Governance 2.0” “Communicating with the Business through high level data models”Experience: Data DAMA International: (DAMA / Wilshire), March 16th – 21st 2008, San Diego, CA. Enterprise Architecture Europe: (IRM), June 16th – 18th 2010, London: ½ dayGovernance, Master Data “Modelling for SoA” workshop “The Evolution of Enterprise Data Modelling”Management, Enterprise “XML amd data models” ECIM Exploration & Production: September 13th 15th 2010, Haugesund, Norway:Information Management DAMA International: (DAMA / Wilshire), March 16th – 21st 2008, San Diego, CA. “Establishing Data Modelling as a Service in BP” “Information Challenges and Solutions” BPM Europe: (IRM), September 2008, London: Information Management in Pharmaceuticals: September 15th 2010, London,Author & conference “BPMN for Dummies” “Clinical Information Management – Are we the cobblers children?”speaker DAMA Europe: (IRM / DAMA), November 2008, London, BPM Europe: (IRM), September 27th – 29th 2010, London, “Learning to Love BPMN 2.0” “BPMN for Dummies” DAMA Scandinavia: October 26th-27th 2010, Stockholm, “Incorporating ERP SystemsCDMP(Master), CBIP, “Data Modelling as a service” into your overall Models & Information Architecture” Data Governance Europe Sysmposia: (IRM / Debtech; London), February 2009, Data Management & Information Management Europe: (DAMA / IRM), NovemberPrince2, APM 2010, London, “How do you get a Business person to read a Data Model? “Data Governance Challenges in a Major Multi National” Webinar series: (Embarcadero Technologies & IPL), Oct 2008 – Feb 2009, Data Governance & MDM Europe: (DAMA / IRM), March 2011, London,Director DAMA UK & MPO “Clinical Information Data Governance” “The New Formula for Success – Moving Data Modelling beyond the Database” Data Rage 2009: March 17-19 2009, Enterprise Data World International: (DAMA / Wilshire), April 2011, Chicago IL,BeyeNetwork Expert “Evolve or Die – Modelling is not just for DBMS’s anymore” “How do you want yours served? – the role of Data Virtualisation and Open Source BI”Channel Author “Data Modelling as a service”“Information Asset Enterprise Data World International: (DAMA / Wilshire), April 5th -12th 2009, Tampa FL,Management” “Exploiting Models for effective SAP implementations” Chairing panel of experts “Keeping modelling relevant” Panel of experts “Issues in information internationalisation” October 1st 2009 “Modelling is not just for RDBMS’s” DAMA UK & BCS Data Management Group:, June 11th 2009; London, The Kings Fund London Intelligent Business 8 “Evolve or Die - Data Modelling is not just for DBMS’s”
  5. 5. Chris Bradley Summary: Chris Bradley Recent publications:30 years Information Database Marketing Magazine, February 2009, “Preventing a Data Disaster”Management experience, Volvo, Thorn EMI, Data Modelling For The Business – A Handbook for aligning the business with IT using high-level data models;Coopers & Lybrand, IPL Technics Publishing; ISBN 978-0-9771400-7-7; Clients: BP, Level/dp/0977140075/ref=sr_1_4?ie=UTF8&s=books&qid=1235660979&sr=1-4Enterprise Oil, Statoil, BeyeNETWORK “Chris Bradley Expert Channel” Information Asset ManagementExxon Mobil, Audit, MoD, Merrill Article “Data Modelling is NOT just for DBMS’s” (July 2009)Lynch, Barclays, DoD, and (August 2009)Imperial Tobacco, GSK …. Data Article: Information Management Deficiency Syndrome (September 2009)Governance, Master Data, Enterprise Article: Drowning in spreadsheets (September 2009)Information Management & conference Article “Seven deadly sins of data modelling” (October 2009)speaker Article “How do you want yours served (data that is)” (December 2009)CDMP(Master), CBIP,, APM Article “How Do You Want Your Data Served?” Conspectus Magazine (February 2010)Director DAMA UK & MPO Article “10 easy steps to evaluate Data Modelling tools” Information Management, (March 2010)BeyeNetwork Expert Article “Big Data, Same Problems” TechTarget (July 2011)Channel Author“Information AssetManagement” October 1st 2009 The Kings Fund London Intelligent Business 9
  6. 6. Agenda1. An Enterprise Information Management Framework2. What is Data Virtualisation?3. 5 ways where EII / Data Virtualisation can add value to Data Warehousing4. 6 key considerations when deciding upon Data migration and take on (ETL vs EII or both?)5. Information Management issues in the BI world.6. IM Certification & Competencies Intelligent Business 10
  7. 7. 1. IPL’s Information Architecture FrameworkArchitecture: Framework: GoalsOrderly arrangement Principles Purpose Components ofand structure for the Architectureassets Governance Planning People Lifecycle Services Process Quality Management Infrastructure Structure Models / Taxonomy Catalog / Meta data Data Structured Types Transaction Unstructured Master Data MI/BI Data Technical Data Data Data Intelligent Business 11
  8. 8. Information Architecture Framework Components 1. Goals / Principles Goals 2. Governance Principles 1 3. Planning Governance Planning (Information Asset Strategy and Roadmap) 2 3 4. Information Quality Process Quality Lifecycle Services Management Infrastructure 5. Life Cycle Management 4 5 6 Processes Models / Taxonomy Catalog / Meta data 6. Services Infrastructure 7 8 (Data Integration, Distribution, etc) Structured Transaction Unstructured Master Data MI/BI Data Technical Data Data 7. Information Models 9 Data (includes Information relationship models) 8. Information Catalog / Meta 9. Master Data Management Data Services Intelligent Business 12
  9. 9. Information Architecture is one of the fourcomponents of the overall Enterprise Architecture Business strategy, Business Organization, and Core business processes Architecture Applications Information Architecture ERP, etc Enterprise Data Architecture Model & Catalog, etc. Technology Architecture Desktop, network, Data centre strategy Intelligent Business 13
  10. 10. Turning data into Business wisdom Data 10,000 feet Information Your current altitude is 10,000 feet Knowledge There is a mountain ahead, peak of 12,000 feet Wisdom Climb immediately to 15,000 feet Intelligent Business14
  11. 11. Now – That should clear up a few things around here! Businesses NEED a common vocabulary for communication Intelligent Business 15
  12. 12. 2. What is Data Virtualisation? Use this layout for a title with a horizontally striped picture.A primer ..... 16 I Intelligent Business
  13. 13. Virtualise Intelligent Business17
  14. 14. Genres of Virtualisation Data Virtualisation Abstracts data from location and complexity RDBMS Data Web Packages Warehouses Excel Services Storage Virtualisation Abstracts logical storage from physical storage Disk 1 Disk 2 Disk 3 Disk 4 Application / Server Virtualisation Abstracts logical apps & servers from physical apps & servers Intelligent Business18 Application 1 Application 2 Server 1 Server 2
  15. 15. Key Purpose of VirtualisationOvercome (mask) Complexity Hardware Software Improve Agility New solutions Existing solutions Reduce Costs Operating New development Intelligent Business19
  16. 16. Data Virtualisation in a Nutshell BI, MI and Portals and Enterprise Custom Apps Reporting Dashboards Search Star SQL Web Services Virtual Virtual Relational Data Marts Shareable Data Operational ViewsData Model Services Data Stores Intelligent Business 20 Legacy Packages RDBMS Web Files Mainframes Services
  17. 17. What are the Business challenges DV addresses? Mergers & Acquisitions Business Cost SavingsChallenges Sales Growth Risk ReductionBusinessSolutions Complexity Disparity Data Location Performance CompletenessIntegrationChallenge Security, Quality, Governance Data Sources Intelligent Business 21
  18. 18. What DV Does Data Virtualisation Intelligent Business 22
  19. 19. Typical Data Integration Architectures BI Tools/Apps. Master Data Mgmt. Operational Apps. Inter-enterpriseCommon Design, Admin., Physical Movement and Abstraction / Virtual Synchronization Consolidation (ETL, Consolidation and Propagation CDC) (Data Federation) (Messaging) Governance Common Metadata Common Connectivity Pace of Business change & requirement for agility demands that Intelligent Business 23 organizations support multiple styles of data integration
  20. 20. How DV differs Physical Movement and Abstraction / Virtual Synchronization Consolidation (ETL, Consolidation and Propagation CDC) (Data Federation) (Messaging)Middle- ETL CDC Data Virtualization EAI / ESB warePurpose DB  DB DB  DB DB  Application Application  Application Event EventAttribute Scheduled On Demand Driven Driven Intelligent Business 24
  21. 21. How DV Works – Example Scenario1) I need to build an application that looks like this…2) The view or data service needs to look like this… 3) And the data comes from these sources… Intelligent Business25
  22. 22. Traditional Integration with ETL and Data WarehousesTraditional Approach1. Design entire DW schema2. Develop ETL3. Refresh on batch basis4. Application gets data from DWIssues Slow development cycle Replicated data Batch latencies Physical stores overhead Intelligent Business 26
  23. 23. Data Virtualisation designDesign Steps1. Discover data2. Model individual view/service3. Validate view/service Data model layerBenefits Faster time to solution Easy to learn and use tools Extensible / reusable objects Conform data to a standard data model Intelligent Business 27
  24. 24. Data Virtualisation ProductionProduction Steps1. Application invokes request2. Optimized data access and retrieval (single query) Optimizer3. Deliver data to applicationBenefits Less replication High performance Up-to-the-minute data Intelligent Business 28
  25. 25. Data Virtualisation Production with CachingProduction Steps1. Cache essential data2. Application invokes request3. Optimized data access and retrieval (leveraging cached data) Optimizer Cache4. Deliver data to applicationBenefits Removes network constraints 7-24 availability Optimal performance Intelligent Business 29
  26. 26. 3. Five example usage patterns Use this layout for a title with a horizontally striped picture.Where Data Virtualisation canadd value to Data Warehousing 30 I Intelligent Business
  27. 27. Prototyping Data Warehouse DevelopmentIn traditional DW development,time taken for schema changes,adding new data sources andproviding data federation are oftenconsiderable.Use DV to prototype a developmentenvironment rapidly buildinga virtual DW rather thana physical one.Reports, dashboards andso on can be built on thevirtual DW.After prototyping the physical DWcan be introduced if theusage merits. Packages Databases Files XML Intelligent Business 31
  28. 28. Enriching the DW ETL Process Frequently new data sources particularly from ERPs are required in the DW. Often the ETL lacks data access capabilities to complex sources. Tight processing windows may require access, aggregation & federation activities to be performed prior to the ETL process. Powerful data access capabilities of EII provide rich access and federation capabilities which can present virtual views to the ETLDW process which continues as though using a simpler data source. Intelligent Business 32
  29. 29. Federating Data Warehouses Many organisations have more than one DW Is the Information in each DWDW DW completely discrete? Data Virtualisation provides powerful options to federate multiple DW’s by creating an integrated view across them. This has particular relevance in providing rapid cross warehouse views following a merger or acquisition. Intelligent Business 33
  30. 30. DW Extension Business Users Require Data From Outside the Data Warehouse so they can meet reporting and operational needs.DW Historical data from the warehouse and up-to-the-minute data from transaction systems or operational data stores is required. Summarized data from the warehouse and drill-down detail from transaction systems or operational data stores is required. Data Virtualisation can Extend Existing Data Warehouses quickly and easily to work around the fact that key data users need resides outside the consolidated data warehouse. Intelligent Business 34
  31. 31. Complete Master Data ViewMaster MDM applications alone cannot fully support all requirements as data exists outside of MDMData hub. Complementary data integration solutions areHub needed to deal with data maintained outside of MDM hubs often in complex, disparate data silos. DV can extend the Master Data and provide a complete 360o view by using master data from the hub as the foreign key to quickly and easily federate master data with additional transactional and historical data to get a complete single view of master data. Intelligent Business 35
  32. 32. 4. Data migrationand take on Use this layout for a title with a vertically striped picture.6 key considerations:ETL vs EII /DV or both? 36 I Intelligent Business
  33. 33. Some Migration Considerations What data have we got? E-discovery Data owners vs. users What other data do we require? Source model vs target model Move all the data or leave some in place? Do we use EII vs ETL (or even both) Intelligent Business37
  34. 34. EII or ETL?1. Will the data be replicated inboth the DW and the OperationalSystem? • Will data will need to be updated in one or both locations? • If data is physically in two locations beware of regulatory & compliance issues (e.g. SoX, HIPPA, BASEL2, FDA etc) Intelligent Business 38
  35. 35. EII or ETL?2. Data Governance • Is the data only to be managed in the originating Operational System? • What is the certainty that DW will be a reporting DW only (vs Operational DW)? Intelligent Business 39
  36. 36. EII or ETL?3. Currency of the data, i.e. Does itneed to be up to the minute? • How up to date are the data requirements of the DW? • Is there a need to see the operational data? Intelligent Business 40
  37. 37. EII or ETL?4. Time to solution i.e. howquickly is the solution required? • Immediate requirement? • Confirmed users & usage? Vs.. • ..Flexible, emerging requirements? Intelligent Business 41
  38. 38. EII or ETL?5. What is the life expectancy ofsource system(s)? • Are the source systems likely to be retired? • Will new systems be commissioned? • Are new sources required? Intelligent Business 42
  39. 39. EII or ETL?6. Need for historical / summary /aggregate data • How much historical data is required in the DW solution? • How much aggregated / summary data is required in the DW solution? Intelligent Business 43
  40. 40. 5. BI &InformationManagement Use this layout for a title with a vertically striped Maybe picture.spreadsheetsaren’t such agood solution after all! Intelligent Business 44
  41. 41. Effective IM IS crucial todayHigher volumes of data generated by organisations Information is all pervasive – if you don’t have a strategy to manage it, you will certainly drown in itProliferation of data-centric systems ERP, CRM, ECM…Greater demand for reliable information Accurate business intelligence is vital to gain competitive advantage, support planning/resourcing and monitor key business functionsTighter regulatory compliance Far more responsibility now placed on organisations to ensure they store, manage, audit and protect their data (SoX, BASEL, SOLVENCY2, HIPPA, FDA ...)Business change is no longer optional – it’s inevitable Mergers/acquisitions, market forces, technological advances… Intelligent Business45
  42. 42. Excel, BI and IM ! Several users within a business are adept at manipulating large data extracts in Excel Easily derive new fields Pivot data Aggregate data Produce charts and dashboards. “All good”, you might say? Intelligent Business47
  43. 43. Excel, BI and IM ! A “new” copy of the source data is now in your spreadsheet You are now (unwittingly) a data steward! What are the rules & calculations for derivations? Where does the additional data come from? Charts / graphs potentially disconnected from data Distribution leading to data duplication & amendment What’s the lineage & provenance of the data now? Intelligent Business48
  44. 44. A Happy Path? Go back to the source Avoid “Cottage Industry” reporting Record metadata regarding the extract and don’t change its values If you must correct data, correct at source Ensure calculations make sense and are properly annotated and tested Clearly label distributed versions vs originals. Identify versions Don’t re-issue your local copy of the source data - redirect any data requests to the source Intelligent Business49
  45. 45. 6. Certification &Competencies Use this layout for a title with a vertically striped picture. Intelligent Business50
  46. 46. What is CDMP? CDMP stands for “Certified Data Management Professional” It is the only non-proprietary, widely recognized data management certification. The certification program was jointly constructed by DAMA International (DAMA) and the Institute for Certification of Computer Professionals (ICCP). DAMA owns the CDMP certification, and ICCP administers and delivers exams, provides all record keeping. Intelligent Business51
  47. 47. Why do I need it? “Certification, in itself, is not a goal, but Professionalism is.” Dr. Paul M. Pair, ICCP Fellow Credential Increase in Salary Company Requirement Credibility within Organisation Professional Growth Credibility with Customers Self Evaluation Greater Self Esteem Financial Reward Solve Problems Quicker Other Why People Certify Primary Achievement Resulting from Certification Intelligent Business Source: ICCP Research Study (Athabasca University))52
  48. 48. Which Specialty Exam? Intelligent Business53
  49. 49. IPL’s Information Management Framework Goals Principles 1 Governance Planning 2 3 Lifecycle Infrastructure Quality Management Services 4 5 6 Models / Taxonomy Catalog / Meta data 7 8 Structured Transaction Unstructured Master Data MI/BI Data Technical Data Data Data 9 10 Intelligent Business 54
  50. 50. Maturity Model – Information Governance 2 Level 1 - Initial Level 2 - Repeatable Level 3 - Defined Level 4 - Managed Level 5 - Optimised No clear data Data Ownership Defined Data Data Ownership Data Ownership ownership assigned. Model does not exist. Ownership Model Model is Model has been Data Owners, if any, Owners exists. Ownership implemented for the extended such that evolve on their own commissioned in the Model is loosely key data entities. the majority of data during project short-term for applied to key data Collaboration assets are under rollouts (i.e. self specific projects & entities. Limited between active stewardship. appointed data initiatives. Often collaboration. Not stakeholders in place. Effective governance owners). No standard department or silo fully bought in to Governance process process employed by tools or focused leading to data ownership at an regularly reviews this stakeholders & documentation ownership by “Data enterprise level. model and its stewards. Well available for use Teams” or “Super application, updating defined standards across the whole Users” that manage and improving as adopted. enterprise. “all” data. needed. Benefits begin to be realised. Intelligent Business 55
  51. 51. Maturity Model – Quality 4 Level 1 - Initial Level 2 - Repeatable Level 3 - Defined Level 4 - Managed Level 5 - Optimised Limited awareness The quality of few Quality measures Data quality is The measurement of within the enterprise data sources is have been defined measured for all key data quality is of the importance of measured in an ad for some key data data sources on a embedded in many information quality. hoc manner. A sources. Specific regular basis. Quality business processes Very few, if any, number of different tools adopted to metrics information across the enterprise. processes in place to tools used to measure quality with is published via Data quality issues measure quality of measure quality. The some standards in dashboards etc. addressed through information. Data is activity is driven by a place. The processes Active management the data ownership often not trusted by projects or for measuring quality of data issues model. Data quality business users. departments. are applied at through the data issues fed back to be Limited consistent intervals. ownership model fixed at source. understanding of Data issues are ensures issues are good versus bad addressed where often resolved. quality. Identified critical. Quality issues are not considerations baked consistently into the SDLC. managed. Intelligent Business 56
  52. 52. Maturity Model – Master Data 9 Level 1 - Initial Level 2 - Repeatable Level 3 - Defined Level 4 - Managed Level 5 - Optimised Limited awareness of The impact of master Definition of an A complete MDM A full integrated MDM. Master Data data issues gain MDM strategy is in strategy has been MDM hub exists and domains have not recognition within progress. Master defined and adopted. has been adopted been defined across the enterprise. data domains have MDM joined up with across the enterprise the enterprise. Silo Limited scope for been identified. data governance and for all key master based approach to managing master Several domains are data quality data domains. The data models means data due to lack of targeted for initiatives. Robust hub controls access multiple definitions Data Ownership delivering master business rules to master data of potential master Model. Project or data to specific defined for master entities. Many data entities, such as department based applications or data domains. Data applications access customer, exist. initiatives attempt to projects. Differing cleansing and the MDM Hub understand the products may be standardisation through a service enterprises master adopted in these performed in the layer. Business users data. No MDM silos for MDM. Senior MDM hub. Specific are fully responsible strategy defined. management support products adopted for for master data. for MDM grows. MDM. Master data models defined. Intelligent Business 57
  53. 53. As-Is IM Principles 5 Business 4 Data Governance Intelligence 3 Master Data 2 IM Planning Management 1 0 As-Is Catalog & Data Quality Metadata Models & IM Lifecycle Taxonomy Management Integration & Intelligent Business58 Access
  54. 54. Use this layout for a title with a horizontally striped picture. 59 I Intelligent Business
  55. 55. SummaryData Virtualisation opensup a brave new worldFor data migration,ETL isn’t “the only way”Effective InformationManagement is crucial Intelligent Business60
  56. 56. Contact details Chris Bradley Business Consulting Director +44 1225 475000 @InfoRacer My blog: Information Management, Life & Petrol Intelligent Business61
  57. 57. Further information:Articles including: • Seven deadly sins of data modelling • The IT Credibility Crunch • Information Management Deficiency Syndrome • Modelling is not just for DBMS’s • Data mining - where’s my hard hat? • Master data mix-ups • Drowning in spreadsheets • Why bother with a semantic layer? • Business Intelligence in a cold climate • Data Management is everybodys business • Information superstitionDownload from: Intelligent Business 62