Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data mart consolidation


Published on

This paper details the process of DMC at eight different organizations while capturing the keys to success from each. These case studies were specifically selected to demonstrate several variations on the concept of consolidation. While there is no such thing as a �cookie-cutter� DMC process, there are common best practices and lessons to be shared.

  • Be the first to comment

  • Be the first to like this

Data mart consolidation

  1. 1. White PaperData Mart Consolidation: Repenting for Sins of the PastWilliam McKnightMcKnight Consulting Group www.m c k n i g h t cg. c o m
  2. 2. Data Mart Consolidation (DMC)ContentsPart 1 Data Mart Consolidation (DMC): The Business Rationale 1Building the Case for Data Mart Consolidation 1The Benefits of the Program Approach to Data Warehousing 1Desired Outcomes of Data Mart Consolidation 2Approaches to Data Mart Consolidation 2Part 2 The Interviews 33M 3 The Pre-Consolidation Environment 3 Reasons for Consolidation 3 The Consolidation Project 4 The Benefits Realized 4 The Post-Consolidation Environment 4Delta Air Lines 5 The Pre-Consolidation Environment 5 Reasons for Consolidation 5 The Consolidation Project 6 The Benefits Realized 6 The Post-Consolidation Environment 7Michigan Department of Community Health 7 The Pre-Consolidation Environment 7 Reasons for Consolidation 7 The Consolidation Project 8 The Benefits Realized 8 The Post-Consolidation Environment 8Healthcare Insurance Company 9 The Pre-Consolidation Environment 9 Reasons for Consolidation 9 The Consolidation Project 9 The Benefits Realized 10 The Post-Consolidation Environment 10
  3. 3. Data Mar t Consolidation (DMC)Royal Bank of Canada 10 The Pre-Consolidation Environment 10 Reasons for Consolidation 10 The Consolidation Project 11 The Benefits Realized 11 The Post-Consolidation Environment 12Major Telecommunications Company 12 The Pre-Consolidation Environment 12 Reasons for Consolidation 13 The Consolidation Project 13 The Benefits Realized 13 The Post-Consolidation Environment 13Anthem Blue Cross Blue Shield 14 The Pre-Consolidation Environment 14 Reasons for Consolidation 14 The Consolidation Project 15 The Benefits Realized 15 The Post-Consolidation Environment 15Sekisui Systems Corporation 16 The Pre-Consolidation Environment 16 Reasons for Consolidation 16 The Consolidation Project 16 The Benefits Realized 17 The Post-Consolidation Environment 17Part 3 Best Practices for Data Mart Consolidation 18Best Practices for DMC 18Customer-Reported Keys to DMC Success 19Author’s Additional Keys to DMC Success 19About the Author 20
  4. 4. 1 Data Mart Consolidation (DMC) Part 1: Data Mart Consolidation (DMC): The Business Rationale Building the Case for Data Mart Consolidation For much of the last decade, conventional theories surrounding decision support architectures have focused more on cost than business benefit. Lack of Return on Investment (ROI) quantification has resulted in platform selection criteria being focused on perceived minimization of initial system cost rather than maximizing lasting value to the enterprise. Often these decisions are made within departmental boundaries without consideration of an overarching data warehousing strategy. This reasoning has led many organizations down the eventual path of data mart prolif- eration. This represents the creation of non-integrated data sets developed to address specific application needs, usually with an inflexible design. In the vast majority of cases, data mart proliferation is not the result of a chosen architectural strategy, but a consequence due to lack of an architectural strategy. To further complicate matters, the recent economic environment and ensuing budget reduction cycles have forced IT managers to find ways of squeezing every drop of performance out of their systems while still managing to meet users’ needs. In other words, we’re all being asked to do more with less. Wouldn’t it be great to follow in others’ footsteps and learn from their successes while still being considered a thought leader? The good news is that the data warehousing market is now mature enough that there are successes and best practices to be leveraged. There are proven methods to reduce costs, gain efficiencies, and increase the value of enterprise data. Pioneering organizations have found a way to save millions of dollars while providing their users with integrated, consistent, and timely information. The path that led to these results started with a rapidly emerging trend in data warehousing today – Data Mart Consolidation (DMC). I’ve learned that companies worldwide are embracing DMC as a way to save largeI’ve learned that companies amounts of money while still providing high degrees of business value with ROI. DMC is an answer to the issues many face today. There is a way to cut BI costs and continueworldwide are embracing to deliver business value with BI. Others have done it and I’m going to share how theyDMC as a way to save large did it in this paper.amounts of money while still This paper details the process of DMC at eight different organizations while capturingproviding high degrees of the keys to success from each. These case studies were specifically selected to demon- strate several variations on the concept of consolidation. While there is no such thing asbusiness value with ROI. a “cookie-cutter” DMC process, there are common best practices and lessons to be shared. The Benefits of the Program Approach to Data Warehousing Tenets of sound business practices apply to data warehousing. One of these is the necessity to accomplish an objective in the most efficient manner. What is the most efficient way to accomplish data warehousing objectives? It’s the way that builds a data warehouse to solve specific needs, but does so in a manner that leverages previous investment in the architecture, tools, processes, and people and does not prohibit future growth. This enables an efficient, programmatic approach to data warehousing created to serve information to the enterprise. By leveraging an integrated data warehousing approach you will realize efficiencies generated by economies of scale.
  5. 5. 2Data Mar t Consolidation (DMC)Efficiency as it relates to DMC comes in three primary forms. There are true costefficiencies involving the hardware, software and personnel carrying costs of theenvironment and switching the costs over to a more manageable expense stream.Many in this study referred to these as “IT benefits” but lower Total Cost of Ownership(TCO) and economies of scale are business benefits as well. With one data warehousingprogram as opposed to many, fewer resources and processes need to be supported in anenterprise.Secondly, there are efficiencies associated with having a “single version of the truth” A central warehouse helps setto reference as opposed to engaging in internal “data warfare” or spending most of the“analysis” time searching for data or “making do” with undesirable, outdated data. aside the politics of whoseAs the interviews will attest, many companies were engaged in “data warfare,” but it’s data is better by establishingnot simply a matter of whose data is better. In many organizations, the best data is notaccessible or the users are not trained on the access method. A central warehouse helps a consistent, trustworthyset aside the politics of whose data is better by establishing a consistent, trustworthy source of information.source of information. Creating a “single version of the truth” drives internal efficien-cies by focusing resources on the value-added activities of business rather than datagathering activities.Thirdly, there are system efficiencies to be gained by eliminating redundant processes.For example, although many are using the file delivery capabilities of operationalsystems to feed data to their data warehousing environment, getting data out of thesource is still one of the most difficult tasks in data warehousing. Usually the firstextract request is not met with “open arms.” A second or third one can be impossible.This leads many to a “single extract, many load” architecture which solves someproblems but not others.Fortunately for those who have met the challenges, data warehousing has proved itselftime and time again as a valid conduit for delivering data and data analysis into businessprocesses and thereby improving them while helping the company achieve their statedgoals. DMC allows organizations to reap the benefits of integrated, centralized datawarehousing while delivering significant cost savings through internal efficiencies.In essence, it is the grand slam of IT initiatives.Desired Outcomes of Data Mart ConsolidationData warehousing is a process, not a project, and a journey rather than a destination.This applies to DMC as well. The case studies below represent several forms thatDMC can take including merging data marts into a new warehouse, picking an existingwarehouse/mart and merging other warehouse/marts into it, and moving analyticalfunctionality from other databases onto a data warehouse. The consolidation itself canleverage existing designs and re-route Extract Transform & Load (ETL) processes intothe consolidated warehouse or consolidate designs as well as the platform.This paper provides a framework of DMC reference points, lays out options for DMCand provides best practices for those considering, planning or doing some form of DMC.Approaches to Data Mart ConsolidationApproaches and steps to DMC as well as maturity levels with DMC emerged fromthe interviews. 1. Rehosting – The process of picking up database designs and ETL “lock, stock and barrel” and moving it to a different platform either as an effort to gain performance or cost advantages. Often the rehosting will be done onto a platform with existing data constructs, thereby expanding the utility of the platform.
  6. 6. 3 Data Mart Consolidation (DMC) 2. Rearchitecting – The process of merging database designs and therefore the data acquisition strategy for the data as well. Rearchitecting may involve picking the best model components from various models and/or it may involve more zero-based approaches, starting from scratch, that use requirements as the basis for the new model. Part 2: The Interviews 3M The Pre-Consolidation Environment 3M is a multi-faceted company that had a data mart environment which represented its diversity. Before the consolidation, they had 40 major data marts, several smaller ones, and some previous failed attempts at a data warehouse in the environment. Previous attempts at a more encompassing data warehouse had proved to be too constrained and inflexible to make it very far so a data mart environment had perpetuated over the years. The marts were solving numerous business objectives including decision support, financial and sales reporting. There were 25 different platforms in place, “just about everything” according to Al Messerli, the former Director of the Enterprise Information Management Group at 3M and now with Allen Messerli Enterprise Systems, LLC. This included many UNIX, some Windows NT and some mainframe systems. All together, it was many terabytes and the environment had grown tremendously over 30 years so obviously it began well prior to market acceptance of data warehousing. This was a firmly entrenched environment many years in the making and it was going to be a challenge to consolidate it! ETL was being done mostly through “pushes” from the operational environments with data pickup and movement through proprietary methods. There were also all kinds of data access tools and methods deployed in the pre-consolidated environment. As a result, major subject areas of sales, product, and customer were duplicated across these data marts and not in a consistent manner. As a matter of fact, the main reason for the consolidation was the inconsistent results and inability to get a corporate-wide view of customers, which was creating enormous business pain. Without this one face to the customer, 3M was unable to get complete customer information due to the distributed nature of the data. Reasons for Consolidation Another major reason for consolidation was a very large opportunity to reduce environ- ment carrying costs by eliminating data marts. 3M did a complete financial impact of the consolidation. ROI was expected to be $20M per year! Indirect expense reductions from internal efficiencies were also projected to accrue. In addition, the consolidated warehouse was expected to help meet market and customer penetration as well as sales growth goals. The project was made very visible to the user community. “Everybody knew” according to Messerli. The idea for consolidation was primarily from certain individuals in IT. Cultural resistance was faced and a year-long sell cycle from the C-level throughout the organization was required. Ironically, most of the resistance was from others in IT. The business saw the benefits more readily. Culture needed to be substantially changed to make it work for the enterprise and this required lots of selling “from the top-down and bottom-up”.
  7. 7. 4Data Mar t Consolidation (DMC)This took the form of chalk talks, hands-on sessions, user groups and data trusteeship(a form of data stewardship.) With 40+ data marts, there was a huge need to providemany mart users with a comfort level around a data warehouse environment and theconcept of data sharing.Data security would actually be improved with the ability to apply a consistent securitypolicy at the data warehouse level and implement business unit specific security aroundsubject areas.The Consolidation ProjectSince it was impossible to pick one from among the 40 marts to use as the conduit forthe data warehouse, 3M built a brand new data warehouse from scratch to accomplishits consolidation objectives. It was going to be totally comprehensive, with atomic leveldetail on all business subject areas and constructs from the existing marts incorporatedover time so the mart platforms could be retired.The ETL was completely redesigned. In building the new warehouse, 3M made sure thenew environment would include all the old functionality and then some. They did somezero-based analysis around business needs for a warehouse and how to construct thewarehouse. It turned out that no pre-existing subject area in any mart was selected tomove verbatim into the new warehouse.The extract load on the source systems was not materially affected by the DMC sincethose systems had mostly been programmed to push ample data out previously and thiswas not changed. Furthermore, they were previously extracting detailed data so thatwas maintained.3M normalized the new data model. The data warehouse team did extensive datacomparisons between the legacy marts and the new warehouse to demonstrate that thedata warehouse was correct (or if the numbers were different, that the data warehousewas “better”).Each migration was a separate project and in total it took several years to get thefunctionality of all 40 into the warehouse. All 40 marts are now gone. Data outageswere managed with parallel runs, causing only glitches in a very complex undertaking.The team had top-down support after the year-long sell cycle for the effort and a “nochoice” budget allocation back to business units.The Benefits RealizedThe benefits indeed were “many and large” and exceeded investment by quite a fewtimes over. Benefits came in many business areas including procurement, finance, sales,marketing, supply chain and e-business.The Post-Consolidation EnvironmentThe consolidation is now complete. 3M chose Teradata for the data warehouse platform.Teradata was deemed to be the only solution that scaled to the eventual size and usersthey would have in a consolidated environment comprised of hundreds of source system,5,000 tables, and 20,000 daily users. Scalability was the major driver behind this decision.They now have a consolidated and manageable set of data access tools and do ETL “oneway.” The data warehouse is now 15 TB of total disk space and has over 10,000 users.The marts were eliminated. Many of their platforms were obsolete according to Messerli.
  8. 8. 5 Data Mart Consolidation (DMC) The environment continues to evolve with more business functions, subject areas, users, and subsidiaries coming on board. The new warehouse environment has opened up the data to channel partners and customers on a self-service basis. Corporate mandates support the shared, centralized warehouse concept now and 100% of ongoing data warehouse efforts go into the centralized, mission-critical data warehouse. Top 3 keys to DMC success: 1. Getting complete buy-in from executives and throughout the organization 2. Good data standardization and a good data model 3. Good user tools to help facilitate user buy-in Delta Air Lines The Pre-Consolidation Environment Delta had three databases called data warehouses by their users. All three were on Teradata and served Financial, Marketing and Flight data interests, respectively. There were only 50 users in total for all the warehouses. The Financial warehouse was used for financial analysis. The 12 users primarily accessed the 100 GB warehouse with a modern data access tool. The Flight data warehouse supported revenue management – the effectiveness and profitability of flights. Its 12 users accessed the 700 GB warehouse primarily through a data mining tool. The largest of the warehouses was Marketing. It was used to look at frequent flyer information in order to adjust and judge the effectiveness of marketing programs. The 500 GB were accessed with both a modern data access tool and a data mining tool. None of the warehouses leveraged a packaged ETL tool. Reasons for Consolidation Ticket, flight and financial data were duplicated in the pre-consolidation environment and they were materially inconsistent in their representation of this data. This approach didn’t provide an accurate, consistent view of the same subject. This was not specifically traced to negative ROI impact but there was a general feeling of dissatisfaction and data disagreement within the user community. There were separate staffs for each warehouse. A goal of consolidation was to bring the warehouse under one group, which caused consternation. Typically IT groups were functionally aligned and were the single points of contact for the business units. Consolidating caused different groups (functional and warehouse) to be making contact with the users and this had to be managed. Additionally, there was conflict over which tool to use and when to use it. There was a desire to get to a standard tool set and develop a training program to help the casual user. The main reason for consolidation was not cost savings, but was to get to an “enterprise view – a single source of the truth” according to Wayne Hyde, former IT Vice President at Delta Air Lines and now with Reflection Technologies. This would eliminate compe- tition regarding whose data is best, which was previously left to IT to figure out. The consolidated warehouse would help put people on a common goal instead of being in competition.
  9. 9. 6Data Mar t Consolidation (DMC)Bottom line improvement had to be demonstrated by getting data in the hands of lotsof people besides the financial analysts. “If only 60 people have access, they will beoverworked. But get the data to hundreds of thousands of people who can engage thedata in an adhoc fashion at the time they are performing business processes, they canexploit the data to perform better and impact costs, processes, fraud and recoverrevenue” according to Hyde.IT did the analysis of corporate pain points and decided on DMC. The stated goal of theproject was not the end-all data warehouse, but focused on consolidating the 3 existingwarehouses and “let the future chips fall where they may.”Several “IT” benefits were also expected including saving machine cycles by loadingone copy of the data (vs. many), redeploying people to more productive value-addingwork as opposed to redundant work, and better leveraging machine capacity. For example,during the DMC process, it was determined that different groups were trying to performthe same analysis!In order to get DMC going, Delta took an ROI view of inefficiencies, redundancies,and software licenses. They did not establish quantifiable business ROI objectives for “ Replatforming reports is likethe initial transition, but asked the business for the ROI when determining what priority trading cars but still usingto train users for the new warehouse. the car for the same routes.The Consolidation Project It might be a nicer car but itPleased with Teradata to-date, Delta stuck with Teradata for the consolidated warehouse.The initial step was to consolidate platforms and copy the data warehouse designs for does nothing for ROI – justFlight and Financial data onto Marketing’s platform. psychological benefits. You’veOnce standard tools were selected, the team used zero-based analysis of business got to provide some kind ofrequirements to define data warehousing needs. The users overwhelmed the data incremental capability.”warehouse team with demand.However, according to Hyde,“Replatforming reports is like trading cars but still usingthe car for the same routes. It might be a nicer car but it does nothing for ROI – justpsychological benefits. You’ve got to provide some kind of incremental capability. Oneis changing the dimension of timeliness. There are some benefits from data marts butyou still have different business units making decisions with different People need tolook at the negative impacts of data marts.”Delta ended up with multiple development teams organized under a central datawarehouse team. They had a business specific team that did specific reports, adhocanalysis and dashboard building. The platform consolidation took 18-24 months andyielded 60% - 70% of the enterprise view, the rest of which would be added over time.Interestingly, they did not do parallel runs with the older warehouses. They just cutover after the platform movement and dealt with any issues. Extract loads on the sourcesystems actually increased over time since the new data warehouse identified needs overand above those that the previous warehouses uncovered.The Benefits RealizedThere are numerous benefits cited for the consolidation but a good example is inRevenue Management. Delta Air Lines was able to contest tens of millions of promo-tional dollars that were claimed by travel agents. This analysis was made possiblethrough a consolidated environment with a common view of the data. However, the realvalue was giving access to data to hundreds of thousands of people, not just a select few.
  10. 10. 7 Data Mart Consolidation (DMC) The Post-Consolidation Environment The consolidation is complete and the two warehouses that were consolidated are history. The DMC of the three warehouses also led to a total of 27 marts being elimi- nated. Delta Air Lines is focused on its architected data warehouse now, which is 4 TB usable data on Teradata and uses an ETL tool in places with an entirely different data access tool than before. Users were consolidated from the Finance and Flight data warehouses and the user base has grown over time to 4,000 users. Top 3 keys to DMC success: 1. Having a strategic vision of where you are going from an enterprise view of the data 2. Having a delivery of new capabilities, not just the old. Need NEW capabilities to establish new points of memorable value to be tied to the effort. 3. Senior level understanding of the vision (sponsorship) “Miss any one and you can be dead. If you have the strategy without the sponsorship, you can get started but not finish. If you have strategy without delivery, you’ll be condemned” according to Hyde. Michigan Department of Community Health (DCH) Pre-Consolidation Environment Starting in 1994, DCH began storing Medicaid paid claims on their data warehouse, which maintains 5 years worth of paid claims. They have 1.2 million Medicaid recipients and the majority of claims are paid through managed care. In 1998, they also started receiving encounter data and accumulated 66 million encounter data records to date, which are records of interactions between members and care providers. David McLaury is the Director for Project Development and Implementation. The Department of Community Health represents the largest user of the data warehouse environment in Michigan. The State of Michigan operates an enterprise data warehouse, which multiple state agencies utilize. It is a Teradata implementation. The department also operates a number of Oracle operational databases that were being used for analytical work in addition to operational needs. Users did not and could not have robust data access tools due to how the tools would interfere with the system’s primary operational purpose. Reasons for Consolidation The main reason that these operational databases were consolidated into the data warehouse was to provide better query and analytical capabilities. By consolidating these databases onto the warehouse, they are now also able to move information onto a new data mart, which uses a MedStat schema and is also run by the department. The idea for the DMC came from business needs. McLaury chairs a departmental committee that oversees the project and approved the DMC. The user community was actively involved in the consolidation, including acquiring the necessary federal funding to support the project.
  11. 11. 8Data Mar t Consolidation (DMC)One goal for the DMC was to create an integrated data warehouse environment that theycould manageably add onto over time and was available to department managers for allkinds of programs, not just those known initially.There was concern about losing control and especially about security. Data owners mustsign off on new users and these users must sign usage agreements. These programshelped assure that owners still felt like owners and alleviated cultural resistance.The Consolidation ProjectDCH created new data flows from the databases to the data warehouse. The additionaldata and emphasis on the data warehouse supported additional data cleansing activity.Data requirements were re-gathered and analysis was done on the requirements tounderstand what operational data was required for analytical purposes.Some database redesign was necessary in the move but some legacy designs were goodenough, even for analytical purposes. The consolidation will take 2 years and is beingdone by stepwise movement of the data from the operational databases into the datawarehouse.The Benefits RealizedThe benefits of DMC have been broad-based, especially in analytical areas. An exampleis the ability to cross-compare Medicaid paid claims and encounter data to otherdepartmental data sets.The users are still adjusting to having access to more data. While many still take theapproach of accessing the same data as before only in a different database, access tomulti-source data for program purposes will over time provide the biggest benefits ofthe DMC. The more data is added, the more benefits will grow. This will include datasuch as long-term care, nursing facilities, mental health services, substance abuseservices, and dental services over the next year.The Post-Consolidation EnvironmentThe Oracle operational databases were and are still available, but they are not nearlyas attractive for analytical purposes because the data warehouse is now available withclean, integrated, and historical data modeled for access and analytics. Reporting is alsobeing moved to the data warehouse. The data warehouse (combined with the MedStatdata mart) is 500 GB with 270 users.BULL is the state’s contracted entity for the Teradata warehouse. This decision wasoriginally made through competitive bids. As a scalable platform available to a varietyof leading tools, Teradata was kept in place for the added data the DMC brought intothe data warehouse. Top 3 keys to DMC Success: 1. Leadership and agreement that you have to do DMC 2. Show the ROI for DMC before and after 3. Have sufficient funding for the effort
  12. 12. 9 Data Mart Consolidation (DMC) Healthcare Insurance Company The Pre-Consolidation Environment Before the merger of the two companies that formed this health care insurance company, there was a mainframe data warehouse at one and a Teradata data warehouse at the other. The Teradata data warehouse actually acted more like an Operational Data Store (ODS) in that its data was immediately available to users after the data was generated in the operational systems. After the merger, this Teradata data warehouse became the feeder system for the mainframe data warehouse. Eventually, both of the systems gave way to a new Teradata data warehouse – one destined to be this company’s consolidated data warehouse. In addition, there is still another data warehouse in the environment that is not yet part of the consolidation effort. Prior to any consolidation, this company had three different ETL processes, three sets of definitions, some of the same data in three places, some critical data missing from the warehouses, and customer tracking being done in multiple data warehouses. There was “extra everybody effort with extra cost on users – joining data from different systems and learning different systems” according to the Director of the Data Warehouse. Each data warehouse had different reconciliation masters (one to the general ledger, one to invoices and one to cash). So the data did not easily reconcile and there was a cost associated with bringing it all together. In most cases, the atomic level detail was captured everywhere although the summaries and some minor aspects were different between the warehouses. For example, the Financial data warehouse has 90% currency-type fields so there are shorter records but it is still detailed. There were also homonyms and synonyms if you looked across the warehouse environment, which created confusion for the users who frequently had to access data across different warehouses to accomplish a business objective. Reasons for Consolidation While direct carrying cost reduction was expected, this was not the most important or the largest benefit. Although originally perceived as IT cost savings (because IT came up with the project idea), DMC was positioned to provide business benefit. IT savings alone would not have justified it. The architectural goal of an enterprise-wide data warehouse was made very visible to the user community. They had a business owner of the project and a steering committee. Interestingly, there were more privacy issues when the data resided on 3 different data warehouses than there were after the initial consolidation was completed! The Consolidation Project The DMC thus far has consisted of rehosting the (former) mainframe data warehouse to Teradata for performance reasons and also feeding a separate schema from the ODS-like data warehouse – two separate schemas for the two pre-merged organizations – but at least sitting in the same Teradata instance. This allowed the pre-merger Teradata data warehouse to focus solely on the organization’s needs for an ODS. The parties involved recommended benchmarking to make sure the chosen DMC environment would perform as advertised. Although they’d had Teradata for almost 10 years, they ran benchmarks prior to confirming its selection for the consolidated environment. Teradata solved an immediate pain point by delivering a 5-fold performance increase compared to the mainframe data warehouse. Moving the existing ETL streams, access environments and database designs to the consolidated platform was the first step of the DMC. Most of their data transformation
  13. 13. 10Data Mar t Consolidation (DMC)happens in mainframe operational environment anyway so the Extract and Transformationstayed the same. Only the Load changed for the DMC. The number of extracts has beenreduced however based on the consolidation. To ensure integrity, a parallel run of aboutthree months for each pre-consolidated warehouse occurred. The mainframe cycles wereThe consolidation of the third data warehouse (previously mentioned as outside the re-dedicated to OLTP-typescope of consolidation thus far) and the redesign of the schemas remain to be accom- work. The warehouse isplished. So, while many of the challenges in the pre-consolidation environment havebeen met by the DMC efforts to date, there is still much work to be done. providing detailed data to support complex andThe Benefits RealizedThe larger benefits for this DMC came from the business perspective, specifically more diverse user queries in atimely data to make better decisions and turn around requests quickly by not having to manageable way.reconcile data and prove use of the “right” data. An example of this is profiling providersand determining whether members are being treated appropriately. This was improvedupon by consolidating the data warehouse environment.The Post-Consolidation EnvironmentThe mainframe cycles were re-dedicated to OLTP-type work. The warehouse is providingdetailed data to support complex and diverse user queries in a manageable way. Therewill be more to this DMC story since it is not complete. Stay tuned. Top 3 keys to DMC Success: 1. High levels of business customer Support – it’s not all IT 2. Know going into a DMC that you are fixing a business problem 3. Benchmark to determine the best platform to use Royal Bank of Canada (RBC) Financial GroupThe Pre-Consolidation EnvironmentRBC Financial Group had a 2.5 TB data warehouse along with several predominant datamarts, some of which pre-dated the data warehouse. These marts ran on heterogeneousdatabase platforms. These data marts were loaded from a combination of source systems,flat files and the enterprise data warehouse. There were numerous ways to load andaccess data, different staffs for the different marts and the warehouse, and a varietyof vendor tools deployed to access the data.Systems and Technology within RBC Financial Group conducted a health check onthe data warehouse environment. As a result, the decision was made to transform to ahub and spoke environment, which would result in simplifying the ETL and processing,as well as optimize resource utilization. According to Mohammad Rifaie, the GroupManager of Information Resource Management at the RBC Financial Group, “Dataintegration is absolutely critical to create a ‘Single Version of the Truth’ whereby allbusiness information/data is unified and shared across all functional departments. Thisenterprise-wide view of our customer behavior along with operational data will allowfor analysis and insight that was not possible before. A consistent, single view of ourdata should improve sales, reduce operational costs, increase customer retention andsatisfaction, and ultimately lead to maximized profitability.”Reasons for ConsolidationTechnological constraints imposed by existing multiple processing platforms made itvery difficult to share data. As a result, much data was replicated. This also resulted in
  14. 14. 11 Data Mart Consolidation (DMC) duplication in resources and processes, which led to a higher cost of ownership and a greater potential for inconsistency. An impartial assessment of the data warehouse environment by an analyst group advocated consolidation onto a Teradata platform if RBC Financial Group was to reduce costs, improve the effectiveness of the environment“ DMC is like having a rearview and realize their strategic objectives. mirror AND a front windshield.” Besides prohibitively higher operating costs, different processing environments pre- vented RBC Financial Group from leveraging all sources of information. The data stored in independent data marts usually encompassed one or two subject areas (sales, marketing, customer service) and failed to provide an integrated environment that allowed the various pockets of information to be shared and leveraged across the organization. RBC Financial Group had top-down support and strong executive sponsorship for the effort. Both were cited as keys to success. Although it was not stated that the new data warehouse would be the final architecture for data warehousing, that’s how it worked out. RBC Financial Group now will only have a physical mart for geographical purposes. “DMC is like having a rearview mirror AND a front windshield” according to Rifaie. The Consolidation Project The first step was to port the data to the single platform, then “rationalize” the data, removing duplicate data and unneeded ETL. They did not redesign initially – they “forklifted” the existing designs. Then they redesigned and rearchitected. They’ve just finished the redesign of the client subject area, which is the most widely used and are now re-doing the ETL to load the new tables and removing the legacy constructs. Arrangement, Product and other subject areas will be done this way as well. RBC Financial Group chose an existing platform to consolidate onto. They had analyst help in choosing the solution for their DMC and they chose Teradata due to multiple areas of savings and benefits including high availability and reliability. They had 99.995% availability in the 7 previous years with Teradata, which Rifaie says is “built for data warehousing and they have compression and economical indexing. TCO is low for Teradata.” Cultural challenges were overcome by keeping the focus on TCO and nothing else. The technical re-porting took 4 months (with one more mart to go.) The team had to “steal machines cycles whenever they could – after midnight, weekends, etc.” to keep from impacting user environments. There was no impact on the source systems for DMC since the systems put out files for data mart/warehouse environment pick-up (both before and after the DMC.) Parallel runs with the legacy marts and warehouse lasted 1 month after the queries were converted to the new warehouse, during which time they were able to procure a written sign off of every client in the data warehouse. The nodes, disk, and software that the marts and warehouse resided on were then deployed elsewhere. The Benefits Realized “Data Warehousing is about repenting for the sins of the past” according to Rifaie. “The data warehouse is corporate memory. Redundant data is difficult to control. In a data mart, the primary key might be a numeric identifier column but it might be different in another mart where it might be dual-columns. It will be problematic to join the data from these two.” For example, once the Business and Personal Marketing data marts are on a single Teradata platform with the EDW, there will be additional revenue and cost-avoidance opportunities. This will be followed by a subsequent data rationalization project to eliminate unnecessary data and process duplication between the EDW and the data marts.
  15. 15. 12Data Mar t Consolidation (DMC)DMC also positions RBC Financial Group to handle new data and business require-ments more effectively. These include business centricity, effortless scalability, highuser concurrency, ease of access, complex and ad hoc query performance, data central-ization, fast fail-safe data load utilities, capability to handle multiple subject area, openaccess, integrated metadata, generic modelling, data-source neutrality, and softwareaddressing all critical components of the architecture.By consolidating data marts and the enterprise data warehouse onto the same platformRBC Financial Group has been able to improve overall profitability by: • Lowering the total cost to own, operate, and expand the data warehouse environment • Reducing the requirement for scarce and expensive skill sets • Enabling data integration across functional areas “ The data warehouse is corpo- • Improving efficiency in making data available to meet changing business requirements rate memory. Redundant data • Providing an enterprise-wide “single version of the truth” spanning from customer is difficult to control.” information to actionable data • Facilitating easier implementation of Data Governance and Privacy and Confidentiality • Shortening the supply chain for data access so they can see a client’s complete relationship to the bank in one place, which has helped improve client relationshipsThe Post-Consolidation EnvironmentThere is one ETL tool with one way to do ETL now. The data warehouse is 3 TB andsupports 2,500 – 3,000 users. Top 3 keys to DMC Success: 1. Base a business case on real savings. Architecture doesn’t sell. Avoid technical terms. Build the case on savings of FTEs, operations and strict TCO. 2. Make sure to obtain support of business partners at the highest level. 3. Make sure to communicate with users about schedules and changes. Do hand-holding. You may need to change their queries for them. Get sign off. Have communication surveys and parallel runs. Major Telecommunications CompanyThe Pre-Consolidation EnvironmentThe pre-consolidation environment had over 70 “reporting systems” which servedbusiness unit-specific purposes. There was nothing that, from a 10,000-foot perspective,resembled a data warehouse. None of the marts had enough of a footprint to be considered“major” from the big picture perspective.Each mart had a “handful” of users and people used what they informally learned hadthe data they needed and that they could get access to. Their choices were not alwaysbest for their needs, but without an organized approach, this was the environment.
  16. 16. 13 Data Mart Consolidation (DMC) The reasons for the marts were as vast and numerous as the platforms. The environment had grown over “centuries” and it was difficult without central management to tell how vast and large the environment really was, although it could easily be surmised to be multiple terabytes. Only when an inventory was done for evaluating DMC opportunities did this company realize the extent of the problems. ETL was hand-coded since individual projects could not justify the purchase of a tool. Data access tools and methods were numerous and not very robust. Additionally, support staffs were numerous and not dedicated since they resided in business areas. Major subject areas were duplicated across the pre-consolidation environment but more importantly business functions were also duplicated. Not only were they duplicated, as you would imagine with so many marts, they were inconsistently duplicated. The main problems were data duplication and confusion as opposed to inconsistent representation problems. Reasons for Consolidation The idea for DMC actually came from the business side. This project had to focus on direct expense reduction as the main key to success. This meant achieving the goal of reducing technical support and maintenance requirements. Top-down support for the anticipated savings helped them deal with cultural resistance, which can be the age-old conflict between centralization and decentralization. Interestingly, privacy was no more an issue in the new environment as it was in the old environment. The Consolidation Project They built a new data warehouse on Teradata to absorb all the data marts. They rerouted the existing streams but added others that were necessary for the consolidation. They’ve done “a little redesign as we go and we’ll see when we’re done if more is necessary” according to the leader of the effort. The ETLs and database design were changed, but like several other DMCs, they relied on source system file outputs so the extract load on the source systems were not affected much. For a few of the mart consolidations, parallel runs of 1 month were done. The complete consolidation of all the marts, which is still occurring, will take approxi- mately three years total. Phase 1 delivered 22 of the data marts and was two years, which included scoping, planning and financial analysis. They have picked up momentum and experience and anticipate finishing the remaining 48+ in the next year. The Benefits Realized The users have acknowledged that the new tools are better and there are many benefits, especially in the financial reporting environment. They get financial insight they never had before, especially into their quote-to-cash cycle. They can now make more immedi- ate decisions with consistent data usage, less data latency and less redundancy. Overall, it’s a more efficient business operation The Post-Consolidation Environment The Teradata warehouse and standardized ETL and data access tools dominate the new environment, which was consolidated around this one set of tools. There are 2.3 TB of usable disk, which will be doubled by project’s end. There are 5,000 users and the data- using community has greatly expanded with this project. Teradata was chosen for the DMC since it was already being used for some internal applications and they were not sure how well alternative products would hold up. Scalability was “Very important. With Teradata you know you can easily keep adding
  17. 17. 14Data Mar t Consolidation (DMC)nodes, but with SMP, you can add CPUs but you get to diminishing returns. They startbattling among themselves if you get too many of them”.Normal growth has occurred in the warehouse since it went into production, but theyare more focused on bringing the other marts in, not advancing what they’ve alreadybrought in. While some of the marts still exist, they serve no purpose. All will be gonesoon. The success of this DMC is assured. Top 3 keys to DMC Success: 1. Executive buy-in 2. Proper data management and data modeling techniques 3. A team that is knowledgeable with the chosen toolsets Anthem Blue Cross Blue ShieldThe Pre-Consolidation EnvironmentAnthem’s consolidation was a result of the merger of Blue Cross Blue Shield companiesin Indiana, Kentucky, Ohio, and Connecticut in the period 1993-97. There were threeincompatible data warehouses that needed to be brought together to provide a consoli-dated view of the business.Each Blue Cross Blue Shield plan used their data warehouse for pricing, understandingtreatments, some fraud and abuse, group reporting, utilization, underwriting, providercontracting and affairs management, Exposure, mandatory government reporting, andexperience analysis. They were initially implemented in the several years prior to themerger, some as far back as 1990.Two were mainframe warehouses and one was on Teradata. Each was hundreds ofgigabytes in size. “In a way we were fortunate that each state had a data warehouse.Everyone was used to using a data warehouse and there being a data warehouse around,but also each state had its own representations of data, its own technologies and its ownway of doing things” according to the author.ETL was done with COBOL code and there were numerous data access tools andmethods. Many of these methods were of the heavy lifting, programming variety suchas Visual Basic, Microsoft Access, Q&E, Powerbuilder, and CLIST applications.Major subject areas were duplicated in the environment since each data warehousewas developed with not only different staffs, but different staffs in different companies.The models were vastly different.For example, the representation of a customer was by policy in one, customer ID inanother and something else in the other. This inconsistent representation worked foreach independent company but when the companies got together, this presentedproblems.Reasons for ConsolidationDMC was sold mainly on the idea of integrating data to get cross-company views fromwhich Anthem would have much richer data for doing functions like fraud detection andclaims re-routing to best-of-breed providers. The idea came from a combination of IT,the business, and Teradata. Each state was vested in their representations and the useof their data. Each was a $1B+ organization so this was not a small effort.
  18. 18. 15 Data Mart Consolidation (DMC) The project was made very visible to the user community. The chief actuary of the consolidated company, who represented much of the usage, was also the executive sponsor. Still, there was cultural resistance that was appeased by keeping their data warehouses alive while the new data warehouse was built. “There are always privacy issues when dealing with healthcare data. Some state specific data is not accessible by all – only by personnel in that state. Even though the companies were merged, it would take quite some time to merge all business processes” per the author. The Consolidation Project The goal of the DMC was to establish one data warehouse that would, by default, receive data from all data warehouses in Blue Cross Blue Shield plans that were being merged with Anthem. Ohio’s was the most recently built data warehouse. It was built on Teradata and the chief sponsor of the project was Ohio’s chief actuary in the pre-consolidation environment. For this reason, and the good experience to date with Teradata, it was selected as the platform for the ADW (Anthem data warehouse). Anthem began bringing data in from the other data warehouses. New streams were created for the Indiana and Kentucky plan data and each subject area went through the process of data element comparison, logical modeling, database design, code value comparison, data transformation, and implemen- tation for the design. The database design, ETL processes, and access environments were changed. Each subject area was redesigned to represent the single version of the truth. Anthem wanted one scalable data model for absorption of new data sources or new Anthem, Inc. acquisitions. “Being able to access, review, analyze and share data across the company made all the difference between success and failure” according to the author. The consolidation took about one year although this included other normal development and creation of value-added functionality. Top-down support combined with parallel runs to make it smooth helped overcome cultural resistance. The Benefits Realized There were many benefits of the consolidated data warehouse. Some were related to the fact that there was consolidation of the prior data warehouses and some are related to the ongoing developments on the data warehouse. The DMC helped Anthem win new business because of the flexibility and reporting capabilities generating income. The cost of care was lowered by $250M annually by using the ADW to identify patterns in the data that allowed Anthem to build better networks and craft the network reimbursement arrangements in different ways. The ADW was instrumental in reducing the cost of products for policyholders and members (i.e., pay VALID bills ONCE). The ADW is used to ensure practitioners are licensed to perform and Anthem was able to craft lower costs from providers by dealing with them based on their profitability as determined by the data warehouse. Anthem was also able to reduce the Caesarian section rates and improve the results from coronary bypass surgeries and improve staff productivity. “I don’t think much of this could have been accomplished without a single version of the truth” according to the author. The Post-Consolidation Environment The initial consolidation of the 3 states into 1 data warehouse is complete.
  19. 19. 16Data Mar t Consolidation (DMC)The multi-terabyte Teradata warehouse has hundreds of users. The former warehouseswere still in place well after the DMC given their new role of feeding the ADW. OtherBlue Cross Blue Shield plans still need to be brought in so this is a work-in-progress.Teradata was chosen because of Ohio Blue Cross Blue Shield’s good experience withTeradata and its known scalability. If the chosen solution was unable to handle the largeworkload, the shared concept would have died and Anthem would have stayed withseparate data warehouses which means they wouldn’t have gained half of what they didwith the ADW – and would have wasted millions! Top 3 Keys to DMC Success: 1. Strong, active executive sponsorship keeping the project out of internal politics 2. Source the data warehouse from operational systems, not existing data warehouse/data marts 3. Create a program with standards and processes Sekisui Systems Corp.The Pre-Consolidation EnvironmentSekisui had seven data marts distributed to branch offices fed from a central datawarehouse. These supported a variety of business functions such as increasing thefrequency of effective customer calls by saving time to create meeting materials, byautomating the sales cycle, and by providing information directly to selected customers.It has grown over time both in data size and number of users. Before the DMC, themarts in total had 110 GB of total disk space with 66 GB used. There were 400 users.Despite the number of marts, they managed to keep consistency among the DBMS,the ETL, and data access for all of them. They also had only 1 DBA for all seven marts,but there were still efficiencies to be gained from DMC.Reasons for ConsolidationOne anticipated benefit was cost reduction by consolidating the machines from sevenbranch offices. It was time to replace some of these anyway due to obsolescence, furtheropening the door to DMC.Another benefit was to unify the system operation and further standardize the operatingskill of the enterprise to the platform they could grow with. “Since the Teradata warehousewas already constructed, we wanted to standardize the operating skill on Teradata byconsolidating the data marts to Teradata” according to Masaaki Kondo, Director of theCorporate Group Systems Division at Sekisui.The ROI was estimated by comparing the costs of continuing to license the branchoffice machines to the cost of a consolidated approach. The System OperatingDepartment Manager (IT) came up with the idea for DMC at Sekisui. There wasn’tany cultural resistance.The Consolidation ProjectSekisui consolidated onto the existing Teradata data warehouse by redesigning the entiresystem. Extract loads on the operational systems were reduced with the DMC. Theproject took 6 months, just as expected.
  20. 20. 17 Data Mart Consolidation (DMC) The Benefits Realized The project is complete and the data marts for the sales database systems are consoli- dated. The planned benefit, direct expense reduction, was achieved. Support costs were reduced and all access is now against the data warehouse. The Post-Consolidation Environment Scalability was crucial for data expansion. Concurrency was also immensely important since usage concentrated around 9 a.m. system-wide. If the DBMS were unable to handle the workload, Sekisui would be isolated from information on member daily sales activities and division managers’ sales results. All organizations in Sekisui group using the system would be affected. Top 3 keys to DMC Success: 1. Create an organized data warehouse (not data marts) which is best suited to your goals 2. Educate the end users on the project and secure their agreement 3. Unify codes and subjects in a consolidated environment
  21. 21. 18Data Mar t Consolidation (DMC)Part 3: Best Practices for DMCKey Findings from the Interviews: 1. The number of marts/warehouses consolidated ranged from 3 to 70 with a median of 7.5. 2. The majority of environments had duplicate and inconsis- tent data across the pre-consolidation environment. 3. The primary reason for DMC varied with very strong opin- ions for the reasons cited! Five quoted business rationale such as creating a consolidated view of customers as the main reason while three quoted IT cost reductions as the main reason. 4. All performed at least some manner of rearchitecting although several made this a later stage step that came after rehosting. 5. Except for the case where the consolidated databases had operational functions to perform in the environment as well, only one kept the consolidated marts/warehouses in the environment after the DMC. The old platforms were redeployed to other uses or, in most cases, eliminated. 6. Every DMC was made very visible to the user community. These projects required a great deal of support which most received from the highest levels of the organization. It was not possible to accomplish DMC objectives in a skunkworks manner. 7. Very little user data access outages were reported. Most DMC programs took great caution to transition users smoothly to the new environment. 8. 5 programs credited IT with the idea for the DMC. The other 3 cited the business with the initiative. 9. All said scalability was important to the data ware- housing decision. Many referenced the sudden increase in data and users that the warehouse would be taking on after the DMC as putting scalability on the top of the criteria list. 10. Almost every DMC faced some degree of cultural resis- tance to the idea of consolidating and centralizing. Most of this was adeptly dealt with through attaining top-down support and cultivating user interests throughout the project. The majority of resistance went away as soon as early benefits of the DMC were realized. 11. Little change occurred to operational systems impact as a result of DMC efforts.
  22. 22. 19 Data Mart Consolidation (DMC) DMC can be used to put in place a scalable, integrated, multi-application data ware- house that absorbs all analytical-type activity in an organization or it can be used to “simply” get an antiquated system out of the environment by moving its function to a system still under support from its vendor. Regardless of the ambition, many DMC efforts eventually lead to the first goal. The act of initiating the consolidation idea within an organization seems to spawn more and more consolidation. For those organizations that are considering DMC and will have opportunity to plan its success, some best practices emerged from the interviews as well as anecdotal evidence. The keys are also applicable to newer data warehouse efforts or those being revamped to a centralized data warehouse environment. Customer-Reported Keys to DMC Success 1. Get top down support. This was cited as the #1 key to success in 5 of the cases and was a top 3 key in all but one case. 2. Fix a problem. Whether you justify on cost savings or a business benefit (or both), the DMC should fix a major, known problem that can be quantified in business terms. 3. Have data standards and a sound data model. 4. Pick the right tools and platform. Put DMC on a scalable platform. Your data volume managed within a singular database will instantaneously explode with DMC. Future efforts will be continuing to grow the environment. Also note that in addition, many took this opportunity of changing platforms to also change data access and ETL tools. 5. Set expectations and communicate with users. There is no such thing as over communication in a DMC project. This is about the users and care needs to be taken to migrate the users without any disruption in their ability to access data. Author’s Additional Keys to DMC Success 1. Don’t just rehost, rearchitect. This time of transition is also an opportunity to reevaluate the data warehouse program according to established best practices – a time to evaluate what is and isn’t working and fully take advantage of the new platform and the migration process. 2. Starve the pre-consolidated marts of attention and resources. Negotiate the condition for user signoff prior to DMC. Make sure all utility is removed from the marts. 3. Justify on either platform cost savings, business benefits or both. The larger the project, the more DMC is a difficult technical challenge and the platform cost savings more evident. It is always easiest to justify on cost savings but business benefit based on delivering new capabilities can be significant. 4. Expect and plan for cultural resistance. Ownership, as a concept in the former environment, may now be designated at a subject area level as opposed to a data mart level. Carry forward security and stewardship designations and responsibili- ties to the consolidated data warehouse. This may even be a time to improve these programs. 5. Consolidate ETL and access tools too. Part of the re-gathering of requirements that should be gathered for a DMC necessitates taking the opportunity to ensure tools are still compatible with the new platform and the most fit-for-purpose.
  23. 23. 20Data Mar t Consolidation (DMC)About the AuthorWilliam McKnight is founder and president of McKnight Consulting Group, a consultingfirm specializing in data warehousing solutions. William is an internationally recognizedexpert in data warehousing and MDM with more than 15 years of experience architectingand managing information and technology services for G2000 organizations.William is a frequent and highly rated speaker at major worldwide conferences andprivate events, providing instruction on customer intimacy, return-on-investment,architecture, business integration, and other business intelligence strategic and architec-ture issues. He is a well-published author and a columnist in Information Managementfor the column “Information Management Leadership".A regularly featured expert on data warehouse/business intelligence and MDM at majorconferences, William is widely quoted on data warehouse and has been featured onseveral prominent expert panels. An expert witness, skills evaluation author and a judgefor best practices competitions, William is the former executive of a recognized bestpractices information management program. 5960 West Parker Road Suite 278, #133 Plano, TX 75093 (214) 514-1444