Enterprise Information Management: Strategy, Best Practices & Technologies on Your Path to Success


Published on

Authored by Frank Dravis, Baseline Consulting, this paper discusses: (1) EIM strategy development and (2) enabling information management technology. Understanding these two areas is crucial to starting, planning and executing an EIM initiative.

Published in: Business, Technology

Enterprise Information Management: Strategy, Best Practices & Technologies on Your Path to Success

  1. 1. Data Management White Paper Enterprise Information Management Strategy, Best Practices & Technologies on Your Path to Success by Frank Dravis Sponsored by
  2. 2. Enterprise Information Management Strategy, Best Practices and Technologies on Your Path to Success 2| Baseline Consulting
  3. 3. Contents v Executive Summary .................................................................................................................... 4 The Business Value of EIM ................................................................................................................. 5 Getting Started ..................................................................................................................................... 6 v EIM Strategy 8 ................................................................................................................................... What Goes into an EIM Strategy 8 ................................................................................................... In Favor of Pragmatism ..................................................................................................................... 10 v EIM Best Practices 11 ................................................................................................................. IT and Business Collaboration ....................................................................................................... 11 Trusted Information ........................................................................................................................... 12 Enterprise-wide Reuse and Standards............................................................................................ 12 Data Governance .................................................................................................................................. 13 Taken Together .................................................................................................................................... 14 v Requirements for Information Management 14 ................................................... SOA Support ........................................................................................................................................ 14 Centralized Data Management ......................................................................................................... 15 Complete Functionality ..................................................................................................................... 16 Seamless Integration ......................................................................................................................... 16 Ease of Use ........................................................................................................................................... 17 v Information Management Software 17 ........................................................................ ETL ......................................................................................................................................................... 19 Data Quality.......................................................................................................................................... 20 Metadata Management ....................................................................................................................... 21 Master Data Management (MDM) .................................................................................................... 21 v In Closing........................................................................................................................................ 22 Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success |3
  4. 4. Enterprise Information Management Strategy, Best Practices and Technologies on Your Path to Success Executive Summary When faced with information management issues, particularly those in a cross-functional setting, many business and IT professionals turn, albeit often unwittingly, toward Enterprise Information Management (EIM). EIM is the effort and practice of reaching across all data and application silos embedded in the organization’s operating infrastructure; then binding those repositories together into one effective information management environment where information is delivered to the person who needs it, when they need it, and how they need it. EIM, as the term denotes, spans the entire corporation, regardless of size, from a small, 30-person garment maker to a 50,000-person, multi-national manufacturer. Agility, accu- racy, and completeness of data delivery are the three primary objectives. An EIM initiative will often be launched well after the organization has implemented its patchwork infra- structure of disparate repositories and applications, signifying a creeping recognition that data integration is broader than individual systems and organizations. As data management practices evolve and become adopted, companies realize that they can be more effective in the use of their information if they take their overall information architecture to the next level—one in which disparate, siloed repositories and applications are instead planned and designed to interoperate and deliver information quickly, completely, and in the correct context. An entire book would be needed to expose EIM to the depth and breadth that it deserves. The goal of this paper is to paint the EIM landscape, noting its components but focusing on the importance of an overarching EIM strategy that focuses on corporate objectives while at the same time offering cross-functional support. Knowing that EIM exists is the first step towards understanding how business issues fit in the information picture. With that overall view, the business and IT manager will be better equipped to discuss, compose requirements, and draft designs for the modern information management environment. Given the breadth of the EIM domain, which is essentially any policy, practice, process or technology that manages information, this paper will delve into two areas that can deliver immediate value to the reader today: (1) EIM strategy development and (2) enabling information management technology. Understanding these two areas is crucial to starting, planning and executing an EIM initiative. The strategy lays out the blueprint of the EIM ini- tiative, communicating the vision, goals, and prioritized projects. And while there are other Knowing EIM exists is important technology concepts in EIM—such as data warehousing and data security—only a corporate-wide data management vision can bind disparate, heterogeneous data sources the first step towards together in a framework for access and sharing of data. This is a fundamental goal of EIM. As such, we will discuss metadata management, master data management, data quality, and understanding how data migration—all of which play important roles in integrating and managing data. business issues fit in the information picture. 4| Baseline Consulting
  5. 5. The Business Value of EIM EIM is about managing information assets across the entire enterprise. The enterprise can be large or small, with several divisions or business units, or it can be a single functional Whatever its scope, entity. Whatever its scope, EIM involves fostering, creating, and maintaining practices that allow the business to optimize data access and usage regardless of where the data resides EIM involves fostering, and what functional entity needs it. First and foremost, EIM exists to support business objectives. This means business drivers are used to form the EIM strategy and tightly link creating, and maintaining them to corporate goals, such as profit, revenue, share value, etc. In order to aid in the practices that allow the attainment of business objects, various operational barriers must be overcome. One barrier that EIM is uniquely suited to breach is the difference in data definitions, business rules, business to optimize and even jargon between functional entities. Resolving data anomalies such as semantic inconsistencies, duplicate or missing data, and inaccurate values is one of the drivers of data access and usage EIM. This implies implementing processes and infrastructure that allow different business regardless of where the units or functions to communicate and share data in a common vernacular. Let’s face it, manufacturing sees a ‘product’ as a part on the shop floor. Marketing considers product data resides and what as one of many of the company’s offerings. And accounting will insist it is a line entry in the general ledger. These are semantic differences. EIM, specifically the data integration, functional entity needs it. metadata, and master data management (MDM) elements, seeks to bridge those semantics through practices and technology that first exposes the differences via metadata, then inte- grates the diverse data entities into common objects, and then turns them into master refer- ence data used as the basis for information understanding across all business functions. Few organizations have the budget or wherewithal to implement an EIM strategy across all lines of business and all data volumes in one fell swoop. Instead, the best approach is to pick the problems that EIM can address, prioritize them, and then implement that por- tion of the EIM strategy that delivers the highest value in the quickest timeframe. In this way, EIM benefits can be reaped early on in the initiative and used to credit, justify, sup- port, and even fund further incremental EIM projects in the strategy portfolio. Benefits of an EIM Initiative What EIM offers The benefits Alignment of business goals with Ensures the ROI of all subsequent information architecture information projects Faster access, across the enterprise, to Improved and more timely decision making crucial data Reduction in time spent debating the Common and shared data definitions meaning and purpose of data elements All enterprise operations run more Improved data quality effectively Impact analysis and data lineage across a Aids compliance to corporate and complete information supply chain governmental reporting requirements Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success |5
  6. 6. Enterprise Information Management Strategy, Best Practices and Technologies on Your Path to Success Take, for instance, a medical equipment supplier’s first foray into EIM. It was considered a smashing success by both customers and IT practitioners alike. By first collecting the infor- mation on their thousands of products into one master repository, and then cleansing and Ultimately, the key standardizing the individual records, they were able to match and consolidate the products into a hierarchical tree. Instead of the data being segmented according to specialty cata- benefit of an EIM log, which resisted vendor and product comparisons, they could now see which vendors initiative is the creation offered the best price performance in general and which offered the best price for unique categories. The distributor was able to streamline catalog production, reduce the number of of an effective and catalogs, and offer a better product mix in the catalogs that remained. The ability to refine business rules about products and vendors and to deploy data quickly not only meant bet- dynamic information ter decision making, but enhanced collaboration between product line units. management Ultimately, the key benefit of an EIM initiative is the creation of an effective and dynamic information management environment with robust facilities for data creation, collection, environment with summarization, sharing, and reporting. The ultimate goal is maximizing business perfor- robust facilities for data mance through access to trustworthy and authoritative business information. creation, collection, Getting Started summarization, sharing, A common question is “How do I get started with EIM?” Interestingly, the adoption and maturity of EIM appears to be moving in lockstep with data quality. When data quality and reporting. adoption began accelerating in the mid-2000s, practitioners changed their question from “Why should I care about data quality?” to “How do I get started?” The same evolution is occurring with EIM. Creating an EIM strategy is the way to get started. With the strategy in hand, the next steps follow classic IT project management: Build a program plan, and within the plan, begin to drill down and define the kernels of the individual projects, as shown in Figure 1. EIM Development Strategy rioritize el Measure Correct Figure 1: EIM Program Development 6| Baseline Consulting
  7. 7. Ultimately, the purpose of creating the strategy and building the program is to formalize EIM within the organization. Developing an awareness campaign informs stakeholders of the benefits of EIM and how it will accelerate the attainment of corporate goals. As with data quality, the success of an EIM initiative comes quickest when the organization is already feeling business pain because of poorly understood, defined, or integrated data. Those organizations that want to excel eventually demand a strategy for dealing with the problem. As EIM is formalized through strategy development, approval of the strategy by senior man- agement establishes the charter for the EIM initiative. Once approved, the program plan aligns resources, priorities, and schedules to the individual projects. At some point during the second or third project, it will have become clear that EIM has been operationalized. The charter and strategy is in place. The program plan is being executed, and data governance activities are creating and refining policies, business rules, and even metrics to measure the success of the business. These business metrics are key, as many measurements will have not been available before the initiative was started. These metrics will provide a newfound transparency into how well the business operates. The information delivered by these met- rics should be used to highlight the EIM initiative and form the basis for new justifications to expand the program beyond the initial pilot projects. An individual project can be large, like launch a CRM system, or small, like create a data stewardship council. It all depends on the project scope. The detailed specifications for each project are then developed, prioritizing each one according to business impact, return on investment (ROI), and executive support. This structured and metrics-based prioritiza- tion process will help bubble candidate projects to the top. If you are new to EIM, pick the smallest projects first and schedule them to complete one after the other. To quote Applegate, et al., in Corporate Information Strategy and Management: Infrastructure that lends itself to incremental improvement enjoys favorable management attributes; for example, investment and implementation risks are easier to manage when improvements involve a series of many small steps rather than a few ‘all or nothing’ steps. Incremental improvement also facilitates experimentation and learning.1 Of course, achieving such a lofty objective requires not only an understanding of how heterogeneous and disparate a company’s data is, but of the associated business impacts. An EIM strategy, developed jointly by business and IT, is the best first step. 1 Applegate, Austin, McFarlan, Corporate Information Strategy and Management, 7th Edition, (McGraw Hill, 2007). Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success |7
  8. 8. Enterprise Information Management Strategy, Best Practices and Technologies on Your Path to Success EIM Strategy Not surprisingly, many organizations implement portions of EIM without realizing it. A common example of this is a firm that was desperate to provide a sales contact and pipe- line tracking tool to its diverse and geographically distributed sales force. The firm wanted a system that all sales people could use; all data was stored in a single centralized repository; it included standardized and robust reporting for both contacts and weekly activity; and it was Web-accessible. The solution was a sales force automation (SFA) application, and it was deployed across the enterprise. When the information demands of a corporate function are implemented in such a way that benefits the business, the application is considered a success. That is, until the next EIM challenge is tackled. In this example, the assumption is that the architects and planners of the SFA application designed it to operate and integrate well within the firm’s existing infrastructure. After the firm implemented SFA, they then turned their attention to building a more effective marketing organization and wanted to deploy customer relationship man- agement (CRM). Now the question became: How will the SFA and CRM systems interoper- ate? And what about the product information system that manufacturing was considering? So far, siloed pieces and parts had been implemented without any overall vision or strategy. Corporate or functional goals (if visible) were being addressed in isolation of each other. Conflicts will invariably arise over funding, interfaces, roles, and objectives, and instead of having a collaborative EIM environment, infighting and bickering over span of control, budgets, and development schedules ensues. Without a plan, any progress towards EIM will be as much by luck, given the failure rates of so many large-scale system implementations. The answer to this problem is to create an EIM strategy. What Goes into an EIM Strategy? Without a plan, any Before an organization can build any type of strategy, it needs to have a vision of where it wants to go and a set of goals that support, drive, and measure success towards that progress towards EIM vision. This vision, along with goals, is absolutely crucial for forming and directing the will be as much by EIM initiative. luck, given the failure For example, if the vision of the organization is to have a 360-degree view of the customer so it can increase revenues through improved customer intelligence, then an EIM strategy rates of so many might include a customer data integration (CDI) effort, data quality automation, and the acquisition of an analytical CRM tool. The information architecture planning will take into large-scale system account the data infrastructure and policies necessary to support this vision. In this case, corporate strategy—where the vision and goals are laid out and articulated—serves as input implementations. The into the EIM strategy. From the corporate strategy, the CIO, IT director, and their business answer to this problem unit counterparts analyze each directive and formulate what and how the information sys- tems need to change to meet those directives. Often it will be the mid-level managers who is to create an EIM first grapple with the concept of an EIM strategy because they are the ones who will most likely be directed to execute on specific goals. These managers may work in either business strategy. or IT, and will usually be the first to document the deficiencies (gaps) in the existing infra- structure. This gap analysis and resolution planning is the first stage of EIM strategy devel- opment, but the planners need to know it, lest they architect yet another isolated data silo. 8| Baseline Consulting
  9. 9. Figure 2 illustrates an effective EIM strategy must address the four quadrants of an informa- tion infrastructure: The reason that organizations are People Processes Including roles, Including practices, awash with data, responsibilities, and workflows, and data flows incentives Best Practice: Best Practice: Trusted Information processes (either IT and Business Collaboration broken or working), and applications is because Policies Technology there are no formal or Including data standards and Including Including business rules interoperability, (SOA), data sharing, ease-of-use published guidelines Best Practice: Data Governance Best Practice: Enterprise-wide resuse and standards that govern information rules and policies. Figure 2: The Four Quadrants of EIM People: Information is consumed by people. Moreover, it is the people in the organization who establish the vision and goals for the initiative, staff the processes, dictate the policies, and deploy the technology. Therefore, the “people” aspect of an EIM strategy considers the roles of IT and business managers, their specific responsibilities, and how they are incented to achieve EIM objectives. A best practice that epitomizes the People quadrant is IT and business collaboration, which will be explored further in the Best Practices section below. Processes: An EIM strategy will answer, at least at the high level, how a chain of informa- tion operations should interact. An information operation is any process that uses data— such as a direct marketing campaign, an order entry system, or a customer dashboard. The strategy will bind together the People quadrant with the Processes quadrant to define who manages and participates in a given workflow. A key process, and hence best practice, is the creation and maintenance of trusted data. After all, what value is EIM if you can’t trust the information it delivers? Policies: Closely related to People, but in a separate quadrant are Policies. Perhaps the quadrant with the least exposure, the policies category is comprised of business rules and data governance, which is seeing increasing awareness of late. The reason that organizations are awash with data, processes (either broken or working), and applications is because there are no formal or published guidelines that govern information rules and policies. The classic question of “What is the definition of a customer?” is answered by the data governance function. How can disparate operations efficiently cooperate on business goals Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success |9
  10. 10. Enterprise Information Management Strategy, Best Practices and Technologies on Your Path to Success if they can’t agree on business rules and definitions? Policies and data standards set by the organization for their unique context are the foundation upon which the people, processes, and technology are constructed. Technology: The last and probably most visible of the quadrants is Technology. The sim- ple fact is paper and pencil went the way of the buggy whip when it comes to managing information—and today’s spreadsheets are close behind. Technology—including software applications, databases, and middleware, among others—is the quadrant responsible for information delivery. However, technology can quickly become inefficient and unbearably complex if not managed, and an EIM strategy focuses on what otherwise could be chaos. The Technology quadrant of EIM needs to define the interoperability of business applica- tions, how and when data should be secured and shared, and what level of complexity is acceptable to the users. A key best practice for this quadrant is enterprise reuse and stan- dards. As we’ll see below, the goals for technology in an EIM strategy are ease of use, ability to share data, complete functionality, and integration with other EIM components. In Favor of Pragmatism Don’t let the breadth of EIM scare you. Any organized and holistic progress you can make is better than no progress at all. For example, an EIM strategy, especially in the beginning, can be large or small, have multiple phases, and have a long or short horizon—but it will always be living and dynamic. If there is one strategy that will evolve with an organization, it is the EIM strategy. No other system employed by the business is more dynamic than its information systems. There are several reasons for this: v The tremendous and continuous growth of data volumes; v The rapid advance of information technology; v The increased rate of new systems development efforts; v The rise of external data sources, resulting from mergers and acquisitions and from partners and customers; v Evolving data formats, including unstructured data; and v An increased business urgency to accelerate the pace of competitive differentiation. The above list reflects tremendous forces on the information infrastructures. Plan a regular review cycle, perhaps every three months, but no more than six. Plan to improve, expand, and refine the strategy. For every change to the corporation’s business strategy and goals, there will also be corresponding changes to the EIM strategy. One is responsible for delivering on the other. 10 | Baseline Consulting
  11. 11. EIM Best Practices The best practices in EIM are as numerous as the types of benefits they deliver. In this paper, we choose four practices—one for each information management quadrant— that every IT and business leader should understand. IT and Business Collaboration If you’ve ever sat in a meeting where business managers complained that IT delivered applications that didn’t meet their needs, or the business managers didn’t understand IT’s project prioritization process, then you’ve been witness to the lack of business/IT collabora- tion. In those situations, either side assumes they know what the other is doing or what it needs and goes marching off in blissful ignorance. What has been lost is the fact that one side is the customer and the other side is the supplier, and both are partners in achieving the organization’s goals. How can IT help the business if they don’t ask business for their goals, needs and requirements? And how can the business ease the IT burden if they don’t prioritize by explaining the goals, needs, and requirements of the business? We’re not talk- ing about one email message sent to the CIO from the VP of sales and marketing. We are talking about constant and regular communication between all echelons, with the players so enmeshed that you have to look at their business cards to tell them apart. Collaboration between IT and business is by far the most important EIM best practice. You know Collaboration between business and IT collaboration is a success when the joint team meets for its weekly project review and the “business” asks “IT” questions, and “IT” asks the “business” questions. Each IT and business is by side is completely aware of the other’s issues. While this may be the height of collabora- tion, an indicator of solid progress is when the two sides can speak in shorthand and not far the most important feel the compulsory need to explain all the minutia of their various challenges. They’ve EIM best practice. You gotten past it. No EIM initiative will be a success unless some portions of business and IT communicate know business and back and forth regularly, in writing and in person. It is true that IT can guess at the needs IT collaboration is a of the business without their input and, given enough tries, will deliver an application that the business can use. Email and Internet connectivity are two examples of communication success when the joint channels, but both are commodity services and neither offers a competitive advantage. Only through rigorous collaboration will business and IT define requirements for systems team meets for its that optimize performance for their unique organization and culture. weekly project review and the “business” asks “IT” questions, and “IT” asks the “business” questions. Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success | 11
  12. 12. Enterprise Information Management Strategy, Best Practices and Technologies on Your Path to Success Trusted Information Beyond people themselves, the foundation of any company is the knowledge used to conduct business. It’s that fundamental. For some of us, this can be a scary thought. It is because of this that a goal of EIM—through data quality, data profiling, data integra- tion, and other functions—is to enhance the measurable integrity—i.e., the trust—of the information. How is trusted information created? It is created through the use of a series of processes that ensures: ? The data is captured accurately (with no errors, transpositions, etc.); ? The captured data adheres to corporate data standards (formats and definitions); ? The data is moved, integrated, and summarized as needed when needed; ? The data is matched and consolidated to the hierarchical levels and context required; ? The data lineage can traced to its origins; ? The data is maintained and cleansed over time as it ages; and ? The data serves the business requirements that drive its access and use. Without trust, the significant investment in enterprise-class IT systems, such as CRM or ERP systems, will be squandered because the business users will instead invest in and rely on their own private data stores—typically spreadsheets. Business productivity degrades to the level of individual management and interpretation of data. Most companies are not only seeking the use of sanctioned and meaningful information, they are hoping that informa- tion will result in competitive advantage. Can the information be used to make critical decisions? You need to go no further than healthcare, patient treatment records and family medical histories to understand what trust is. When the doctor looks at the online medical records, she will make a potentially life-changing decision on what is stored in that system. CFOs, CEOs, and other business leaders make their decisions based on data too. Therefore, a best practice of EIM is to ensure data integrity is maintained throughout the information supply chain. Enterprise-wide Reuse and Standards The very nature of EIM dictates that the greatest value derived from information and IT assets is when they are leveraged across the entire enterprise. This provides for economies of scale, the sharing of data, the uniform spread of technology, and the effective use of trained and experienced staff. A goal of any EIM initiative is to ensure that an application devel- oped for customer support, for example, can be accessed and used by marketing. After all, why reinvent the wheel? It is true that wheels come in different sizes and are made of different materials, but proper EIM planning takes that into account and ensures that a version of the same application, with adjustments to the user interface and data model, can be delivered to marketing with a minimum of work. In so doing, customer support and 12 | Baseline Consulting
  13. 13. marketing are essentially using the same system and data, but adjusted within a tolerance the two functions can support. This means that business applications—particularly data integration applications such as CRM, CDI, and MDM—can be deployed faster and serve a wider community. It is the purview of the data governance Information management technologies have evolved to the point where the platforms they are built upon can support a wider range of business operations, often accessible from a function to establish, single repository (see the Information Management Software section below). The platform approach to delivering data quality and data integration functionality, for example, stan- amongst a myriad of dardizes data delivery. Now marketing, customer support, and sales departments can all expect the same behavior and consistent results from a cleansing operation. Substantial other policies, the work is invested by data stewards into data definition and business rule development. sanctioned definitions This data is captured and stored within a system in a structured and sustainable way. and acceptable level EIM practice would dictate that those business rules be made available across the enterprise so that other functions, such as marketing, can standardize on those definitions and not of quality for corporate have to replicate the weeks or months worth of “pick and shovel” work to create them. data. Moreover, smart EIM teams, through a common application platform, will allow marketing to inherit those rules and change them to suit their own specific needs. Marketing can then publish its own set of business rules to the enterprise, making the data environment deeper and richer with managed vertical content. All of which follows corporate standards invoked through the data systems via the user interface, rules repositories, and data models. Data Governance In the book Customer Data Integration: Reaching a Single Version of the Truth, the authors state: The goal of data governance is to establish and maintain a corporate-wide agenda for data, one of joint decision making and collaboration for the good of the corporation rather than the individuals or departments, and one of balancing business innovation and flexibility with IT standards and efficiencies.2 This goal emphasizes the importance of policy making around corporate information. If you’ve ever heard a manager say – “We back up our data whenever we can” or “The qual- ity of our data is okay. It could be better, but there is no one driving that” – you have just heard a failure of data governance. It is the purview of the data governance function to establish, amongst a myriad of other policies, the sanctioned definitions and acceptable level of quality for corporate data. Data governance must be done in a well-planned and cross-functional manner. It is also implemented up and down the organizational hierarchy, so that the data stewards who regularly manipulate and fix the data can raise their issues and propose tactics, while business directors and executives can set goals and propose poli- cies. In the middle of the governance function, the proposed policies meet the nascent 2 Dyche, Levy, Customer Data Integration, Reaching a Single Version of the Truth (John Wiley and Sons, 2006), pg. 151 Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success | 13
  14. 14. Enterprise Information Management Strategy, Best Practices and Technologies on Your Path to Success tactics and the two come together, over time establishing a robust policy and rules system that meets the need of the organization by “…balancing business innovation and flexibility with IT standards and efficiencies.” Implicit in that quote is the refusal to restrict organi- zational growth with needless straightjacket regulations, but to set standard processes that deliver greater value. Taken Together These best practices can be implemented separately and incrementally, but they gain expo- nential value as other practices are added to the EIM framework, gradually putting on muscle. Bottom line: EIM is not built overnight. It is built every day, and with each sunset, some small part has been added, and with each sunrise, there is the promise to add more. Requirements for Information Management In order for a suite of information management applications to support the demands of a robust EIM environment, there is a high-level set of requirements the suite should satisfy: ? Services Oriented Architecture (SOA) support; ? Centralized data management; ? A complete solution for a given chain of operations; ? Easy or existing integration with other applications; and ? Easy to use for all use cases. These requirements are about deploying an easy-to-use solution for any part of the EIM problem domain across the enterprise, and ensuring the targeted users applaud its effec- tiveness. So as practitioners go about either building or buying components of their EIM infrastructure, they should keep these five requirements firmly in mind, and bake them into the specification process to the extent possible. Consider it, if you will, part of the standard EIM recipe. SOA Support The ultimate purpose of SOA is to provide an application-independent interface layer to IT architectures that connect multiple data silos across the enterprise. SOA is modern-day middleware—only this instantiation is proving to be more effective and is gaining broader adoption because it is evolving into an industry standard. Industry standards are good for EIM because anything that eases and simplifies data sharing and operational integra- tion makes EIM easier to implement. In the past, attempts at EIM have been problematic because integrating between data silos took substantial effort and time. SOA directly attacks this decades-old problem. Moreover, SOA is not just about requesting and receiving data; it is also bi-directional. Data sources can call published services via SOA to perform specific functions, like launch a series of data audit tests when an event is triggered. SOA makes EIM operations richer because they can both pass information and invoke procedures. In the grand scheme of things, organizations become more agile. 14 | Baseline Consulting
  15. 15. Organizations implementing SOA do so to reduce costs through reuse, change systems faster, or modernize their system architecture. SOA agility means new applications or services can be brought online and have their capabilities published; existing operations can then sub- scribe to them without disrupting existing applications. Integrating disparate applications is now substantially easier: it removes significant time and cost from systems development; the standard SOA connectivity isolates and abstracts programmatic interfaces; and it elimi- nates the drama of system maintenance and upgrades. Centralized Data Management A challenge to information management is the distributed nature of the applications and systems that generate and use company data. While substantial effort is regularly invest- ed in getting databases, marts, applications and warehouses to “share” their data, there is always equal pressure to create new silos—temporary or permanent—for very good reasons. While a company’s data systems may grow like buildings in cities, there is no reason the management of the data in those systems should remain disjointed. Similar to how buildings are connected by telecommunications and roads, and managed by zon- ing restrictions and centralized property management firms, so too can distributed data systems be interconnected and centrally managed. Business intelligence (BI) and data integration competency centers, data governance councils, data stewardship programs, metadata management, and other efforts are all components of a common data manage- ment infrastructure. The benefits of this approach are substantial: ? Formal data management organizations are sanctioned by the company’s leadership, and therefore, their responsibilities are more apt to be recognized by both the business and IT. ? As roles and responsibilities are clearly defined, an enterprise-focused data manage- ment organization is more able to justify and absorb them. ? Policies and procedures are standardized once and practiced continually. ? Metadata and business rules have a central point of reference. While a company’s ? Systems of record are identified, prioritized, and recognized as key data sources. data systems may ? Technology maintenance is streamlined and is more cost effective. grow like buildings ? Data provisioning is an enterprise-based service, thus leveraging specialized skills and data reuse across projects and systems. The resulting cost savings can be substantial. in cities, there is no In the pursuit of centralized data management, firms will create solutions and application reason the management architectures that can access and manage the content of many different systems in a sus- tained and repeatable way. Sometimes, as in the case of master data management, the data of the data in those will be regularly pulled from the distributed systems, cleansed, matched, consolidated, and systems should remain enhanced in a central location so that it can then be published (pushed out) to the distrib- uted systems as master reference content. These efforts are evolving as key components of disjointed. IT architectures where each solution has greater and broader capabilities. Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success | 15
  16. 16. Enterprise Information Management Strategy, Best Practices and Technologies on Your Path to Success Moreover, the “toolbox” approach provides for simplicity. Having a platform, or set of tools, that supports a majority of the processing needs reduces the installation footprint, maintenance burden, training efforts, and operational complexity; and increases the sharing of business rules and standardizes services offered. When combined with SOA, the toolbox approach becomes even more powerful in three ways: ? Connecting to Web-enabled data sources is simplified. There is no need for complex SQL scripts or knowledge of proprietary application or database interfaces to access the data. ? ia the SOA connectivity, data management tools can be called from other applica- V tions as a service, again eliminating the need for a proprietary API to access the platform. ? o single platform offers all the functionality needed by an EIM initiative. SOA N allows for a blueprint to augment existing capabilities with plug-in modules. Through this surrogate relationship, the platform can serve as the larger framework upon which to build third-party functionality when appropriate. Complete Functionality The platform leads us straight to the next ingredient: a complete solution for a given chain of operations. Information management vendors that offer a single platform make it significantly easier to add new functionality. All of the processing “overhead”—such as When practitioners grid computing, parallel processing, user interface (UI), rules and metadata repositories, processing engines and so on—are taken care of by the platform. When practitioners build build out their EIM out their EIM infrastructure, they look for solutions that provide them with the greatest infrastructure, they breadth to reduce data acquisition and provisioning time, complexity, and installation costs. Moreover, the more complete the solution, the more efficient their development look for solutions that efforts. Anytime a separate function has to be “stitched in” to fill a processing void, costs increase and additional failure points are introduced. So the completeness of the solution provide them with the is not only about being the most functional, but also about achieving the lowest risk of implementation. greatest breadth to reduce data acquisition Seamless Integration There comes a point where the functional boundary of the platform will be reached and and provisioning a handoff to the next application is needed. Unfortunately, a technology platform is con- time, complexity, and strained by the elegance of its design and the amount of development resources applied to it. It can’t be expected to do everything. For example, consider the migration from a source installation costs. The system to an MDM hub to a data warehouse. No single, discrete platform today supports the multi-functional capabilities of robust extract and transformation with operational data more they can get from reconciliation and analytical and query support. There are, however, world-class solutions one vendor, the better. for each of these, and vendors are providing tightly-coupled integration solutions between these separate applications and platforms. Such solutions can take a variety of forms from predefined SOA calls to code-level callouts. Most often, the strongest integration between 16 | Baseline Consulting
  17. 17. separate applications will be within the product line of a single vendor, such as SAP: their ETL product, SAP® BusinessObjects™ Data Integrator, integrates with SAP NetWeaver® Master Data Management, which in turn is coupled with their SAP BusinessObjects business intelligence solutions. One advantage to steering towards products with exist- ing external integrations is the practitioner can comfortably and incrementally expand and scale the environment knowing that for the next component, the integration point exists and has been tested. Ease of Use Ease of use is a common refrain from all business application users. All EIM (BI, ERP, etc.) software should be easy to use. The judges are not IT, but rather, those people who have to run the application as part of their work. Consider the wide variety of applications the typical sales operations manager uses during the typical work week. First, there is the full Microsoft Office suite of Excel, Word, PowerPoint, Visio, Outlook, etc. Then there is the sales force automation solution, CRM application, and the web browser. With the plethora of applications and increasing complexity of the modern workplace, ease of use in software becomes a matter of personal productivity. The pressures on IT staff are no different. IT management does not want to buy yet another product that requires intensive training and significant subsequent practice. Neither IT nor the business wants to invest in a solution that requires a high degree of specialization. Ultimately, ease of use is about speed to return on investment (ROI). The faster a person can learn an application, the sooner the organization accelerates towards profit and revenue targets. Sadly, ease of use is the most overlooked of all EIM requirements, and yet the one with the most measurable and tangible returns. Information Management Software The focus of the information management software discussion centers on applications and technologies closely related to data integration. There are important reasons why an organi- zation is encouraged to consider starting with EIM: ? Companies across industries, particularly those accustomed to frequent mergers and acquisitions, have heterogeneous data environments. Extracting value from those data systems demands that their data be integrated; otherwise their data is isolated to the few users and applications with access to those silos. Data integration is the core tech- nology for sharing data across the enterprise. ? Integration offers a relatively quick return on IT investment. It leverages existing data systems to extract and move data to where it is needed today. To a certain extent, a robust data integration strategy can overcome weaknesses in the existing information architecture (deployed repositories) until newer repositories can be affected. ? The movement of data within an organization is constant and crucial to business operations. Developing strong capabilities in this area increases enterprise agility that improves the organization’s ability to react given unforeseen circumstances. Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success | 17
  18. 18. Enterprise Information Management Strategy, Best Practices and Technologies on Your Path to Success ? Any time data is moved from point A to point B, there is an opportunity to improve it. A common complaint by government agencies is they cannot change the data because they don’t own the source systems. The information value chain within government agencies can cover many departments with the original source system beyond the span of control. Modern data integration technology solves this dilemma by allowing data transformations on the fly, as the data is moved. The changes to the data can either be saved or discarded, knowing that the next time the data is moved the same transformation can be applied. In essence, data integration is a key building block of EIM. Yet even the data integration technology space is broad. There is ETL (extract/transform/load), EII (enterprise informa- tion integration), EAI (enterprise application integration), database replication, and the simplest of all, FTP (file transfer protocol). Surrounding the data integration space or ETL Process Development closely related to it is data quality, metadata management, text analytics, and master data management. The master reference data process shown in Figure 3 highlights the interac- tion of these technologies in a typical IT environment and shows how they fit into a major EIM operation: ETL Process MDM Data Quality Data Governance Process Hierarchy Mgmt. Data Transfor- Data Data Modeling Extraction mations Cleansing Authoring Publishing Call Log & Data Entities/ Text Matching & Data Text Files Extraction Actions Analysis Consolidation Loading List Metadata Management Metadata Repository Figure 3: A Master Reference Data Environment The overall purpose of the above process is to collect data from the point of capture and load it into an MDM system where a reporting or analytical application (BI) can access the master data and provide an enterprise-wide view of the information in the context needed. 18 | Baseline Consulting
  19. 19. ETL The ETL process is at the heart of data integration. Most often, data integration entails the movement of data, not just accessing data in place. Moreover, as can be seen in the MRD diagram, ETL can serve as the framework upon which other EIM functionality can be included in the process flow. In the diagram, two different source systems—the backend of an e-commerce website and the call logs for a warranty center—are accessed. One has struc- tured data in the form of database tables, and the other has unstructured data in the form of text files. The ETL program will internally route the data to the appropriate transform, one of them being a sophisticated text analysis (unstructured data processing) program that is linked to the ETL application through SOA or an interface API. The ability of the ETL pro- gram to interface with external programs is one of the requirements for information man- agement software. After the text analysis function extracts the desired data, the ETL program takes over and merges the two disparate data streams into one structured data stream where a myriad of transformations can be applied. This in itself is a major boon to EIM. In years past, practitioners had to struggle with complex and convoluted processes to extract data from freeform textual data and then compromise on how it was stored with structured data. With 80% of the world’s data in unstructured data sources, an EIM strategy will eventually have to address it. Following the native ETL transformations, the ETL application can route the single data stream to data quality processing in the same way it did for text analytics. However, more data integration vendors are building single application frameworks that natively support greater portions of the EIM domain, and the first easy step in this direction is embedding data quality functionality. After data quality processing, the ETL application is ready to load the cleansed data stream into the MDM solution. Typically, the data is deposited into a staging area isolated from the heart of the MDM repository. For EIM, ETL has served the crucial role of moving, transforming, and loading captured data from one end of the enter- prise—i.e., an order entry website—all the way to the corporate master reference data sys- tem. Organizations will have different architectures, and some will have a data warehouse in the process flow, but regardless, the technology of choice to perform data movement is ETL. Most often, data integration entails the movement of data, not just accessing data in place. Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success | 19
  20. 20. Enterprise Information Management Strategy, Best Practices and Technologies on Your Path to Success Data Quality Data Supply Chain Building trusted information is an EIM best practice. Organizations build and maintain trusted data at every step in the data supply chain. The concept of the data supply chain has no greater relevance than in the EIM context. Figure 4 shows how data quality technology intersects with a classic data supply chain: Data Quality Operations: Profiling, Parsing, Standardization, Cleansing, Matching & Consolidation Contact Sales Develop Manufacture Deliver Support Analyze Data Supply Chain Repeat the chain Figure 4: Quality Across the Data Supply Chain In Figure 4 we can see data quality operations exist at every major stage in the chain. Each stage is an opportunity to create, enhance, or just maintain the level of trust in the data. The sooner data quality issues are corrected in the chain, the sooner the firm benefits from greater trust. For example, validating and standardizing data at the initial point of contact with the customer, such as a website where they can enter their information, benefits every downstream operation no matter how far-reaching the enterprise. You can multiply the benefit by the count of all the subsequent operations that use the data. Conversely, the longer an organization waits to cleanse and improve data integrity, the more upfront opera- tions are sub-optimized because of data defects impacting their effectiveness. Moreover, the earlier the data is cleansed, the less the cleansing costs later on. The reason is the count, type, and most importantly, complexity of data quality problems are less. Rather than let- ting problems build up to the point where correcting them in the data warehouse becomes a large task, tackling the issues as they arise makes each operation simpler. Following the incremental improvement approach, data quality operations lend themselves to pilot proj- ect implementations. Use the success of each pilot to build out the data quality infrastruc- ture as part of your EIM strategy. 20 | Baseline Consulting
  21. 21. Metadata Management Metadata is data about data. It tells us such useful things like when a table was extracted from a data source, what transformations were performed on each field, what user ran the Regardless of how you transformations, when they did it, and where the data was moved. If the CFO wants to define the specific know how his quarterly financials became corrupted, the IT director will be very interested in the migration log tables to answer this question. contexts, metadata is There are at least three general types of metadata3, depending on whose definition you use: the information a firm business, application, and database. Regardless of how you define the specific contexts, metadata is the information a firm will use to decide on the usefulness of a given data set will use to decide on in their decision-making and business operations. Data quality metrics that quantify num- ber of defects, percentages of blank or null fields, cardinality, minimum and maximum the usefulness of a values, and outliers against business rules are all metadata attributes that a data steward given data set in their will use to judge the information. Capturing, storing, and analyzing this information is fundamental to building trusted information. Metadata management software must be able decision-making and to serve this function. Moreover, to be useful, metadata needs to be tracked backwards in the information supply chain via data lineage and tracked forwards via impact analysis. business operations. These are the two key operations of metadata management. Data lineage allows the CIO to see where and when the data came from and what was done to it before being used in the financial reports. Impact analysis flips the coin over and allows the IT analyst to see what reports use a field of data that requires a calculation change. With this visibility, the analyst can go to report stakeholders and notify them of the pending change before they find it in a report. Master Data Management (MDM) At the apex of data integration software is MDM. As shown in the MRD diagram, two dispa- rate data sources are loaded into the MDM system. Actually many different source systems may be involved. The MDM system reconciles (matches, standardizes, and consolidates) new input data with its current master reference data, and then stores the master repository in a data model flexible enough to support multiple hierarchies. For certain, MDM is much more than technology, as it encompasses policies, practices, and systems that create an infrastructure for collecting, storing, and managing master reference data. However, no discussion on EIM software is complete without MDM software. MDM software deployment can take four forms: Three of them are domain-specific (customer, product, financial), and the fourth is a generalist version that seeks to support all domains and comes with the necessary generalizations. 3 Dyche, E-Data: Turning Data Into Information With Data Warehousing (Addison Wesley, 2000), pg. 148-149 Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success | 21
  22. 22. Enterprise Information Management Strategy, Best Practices and Technologies on Your Path to Success For EIM, an MDM system offers great advantages. It not only serves as the system of record for customer or product data, collecting and consolidating it from all reaches of the enter- prise, including multiple data warehouses, but it also allows the data stewards to design To deal with these and create data models that roll up to hierarchies that can be adjusted at will for a specific issues before a view. These views, or context-sensitive hierarchies, can be saved and used by different cor- porate functions as their own operations dictate. Marketing, sales, and manufacturing can firm finds their all view the product hierarchy—from suppliers to chemical composition to distribution channel—as needed. Then, when a hierarchy is placed “in production,” the master data can data environment be published to subscribing applications, where it is either pushed or pulled out to down- marginalized to the stream operations. Along with publishing to external systems, the MDM system—through SOA or another type point of ineffectiveness, of integration—can serve as the hub or repository that provisions cleansed and reconciled the organization must data to a business intelligence (BI) environment. Indeed, one of the “entry points” for MDM software is often to cleanse and reconcile master data to readily support improved adopt EIM. reporting and analytics. In Closing Organizations are facing increasing complexity in their operational and data environ- ments. New data sources, unstructured data, and more data than ever before are creating a perfect storm of information overload (also known as “infoglut”). New regulatory require- ments for transparency and confidentiality add a layer of rules that compound complexity. Customers’ demands for faster service and more relevant conversations stress front-office applications, while parallel demands by internal users place even greater demands on back- office systems. And the technologies used to implement the environment are constantly evolving and becoming more sophisticated, but not necessarily easier to use. Meanwhile, competitive pressures are never-ending with the companies continually raising the bar through their own adoption of information integration and deployment strategies. All of these pressures have combined to render information management more urgent than ever. Before your company discovers that its data quality and deployment practices have been marginalized to the point of ineffectiveness, consider adopting EIM. Only through holistic and systematic planning encompassing the best practices discussed in this paper can your corporate data contribute to revenue growth and strategic fulfillment. 22 | Baseline Consulting
  23. 23. About the Author Frank Dravis is a senior consultant at Baseline Consulting, a business analytics and data integration services firm. Frank has twenty-one years of experience in enterprise informa- tion management (EIM) and data quality solutions design, implementation, and consult- ing. At Baseline Consulting he serves as senior consultant specializing in data integration, data quality, and data governance solutions, advising key clients and industry vendors on these and other technology strategies. Prior to joining Baseline Consulting Frank served as VP of EIM Strategy at Business Objects/SAP where he researched and aided in the formu- lation of EIM and data quality market strategies. Principle among those efforts was plan- ning of CDI/master data management in the EIM suite. As a benefit of the research Frank delivered data quality best-practice advice and consulting to Business Objects’ extensive list of industry-leading clients. He is a frequent writer, blogger and industry speaker on EIM topics. Prior to Business Objects Frank held such positions as VP of Development and VP of Information Quality at Firstlogic, Inc. where he led the IQ Assurance Strategic Data Quality consulting program, contributing thought leadership and practice manage- ment in addition to data profiling program management. Frank holds an M.B.A. from the University of Wisconsin-La Crosse, and a B.S. degree in computer science. Enterprise Information Management: Strategy, Best Practices and Technologies on Your Path to Success | 23
  24. 24. Baseline Consulting is a management and technology consulting firm specializing in data integration and business analytic services to help companies enhance the value of enterprise data and improve the performance of their business. Baseline’s proven, structured approaches uniquely position us to help clients achieve self-sufficiency in designing, delivering, and managing data as a corporate asset. Baseline Consulting Group 15300 Ventura Blvd., Suite 523 Sherman Oaks, CA 91403 1-818-906-7638 www.baseline-consulting.com © 2008 Baseline Consulting Group. All Rights Reserved. 24