Your SlideShare is downloading. ×
0
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Implementing Enterprise Information Programs with MDM-CDI
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Implementing Enterprise Information Programs with MDM-CDI

925

Published on

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
925
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
117
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • There’s no standard definition of MDM, or Master Data, and Ecosystems (aka Information Exchanges or Fusion Centers) are a new term to most, although we’ve been involved in several of these across three continents. Most all MDM definitions include people, business processes that need to be “fixed” and managed, and technologies to implement data quality business rules and workflow/task management.
  • Space was first defined in early 2005 But already MDM market size today at more than $5 Billion with predicted growth surpassing $10B in the coming years Versus Forrester sized the BPMS market at $1.2 in 05 with expected growth to $2.7B in 2009 Enterprise content management (ECM) at $2.2B in 2005 with expected growth to $3.9B in 2008
  • -Problem complexity by its multidimensional nature. Some of the dimensions and components are shown in the picture: customer data integration to move to customer centric processes, information quality, legacy systems, data visibility and security concerns etc. If you want to take into account everything upfront – impossible If you make it very narrow, it will never be an enterprise solution but rather another flavor of a silo -Difficult to sell to business. EDM and MDM are perceived as infrastructural with lack of business benefits. (Cost center). Since they are oftentimes perceived as purely infrastructure initiatives with limited or unclear benefit. Some examples including spendings on the roadmaps -No single vendor provides a solution… (good) and spending. For all enterprise data problems due to a great variety of problems, industries, corporate cultures, platforms, and standards along with continuously changing market dynamics and business requirements -It is not just about vendors. It is about a vision of a solution and ability to assemble this solution from best vendor components and product and enable enterprise for flexibility
  • Key points here: Definition of MDM defined through a single version of the truth. What is it? There are other types of Master data – to be defined by each business
  • Organizations missing the Party level data: from the structure perspective and from the businbess process perspective
  • This change to customer centricity occurs in all industry verticals. Horizontal trend Four groups of areas where MDM-CDI helps: Business Development and Marketing Campaigns: Credit Cards/student example – not only customer recognition but also relationships, marketing campaigns efficiency example, “spinners” – example, example of redundant data entry for FA, inability to implement privacy laws – how do you collect and store opt in / opt out information, risk adjusted return on investment, operational inefficiencies at account opening/contract creation and maintenance and how this information is propagated from contract creation to billing systems, consolidated billing statements, government program in the UK. Need for Risk Adjusted Returns.
  • Some examples from our customers. Most of these started with CDI, and are only now beginning to discuss moving into Item-centric MDM, but the value for them isn’t apparent yet, so a “product-centric MDM” strategy is slower, but we’re seeing several of our customers begin down a Location MDM initiative as well as other non-customer data management. Real time Customer Data Integration for B2C and B2B marketing, sales enablement in high tech. Real time tracking of customers and prescriptions across 3500 stores for WalMart Saving millions in cleansing and integrating data – even after it had been cleansed by an external data provider (like an Experian, Harte Hanks, or the like) Joining together customers across nearly a dozen different hotel brands under the portfolio of one hospitality company Catching “bad guys” and sharing data across scores of agencies across jurisdictional, political, agency, and cultural boundaries Improving patient data across insurance and provider ecosystems
  • You can see that James or Jim Jones appears to this company as three different people because there is different demographic data for him. People move, get married…their data changes Go thru four bullets 1. non intrusive…no new unique identifiers, do not force standardization, no code at the source systems…why our implementations are 3 – 6 months. 2. different applications have different data needs…we provide the right data to the right application…again in real time. By linking all these data and creating the Golden record we are “cleaning” and deduping the data Reduce duplication rates to well under 1% in a typical installation Discovered that 23% of a credit card issuer’s customers shared an address with another cardholder, enabling company to dramatically decrease marketing and mailing costs 3. we then maintain these linkages so that every time Bill Conroy knocks on the door…we make sure that another duplicate never appears.
  • -Let’s move on to another topic of this presentation: Why do we need MDM? Is that really something new or just a new buzz word? -A friend of mine, a person with a strong ETL experience brought an argument in our conversation. He said: MDM is defined thru an ability to create and maintain a single version of truth for enterprise entities such as Customer, Supplier, Product etc.. We have been doing this for years now for Datawarehouses, datamarts, and CRM solutions. Then why is MDM new? What is a single most important and disruptive new thing about MDM? -My answer was that: “Modern probabilistic MDM technologies significantly enhance traditional deterministic ETL A traditional deterministic ETL oftentimes is not capable of providing high quality dimensions for data warehouses and data marts. -The value of the datawarehousing ability to slice and dice.is greatly impacted by insufficiencies of deterministic ETL in building complex data warehousing dimensions, relationships and hierarchies within dimensions. Indeed it is not easy to slice and dice if your data warehousing dimensions are crooked -This is where probabilistic MDM comes in. Potentially every complex datewarehousing dimension may need a data hub extension based on probabilistic technology as shown in the next slide
  • For each complex dimension you may want to consider a data hub This is a logical view and doesn’t mean that different products are required for different hubs This picture emphasizes the notion of Analytical MDM. Initiatives are frequently begin with this and then move deeper in more operational and collaborative MDM where probabilistic technology helps real-time synchronization in the OLTP world, e.g. support account opening and client on-boarding
  • This slide provides more detailed view on matching and linking. A lot of science has been involved here including multiple PhD’s to build this. The goal is to have algorithm think like a person when records are compared. Bill  William, Larry  Lawrence Misspelled values (missed letters, transposed letters, extra records) Transposed first name and last name Anonymous values Validation for TIN, Credit card number Name change / Address change Uniqueness of values The process includes the following stages: Optimization for statistical comparison (the engine creates comparison strings internally). Case conversion, removing extra spaces, standardizing phone numbers, removes anonymous values, standardization functions are available (generic, address functions (country specific and address component specific) , date functions, name functions Finds all potential matches (bucketing – score only what makes sense for comparison) – scalability (linear vs quadratic) One record can be in multiple buckets Compares and produces weighted score Links
  • 1. Hierarchy is certain kind of relationship.
  • MDM can’t be separated from Data Quality A relationship between DQ and MDM is two-fold On the one hand MDM can’t be implemented without certain level of data quality for matching attributes On the other hand matching and deduplication by their nature are concerns in the realm of data quality There have been a lot of discussions about data governance lately. Indeed data governance is very important. Data Governance defines data policies, standards and key data processing and access rules. Once a data governance organization is created, it should be able to execute these policies. There should be mechanisms and controls developed enabling the organization to enforce and control the policies and standards. Data Governance and Data Stewardship need a framework, procedures and supporting technologies for continual data quality improvement If information quality is not continually monitored and maintained, it will deteriorate. The deterioration rate depends on specifics of the industry and a given company’s business processes and technology maturity, but a generally estimated deterioration rate of 2 percent per month is quite typical. Initiate Inspector supports a number of data stewardship functions including: validity of key identifiers, and conditions like potential linkage, merge, etc. The current strategy is to provide a comprehensive data governance framework with its abilities to review and resolve data quality issues and monitor the data quality progress overall
  • Only to illustrate the idea, not a presentation. Even more I don’t believe in one tool. Baseline tool and reusable components/services to be able to change the functionality as needed Search and inspect – reactive mode. Let’s edit records for Cynthia Johnson We can add new records if needed Resolve – proactive activity Potential Duplicates Potential Linkage Relationships
  • Service Oriented Architecture is a software design and implementation architecture of loosely coupled, coarse-grained, and reusable services. Each of these three qualifiers have a specific meaning: Loosely Coupled – it means that if we defined services for use in a certain combination and sequence, we should be able to use the same services in other call sequences or call combinations. Loosely coupled is of course opposite to monolithic. Coarse-grained means that services should be defined at a high enough level typically higher than a single database table. Fine-grain services are important too since they may provide flexibility Reusable – this term is pretty straightforward. Services design should be optimized for reuse. Services are exposed on the network and can be reused by multiple applications Web Services is one of the most popular standards. It offers a standard foundation for functionality, data access and data management reuse within a federated enterprise and with services provided externally It is quite common for modern enterprises to have SOA initiatives, committees and counsels. It is much less common to see that a SOA initiative works closely with Information Management initiatives. This constitute a significant problem. SOA initiative and information management initiatives must cooperate closely. Indeed, 60% of all services are data services.
  • -At a high level data services do two primary things: First, Data Services make interaction with data format agnostic – no matter how the data is stored; whether it is a relational data base like Oracle, Teradata, SQL Server, Sybase or Informix, MS Access, Excel, XML, flat file or a packaged application, web services standard makes data access data format agnostic -The second point is more profound but it is also more difficult to execute. Data services make interaction with data, systems and application agnostic. -With this architecture data services metadata is responsible for locating the right records. Even if an underlying system is replaced with a new system, IT’s responsibility and its SLA with business community demand to preserve the existing services and their characteristics.
  • Data Hub acts as a services platform that supports two major groups of services: internal, infrastructure-type services that maintain Data Hub data integrity and enable necessary functionality; and external, published services. The latter category of services maps well to the business functions that can leverage CDI Data Hub. These services are often considered as business services, and the Data Hub exposes these external business services for consumptions by the users and applications.
  • Having service oriented capabilities and data services framework is almost a must for a modern enterprise. These capabilities allow the enterprise to Reduce data redundancy – we don’t have to copy data in a new location but rather use services pointed to the right data sources Business, data governance and data stewards can define SLA via services Data services provide a required level of abstraction – no need for data governance and data stewards to work at the data and data model levels Services provide standardized interaction within the enterprise, external and vendor provided services One of the strongest arguments for data services is increased productivity of development and agility to support evolving requirements. On the left we listed considerations restricting the use of services. Performance problems can be significant, especially if multiple pieces of information are to be joined If implemented with web services, solutions do not support transactional integrity for synchronous processes, which means that compensating transactions are required Use of data services requires strong governance and educated users. As soon as you provide services they can be misused otherwise. This requires a new SOA culture in the enterprise SOA initiatives typically do not meet expectations if not supported by a sound data strategy
  • Brief overview of Mike2.0 includes the concepts like: Phases Activities Tasks Usage Model Complexity and global nature of EDM is adequately addressed by Open Source Mike2.0 If the Internet connectivity is available we will make brief overview of Mike2.0 online
  • The MIKE2.0 Usage Model provides a guide in determining which phases, activities and tasks from the overall methodology process are used for different categories of Enterprise Information Management engagements. The IM engagements in Mike2.0 are categorized as follows: Business Intelligence Information Asset Management Access, Search and Content Delivery Enterprise Content Management Enterprise Data Management Information Strategy, Architecture and Governance Composite Core Solution Offerings MDM-CDI is under Enterprise Data Management
  • For each Mike2.0 Activity and task there is a description, deliverables, resources and other useful information
  • 80% of the failed projects fail due to socialization issues and not due to technology Socialization is shown here as a three dimensional concept Level of ownership Stakeholders Lifecycle Phase and Releases Helps program managers build communications plan
  • Don’t try to see everything through: it is too complex and unmanageable Don’t put too much emphasis on Current State – this will safe time The initiative should be transparent to your stakeholders want to know how the initiative will help them achieve their goals
  • The slide compares 2 scenarios: blue with the registry hub and red as the master hub With this approach business it is typical to have first business results within 4 to 6 months. If business wants hub master as the target state, in 18-20 months If you try to get to the hub solution straight (start with the hub), business value is obtained much slower and risk is much higher. Exception may be developing markets
  • In the first year and first implementation phase the number of legacy systems integrated in scope of MDM-CDI is limited (typically 1-3) How to accelerate on-boarding of new systems in the consequent phases given that it is not unusual that 20-50 systems can be in scope of MDM-CDI integration? A well-defined set of system on-boarding standards and procedures determines common rules that each legacy system should comply with to be integrated into the evolving MDM-CDI solution Enables a repetitive on-boarding process and enables sustainable accelerated solution growth in terms of the number of systems and LOBs Preserves integrity and consistency of the MDM-CDI solution Improved data governance Enables highly sustainable pace of
  • So far probabilistic Match and Link technologies have been primarily used to establish that two or more records represent the same entity (the same individual, supplier. Product, company etc). In some scenarios it is important to dynamically identify relationships between records based on common attribute values. These relationships can be established by probabilistic technologies even if we deal with fuzzy matches. Fuzzy matches can’t be established by applying SQL or other deterministic methods. As some of the examples of systemically inferred relationships we can mention: dynamic definition of households, individuals in position of power in the context of a business your organization may deal with etc. Integration of MDM tools with Business Rules Engines will open new capabilities in applications like client on-boarding, account opening and maintenance, policy issuance etc. Comprehensive Data Stewardship solutions also require Business Rules engine components that will allow the organization to configure and customize data conditions that trigger sending the records to data stewardship queues for resolution As we discussed today probabilistic MDM technologies should be better integrated with ETL technologies and major messaging vendor products to enable higher quality BI solutions Metadata integration is a hot topic. With a variety of products when each of the products is metadata driven, how to reuse the metadata and develop sound standards and practices for code translation semantics to support interoperability Finally, from the enterprise perspective it is not a satisfactory practice that each vendor product has its own independent data security and visibility structures. Data security and visibility should typically be externalized from the tools and supported in a centralized fashion, for instance by LDAP. Vendor solutions should be able to consume the rules of data visibility and security owned externally.
  • Transcript

    • 1. Implementing Enterprise Information Programs with MDM-CDI & SOA Larry Dubov, Sr. Director, Sales Consulting & Architecture New York, NY May 15, 2008 [email_address] DAMA-NY 2008
    • 2. Agenda <ul><li>Definitions </li></ul><ul><li>Why MDM and EDM Now and Key Challenges </li></ul><ul><li>MDM and Data Hub Capabilities </li></ul><ul><li>Data Stewardship Framework and Information Quality </li></ul><ul><li>SOA and Data Services: Strengths and Weaknesses </li></ul><ul><li>Information Management Methodology </li></ul><ul><li>Lessons Learned and Accelerators </li></ul><ul><li>What are the next disruptive things in EDM and MDM? </li></ul>
    • 3. What is Master Data Management (MDM)? <ul><li>Master Data Management (MDM) is a framework of processes and technologies aimed at creating and maintaining an authoritative, reliable, and sustainable, accurate, and secure data environment that represents a “ single version of the truth ,” an accepted system of record used both intra- and inter-enterprise across a diverse set of application systems, lines of business, and user communities. </li></ul><ul><li>Master data are those data which are foundational to business processes, are usually widely distributed, which, when well managed, are directly contributing to the success of an organization, and when not well managed pose the most risk </li></ul>
    • 4. Customer Data Integration (CDI) is the “Entry Point” to MDM MDM CDI <ul><li>Products </li></ul><ul><li>Equipment </li></ul><ul><li>Financial Assets </li></ul><ul><li>Vessels </li></ul><ul><li>Containers </li></ul><ul><li>Weapons </li></ul><ul><li>Locations </li></ul><ul><li>Drugs </li></ul><ul><li>Vehicles </li></ul><ul><li>Customers </li></ul><ul><li>Prospects </li></ul><ul><li>Patients </li></ul><ul><li>People </li></ul><ul><li>Citizens </li></ul><ul><li>Employees </li></ul><ul><li>Vendors </li></ul><ul><li>Suppliers </li></ul><ul><li>Trading Partners </li></ul>CDI Focus is on Individual &amp; Organizational Entities: MDM Expands the Problem to Include New Entities: Party (CDI) Product (PIM)
    • 5. Key Challenges <ul><li>Very complex, multidimensional, and multi-disciplined and can be risky </li></ul><ul><li>Difficult to sell data initiatives to the business and executive management </li></ul><ul><li>No single vendor provides a comprehensive solution </li></ul><ul><li>These factors mandate development and reliance on sound models, open integration standards, and methodologies in building holistic solutions from multiple best-of-breed components </li></ul>New Customer &amp; Relationship Centric Business Processes CDI Consuming Applications Customer Identification, Correlation &amp; Grouping Service Oriented Architecture Business &amp; Operational Reporting Exception Capture &amp; Processing Data Acquisition, Distribution &amp; Synchronization (Batch &amp; Real Time) Visibility, Security, Confidentiality, Compliance Metadata Management Data Governance &amp; Standards Information Quality External Data Providers
    • 6. Enterprise Customer Data Managed by Lines of Business <ul><li>Enterprises organized by Lines of Business and manage customer information in a product-centric model with overlapping customer domains </li></ul>Business Line 1: Business Line Products Business Line Customers Business Line 2: Business Line Products Business Line Customers Business Line 3: Business Line Products Business Line Customers Business Line 4: Business Line Products Business Line Customers Business Line 5: Business Line Products Business Line Customers
    • 7. So, What’s the MDM-CDI Focus? The Need to Transition from Product/Account to Party…it’s a Big Deal Current State: Future State: Household Client Grouping Account Grouping <ul><li>Account 123: </li></ul><ul><li>Product 1 </li></ul><ul><li>E-Statement </li></ul><ul><li>KYC </li></ul><ul><li>Acct &amp; Client Docs Approval </li></ul><ul><li>Account 456: </li></ul><ul><li>Product 2 </li></ul><ul><li>E-Statement </li></ul><ul><li>KYC </li></ul><ul><li>Acct. &amp; Client Docs Approval </li></ul><ul><li>Account 789: </li></ul><ul><li>Product 3 </li></ul><ul><li>Paper Statement </li></ul><ul><li>KYC </li></ul><ul><li>Acct &amp; Client </li></ul><ul><li>Docs Approval </li></ul><ul><li>Account 123: </li></ul><ul><li>Product 1 </li></ul><ul><li>Acct. Attr. Only </li></ul><ul><li>Account 456: </li></ul><ul><li>Product 2 </li></ul><ul><li>Acct. Attr. Only </li></ul><ul><li>Account 789: </li></ul><ul><li>Product 3 </li></ul><ul><li>Acct. Attr. Only </li></ul>Derived Joe Mary Joe Mary Mary Joe Mary 2 Mary 1 <ul><li>E-Statement </li></ul><ul><li>KYC </li></ul><ul><li>Client Doc Approval </li></ul><ul><li>Paper Statement </li></ul><ul><li>KYC </li></ul><ul><li>Client Doc Approval </li></ul>Spouse Joe Mary Owner Owner Beneficiary Owner Owner (Joint)
    • 8. Drivers <ul><li>Cross sell/up sell to existing customers </li></ul><ul><li>Effectiveness of marketing campaigns </li></ul><ul><li>Recurring revenue from existing customers </li></ul><ul><li>Retain “good” customers by reducing attrition rates </li></ul><ul><li>Recognize “bad” customers </li></ul>Business Development, Sales &amp; Marketing <ul><li>Account setup costs </li></ul><ul><li>Customer acquisition costs </li></ul><ul><li>Administrative overhead of redundant data entry </li></ul><ul><li>Operation costs: duplication, redundancies, transaction errors, data processing errors &amp; exceptions </li></ul><ul><li>Failed tactical initiatives </li></ul><ul><li>Reduces costs of planned initiatives due to CDI </li></ul>Operational Efficiency <ul><li>Risk Management </li></ul><ul><li>Accurate Books &amp; Records </li></ul><ul><li>Compliance with AML &amp; KYC Regulations </li></ul><ul><li>Compliance with corporate standards and policies </li></ul><ul><li>Regulatory fines and penalties </li></ul>Risk, Privacy, Compliance &amp; Control <ul><li>Account setup time </li></ul><ul><li>Customer service time </li></ul><ul><li>Customer intelligence and level of service </li></ul><ul><li>Consolidated statements </li></ul>Customer Service Drivers: Business Area:
    • 9. MDM Is Adding Value When You… … Purchase Software … Pick Up Your Prescription … Apply for a Loan … Check In &amp; Earn Points … Identify Risks … File Insurance Claims … Register a Patient
    • 10. MDM and Data Hub Capabilities
    • 11. Single Version of The Truth INFORMATION J. Jones (Name) 35 West 15 th Street (Address) Toledo, OH (Address 2) Sales Customer Support E.R. Merge and Persist Composite View or INFORMATION James Jones (Name) 35 W. 15 th Street (Address) Toledo, OH (Address 2) INFORMATION Jim Jones (Name) 35 West 15 th St. (Address) Toledo, OH (Address 2) INFORMATION James (First) 35 West 15 th Street (Address) Toledo (City) Jones (Last) OH (State) M Gender) 30 (Age)
    • 12. Single Version of Truth: Commercial Customers D&amp;B AIU Back Office Provides accurate, real-time access to complete customer or entity data across disparate sources, systems and networks ABC Incorporated 9146 E VIA DEL SOL NETOWN, CA 45883 Joe Smith 480-473-5620 Name Addr Cont. Phone INFORMATION Trusted System of Record ABC Inc. 9146 VIA DEL SOL NETOWN, CA 45883 Joseph Smith Will Jones 480-473-5620 Name Addr Cont. Cont. Phone ABCC Incorp. Joseph Smithe 304-473-5602 Name Cont. Phone ABC Inc. Will Jones 480-473-5620 Name Cont. Phone AB&amp;C 9146 VIA DEL SOL NETOWN, CA 45883 480-473-5620 Name Addr Phn
    • 13. Why Do We Need MDM? (DW is Only as Good as Its Dimensions) Customer Product Time Customer Product Time Can we really ‘slice and dice?’ Traditional Deterministic ETL may not be sufficient… This is where Probabilistic MDM enabled by Data Hubs comes in
    • 14. Star Schema Hubs <ul><li>If your MDM solution is BI driven, align your MDM solution with complex DW dimensions </li></ul>Customer Hub Account Hub Branch Hub Product Hub Customer Account Branch Product Time Status Facts
    • 15. The Initiate MDS Solution in an Enterprise Architecture Web Services Initiate Master Data Service ™ Transaction Support Security &amp; Audit Trail Data Stewardship Batch Support Matching Accuracy Performance Scalability Master Data Views Relationships Hierarchies APIs Loads &amp; Extracts Orchestrated Services Messaging Events &amp; Alerts Orders CRM Call Center Sales Profiles Web Self-Service Marketing
    • 16. Implementation Styles Consolidation Style Ownership Style Integration Style Link Co-Exist Combine Batch Near Real-Time Real-Time Registry/Slave Hybrid Transaction/Master
    • 17. More on Matching and Linking <ul><li>Step 1: Optimizes data for statistical comparisons </li></ul><ul><ul><li>Normalizes &amp; compacts data, creates derived data layer, source data remains intact </li></ul></ul><ul><ul><li>Phonetic equivalences, tokenization, nicknames, etc. </li></ul></ul><ul><li>Step 2: Finds all the potential matches </li></ul><ul><ul><li>Casts a wide net – all matches on current or historical attributes, prevents misses </li></ul></ul><ul><ul><li>Partial matches, reversals, anonymous values, etc. </li></ul></ul><ul><li>Step 3: Scores accurately via probabilistic statistics </li></ul><ul><ul><li>Compares attributes one-by-one and produces a weighted score (likelihood ratio) for each pair of records </li></ul></ul><ul><ul><li>Frequency weights specific to your business </li></ul></ul><ul><ul><li>Edit distance, proximity of match </li></ul></ul><ul><ul><li>Allows custom deterministic rules, e.g. false positive filters </li></ul></ul><ul><li>Step 4: Custom threshold settings </li></ul><ul><ul><li>Single or dual threshold models </li></ul></ul><ul><ul><li>Link, don’t link, don’t know – “learns” from manual input </li></ul></ul><ul><ul><li>Manage cost/quality trade-offs </li></ul></ul><ul><ul><li>Manage the linkages, workflow review </li></ul></ul>Manual review Lowest possible score Highest possible score Don’t link Link Lowest threshold Upper threshold Should be linked Should not be linked
    • 18. Hierarchy Management <ul><li>The term hierarchy is used only as a simple hierarchy with one and only one root, only one parent for each node within one hierarchy </li></ul><ul><li>Typically one hierarchy is selected as the Master used as a foundation (e.g. D&amp;B or custom) </li></ul><ul><li>There is a notion of source precedence / tree of truth </li></ul><ul><li>High performance match to build the hierarchy 40MM records for 15 minutes </li></ul><ul><li>One original systems record (member) or single version of truth record (entity) can belong to multiple hierarchies (e.g. corporate for D&amp;B and geography with territories, regions etc.) </li></ul><ul><li>A data steward can edit the hierarchy manually, (e.g. if there is a knowledge of a merger) </li></ul><ul><li>When later the merger update is coming from the Master source, the data steward can reconcile the source merge with the node previously created manually </li></ul><ul><li>Hierarchy query and navigation is done using various types of methods that allow to navigate to the node, node’s immediate children and all the sub-tree below or navigate from the node up and across (to be checked) </li></ul><ul><li>The product can export a hierarchy (e.g. to build a DW dimension) </li></ul>
    • 19. <ul><li>Understand &amp; Visualize Customer Relationships </li></ul>Hierarchy: Management &amp; Services <ul><li>Establish business &amp; consumer hierarchies </li></ul><ul><li>Resolve logical master from multiple internal or third party source hierarchies </li></ul><ul><li>Rule based hierarchy &amp; relationship creation and management </li></ul><ul><li>Maintain individual &amp; organizational hierarchies through web application to support active data stewardship </li></ul>Hierarchy Source Data Customer Data
    • 20. Initial Hierarchy Harmonization: Original State of Customer Records <ul><li>Customer’s Organization Records are Fragmented </li></ul><ul><li>Utility of hierarchies housed in SAP &amp; other apps is inconsistent </li></ul>Disclaimer: (Example Data Only) 9155925804 8272 Gateway Blvd E El Paso, TX 79907-1511 Buell Motorcycles 30 SAP 80 7017756098 Business Highway 81 N Grand Forks, ND 58203 Andy’s Harley-Davidson 20 SAP 40 .COM SAP SAP SAP Source: 2626422020 2815 BUELL DR East Troy, WI 53120-1366 Buell Motorcycle Co 30 5099286811 6815 E TRENT AVE Spokane Valley, WA 99212-1252 Shumate Harley Davidson 20 60 4145353500 3700 W Juneau Ave Milwaukee, WI 53208-2865 Harley-Davidson Motor Co 20 4143424680 3700 W Juneau Ave Milwaukee, WI 53208-2865 Harley-Davidson Inc 10 Phone: Address: Name: Bill To: ID: 80 90 30 20 40 50 60 10 70 Existing (SAP) Relationships Shipping Location Pricing Legend:
    • 21. Initial Hierarchy Harmonization: DNB Reference Source 4143424680 3700 W Juneau Ave Milwaukee, WI 532082865 Harley-Davidson Motor Company Inc 1 DNB 5 7758863393 4150 Technology Way Carson City, NV 897062009 Harley-Davidson Credit Corp 2 DNB 6 DNB DNB DNB DNB Source: 1865719000 6000 Garsington Rd Oxford, Oxfordshire OX4 2DQ Harley-Davidson Europe LTD 1 4 2626422020 2815 Buell Dr East Troy, WI 531201366 Buell Motorcycle Company 1 3 3123689501 150 S Wacker Dr Chicago, IL 606064103 Harley-Davidson Financial Services, Inc 1 2 4143424680 3700 W Juneau Ave Milwaukee, WI 532082865 Harley-Davidson Inc 1 Phone: Address: Name: Parent: ID: 1 2 3 4 5 6 4143424680 3700 W Juneau Ave Milwaukee, WI 532082865 Harley-Davidson Motor Company Inc 1 DNB 5 DNB Source: 4143424680 3700 W Juneau Ave Milwaukee, WI 532082865 Harley-Davidson Inc 1 Phone: Address: Name: Parent: ID:
    • 22. Initial Hierarchy Harmonization: Target State 80 90 30 20 40 50 60 1 2 3 4 5 6 10 70 .COM DNB Source: 4143424680 3700 W. Juneau Ave. Milwaukee, WI 53208-2865 Harley-Davidson Inc. 10 4143424680 3700 W. Juneau Ave. Milwaukee, WI 532082865 Harley-Davidson Inc. 1 Phone: Address: Name: Parent: ID: SAP DNB Source: 4145353500 3700 W. Juneau Ave. Milwaukee, WI 53208-2865 Harley-Davidson Motor Co. 20 4143424680 3700 W. Juneau Ave. Milwaukee, WI 532082865 Harley-Davidson Motor Company Inc. 1 5 Phone: Address: Name: Parent: ID: SAP DNB Source: 2626422020 2815 BUELL DR East Troy, WI 53120-1366 Buell Motorcycle Co. 30 2626422020 2815 Buell Dr. East Troy, WI 531201366 Buell Motorcycle Company 1 3 Phone: Address: Name: Parent: ID:
    • 23. Resolve &amp; Rationalize Hierarchies for Immediate Impact 20 40 50 60 70 80 90 30 <ul><li>Accurate Relationships Will Drive Revenue: DNB members </li></ul><ul><li>that are not yet Customers </li></ul><ul><li>Misaligned Pricing or Territories: Unassigned or incorrect </li></ul><ul><li>track codes and rep assignments </li></ul><ul><li>Incomplete Customer Profiles: Matched &amp; organized </li></ul><ul><li>records required for accurate analytics </li></ul><ul><li>Will be properly included and targeted in marketing campaigns and in sales activity </li></ul>Improved Customer Satisfaction, Improved Sales Coverage, Improved Sales Operations <ul><li>Deliver complete customer relationships to Data Warehouse and Marketing Analytics apps </li></ul>ID – Pricing Code : 20 – KFDRR 70 – [missing] Harley Davidson: Global Account Info : Customer Locations / Bus Units 9 Current Yearly Purchasing $900K Added Potential $300K 10 1 2 3 4 5 6 20 70 2 4 6
    • 24. New Account Creation Scenarios New New New Source 7001 E Trent Ave Spokane, WA 99212 Schumate Harley Davidson C 592-5804 8272 Gateway Boulevard El Paso, TX 77907 Buell Motor Cycles B HWY 81 N Grand Forks, ND 58206 Andya Harley Davidson A Phone Address Name Parent ID A B C
    • 25. Improve Account Creation Process &amp; Data Quality <ul><li>Duplicate prevention </li></ul><ul><li>Pricing alignment </li></ul><ul><li>Territory assignment </li></ul>20 40 50 60 70 80 90 30 10 1 2 3 4 5 6 Pricing Code : KFFDE KFFDE Sales Territory : 840123 Assign Correct Track Code to New Account KFFDE Assign Correct Territory to New Account 840123 [new] SAP Source: HWY 81 N. Grand Forks, ND 58206 Andya Harley Davidson A 7017756098 Business Highway 81 N. Grand Forks, ND 58203 Andy’s Harley-Davidson 20 40 Phone: Address: Name: Parent: ID: A B C Duplicate: DO NOT ADD! [new] SAP Source: 8272 Gateway Boulevard El Paso, TX 77907 Buell Motor Cycles B N / A 8272 Gateway Blvd. E. El Paso, TX 79907-1511 Buell Motorcycles 2 6 Track Code: Address: Name: Parent: ID: [new] .COM Source: 7001 E. Trent Ave. Spokane, WA 99212 Schumate Harley Davidson C 840123 6815 E. TRENT AVE. Spokane Valley, WA 99212-1252 Shumate Harley Davidson 30 90 Territory: Address: Name: Parent: ID:
    • 26. Relationship Management <ul><li>Relationship is a much looser construct than a hierarchy. Relationship can be used to associate people with group or products with categories, etc. </li></ul><ul><li>Relationships supports one-to-many and many-to-many associations </li></ul><ul><li>Relationships can be symmetric and asymmetric </li></ul><ul><li>For each relationship type its cardinality (one-to-many or many-to-many) is defined along with its symmetry (symmetric or asymmetric) </li></ul><ul><li>Also when a relationship type is defined, the types of records that can be related are also defined </li></ul>Many-to-Many One-to-Many Asymmetric Symmetric
    • 27. Information Quality and Data Stewardship Framework
    • 28. Approaches to Information Quality <ul><li>“ Upstream” at the point of entry of customer information </li></ul><ul><ul><li>Better validation </li></ul></ul><ul><ul><li>Change in business process is likely required </li></ul></ul><ul><ul><li>Change in applications, workflows and data flows </li></ul></ul><ul><li>“Downstream” </li></ul><ul><ul><li>Focus on ETL </li></ul></ul><ul><ul><li>Includes data stewardship </li></ul></ul><ul><ul><li>Less invasive – does not require changes in business processes </li></ul></ul><ul><ul><li>Less effective – always on the flow of dirty data </li></ul></ul><ul><li>Combination of both “Upstream” and “Downstream” approaches required to accomplish best results </li></ul>
    • 29. Moving Resolution of IQ Issues Closer to Point of Entry: Account Opening &amp; Client On-boarding Customer Information File Householding System Product 1: Account 1: Account Attributes Customer Attributes Account 2: Account Attributes Customer Attributes Product 2: Account 3: Account Attributes Customer Attributes Product 3: Account 4: Account Attributes Customer Attributes Account 5: Account Attributes Customer Attributes Product 1: Account 1: Account Attributes Customer Attributes Account 2: Account Attributes Customer Attributes Product 2: Account 3: Account Attributes Customer Attributes Product 3: Account 4: Account Attributes Customer Attributes Account 5: Account Attributes Customer Attributes Account-centric: Customer-centric: Data Entry Data Enrichment Vendor Data Entry Data Hub
    • 30. Enterprise Data Stewardship Framework Policies, standards, processes, roles, responsibilities, metrics, &amp; controls Information Technology <ul><li>Performs ongoing data quality task resolution </li></ul><ul><li>Improves data entry validation </li></ul><ul><li>Configures data quality task generation &amp; reports </li></ul>Data Stewards Data Quality Improvement Loop <ul><li>Validity of Identifiers </li></ul><ul><li>Updates overlaying identity </li></ul><ul><li>Link </li></ul><ul><li>Merge </li></ul><ul><li>Conflicts between match &amp; hierarchy / relationships associations </li></ul><ul><li>Summary reports on data quality metrics </li></ul>Technology Supports Customizable Data Quality Task Resolution Queues: Data Governance Systems
    • 31. Initiate TM Inspector: High Level Capabilities
    • 32. Initiate TM Inspector: Primary Tasks
    • 33. Initiate TM Inspector: “Potential Linkage”
    • 34. Initiate TM Inspector: “Potential Linkage” – Select and Action (Page 1 of 2)
    • 35. Service Oriented Architecture and Data Services: Strengths and Weaknesses
    • 36. Service Oriented Architecture <ul><li>Software design and implementation architecture characterized by the following: </li></ul><ul><ul><li>Logical View – abstraction of loosely coupled, reusable business needs and functions; </li></ul></ul><ul><ul><li>Message Orientation – exchange between provider and requester </li></ul></ul><ul><ul><li>Description Orientation – machine-processable metadata to support interoperabilty </li></ul></ul><ul><ul><li>Granularity – combination of coarse-grained and fine-grained </li></ul></ul><ul><ul><li>Network Orientation – typically used over network </li></ul></ul><ul><ul><li>Platform Neutral </li></ul></ul><ul><li>Software architecture in which software components are exposed as services on the network , which can be reused as necessary for different applications </li></ul><ul><li>When implemented with web services , offers a standard foundation for functionality reuse and data access within a federated enterprise and services provided externally </li></ul><ul><li>What is in common between SOA initiatives and EDM or MDM? </li></ul><ul><ul><li>Data Services </li></ul></ul>
    • 37. Data Services: Format Agnostic &amp; Source Agnostic Data Management Data Sources List Reports Data Services Metadata End User Interfaces with the System by Requesting a data source agnostic Business/Data Service Executive Manager Analyst Find Client Report Premium Revenue Create New Account Create New Report Data Services Metadata translates the course-grain business service call into an orchestrated set of data location aware service calls. The Metadata processes parameters of the request and security access eligibility and data visibility parameters of the requestor. Transform Data CRUD Record Capture Exception Calculate Get Best Source Data and Process It Execute and Return to Requestor 2 3 4 Note: Metadata insulates user applications and business services from data sources. Thus the data sources can be changed or replaced seamlessly with no changes in user interfaces and user experience 1
    • 38. Data Hub: SOA Architecture Viewpoint Legacy Data Store Other Data Store Data Provider Service Interface Data Provider Service Interface CDI Customer Hub Data Provider Service Interface Service Consumers – Business Applications External (Exposed) Services Layer Internal Services Layer <ul><li>Individual Identification &amp; Recognition </li></ul><ul><li>Privacy Preference Capture &amp; Notification </li></ul><ul><li>Identification of Associations, Roles &amp; Relationships </li></ul><ul><li>Customer Grouping Management </li></ul><ul><li>Compliance Notification &amp; Reporting </li></ul><ul><li>Customer Information Maintenance </li></ul><ul><li>Customer Insight Reporting </li></ul><ul><li>Key Services </li></ul><ul><li>Third-Party Data Interface Service </li></ul><ul><li>Data Synchronization &amp; Queue Management </li></ul><ul><li>Data Archival &amp; Versioning </li></ul><ul><li>Coordination &amp; Orchestration </li></ul><ul><li>Visibility, Entitlements, Privacy &amp; Security </li></ul><ul><li>Rules / Workflow Administration </li></ul><ul><li>Metadata Management </li></ul><ul><li>Transaction Logging &amp; Auditing </li></ul><ul><li>Content Management &amp; Caching </li></ul><ul><li>Event / Notification Management </li></ul><ul><li>Error Management </li></ul>
    • 39. Reference Architecture Process Mgmt. Orchestration Contact Mgmt. Campaign Mgmt. Relationship Mgmt. Document Mgmt. Business Processes Layer ILLUSTRATIVE Hub Data Management Layer Party Mgmt. Client/Suspect Identification Profile Mgmt. Grouping Mgmt. Enrichment &amp; Sustaining Hub Data Rules Layer Rules Capture &amp; Mgmt. Synchronization Rules Identity Matching Aggregation &amp; Split Rules Visibility Rules Transformation Rules Hub Data Quality Layer Data Quality Mgmt. GUID Mgmt. Address Standardization Reporting Transformation &amp; Lineage Hub Systems Services Layer Transaction &amp; State Mgmt. Security Visibility Services Orchestration Legacy Connectivity Persistence Synchronization Data Sources
    • 40. Pros &amp; Cons of SOA: When to Use or Not to Use SOA <ul><li>Performance problems if multiple pieces of information are to be joined </li></ul><ul><li>If implemented with web services, solutions do not support transactional integrity for synchronous processes; compensating transactions required </li></ul><ul><li>SOA initiatives don’t meet expectations if not supported by data strategy </li></ul><ul><li>Use of data services requires strong governance and a new culture for the data governors, data stewards, and testers </li></ul><ul><li>Reduced data redundancy </li></ul><ul><li>Business, data governance and data stewards can define SLA via services </li></ul><ul><li>Data services provide level of abstraction – no need to work at the data and data model levels </li></ul><ul><li>Standardized interaction within the enterprise, external and vendor provided services </li></ul><ul><li>Increased productivity of development and agility to support evolving requirements </li></ul>Pros Cons When implemented properly SOA and data services provide significant benefits for MDM and EDM
    • 41. <ul><ul><li>Testing SOAP messages </li></ul></ul><ul><ul><li>Testing WSDL files and using them for test plan generation </li></ul></ul><ul><ul><li>Web service consumer and producer emulation </li></ul></ul><ul><ul><li>Testing the publish, find, and bind capabilities of a SOA </li></ul></ul><ul><ul><li>Testing the asynchronous capabilities of Web services </li></ul></ul><ul><ul><li>Testing dynamic run-time capabilities of Web Services </li></ul></ul><ul><ul><li>Web services orchestration testing </li></ul></ul><ul><ul><li>Web service versioning testing </li></ul></ul>Testing Data Hub (Not Just Data But Also Services) Data Hub        
    • 42. Information Management Methodology
    • 43. Importance of Information Management Methodology <ul><li>Implementation of enterprise information management projects (DW, ODS, MDM, CDI, etc) require well structured methodology </li></ul><ul><li>Methodologies used for data intensive projects are different from traditional application development methodologies </li></ul>
    • 44. Methodology – Need and Overview <ul><li>Mike2.0 (Method for an Integrated Knowledge Management) is an open source for Enterprise Information Management ( www.openmethodology.org ) </li></ul><ul><ul><li>Developed by BearingPoint </li></ul></ul><ul><ul><li>Available for Open Source Community since December 2006 </li></ul></ul><ul><ul><li>Transition to Open Source “Creative Commons” license completed in May 2007 </li></ul></ul><ul><ul><li>Over 2000 online pages and growing </li></ul></ul><ul><ul><li>Contains Phases, Activities and Tasks </li></ul></ul><ul><li>Open Source Mike2.0 allows global community to use this methodology right now for their Information Development initiatives </li></ul><ul><li>Organizations and individuals can sign-up to become a contributing member of Mike2.0 </li></ul>
    • 45. Mike2.0 – Usage Model Details
    • 46. Mike2.0 SAFE Architecture SAFE (Strategic Architecture for the Federated Enterprise) provides the technology solution framework for MIKE2.0.
    • 47. Lessons Learned and Accelerators
    • 48. The Three Dimensional Socialization Roadmap <ul><li>Ownership: Demonstrated commitment to the change and accountability </li></ul><ul><li>Buy-in: Agreement with the concepts and ideas &amp; expressed support </li></ul><ul><li>Understanding: Internalizing the concepts and ideas and grasping the implications of the change </li></ul><ul><li>Awareness: Becoming cognizant and developing a sense of appreciation for the change </li></ul>Ownership Buy-in Understanding Awareness Security Front Office Back Office Senior Management Technology &amp; Infrastructure Legacy Systems Lifecycle Phases/Releases Level of Involvement Stakeholders Training Testing Development Planning Helps Program Managers Build Communications Plan
    • 49. Typical Implementation Work Streams: Organizing for Success &amp; Breaking the Problem Down <ul><li>It is much easier to discuss, define and plan MDM-CDI when the problem is broken down into more manageable areas and specialty domains </li></ul><ul><li>Master Entity Identification </li></ul><ul><li>Entity groups &amp; relationships </li></ul><ul><li>Data governance, standards, quality, &amp; compliance </li></ul><ul><li>Data architecture </li></ul><ul><li>Metadata management &amp; administrative applications </li></ul><ul><li>Initial data load </li></ul><ul><li>Inbound data processing (batch &amp; real-time) </li></ul><ul><li>Outbound data processing (batch &amp; real-time) </li></ul><ul><li>Changes to legacy systems &amp; applications </li></ul><ul><li>Visibility &amp; security </li></ul><ul><li>Exception processing </li></ul><ul><li>Infrastructure </li></ul><ul><li>Data Hub applications </li></ul><ul><li>Reporting requirements of a stratified user community </li></ul><ul><li>Testing </li></ul><ul><li>Release management </li></ul><ul><li>Deployment </li></ul><ul><li>Training </li></ul>Helps Program Managers Build Project Plan
    • 50. Complexity vs. Manageability Helps Program Managers Define Phases &amp; Releases Manageability Complexity Plan a Release Here Critical Point
    • 51. Fastest ROI Time - Months Business Value &amp; ROI High Low Potential End State CDI Start 6 12 18 24 Start w/ Master Initial Phase ROI Initial Resolve… Relevant Relationships in the Data Synchronize… Data, Systems, Processes &amp; People Master… Your Data Start w/ Resolve
    • 52. Implementation Continuum Cross Reference Management Customer Data Access Customer Data Synchronization Customer Transaction Management Customer Process Management <ul><li>Establish and maintain a trusted source for analysis </li></ul><ul><li>Provide people with on demand search </li></ul><ul><li>Transactional applications built on top of customer definition across sources </li></ul><ul><li>Transference of record ownership – Hub owned and maintained </li></ul><ul><li>Manage business process associated with customer data/ transaction management </li></ul><ul><li>Provide bi-directional update between sources of customer information through messaging, APIs and other integration methods </li></ul><ul><li>Create linkages amongst all records </li></ul><ul><li>Prepare data for new systems </li></ul>Batch/ Analytical Real-time/ Operational Pure Registry Mastered
    • 53. Develop Repeatable Initiative On-boarding Processes &amp; Templates Program Initiation: Program Planning &amp; Definition Program Initiated DW DEVELOPMENT CDI-MDM DATA PROFILING &amp; DATA QUALITY DATA SERVICES FRAMEWORK DATA GOVERNANCE &amp; STEWARDSHIP DATA MODELING On-boarding Initiative 3 On-boarding Initiative 2 On-boarding Initiative 1 <ul><li>Business Case &amp; Value Proposition </li></ul><ul><li>Business Requirements </li></ul><ul><li>Target State Solution </li></ul><ul><li>Detailed Roadmap </li></ul><ul><li>Data Governance, Standards, Data Quality </li></ul><ul><li>Architectural Principles </li></ul>Global Geography Enterprise Information Program
    • 54. Develop Repeatable MDM Systems on-boarding Processes and Templates <ul><li>In the first year and first implementation phase the number of legacy systems integrated in scope of MDM-CDI is limited (typically 2-3) </li></ul><ul><li>How to accelerate on-boarding of new systems in the consequent phases given that it is not unusual that 20-50 systems can be in scope of MDM-CDI integration? </li></ul><ul><li>A well-defined set of system on-boarding standards and procedures determines common rules that each legacy system should comply with to be integrated into the evolving MDM-CDI solution </li></ul><ul><ul><li>Enables a repetitive on-boarding process and enables sustainable accelerated solution growth in terms of the number of systems and LOBs </li></ul></ul><ul><ul><li>Preserves integrity and consistency of the MDM-CDI solution </li></ul></ul><ul><ul><li>Improved data governance </li></ul></ul><ul><ul><li>Enables highly sustainable pace of </li></ul></ul>
    • 55. Two Schools of Thought on Hub Data Model <ul><li>Hub with Out-of-the-box Data Model </li></ul><ul><li>Pros </li></ul><ul><li>Seems attractive to have the “right” data model out-of-the-box </li></ul><ul><li>The product has some pre-built coarse-grain business transactions </li></ul><ul><li>Cons </li></ul><ul><li>How flexible is it to support on-going changes? </li></ul><ul><li>Overhead of having multiple entities and attributes that never used by your specific solution </li></ul><ul><li>Data Model Agnostic Hub Product </li></ul><ul><li>Pros </li></ul><ul><li>Flexible to accommodate any data model and its changes </li></ul><ul><li>Can generate fine-grain services on top of any data model </li></ul><ul><li>Cons </li></ul><ul><li>Development work required to build coarse-grain services to support composite transactions </li></ul><ul><li>Possible performance and maintenance impact due to additional metadata lookups </li></ul>
    • 56. Hub Implementation: Buy vs. Build vs. Data Enrichment Partner <ul><li>Traditional “buy or build” question is typically resolved in favor of “buy” </li></ul><ul><li>An additional consideration is the use of an External Data Enrichment vendor </li></ul><ul><ul><li>Can we outsource the primary function of CDI hub and do customer match externally? </li></ul></ul><ul><li>Use of an External Data Enrichment partner has its own pros and cons </li></ul><ul><ul><li>Pros </li></ul></ul><ul><ul><ul><li>Higher match accuracy that based on the Knowledge Base (US NCOA and other Libraries) </li></ul></ul></ul><ul><ul><ul><li>Ability to recognize new customers and prospects </li></ul></ul></ul><ul><ul><ul><li>Additional data from the Knowledge Base – “data enrichment” </li></ul></ul></ul><ul><ul><li>Cons </li></ul></ul><ul><ul><ul><li>Need to share customer data with external vendor </li></ul></ul></ul><ul><ul><ul><li>Capabilities and Knowledge Base quality depends on the country – domestic better than international </li></ul></ul></ul><ul><ul><ul><li>Additional cost </li></ul></ul></ul>
    • 57. Focus on Data Mapping <ul><li>Data Mapping is an activity that “maps” the legacy system attributes to the new customer-centric model and vice versa </li></ul><ul><li>This activity is performed by business analysts </li></ul><ul><li>The produced data maps are used by ETL and EAI developers </li></ul><ul><li>The mapping process is time-consuming, can cause numerous errors and can be on the critical path of development </li></ul><ul><li>A data mapping vendor product can help accelerate delivery </li></ul><ul><ul><li>Drag-and-drop interface </li></ul></ul><ul><ul><li>Open source mapping metadata </li></ul></ul><ul><ul><li>Ability to integrate the mapping metadata with ETL, EAI and EII tools and share the metadata rules </li></ul></ul><ul><ul><li>Ability to reverse the transformation rules when possible </li></ul></ul>
    • 58. Creation and Protection of Test Data <ul><li>Sensitive Customer data must be anonymized (obfuscated, cloaked) to disguise it from unauthorized personnel in test and development environments. </li></ul><ul><li>Some anonymization techniques are as follows: </li></ul><ul><ul><ul><li>Masking Data </li></ul></ul></ul><ul><ul><ul><li>Substitution </li></ul></ul></ul><ul><ul><ul><li>Shuffling records . </li></ul></ul></ul><ul><ul><ul><li>Number Variance </li></ul></ul></ul><ul><ul><ul><li>Gibberish Generation </li></ul></ul></ul><ul><ul><ul><li>Encryption / Decryption </li></ul></ul></ul><ul><li>Key challenges in using data anonymization techniques include: </li></ul><ul><ul><li>Ability to preserve logic potentially embedded in data to ensure that application logic continues to function </li></ul></ul><ul><ul><li>Need to provide consistent transformation outcome for the same data </li></ul></ul>
    • 59. Some Key Reasons for Project Failure to Avoid at Project Kick-off <ul><li>Lack of executive support and budgetary commitment </li></ul><ul><li>Lack of cooperation and/or coordination between business and technology </li></ul><ul><li>Lack of consuming applications – “if we build, they will come…” </li></ul><ul><li>Lack of end-user adoption </li></ul><ul><li>Underestimation of legacy impact </li></ul><ul><li>Insufficient socialization throughout the enterprise to include all stakeholders at the right level </li></ul><ul><li>Underestimation of the need for layered architecture provided by SOA </li></ul><ul><li>Gaps in data governance, stewardship, and information quality strategy </li></ul><ul><li>Miscalculated staffing needs </li></ul>
    • 60. What are the Next Disruptive Things in MDM and EDM? <ul><li>Match and link evolution: From entities to relationships </li></ul><ul><li>Integration of MDM with Business Rules Engines and Work Flows </li></ul><ul><li>Data Stewardship Framework </li></ul><ul><li>Metadata Integration </li></ul><ul><li>Externalization of data visibility and security </li></ul>
    • 61. ? Q&amp;A

    ×