Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data modeling for the business

Data modelling for the business half day workshop presented at the Enterprise Data & Business Intelligence conference in London on November 3rd 2014
chris.bradley@dmadvisors.co.uk

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Data modeling for the business

  1. 1. P A G E 1 Data Modelling for the Business The Critical Role of a Conceptual Data Model C H R I S T O P H E R B R A D L E Y ( C D M P F E L L O W ) chris.bradley@dmadvisors.co.uk
  2. 2. P A G E 2
  3. 3. P / 3 Christopher Bradley I N F O R M AT I O N M A N A G E M E N T S T R AT E G I S T Information Management, Life & Petrol http://infomanagementlifeandpetrol.blogspot.com @InfoRacer uk.linkedin.com/in/christophermichaelbradley/ +44 7973 184475 (mobile) Chris.Bradley@DMAdvisors.co.uk +44 1225 923000 (office)
  4. 4. P / 4 Christopher Bradley Chris has 36 years of Information Management experience & is a leading Independent Information Management strategy advisor. In the Information Management field, Chris works with prominent organizations including HSBC, Celgene, GSK, Pfizer, Icon, Quintiles, Total, Barclays, ANZ, GSK, Shell, BP, Statoil, Riyad Bank & Aramco. He addresses challenges faced by large organisations in the areas of Data Governance, Master Data Management, Information Management Strategy, Data Quality, Metadata Management and Business Intelligence. He is President of DAMA UK, DAMA- I DMBOK 2 author. In April 2016 he became the inaugural CDMP Fellow, and received the DAMA lifetime professional achievement award. He is an author & examiner for CDMP, a Fellow of the Chartered Institute of Management Consulting (now IC) a member of the MPO, and SME Director of the DM Board. A recognised thought-leader in Information Management Chris is the author of numerous papers, books, including sections of DMBoK 2.0, a columnist, a frequent contributor to industry publications and member of several IM standards authorities. He leads an experts channel on the influential BeyeNETWORK, is a sought after speaker at major international conferences, and is the co-author of “Data Modelling For The Business – A Handbook for aligning the business with IT using high-level data models”. He also blogs frequently on Information Management (and motorsport).
  5. 5. P / 5
  6. 6. Recent PresentationsDAMA Sydney: August 2016 Sydney; “Data Quality, The Impact of Dirty Data?” DAMA Melbourne: August 2016 Melbourne; “Data Quality, Are You Feeling Lucky?” DAMA Canberra: August 2016 Canberra; “Data Quality, Are You Feeling Lucky?” Australian Computer Society (ACS): August 2016 Canberra; “Data Quality Matters For Machines Too” MDM DG Europe: May 2016 London; “Data Governance by Stealth” Webinar: May 2016; “Data Modelling is not JUST for DBMS design” Enterprise Data World: April 2016 San Diego; “DAMA CDMP Workshop” DAMA Webinar: April 2016; “Data Integration & Interoperability” Disciplines of the DAMA DMBoK” DAMA Webinar: March 2016; “Document & Content Management” Disciplines of the DAMA DMBoK” DAMA Webinar: February 2016; “Data Operations Management” Disciplines of the DAMA DMBoK” Oil & Gas Data Management Conference: February 2016, London; “Developing An Information Strategy To Align With In-Flight Business Programs” DAMA Webinar: January 2016; “Data Governance” Disciplines of the DAMA DMBoK” DAMA Webinar: December 2015; “Information Lifecycle Management” Disciplines of the DAMA DMBoK” DAMA Webinar: November 2015; “MetaData Management” Disciplines of the DAMA DMBoK” Enterprise Data & BI Europe (IRM): November 2015, London; “Is the Data Asset really different?” & “CDMP Examination Preparation” & “Data Management Room 101” DAMA Webinar: October 2015; “Data Risk & Security” Disciplines of the DAMA DMBoK” DAMA Webinar: September 2015; “Data Warehousing & Business Intelligence” Disciplines of the DAMA DMBoK” DAMA Webinar: August 2015; “Data Quality Management” Disciplines of the DAMA DMBoK” BCS & DAMA Seminar: June 2015; “Is the Data Asset really different?” DAMA Webinar: June 2015; “Data Modelling” Disciplines of the DAMA DMBoK” PRISME Pharmaceutical Congress: May 2015, Basel, CH; “Building & exploiting a Pharmaceutical Industry consensus data model” MDM DG Europe (IRM): May 2015, London; “CDMP Examination Preparation” & “Data Governance By Stealth?, Can you ‘sell’ Data Governance if the stakeholders don’t get it?” DAMA Webinar: April 2015; “Master & Reference Data Management” Disciplines of the DMBoK” Enterprise Data World: April 2015, Washington DC USA; “Data Modelling For The Business” and “Evaluating Information Management Tools” DAMA Webinar: February 2015; “An Introduction to the Information Disciplines of the DMBoK” Dataversity Webinar: February 2015; “How to successfully introduce Master & Reference data management” Petroleum Information Management Summit 2015: February 2015, Berlin DE, “How to succeed with MDM and Data Governance” Enterprise Data & Business Intelligence 2014: (IRM), November 2014, London, UK “Data Modelling 101” Enterprise Data World: (DataVersity), May 2014, Austin, Texas, “MDM Architectures & How to identify the right Subject Area & tooling for your MDM strategy” E&P Information Management Dubai: (DMBoard),17-19 March 2014, Dubai, UAE “Master Data Management Fundamentals, Architectures & Identify the starting Data Subject Areas” DAMA Australia: (DAMA-A),18-21 November 2013, Melbourne, Australia “DAMA DMBoK 2.0”, “Information Management Fundamentals” 1 day workshop” Data Management & Information Quality Europe: (IRM Conferences), 4-6 November 2013, London, UK; “Data Modelling Fundamentals” ½ day workshop: “Myths, Fairy Tales & The Single View” Seminar; “Imaginative Innovation - A Look to the Future” DAMA Panel Discussion IPL / Embarcadero series: June 2013, London, UK, “Implementing Effective Data Governance” Riyadh Information Exchange: May 2013, Riyadh, Saudi Arabia; “Big Data – What’s the big fuss?” Enterprise Data World: (Wilshire Conferences), May 2013, San Diego, USA, “Data and Process Blueprinting – A practical approach for rapidly optimising Information Assets” Data Governance & MDM Europe: (IRM Conferences), April 2013, London, “Selecting the Optimum Business approach for MDM success…. Case study with Statoil” E&P Information Management: (SMI Conference), February 2013, London, “Case Study, Using Data Virtualisation for Real Time BI & Analytics” E&P Data Governance: (DMBoard / DG Events), January 2013, Marrakech, Morocco, “Establishing a successful Data Governance program” Big Data 2: (Whitehall), December 2012, London, “The Pillars of successful knowledge management” Financial Information Management Association (FIMA): (WBR), November 2012, London; “Data Strategy as a Business Enabler” Data Modeling Zone: (Technics), November 2012, Baltimore USA; “Data Modelling for the business” Data Management & Information Quality Europe: (IRM), November 2012, London; “All you need to know to prepare for DAMA CDMP professional certification” ECIM Exploration & Production: September 2012, Haugesund, Norway: “Enhancing communication through the use of industry standard models; case study in E&P using WITSML” Preparing the Business for MDM success: Threadneedles Executive breakfast briefing series, July 2012, London Big Data – What’s the big fuss?: (Whitehall), Big Data & Analytics, June 2012, London, Enterprise Data World International: (DAMA / Wilshire), May 2012, Atlanta GA, “A Model Driven Data Governance Framework For MDM - Statoil Case Study” “When Two Worlds Collide – Data and Process Architecture Synergies” (rated best workshop in conference); “Petrochemical Information Management utilising PPDM in an Enterprise Information Architecture” Data Governance & MDM Europe: (DAMA / IRM), April 2012, London, “A Model Driven Data Governance Framework For MDM - Statoil Case Study” AAPG Exploration & Production Data Management: April 2012, Dead Sea Jordan; “A Process For Introducing Data Governance into Large Enterprises” PWC & Iron Mountain Corporate Information Management: March 2012, Madrid; “Information Management & Regulatory Compliance” DAMA Scandinavia: March 2012, Stockholm, “Reducing Complexity in Information Management” (rated best presentation in conference) Ovum IT Governance & Planning: March 2012, London; “Data Governance – An Essential Part of IT Governance” American Express Global Technology Conference: November 2011, UK, “All An Enterprise Architect Needs To Know About Information Management” FIMA Europe (Financial Information Management):, November 2011, London; “Confronting The Complexities Of Financial Regulation With A Customer Centric Approach; Applying a Master Data Management And Data Governance Process In Clydesdale Bank “ Data Management & Information Quality Europe: (DAMA / IRM), November 2011, London, “Assessing & Improving Information Management Effectiveness – Cambridge University Press Case Study”; “Too Good To Be True? – The Truth About Open Source BI” ECIM Exploration & Production: September 12th 14th 2011, Haugesund, Norway: “The Role Of Data Virtualisation In Your EIM Strategy” Enterprise Data World International: (DAMA / Wilshire), April 2011, Chicago IL; “How Do You Want Yours Served? – The Role Of Data Virtualisation And Open Source BI” Data Governance & MDM Europe: (DAMA / IRM), March 2011, London; “Clinical Information Data Governance”
  7. 7. P / 7 Recent Publications Book: “Data Modelling For The Business – A Handbook for aligning the business with IT using high-level data models”; Technics Publishing; ISBN 978-0-9771400-7-7; http://www.amazon.com/Data-Modeling-Business-Handbook-High-Level Book: “DAMA Data Management Body Of Knowledge 2.0” ; Technics Publishing; ISBN TBD Article: Back to the future for Data management? September 2015 Article: Is the “Data Asset” really different? July 2015 Article: A visit to the vet & a BA flight reminded me about Data Governance; June 2015 White Paper: “Information is at the heart of ALL Architecture disciplines”,; March 2014 White Paper: “Are you ready for Big Data ?”, November 2013 Article: The Bookbinder, the Librarian & a Data Governance story ; July 2013 Article: Data Governance is about Hearts and Minds, not Technology January 2013 White Paper: “The fundamentals of Information Management”, January 2013 White Paper: “Knowledge Management – From justification to delivery”, December 2012 Article: “Chief INFORMATION Officer? Not really” Article, November 2012 White Paper: “Running a successful Knowledge Management Practice” November 2012 White Paper: “Big Data Projects are not one man shows” June 2012 Article: “IPL & Statoil’s innovative approach to Master Data Management in Statoil”, Oil IT Journal, May 2012 White Paper: “Data Modelling is NOT just for DBMS’s” April 2012 Article: “Data Governance in the Financial Services Sector” FSTech Magazine, April 2012 Article: “Data Governance, an essential component of IT Governance" March 2012 Article: “Leveraging a Model Driven approach to Master Data Management in Statoil”, Oil IT Journal, February 2012 Article: “How Data Virtualization Helps Data Integration Strategies” BeyeNETWORK (December 2011) Article: “Approaches & Selection Criteria For organizations approaching data integration programmes” TechTarget (November 2011) Article: Big Data – Same Problems? BeyeNETWORK and TechTarget. (July 2011) Article “10 easy steps to evaluate Data Modelling tools” Information Management, (March 2010) Article “How Do You Want Your Data Served?” Conspectus Magazine (February 2010) Article “How do you want yours served (data that is)” (BeyeNETWORK January 2010) Article “Seven deadly sins of data modelling” (BeyeNETWORK October 2009) Article “Data Modelling is NOT just for DBMS’s” Part 1 BeyeNETWORK July 2009 and Part 2 BeyeNETWORK August 2009 Web Channel: BeyeNETWORK “Chris Bradley Expert Channel” Information Asset Management http://www.b-eye-network.co.uk/channels/1554/ Article: “Preventing a Data Disaster” February 2009, Database Marketing Magazine
  8. 8. Data Modelling Foundation (1 day) Introductory Intermediate Advanced / Deep Dive Advanced Data Modelling (3 days) Integrated Business Process, Data Requirements and Discovery (5 days) DAMA-I CDMP Exam Cram & Certification (3 days) Level Information Management For The Business (½ and 1 day) Multiple Levels of Training to meet your needs Data Modelling Fundamentals (3 days) The “client Way” Information Management Mentoring Information Management Fundamentals (4 or 5 days) Data Quality Management Implementation & Practice (1 and 2 day) Data Warehouse & Business Intelligence Implementation & Practice (1 and 2 day) Reference & Master Data Management Implementation & Practice (1 and 2 day) Data Governance Implementation & Practice (1 and 2 day) Data Integration Implementation & Practice (1 and 2 day) Introduction to Information Management (3 days) MetaData Management Practitioner (1 day)
  9. 9. P A G E 9 AGENDA › What is a Conceptual Data Model › Why is a Conceptual Data Model Important? › Data Models and Information Management › Aligning with Business Goals & Motivations › Data Modeling Basics
  10. 10. P A G E 1 0 Who Are You? Survey HOW WOULD YOU DESCRIBE YOUR ROLE? Businessperson or Business Analyst Business Intelligence Analyst or Developer Data Architect, Data Modeler, or Data Analyst DBA or Technical IT A combination of the above Other
  11. 11. P A G E 1 1 I am new to data modelling I know a little about data modelling I have a lot of experience with data modelling Poll: How Familiar are you with Data Modelling?
  12. 12. P A G E 1 2 What is a Conceptual Data Model?
  13. 13. P A G E 1 3 What is a Conceptual Data Model? › A Conceptual Data Model (CDM) uses simple graphical images to describe core concepts and principles of an organization and what they mean › The main audience of a CDM is businesspeople › A CDM is used to facilitate communication › A CDM is used to facilitate integration between systems, processes, and organizations › It needs to be high-level enough to be intuitive, but still capture the rules and definitions needed to create database systems PRECISION FLEXIBILITY
  14. 14. P A G E 1 4 THIS? THIS? or We all use Models to Gain Understanding
  15. 15. P A G E 1 5 We all use Models Conceptual
  16. 16. P A G E 1 6 We all use Models Logical
  17. 17. P A G E 1 7 We all use Models Physical
  18. 18. P A G E 1 8 Customer “A Picture is Worth a Thousand Words” EXAMPLES OF CONCEPTUAL DATA MODELS
  19. 19. P A G E 1 9 “A Picture is Worth a Thousand Words” Product Customer Location Order Raw Material Ingredient Region EXAMPLES OF CONCEPTUAL DATA MODELS
  20. 20. P A G E 2 0 “A Picture is Worth a Thousand Words” EXAMPLES OF CONCEPTUAL DATA MODELS
  21. 21. P A G E 2 1 “A Picture is Worth a Thousand Words” EXAMPLES OF CONCEPTUAL DATA MODELS
  22. 22. P A G E 2 2 Conceptual Data Models apply to Dimensional Data Modelling as well. “A Picture is Worth a Thousand Words” EXAMPLES OF CONCEPTUAL DATA MODELS
  23. 23. P A G E 2 3 “A Picture is Worth a Thousand Words” EXAMPLES OF CONCEPTUAL DATA MODELS
  24. 24. P A G E 2 4 “A Picture is Worth a Thousand Words” EXAMPLES OF CONCEPTUAL DATA MODELS
  25. 25. P A G E 2 5 Is Notation Important? › Many Notations can be used to express a high- level data model › The choice of notation depends on purpose and audience › For data-related initiatives, such as Master Data Management (MDM) and Data Warehousing (DW): » ER modeling using IE (Information Engineering) is our choice of notation » It is important that your high-level model uses a tool that can generate DDL, or can import/export with a tool that can » A repository-based solution helps with reuse and standards for enterprise-wide initiatives
  26. 26. P A G E 2 6 Multiple Models – Multiple Purposes – Multiple Audiences Conceptual Logical Physical Audience Businessperson 3NF Data Architect DBA , Developer Purpose Communication & Definition of Business Terms & Rules Clarification & Detail of Business Rules & Data Structures Technical Implementation on a Physical Database Business Concepts Data Entities Physical Tables Repository & Model-Driven
  27. 27. P A G E 2 7 Building Models Top Down vs. Bottom Up › Models can be built » Top-Down » Bottom-Up » Using a Hybrid Approach – Middle Out Conceptual Logical Physical Repository & Model-Driven
  28. 28. P A G E 2 8 Step 1: Speak with the business representatives. Document the key business requirements and agree on high-level scope. BRD Step 2: Create detailed business requirement specification document with subscriber data requirement, business process and business rules BRD Step 3: Understand and document the business major attributes and definitions from business subject matter experts. Create logical data model. Logical data model Tables & foreign key constraints Step 4: Verify the logical data model with the stakeholders and create physical data model Step 5: Implement using the created physical model Top-down Approach
  29. 29. P A G E 2 9 Step 5: Try to understand the business meanings of probable attributes and entities that may be candidates for logical data model Step 4: Document the meanings of columns, and tables from IT subject matter experts Step 3: Find out foreign key relationships between tables, from IT subject matter experts & verify findings A near logical data model (accuracy unknown) Tables & foreign key constraints Step 2: Profile data by browsing and analyzing the data from tables. Scan through the ETL to find out hidden relationships & constraints. Step 1: Reverse engineer the database schema that is already implemented. Bottom-up Approach
  30. 30. P A G E 3 0 Create Different Models for Different Audiences Business -> Conceptual
  31. 31. P A G E 3 1 Create Different Models for Different Audiences Technical -> Physical
  32. 32. P A G E 3 2 Database Script (DDL) vs. Physical Data Model product_id INTEGER NOT NULL, product_name VARCHAR(50) NULL, product_price NUMBER NULL); ALTER TABLE PRODUCT ADD ( PRIMARY KEY (product_id) ) ; CREATE TABLE DEPARTMENT ( department_id INTEGER NOT NULL, department_name VARCHAR(50) NULL); ALTER TABLE DEPARTMENT ADD ( PRIMARY KEY (department_id) ) ; CREATE TABLE EMPLOYEE (employee_id INTEGER NOT NULL, department_id INTEGER NOT NULL, employee_fname VARCHAR(50) NULL, employee_lname VARCHAR(50) NULL, employee_ssn CHAR(9) NULL); ALTER TABLE EMPLOYEE ADD ( PRIMARY KEY (employee_id) ) ; CREATE TABLE CUSTOMER ( customer_id INTEGER NOT NULL, customer_name VARCHAR(50) NULL, customer_address VARCHAR(150) NULL, customer_city VARCHAR(50) NULL, customer_state CHAR(2) NULL, customer_zip CHAR(9) NULL); ALTER TABLE CUSTOMER ADD ( PRIMARY KEY (customer_id) ) ; CREATE TABLE ZORDER ( zorder_id INTEGER NOT NULL, employee_id INTEGER NOT NULL, customer_id INTEGER NOT NULL, zorder_date DATE NULL); ALTER TABLE ZORDER ADD ( PRIMARY KEY (zorder_id) ) ; “A picture is worth a thousand words”
  33. 33. P A G E 3 3 CONCEPTUAL LOGICAL PHYSICAL Defines the scope, audience, context for information Defines key business concepts and their definitions Represents core business rules and data relationships at a detailed level Main purpose is for communication and agreement of scope and context Main purpose is for communication and agreement of definitions and business logic Provides enough detail for subsequent first cut physical design Relationships optional. If shown, represent hierarchy. Many-to-Many relationships OK Many-to-Many relationships resolved Cardinality not shown Cardinality shown Cardinality shown No attributes shown Attributes are optional. If shown, can be composite attributes to convey business meaning. Attributes required and all attributes are atomic. Primary and foreign keys defined. Not normalized (Relational models) Not normalized (Relational models) Fully normalized (Relational models) Subject names should represent high-level data subjects or functional areas of the business Concept names should use business terminology Entity names may be more abstract Subjects link to 1-M HDMs Many concepts are supertypes, although subtypes may be shown for clarity Supertypes all broken out to include sub- types ‘One pager’ Should be a ‘one pager’ May be larger than one page Business-driven Cross-functional & more senior people involved in HDM process with fewer IT. Multiple smaller groups of specialists and IT folks involved in LDM process. Informal notation ‘Looser’ notation required – some format construct needed, but ultimate goal is to be understood by a business user Formal notation required < 20 objects < 100 objects > 100 objects Comparing Conceptual, Logical & Physical Models
  34. 34. P A G E 3 4 Roles Culture 3NF DBAs, Data Architects and Executives are different creatures DBA • Cautious • Analytical • Structured • Doesn’t like to talk • “Just let me code!” Data Architect/Data Modeler • Analytical • Structured • Passionate • “Big Picture” focused • Likes to Talk • “Let me tell you about my data model!” Business Executive • Results-Oriented • “Big Picture” focused • Little Time • “How is this going to help me?” • “I don’t care about your data model.” • “I don’t have time.”
  35. 35. P A G E 3 5 Data Drives the Business – Make sure it’s Correct In today’s information age, data drives key business decisions. Executives ask questions such as: › How many customers do I have? › What is total revenue by region for last fiscal year? › Which products drove the most revenue this quarter? Behind the answers to those questions lies a data model: › Documenting the source and structure of data » What database(s) store customer information » How are these databases structured to store customer information › Defining key business terms » What is a product? e.g. Finished goods only? Raw materials? › Regulating business rules » Can a customer have more than one account?
  36. 36. P A G E 3 6 Information in Context There’s more to data than meets the eye I’d like a report showing all of our customers SUPPORT ENGINEER A person’s not a customer if they don’t have an active maintenance account. SALES A customer is someone who wants to buy our product. SYBASE DB2 ORACLE SQL SERVER MS SQL AZURE INFORM IX TERADA TA SAP DBA Which customer database do you want me to pull this from? We have 25. BUSINESS EXECUTIVE DATA ARCHITECT And, by the way, the databases all store customer information in a different format. “CUST_NM” on DB2, “cust_last_nm” on Oracle, etc. It’s a mess. ACCOUNTING A customer is someone who owns our product. HUMAN RESOURCES My customers are internal employees.
  37. 37. P A G E 3 7 The Challenge › You’ve been tasked to assist in the creation of a Data Warehouse › Trying to obtain a single view of ‘customer’ › Technical and political challenges exist » Numerous systems have been built already—different platforms and databases » Parties cannot agree on a single definition of what a ‘customer’ is Solution: Need to build a Conceptual Data Model
  38. 38. P A G E 3 8 Building a Conceptual Data Model › Our challenge is to achieve a ‘single version of the truth’ for Customer information › We have 6 different systems with customer information in them: » 2 on Oracle » 1 on DB2 » 1 using legacy IDMS » 1 SAP system » 1 using MS SQL Server DB2 Oracle Oracle IDMS SQL Server SAP
  39. 39. P A G E 3 9 Building a Conceptual Data Model › We start with a very simple CDM, with just one object on it, called “Customer”. › We use an data model and show business definitions Too Simple?? Customer A person or organization that has purchased at least one of our products and has an active account.
  40. 40. P A G E 4 0 Too simple? Our team thought so, so went ahead and focused on the technical integration, including: › Reverse engineering a physical model from each system › Profiling the data to ensure data type consistency › Creating ETL scripts › Migrating the data into the data warehouse › Building a reporting system off of the data
  41. 41. P A G E 4 1 Focusing on the Business › This implementation went “perfectly”, with no errors in the scripts, no data type inconsistencies, no delays in schedule, etc. › We built a complex BI reporting system to show our upper management the results. › We even sent out a welcome email to all of our customers, giving them a 50% off coupon, and thanking them for their support.
  42. 42. P A G E 4 2 Focusing on the Business Until we showed the report to the business sponsor: › We can’t have 2000 customers in this region! I know we only have around 400! › Why is Jones’ Tire on this list? They are still evaluating our product! Sales was negotiating a 10% discount with them, and you just sent them a 50% coupon!?!? › You just spent all of that money in IT to build this report with bad data???
  43. 43. P A G E 4 3 Back to the Drawing Board After doing an extensive review of the six source systems, and talking with the system owners we discovered that: › The DB2 system was actually used by Sales to track their prospective “customers” › These “customers” didn’t match our definition—they didn’t own a product of ours!! Customer A person or organization who does not currently own any of your products and who is potentially interested in purchasing one or more of our products. Customer A person or organization that has purchased at least one of our products and has an active account.
  44. 44. P A G E 4 4 Oops! We were mixing current customers, with prospects (non-customers). › We just sent a discount coupon to 1600 of the wrong people! › We gave upper management a report showing the wrong figure for our total number of customers! › We are now significantly over budget to have to go back and fix this!! We started over, this time with a Conceptual Data Model
  45. 45. P A G E 4 5 Achieving Consensus We created a report of the various definitions of customer… And verified with the various stakeholders that: › There were 2 (and only 2 definitions) of customer › Sales was OK with calling their “customer” a “prospect” Customer A person or organization who does not currently own any of your products and who is potentially interested in purchasing one or more of our products. Customer A person or organization that has purchased at least one of our products and has an active account.
  46. 46. P A G E 4 6 Resolving Differences Our new high-level data model looked like this:
  47. 47. P A G E 4 7 Identify Model Stakeholders Make sure ALL relevant parties are involved in the design process – Get buy-in!
  48. 48. P A G E 4 8 Identify Model Purpose › Key to success of any project is finding the right pain-point and solving it. › Make sure your model focuses on a particular pain point, i.e. migrating an application or understanding an area of the business Existing Proposed Business “Today, we have a poor customer list.” “By next quarter, we want to identify a quality list of potential new customer prospects in the Northeast region.” Application “The legacy Account Management system is managed in DB2.” “We need to migrate the legacy Account Management system to SQL Server, for finance reasons.”
  49. 49. P A G E 4 9 A CDM Facilitates Communication Focus on your (business) audience › Intuitive display › Capture the business rules and definitions in your model Simplicity does not mean lack of importance › A simple model can express important concepts › Ignoring the key business definitions can have negative affects A model or tool is only part of the solution › Communication is key › Process and Best Practices are critical to achieve consensus and buy-in A Conceptual Data Model Facilitates Communication between Business and IT
  50. 50. P A G E 5 0 Communication is the Main Goal of a Conceptual Data Model › Wouldn’t it be helpful if we did this in daily life, too? › i.e. “Let’s go on a family holiday!” PERSON CONCEPT DEFINITION Father Vacation An opportunity to take the time to achieve new goals Mother Vacation Time to relax and read a book Jane Vacation A chance to get outside and exercise Bobby Vacation Time to be with friends Donna Vacation More time to build data models
  51. 51. P A G E 5 1 Tactics for Improving Communication Start small › Don’t try & boil the ocean › Gain consensus – then drill into more detail & widen scope Re purpose the models and data › Mask detail › Avoid Data Modeling geek language Don’t become a methodology obsessive! Go out of your way to engage with stakeholders › Lunch & learn › Webinars
  52. 52. P A G E 5 2 The basics of a good workshop / interview BEFORE › Choose attendees carefully for a broad representation › Reserve interviews for senior/time-limited stakeholders › Do your homework! Research. Prepare questions DURING › Use the attendees’ own language, avoid jargon › Use open questions, listen, and playback understanding AFTERWARDS › Photograph whiteboard / flipchart output › Write up your notes ASAP, omit nothing yet TOP DOWN FACT FINDING
  53. 53. P A G E 5 3 Collect: Gather the Nouns Students enroll for a course by submitting an application via our web portal, providing their name, date of birth, email, selected course and card details. TopTrainingCorp arranges for distribution of the necessary payment to the relevant examination centre and certification body. All our courses are approved by the relevant certification body. Instructors deliver our courses over 3 days after which the student sits 2 examinations consisting of 40 multiple-choice questions. What the business says… What the analyst hears… Student blah blah course blah blah application blah blah name, email, selected course, card details. Blah blah payment blah blah examination centre blah blah blah certification body. blah blah blah course blah blah blah certification body. Instructor blah blah blah course blah blah blah student blah blah blah examination blah blah question.
  54. 54. P A G E 5 4 Workshop Activity INSURECo Description INSURECo assists Customers manage their risk. In exchange for a constant stream of Premiums, INSURECo pays Customers a sum of money upon the occurrence of a predetermined event, such as a natural catastrophe, a car crash, or a doctor's visit. INSURECo create value by pooling and redistributing various types of risk. It does this by collecting liabilities (i.e. premiums) from everyone that it insures and then paying them out to the few that actually need them. INSURECo can then effectively redistribute those liabilities to entities faced with some sort of event-driven crisis, where they will ostensibly need more cash than they currently have on hand. As not everyone within the pool will actually suffer an event requiring the total use of all of their premiums, this pooling and redistribution function lowers the total cost of risk management for everyone in the pool. INSURECo can generate profits in two ways: • By charging enough premiums to cover the expected payouts that they will have to cover over the life of the policy • By earning investment returns ("the float") using the collected premiums INSURECo’s earnings ratio leans heavily towards investment returns. A large percentage of premiums are paid out which assists in attracting larger customer volumes and liabilities. … KEY DATA THINGS? Jeff has been an INSURECo customer since starting his landscaping business in 2001. Insurance is extremely important Jeff. Jeff requires Commercial Liability coverage to protect himself in case of damage done to a client's property. Jeff also requires to have his equipment/tools insured. This includes a number of vehicles that will be driven for business on a commercial auto policy. As Jeff is responsible for a number of staff it was important for his commercial policy to include options for covering specific items such as worker's compensation. Jeff – Business Owner / Contract Landscaper I N S U R E C O C U S T O M E R P R O F I L E
  55. 55. P A G E 5 5 Workshop Activity Organising the Terms Break into small groups. Using the INSURECo company description, identify the “nouns” or “concepts” or “things” that are important to the business. Tip: It’s helpful to circle the key nouns in the text. Create a Post It Note “Data Thing” for each of the key terms that are important for the business. Time: 20 minutes THINGS (CONCEPTS) Customer Concept #2 Concept #3 Etc.
  56. 56. P A G E 5 6 Workshop Activity INSURECo Description… INSURECo assists Customers manage their risk. In exchange for a constant stream of Premiums, INSURECo pays Customers a sum of money upon the occurrence of a predetermined event, such as a natural catastrophe, a car crash, or a doctor's visit. INSURECo create value by pooling and redistributing various types of risk. It does this by collecting liabilities (i.e. premiums) from everyone that it insures and then paying them out to the few that actually need them. INSURECo can then effectively redistribute those liabilities to entities faced with some sort of event- driven crisis, where they will ostensibly need more cash than they currently have on hand. As not everyone within the pool will actually suffer an event requiring the total use of all of their premiums, this pooling and redistribution function lowers the total cost of risk management for everyone in the pool. INSURECo can generate profits in two ways: • By charging enough premiums to cover the expected payouts that they will have to cover over the life of the policy • By earning investment returns ("the float") using the collected premiums INSURECo’s earnings ratio leans heavily towards investment returns. A large percentage of premiums are paid out which assists in attracting larger customer volumes and liabilities. … KEY DATA THINGS?
  57. 57. P A G E 5 7 Break LET’S TAKE 15 MINUTES FOR A BREAK!
  58. 58. P A G E 5 8 Data Modelling & Information Management
  59. 59. P A G E 5 9 DAMA DMBOK Framework DATA ARCHITECTURE MANAGEMENT DATA DEVELOPMENT DATABASE OPERATIONS MANAGEMENT DATA SECURITY MANAGEMENT REFERENCE & MASTER DATA MANAGEMENT DATA QUALITY MANAGEMENT META DATA MANAGEMENT DOCUMENT & CONTENT MANAGEMENT DATA WAREHOUSE & BUSINESS INTELLIGENCE MANAGEMENT DATA GOVERNANCE Knowledge Areas
  60. 60. P A G E 6 0 DATA ARCHITECTURE MANAGEMENT DATA DEVELOPMENT DATABASE OPERATIONS MANAGEMENT DATA SECURITY MANAGEMENT REFERENCE & MASTER DATA MANAGEMENT DATA QUALITY MANAGEMENT META DATA MANAGEMENT DOCUMENT & CONTENT MANAGEMENT DATA WAREHOUSE & BUSINESS INTELLIGENCE MANAGEMENT DATA GOVERNANCE › Enterprise Data Modelling › Value Chain Analysis › Related Data Architecture › External Codes › Internal Codes › Customer Data › Product Data › Dimension Management › Acquisition › Recovery › Tuning › Retention › Purging › Standards › Classifications › Administration › Authentication › Auditing › Analysis › Data modelling › Database Design › Implementation › Strategy › Organisation & Roles › Policies & Standards › Issues › Valuation › Architecture › Implementation › Training & Support › Monitoring & Tuning › Acquisition & Storage › Backup & Recovery › Content Management › Retrieval › Retention › Architecture › Integration › Control › Delivery › Specification › Analysis › Measurement › Improvement Where does Data Modelling fit?
  61. 61. P A G E 6 1 Data Modelling for Data Warehousing (DW) and Business Intelligence (BI) DATA WAREHOUSE BI REPORT: CUSTOMERS BY REGION DATA WAREHOUSING BI REPORTING › What is the definition of customer? › Where is the data stored? › How is it structured? › Who uses or owns the data? Show me all customers by region › What do I want to report on? › How do I optimize the database for these reports? A Data Model is the “Intelligence behind Business Intelligence” – Understand source and target data systems – Define business rules – Optimize data structures to align queries with reports Data Warehouse
  62. 62. P A G E 6 2 Scenario: Who is our “Customer”? Has this ever happened to you? › You’ve had a particular credit card for 10 years, and you regularly pay your full balance every month. › In addition to your monthly bill, you also receive a separate letter asking you to “Sign up for our credit card! Great Interest Rate!” Why does this happen? Don’t they know that you already own a card? How could a data model help?
  63. 63. P A G E 6 3 Who is our “Customer”? Look familiar?
  64. 64. P A G E 6 4 Data Modelling for Application Development › The majority of today’s applications are data-driven › A data model is a key part of the application development lifecycle › Reuse of common data objects helps promote › Increased efficiency – don’t “reinvent the wheel” › Better collaboration › Increased quality and consistency
  65. 65. P A G E 6 5 Scenario: Expense Tracking Application “Bug” Acme corporation has an Expense Billing application for its employees. › For each business trip, employees were asked to specify which department should be billed. › This caused a problem for employees whose job role spanned several departments. A product manager, for example, needed to bill Development for business trips to manage product releases, but needed to bill Sales for business trips to support customers. › With the existing database structure, this was impossible. › When management asked for an estimate of what it would cost to correct this, the expense was in the thousands of euros. How could a data model help?
  66. 66. P A G E 6 6 Expense Tracking Application “Bug” › When the original database system was built, the developer assumed that each employee works for a single department. › He structured the system in such a way that only a single department name could be associated with each employee. › This could have been fixed with a simple data model design.
  67. 67. P A G E 6 7 Expense Tracking Application “Bug” Each employee can bill expenses to multiple departments (Better implemented as: )
  68. 68. P A G E 6 8 Data Modelling for Metadata Management Metadata is “Information in Context” A Data Model helps Provide this Context Customer Name Company City Year Purchased Joe Smith Komputers R Us New York 1970 Mary Jones Big Bank Co London 1999 Proful Bishwal Little Bank Inc Mumbai 1998 Ming Lee My Favorite Store Beijing 2001 METADATA DATA Who is using the data? Who owns the data? Where is the data stored? When was it last updated? What are the business definitions? Why are we tracking the data? What is the structure and format of the data? METADATA
  69. 69. P A G E 6 9 Enterprise Architecture provides a high-level view of the people, processes, applications, and data of an organization Putting data in business context: › How does data link to the rest of my organization? › If I change data, what business processes are affected? Data Modelling for Enterprise Architecture
  70. 70. P A G E 7 0 Data Modelling for Database Management › Know what data you have: Create a visual inventory of database systems › Know what your data means: Communicate key business requirements between business and IT stakeholders › Support data consistency: Build consistent database structures with common definitions SYBASEORACLE DB2 TERA- DATA SQL SERVER MySQL SQL SERVER MySQL SYBASE DB2 ORACLE TERA- DATA
  71. 71. P A G E 7 1 Data Modelling for Master Data Management Master Data Management strives to create a “single version of the truth” for key business data: customer, product, etc. Using a central data model helps define: › Common business definitions › Common data structures › Data lineage between defined “version of the truth” and real-world implementations
  72. 72. P A G E 7 2 MDM: Plan Big – Implement Small Business Initiative / Project 1 Some of MD area A needed here Some of MD area B needed here Business Initiative / Project 2 More of MD area A needed here Some of MD area C needed here More of MD area B needed here Business Initiative / Project 3 More of MD area C needed here More of MD area B needed here More of MD area A needed here Business Initiative / Project 4 Lots of MD area D needed here More of MD area B needed here More of MD area A needed here Project4 later includes MD for area D MDM program cannot deliver Data Subject Area D at this time for Project 4. Project 4 gains exemption to add this MD later IN THE CONTEXT OF THE BIG PICTURE VIA THE CONCEPTUAL DATA MODEL IMPLEMENT MDM IN ALIGNMENT WITH BUSINESS INITIATIVES
  73. 73. P A G E 7 3 Data Modelling for Data Governance › Data, like money, is a corporate asset, and needs to be managed accordingly. › Like an auditing department for finance, data governance provides the guidelines, accountability and regulations around data management. › A Data Model can help define: » Who is accountable for data (e.g. Data Steward)? » Who is using data? » What is the lineage and traceability of data? » What is the proper definition of key business information? » When was the data last updated?
  74. 74. P A G E 7 4 Data Modelling for Data Lineage: e.g. SOX Financial reporting & auditing requirements Near real time reporting Accuracy of Financial statements Effectiveness of internal controls INFORMATION MANAGEMENT ESSENTIAL DATA LINEAGE IMPLICATIONS DATA GOVERNANCE IMPLICATIONS DATA QUALITY & DEFINITIONS IMPLICATIONS
  75. 75. P A G E 7 5 Data Modelling for Package / ERP systems Data Requirements For Configuration & Fit For Purpose Evaluation Data Integration & Governance Legacy Data Take On Master Data Integration DATA MODEL
  76. 76. P A G E 7 6 Data Modelling For Packages / ERP Systems Data lineage (particularly important with Data Lineage & SOX compliance issues) Master Data alignment For Data migration / take on Identifying gaps For requirements gathering ... But what if we’ve got to use package X? CUSTOMER ORDER CUSTOMER ORDER VS
  77. 77. P A G E 7 7 Data Modelling for Data Virtualisation Virtual Operational Data Stores Shareable Data Services D A T A MOD E L S QL W E B S E R VIC ES S T A R Virtual Data Marts Relational Views LEGACY MAINFRAMES FILES RDBMS WEB SERVICES PACKAGES BI, MI AND REPORTING CUSTOM APPS PORTALS & DASHBOARDS ENTERPRISE SEARCH
  78. 78. P A G E 7 8 Data Modelling Spans All Disciplines Data is at the heart of ALL architecture disciplines Data has to be understood to be managed Different levels of models for different purposes It’s NOT just for DBMS design Data models are not (just) art Professional development: certification & training All of the Architecture disciplines use the language (and rules) of the data model
  79. 79. P A G E 7 9 Aligning with Business Motivation
  80. 80. P A G E 8 0 Validate and Refine Business Goals › A Goal is a statement about a state or condition of the enterprise to be brought about or sustained through appropriate Means. › A Goal amplifies a Vision — that is, it indicates what must be satisfied on a continuing basis to effectively attain the Vision. › A Goal should be narrow — focused enough that it can be quantified by Objectives. › A Vision, in contrast, is too broad or grand for it to be specifically measured directly by Objectives. However, determining whether a statement is a Vision or a Goal is often impossible without in depth knowledge of the context and intent of the business planners. In light of the mission and vision and the influencer pressures, validate and refine the goals of the organisation
  81. 81. P A G E 8 1 Look for Levers Derive a set of measurable levers of business value and growth by cascading down the drivers of income in your business. › The levers are intended to be durable even as business strategy shifts. Value levers indicate which business dimensions need to be analysed for change projects. › Business consultants use the matrix to understand which business architecture dimensions have the greatest impact on each lever, focusing attention on those dimensions most relevant to the levers in focus. Look for levers that can help you address the goals
  82. 82. P A G E 8 2 Improvement Levers Example Increase price Increase volume Improve mix Improve process Reduce cost of inputs Improve warehouse utilisation Increase productivity Decrease staffing Optimize scheduling Optimize physical network Decrease staffing Use alternative distribution Lower Customer Service & Order Management Costs Lower I/S costs Lower Finance / Accounting costs Lower HR costs Improve capital planning/ investment process Reduce inventories Reduce A/R increase A/P o Profit-driven marketing efforts: • Target “best” customers • Offer “best” product mix • Improve pricing management • Proactive production planning for inventory management • Most profitable capacity allocation/utilization o Reduced sales management layers o Focus on high-profit accounts o Improved inventory flow visibility • Lower transportation costs • Higher facilities utilization • Less “fire fighting” o Better carrier evaluation/mgmt. o Higher quality Customer Service o Improved Supply Chain visibility • Improved order fill rates • Significantly lower cost • More consistent service • Faster problem resolution o Improved capital stewardship • Increased capital productivity • Reduced inventory investment • Reduced receivables investment o Automated PO requisitions o Improved information for evaluating vendors o Automation of some scheduling functions o Single point of entry eliminates data re-entry and improves accuracy o Faster data reconciliation o Automated billing processes o Automated payroll processes o Moderately lower safety stock inventory o Moderately improved A/R and A/P management Increase revenues Decrease costs Reduce selling costs Reduce distribution costs Reduce administrative costs Increase gross profit Decrease operating expenses Capital deployment Cost of capital Increase net operating profit after tax (NOPAT) (I/S) Improve capital allocation (B/S) Enterprise Value Map VALUE LEVERS TRANSFORMATION BENEFIT (Outcome) AUTOMATION BENEFIT
  83. 83. P A G E 8 3 Goals, drivers and levers are tightly integrated The value map will help identify enterprise aligned business unit drivers and leverage points Examples • Revenue enhancement • Margin enhancement • Operating efficiencies • Working capital management • Investment capital productivity • Capital structure optimization CORPORATE LEVEL-SPECIFIC Examples • Pricing strategy • Product assortment • Departmental emphasis • Product cost • Store operating costs • Corporate administrative costs • Operating cash reserves • Inventory management • Accounts payable management • Store base • Leases • Intangible assets • Distribution assets BUSINESS UNIT-SPECIFIC Examples • Purchase frequency • Household penetration • Transaction size • Gross margin • Store operating expense % • SG&A expense % • Distribution cost per case • Days cash on hand • Inventory turns-store and DC • Account payable cycle • Store-level profitability • Debt/Equity ratio • Annual capital investment OPERATING VALUE DRIVERS Earning loyalty & trust with customers & community Make our Processes simpler and faster Empower the frontline to Deliver Integrated Financial Solutions Deliver new and relevant products Create a performance culture REVENUE COSTS WORKING CAPITAL FIXED CAPITAL NOPLAT INVESTED CAPITAL EVM Goals and Drivers help define value drivers and improvement levers
  84. 84. P A G E 8 4 › Your team has been brought on board to help address some of the information related stakeholder concerns › INSURECo’s CIO intuitively understands some of these concerns are due to poor management of data and information across the organisation › The CIO has engaged your team to assist him to understand how he can best get control of the situation and, as a result, elevate a number of stakeholder concerns › The CIO is specifically interested in understanding what information management practices need to put in place to avoid finding himself in this position again in the future Workshop Activity Problem Briefing
  85. 85. P A G E 8 5 Example Decomposition What are our corporate goals? What are the priorities and battlegrounds given our corporate goals? What IT assets and data do we need to support these capabilities? How will our business model change over the next three to five years? What are the key capabilities that will maximize value creation in the business? How do we optimize our IT operating model to deliver the required business capabilities? Earn all Our customers’ business Drive a strong Customer culture Enhanced branch Capability and Power in frontline Transform Service delivery And processes Customer centricity Front office empowerment Channel and Product operational excellence Customer Profile management Relationship management Customer analytics Offer design Product management Integrated Data store Enterprise Service bus Channel platforms Product platforms Security platforms End-user computing Customer analytics engine Universal customer master Integrated sales & Service front end Internet platform transformation Core banking transformation Vision Strategic agenda Business objectives Business capabilities IT capabilities IT investments
  86. 86. P A G E 8 6 Problem Briefing “Workflow is inadequate for Product Development, Underwriting and Legal & Compliance” “First contact resolution is an important goal for Enquiries and Complaints” “Rolling out new products to customers is expensive and takes a long time. Product Releases should be a BAU activity ” “No correlation between claims and reinsurance systems” “Management Reporting is complex and uses a lot of disparate systems and spreadsheets to produce” “We have key person dependencies in a number of teams” “Errors are only picked up at policy anniversary” “Knowledge, Document and Content Management is inadequate” “50% of errors are associated with either missing a process step or sending paperwork to the wrong place” Are the strategic programs aligned, or for that matter, are they the right strategic programs? Are we investing in the right areas across the enterprise? Is my investment portfolio balanced across my strategic and tactical issues Where can we take advantage of synergies across the major strategic programs? There is a lot of activity going on out there, how do I know we are doing the right things? INSURECo Stakeholder Concerns
  87. 87. P A G E 8 7 Workshop Activity Describe the problem space using Text-based & Visual models › Think about “a day in the life” of INSURECo with specific focus on the environment and problem space. List these as text › Flip over the card and now draw a picture of the problem › Post the picture on the wall and explain it to the class. Take note of common elements › Time: 20 minutes › After descriptions, the group must reflect on the similarities and differences and work towards a shared understanding of the problem space › Document this shared understanding on the flip chart and post on the wall INDEX CARD – FRONT AND BACK
  88. 88. P A G E 8 8 Data Modelling Basics
  89. 89. P A G E 8 9 Data Modelling does not have to be Complicated! If you can write a sentence, you can build a data model. If you understand how your business works, you can build a data model. Businesspeople should be involved in the development of data models, because only they understand the business needs and rules. Understanding data modelling basics will help the Business better communicate with IT
  90. 90. P A G E 9 0 What is an Entity? Entity: A classification of the types of objects found in the real world -- persons, places, things, concepts and events – of interest to the enterprise. 1 1 DAMA Dictionary of Data Management WHO? WHERE? WHEN? HOW?WHY? WHAT? The “Who, What, Where, When, Why” of the Organization
  91. 91. P A G E 9 1 Entities are the “Nouns” of the Organization › Who? Employee, Customer, Student, Vendor › What? Product, Service, Raw Material, Course › Where? Location, Address, Country › When? Fiscal Period, Year, Time, Semester › Why? Transaction, Inquiry, Order, Claim, Credit, Debit › How? Invoice, Contract, Agreement, Document
  92. 92. P A G E 9 2 Sample Entities Product Customer Location Order Raw Material Ingredient Region
  93. 93. P A G E 9 3 Baker Cakes, Inc. › Baker Cakes is a family-run business whose main stakeholder is Bob Baker, the owner/operator of Baker Cakes. Bob is in charge of making most decisions, from database design to icing color selection. › In our example, we’re building a new application for Baker Cakes, who is looking to build a new application to manage their data.
  94. 94. P A G E 9 4 What Entities are needed for this Cake Company? Exercise: List some entities that would be needed for Baker Cakes, Inc. to manage their corporate data — Customer — Product — Invoice — Ingredient — Raw Material — Flavor
  95. 95. P A G E 9 5 Attributes An Attribute is a piece of information about or a characteristic of an Entity. Attributes Entity Employee • Employee Identifier • Employee Last Name • Employee First Name • Employee Hire Date • Employee Signed Employment Contract • Employee Drivers License Photo Entity Attributes
  96. 96. P A G E 9 6 Relationships are the “Verbs” of the Organization Relationships define the data-centric Business Rules of an organization › An employee can work for more than one department. › A customer can have more than one account. › Sales are reported monthly. › A department can contain more than one employee. — Relationships are the “verbs” in a sentence • A department can contain more than one employee. Defining Business Rules — Relationships are the “lines” on a data model Contain Department Employee
  97. 97. P A G E 9 7 Contain Department Employee Cardinality = “How Many?” ‘Cardinality’ is the term used to describe the numeric relationships between the entities on either end of the relationship line. A department can contain more than one employee.
  98. 98. P A G E 9 8 Deciphering Cardinality Think of how a child might answer the question “How many?” › One = 1 finger › More than one = several fingers Contain Department Employee
  99. 99. P A G E 9 9 Optionality = Required ‘Optionality’ describes what may or must happen in a relationship. A department may contain more than one employee. i.e. “zero, one, or many” A department must contain more than one employee. i.e. “one, or many” i.e. “Must” or “May” Contain Department Employee Contain Department Employee
  100. 100. P A G E 1 0 0 Data Modeling is Similar to Diagramming a Sentence 1. Place boxes around the “nouns”, or entities. 2. Underline the verb 3. Circle the “how many” qualifier. 4. Look for optionality words (e.g. “may/must”) A department may contain more than one employee. Contain Department Employee
  101. 101. P A G E 1 0 1 Exercise: Practice “Reading” the following Relationships/Business Rules › Each Employee may process one or many Transactions. › Each Transaction must be processed by one Employee. › Each Customer may place one or many Orders. › Each Order must be placed by one Customer. Process Employee Transaction Place Customer Order
  102. 102. P A G E 1 0 2 Exercise: Create a Data Model from a Business Rule Create a Data Model for the following business rule: A customer may own more than one account. Thought for this exercise: Can an account have more than one customer associated with it? If so, how would you show this?
  103. 103. P A G E 1 0 3 Exercise: Create a Data Model from a Business Rule Create a Data Model for the following business rule: A customer may own more than one account. Own Customer Account
  104. 104. P A G E 1 0 4 What are Keys? A Key is an attribute or group of attributes that uniquely identifies an Entity. For example, — A Customer is identified by a Customer ID — A Student is identified by Last Name, First Name, and Date of Birth
  105. 105. P A G E 1 0 5 Candidate Keys › What attributes might uniquely identify an entity? Let’s use Customer as an example. › What might uniquely identify an individual customer? Is Last Name + First Name enough? — Could there be 2 customers named John Smith?  Probably Is Last Name + First Name + Date of Birth enough? — Could there be 2 customers named John Smith born on 1 June, 1963?  Less Likely, but Possible Is Last Name + First Name + Date of Birth + Address enough? — Could there be 2 customers named John Smith born on 1 June, 1963 living at 1 Earl’s Court, London, UK?  Even Less Likely. Possible, but how many attributes do we want to use?
  106. 106. P A G E 1 0 6 Candidate Keys: Natural vs. Surrogate › The keys we just identified would be classified as natural keys. › Natural keys are based on business rules and logic that determine how an individual instance can be uniquely identified. › As we’ve seen, natural keys can become unwieldy, requiring a number of attributes, which makes queries difficult. — Surrogate keys are often used instead, which are system-generated unique identifiers. e.g. Customer ID, Product ID, etc. — While surrogate keys are more efficient, important business rules are lost when they are used. It’s a balancing act.
  107. 107. P A G E 1 0 7 Alternate Keys Alternate Keys can be defined to keep the business rules of a natural key, while still achieving the efficiency of a surrogate key. In the example below: › Student Number is the primary key (surrogate key) › Student Last Name + Student First Name + Student Date Of Birth is the natural key Alternate keys can be used to retrieve information more quickly in a database (e.g. index by last name + first name + date of birth). Student Student Number Student First Name (AK1.2) Student Last Name (AK1.1) Student Date of Birth (AK1.3)
  108. 108. P A G E 1 0 8 Primary Keys A Primary Key is an attribute or group of attributes that uniquely identifies an Entity. A primary key is shown “above the line” in a data model. Primary Key Customer Customer ID Customer First Name Customer Last Name Customer Date of Birth
  109. 109. P A G E 1 0 9 Supertypes and Subtypes Supertypes and Subtypes are a mechanism for grouping objects with similar (but not identical) characteristics. For example: Individual Customer ID (FK) Customer First Name Customer Last Name Customer Gender Customer Date of Birth Corporation Customer ID (FK) Company Name Corporate Logo Customer Customer ID First Date of Purchase Common Attributes Supertype Subtype Unique Attributes Unique Attributes Subtype
  110. 110. P A G E 1 1 0 Dimensional Data Modelling
  111. 111. P A G E 1 1 1 Dimensional Data Modelling › Relational data modeling describes business rules around transactional systems. › Dimensional data modeling describes structures for aggregation and reporting, typically for business intelligence.
  112. 112. P A G E 1 1 2 Data Warehousing and BI › Typically a Data Warehouse is built for a Business Intelligence (BI) project › Need to transform and summarize the relational tables from operational systems, to a single warehouse summarized for reporting. › “ETL” is the process of Extracting, Transforming, and Loading the Warehouse. Data Warehouse ETL RELATIONAL DIMENSIONAL ERP System Sales DB Sales DB (2) Sales DB (3) Accounting DB
  113. 113. P A G E 1 1 3 Dimensional Data Modeling For BI, an easy way to think of a dimensional model is to ask yourself: “What do I want to report by?” (Apologies to grammarians!), e.g. › by month › by region › by quarter › by product The lines on a dimensional data model represent navigation paths, not business rules.
  114. 114. P A G E 1 1 4 Sample Dimensional Data Model “Star Schema” In this case, we want to report on Sales by Product, by Month, by Sales Rep, and by Region. FACT TABLE DIMENSION DIMENSION DIMENSION DIMENSION
  115. 115. P A G E 1 1 5 Dimensional Data Modeling FACTS Contain the actual values to be reported upon. e.g. Sales Figures › Few attributes (with links/keys to the dimensions) › Many values DIMENSIONS Contain the details that describe the central fact. › Many attributes › Few values
  116. 116. P A G E 1 1 6 Workshop: Employee Salary Reporting › You work in Human Resources and your company is implementing a new employee salary and benefit reporting system. › IT asks you for your requirements. You can impress them by sending them a well-formed dimensional model. › Your report needs to show your company’s total salary spending by region. You also want to know which managers are paying their employees the most. And you’ll want to see which roles in the organization make the most money. › Create a conceptual data model for this reporting system. Don’t worry about entering definitions for the concepts at this point—just show the concepts that you need to report on and report by. (Hint: this will be a three-sided star).
  117. 117. P A G E 1 1 7 Example: Employee Salary Reporting Your conceptual data model for the new reporting system should look something like this: Salary Manager Region Role
  118. 118. P A G E 1 1 8 SUMMARY › A data model can help increase Communication with the business owners and IT staff to and achieve better results › Starting with a robust data model helps alleviate many of the data quality and data-related errors commonly seen in business today. › Aligning with Business Motivation and Drivers is key to success of any initiative › Data Modeling does not have to be complicated and can be understood by most businesspeople › Data Modeling is at the core of most Information Management challenges

    Be the first to comment

    Login to see the comments

  • jumayjuma

    Dec. 1, 2016
  • williammdavis

    Dec. 3, 2016
  • ABHIPSASARKAR

    May. 11, 2017
  • ozkaya

    May. 13, 2017
  • JaneRobertsCDMP

    Oct. 31, 2017
  • MopMex

    Oct. 31, 2017
  • designerDATA

    Jan. 11, 2018
  • hansmel

    Feb. 13, 2018
  • RockYou1

    May. 6, 2018
  • MuhammadKhan650

    Jul. 16, 2018
  • TREASARS

    Aug. 15, 2019
  • NilaHines

    Sep. 4, 2019
  • IaoAndy

    Oct. 2, 2019
  • djibrilndiaye

    Oct. 26, 2019
  • KhanhNguyen122

    Nov. 26, 2019
  • IyyappanDevadoss

    Dec. 17, 2019
  • mraveendranathan

    Apr. 6, 2020
  • AndriusUrbonavicius

    May. 29, 2020
  • manpritsingh77

    Jul. 12, 2020
  • bv2600

    Apr. 7, 2021

Data modelling for the business half day workshop presented at the Enterprise Data & Business Intelligence conference in London on November 3rd 2014 chris.bradley@dmadvisors.co.uk

Views

Total views

20,281

On Slideshare

0

From embeds

0

Number of embeds

184

Actions

Downloads

1,463

Shares

0

Comments

0

Likes

57

×