• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
How to read a data model
 

How to read a data model

on

  • 3,738 views

This is presntation on how you can read a data model and understand the data and business rules contained in it. It is intended for non-technical people

This is presntation on how you can read a data model and understand the data and business rules contained in it. It is intended for non-technical people

Statistics

Views

Total Views
3,738
Views on SlideShare
3,736
Embed Views
2

Actions

Likes
1
Downloads
97
Comments
0

1 Embed 2

http://mendatingbadly.com 2

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    How to read a data model How to read a data model Presentation Transcript

    • How to read a data model? 1
    • By:Sanjay SharmaConsulting Enterprise and Data Architecte.mail: data_arch@yahoo.com 2
    • Goal• To develop basic literacy about data models. – To understand what it contains. – To understand how information in it can be used more effectively. We would not touch upon technicalities of developing a data model. 3
    • Session Structure• What is a data model, its need and context.• Different types of data models• Semantics of data models• How to read a data model• How to use data models more effectively• Question – answers. 4
    • Why Model?• John Boyd (1927-1997)– Military Strategist and Thinker• Most original military thinker since Sun Tzu (600BC)• OODA Loop: Every organization/organism uses OODA loop to adapt to its surroundings and survive. 5
    • Why model?• Observation is information gathering.• Orientation is developing a mental framework of information by understanding its structure and relationships .• Models are observation as well as orientation tools which use symbols for real world facts.• Models are effective because human mind absorbs more information visually than textually.• Models in business and IT – Enterprise Models, Business Process Models, Workflow Models Interaction Models, Network Models etc. 6
    • Why model data?• Data is a distinct component of an information system – the other component is application logic.• It needs to be described in such a way that it is clearly and precisely communicated to all stake holders- information analysts, application developers, data analysts, database administrators etc.• Every data element must have a defined business purpose. A data model is an un-ambiguous and precise description of data, its structure and relationships agreed upon by all stakeholders. 7
    • What is a data model?• It is a paper sheet with coloured rectangles and tangled web of crow-feet lines joining them……• For a given information system, it is graphical representation of data elements, their relationships and constraints governing the data. InReportOther InReportHospital occupReportTypeCode: int NULL servProvOrgRoleID: int NULL inReportID: int NOT NULL (FK) servProvLU: text NULL hospitalOrgRoleID: int NULL therapistPersonRoleID: int NULL hospitalLU: text NULL therapistLU: text NULL admissionDate: datetime NULL inReportID: int NOT NULL (FK) dischargeDate: datetime NULL findings: text NULL dischargeNote: text NULL recs: text NULL HospitalConsultationReport HospitalOtherReport inReportID: int IDENTITY (FK) inReportID: int IDENTITY (FK) hospConReportID: int NOT NULL hospitalOtherReportID: int NOT NULL doctorPersonRoleID: int NOT NULL reportDate: datetime NULL doctorLU: text NULL hosOthReportTypeCode: int NULL dictationDate: datetime NULL source: varchar(50) NULL diagnosis: text NULL findings: text NULL findings: text NULL procedures: text NULL procedures: text NULL comments: text NULL HospitalImagingReport inReportID: int IDENTITY (FK) hospImgReportID: int NOT NULL reportDate: datetime NULL proceduresCode: int NULL findings: text NULL opinions: text NULL 8
    • What is the context? 9
    • Types of data models• Data can be described with different perspectives: Object- Role Models, Entity-Relationship Diagrams(ERDs), Data Flow Diagrams ( DFDs), UML Class Diagrams etc.• Entity-Relationship (ER) Diagrams most popular for data modeling as they can easily be converted into relational database designs. InReportHospital inReportID: int NOT NULL (FK) hospitalOrgRoleID: int NULL hospitalLU: text NULL admissionDate: datetime NULL dischargeDate: datetime NULL dischargeNote: text NULL HospitalConsultationReport HospitalOtherReport inReportID: int IDENTITY (FK) inReportID: int IDENTITY (FK) hospConReportID: int NOT NULL hospitalOtherReportID: int NOT NULL doctorPersonRoleID: int NOT NULL reportDate: datetime NULL InReportPsychTest doctorLU: text NULL hosOthReportTypeCode: int NULL dictationDate: datetime NULL inReportID: int NOT NULL (FK) source: varchar(50) NULL diagnosis: text NULL source: varchar(50) NULL findings: text NULL findings: text NULL assessmentDate: varchar(30) NULL procedures: text NULL procedures: text NULL conclusions: text NULL comments: text NULL summary: text NULL recs: text NULL HospitalImagingReport inReportID: int IDENTITY (FK) hospImgReportID: int NOT NULL InReportAmb NeuroPsychTest reportDate: datetime NULL neuroPsychTestID: int IDENTITY inReportID: int NOT NULL (FK) proceduresCode: int NULL inReportID: int NOT NULL (FK) scene: varchar(50) NULL findings: text NULL test: varchar(75) NULL sceneTime: datetime NULL opinions: text NULL result: varchar(75) NULL destination: varchar(50) NULL destinationTime: datetime NULL note: text NULL GlasgowComaScale complaint: varchar(100) NULL gcsID: int IDENTITY injuryMech: text NULL inReportID: int NOT NULL (FK) history: text NULL time: datetime NULL medications: text NULL eyes: tinyint NULL allergies: text NULL verbal: tinyint NULL consciousness: varchar(50) NULL motor: tinyint NULL airwayControlCode: int NULL total: tinyint NULL note: text NULL 10
    • Types of ERD – domain model• Domain Model(Subject Area Model): A very high level (10,000 feet) conceptual model showing the major entities and their relationships in a business or problem domain• Only entities are shown 11
    • Scope of domain models• Business Domain Models or Business Subject Area Models – Very high level covering entire business• Application Domain Models or Application Subject Area Models – covering an application/package. 12
    • Types of ERD – logical models• Logical Models: Showing entities and their logical relationships for a given information system. TOTAL LOSS REQUEST RECORD CLAIM FILE Claim File Id CLAIM FILE ESTIMATE GROUP Claim File Id (FK) RequestNumber Claim File Id (FK) ICBC Claim Number ActualMileageFlag ICBC Form Id Claim Program Type CommentsNotOnValuation ClaimStatus Estimating Business Facility Number CommentsOnValuation ControlLogNumber Maximum Estimate Id Condition EstimateCount Current Status Equipment Creation Date Last Status Change Timestamp MarketValue Creation Time Stale Claim Flag Other LastNet BF Logical Supplement Count OtherAdj PrimaryImpactPoint OtherDesc SecondaryImpactPoint Packages Entered Car Model Year RequestUploadFlag Entered Car Model VIN SalvageType ADPHostControlLogNumber SearchDays DeviceAssetNumber SearchExtent PenPro Claim Number TransferFee AcctControlNo ValuationLevel Adjuster Resource Name ValuationStatus Adjuster Resource Number Create Date LossSecondPayee LossPayee LossType LossDate ESTIMATE DAIS CHUNKS VEHICLE REPAIR LOG PolicyNumber Claim File Id (FK) Vehicle Repair Log Claim File Id (FK) Insured Name Sequence Number Vehicle Repair Log Secondary Id Claim Centre Number Claim Centre Name DAIS Data Vehicle Repair Log Logon Id CLF_DAIS_NUM_BYTES Vehicle Repair Log TimeStamp CLF_DAIS_NUM_ROWS Vehicle Repair Log Car In Date Claim Number Check Digit Vehicle Repair Log Car In Time Exposure Code Vehicle Repair Log Customer Contact Date Kind Of Loss Code Vehicle Repair Log Customer Contact Time Person Organization Id Vehicle Repair Log Car Out Date Licence Series Year Vehicle Repair Log Car Out Time Declared Value Vehicle Repair Log Exclude Flag Gross Vehicle Weight VRL_PVRT_NUM_DAYS ESTIMATE PRINT IMAGE LINE Claim File Id (FK) EstimateID (FK) Estimate Print Line Number Estimate Print Line Text 13
    • Types of ERD-physical models• Physical Models: The model showing the physical implementation of logical model at data storage level.• Contains columns for implementing relationships and fast data access. CLAIM_FILE AUTOSOURCE_REQUEST CLF_ID: DECIMAL(15,0) NOT NULL• Most tools can create ASR_CLF_ID: DECIMAL(15,0) NOT NULL (FK) CLF_ICBC_CLM_NUM: CHAR(7) NOT NULL ASR_REQ_ID: SMALLINT NOT NULL CLF_ICBC_FORM_ID: CHAR(1) NOT NULL ASR_ADXE_CREATE_ID: VARCHAR2(35) NOT NULL CLF_CLM_STAT: SMALLINT NOT NULL ASR_EST_ID: SMALLINT NOT NULL CLF_CNTL_LOG_NUM: CHAR(25) NOT NULL ASR_PRODUCT_TYP: CHAR(1) NOT NULL CLF_EST_CNT: SMALLINT NOT NULL ASR_DEVICE_NME: VARCHAR2(10) NOT NULL CLF_SCHED_DTE: DATE NULL VEHICLE_REPAIR_LOG schema scripts from ASR_SEARCH_DAYS: VARCHAR2(30) NOT NULL CLF_SCHED_TME: DATE NULL CLF_LAST_NET: DECIMAL(8,2) NOT NULL VRL_CLF_ID: DECIMAL(15,0) NOT NULL (FK) ASR_SEARCH_PROV_CD: VARCHAR2(30) NOT NULL ASR_SEARCH_PROV: VARCHAR2(30) NOT NULL CLF_PRIM_IMP_PNT: SMALLINT NOT NULL VRL_SEC_ID: SMALLINT NOT NULL ASR_SEARCH_POSTAL: VARCHAR2(30) NOT NULL CLF_SEC_IMP_PNT: SMALLINT NOT NULL VRL_LOGON_ID: CHAR(8) NOT NULL ASR_SEARCH_CITY: VARCHAR2(30) NOT NULL CLF_SCHED_YEAR: SMALLINT NOT NULL VRL_TMESTMP: TIMESTAMP NOT NULL ASR_ASHOST_REQ_NUM: CHAR(8) NOT NULL CLF_SCHED_VIN: CHAR(20) NOT NULL VRL_CAR_IN_DTE: DATE NOT NULL ASR_CURRENT_STAT: CHAR(18) NOT NULL CLF_ADPH_CNTL_NUM: CHAR(7) NOT NULL physical models. VRL_CAR_IN_TME: DATE NOT NULL ASR_ADJ_POLARITY: CHAR(6) NOT NULL CLF_DEV_ASSET_NUM: CHAR(10) NOT NULL VRL_CUST_CNTCT_DTE: DATE NULL ASR_ADJ_VALUE: DEC(8,0) NOT NULL CLF_PENPRO_CLM_NUM: CHAR(25) NOT NULL VRL_CUST_CNTCT_TME: DATE NULL ASR_ADJ_DESC: VARCHAR2(30) NOT NULL CLF_ACCT_CNTL_NUM: CHAR(17) NOT NULL VRL_CAR_OUT_DTE: DATE NULL ASR_TITLE_FEE: DEC(4,0) NOT NULL CLF_ADJ_RSRC_NME: CHAR(35) NOT NULL VRL_CAR_OUT_TME: DATE NULL ASR_TRANSFER_FEE: DEC(4,0) NOT NULL CLF_ADJ_RSRC_NUM: CHAR(5) NOT NULL VRL_EXCLUDE_FLG: CHAR(1) NOT NULL ASR_SALVAGE_TYP: SMALLINT NOT NULL CLF_LOSS_SECND_PAY: CHAR(30) NOT NULL VRL_PVRT_NUM_DAYS: SMALLINT NULL ASR_PUB_COMMENT: VARCHAR2(1000) NOT NULL CLF_LOSS_PAYEE: CHAR(30) NOT NULL ASR_PRIV_COMMENT: VARCHAR2(1000) NOT NULL CLF_LOSS_TYP: SMALLINT NOT NULL ASR_RECEIVED_DTE: DATE NULL CLF_LOSS_DTE: DATE NULL CLF_PLCY_NUM: CHAR(12) NOT NULL CLF_INS_NME: CHAR(27) NOT NULL CLF_CLM_CNTR_NUM: CHAR(3) NOT NULL CLF_CLM_CNTR_NME: CHAR(30) NOT NULL CLF_DAIS_NUM_BYTES: INTEGER NOT NULL AS_REQ_CONDITION CLF_DAIS_NUM_ROWS: SMALLINT NOT NULL ASRC_CLF_ID: DECIMAL(15,0) NOT NULL (FK) CLF_CLM_NUM_CD: CHAR(1) NOT NULL ASRC_REQ_ID: SMALLINT NOT NULL (FK) CLF_EXP_CDE: CHAR(1) NOT NULL ASRC_SEQ_NUM: SMALLINT NOT NULL CLF_KOL_CDE: CHAR(2) NOT NULL CLF_PO_ID: DECIMAL(15,0) NOT NULL ASRC_COMPONENT: VARCHAR2(72) NOT NULL CLF_LIC_SER_YEAR: CHAR(1) NOT NULL ASRC_COND_TYP: CHAR(1) NULL CLF_DEC_VALUE: DECIMAL(7,0) NOT NULL ASRC_CNDTYP_RATING: CHAR(18) NOT NULL CLF_GR_VEH_WT: CHAR(6) NOT NULL ASRC_COND_RATE: SMALLINT NOT NULL CLF_PR_ID: DECIMAL(15,0) NULL ASRC_COND_DATE: DATE NULL CLF_AQT_CDE: CHAR(3) NOT NULL ASRC_COND_VALUE: DECIMAL(6,0) NOT NULL CLF_MIN_NO_DAM_TYP: CHAR(2) NOT NULL ASRC_COND_NAME: VARCHAR2(30) NOT NULL CLF_EST_REM_CRC: INTEGER NOT NULL ASRC_COND_NOTES: VARCHAR2(30) NULL CLF_EST_REM_CH_FLG: CHAR(1) NOT NULL CLF_PURGE_FLG: CHAR(1) NOT NULL CLF_PURGE_DTE: DATE NULL 14
    • Semantics of data models• Data models use graphical notations and text strings called ‘Verb Phrases’.• The semantics of notations depends upon the modeling technique followed and the tool being used. 15
    • Entities• A Thing of significance for business for which data has to be stored and manipulated.• Nouns representing Objects, Events, Concepts, Relationships, Actions…..• In data models represented as rectangles.• Examples: Insurance policy, Claim, Vehicle, Event etc. 16
    • Entity sub-types• Some entities have many subtypes• PERSON and ORGANIZATION entities are sub types of PARTY entity• FULL TIME EMPLOYEE and CONTRACT EMPLOYEE are sub types of EMPLOYEE entity• They are depicted as contained in main entity or as child of main entity Party Employee Full Time Contract Person Organization 17
    • Attributes• The properties of Entities for which data has to be collected and stored.• Attributes are represented as text strings contained inside the entities in data models.• Example- Policy holder`s name, event date, claim amount etc 18
    • Relationships• Relationships represent how entities interact and create, use, modify or delete each other.• They are represented by different types of lines going from one entity to another. ---------------- ________ ------------- _________ ________ 19
    • Cardinality of relationship• Cardinality of relationship is number of instances of entities at the two ends of relationships.• It is represented by 3 domain values – Zero, One or Many• It may be shown as a circle, a vertical line and a crow feet at the end of relationship lines or some other symbol.• Sometimes it is represented as ‘0’, ‘1’ or ‘n’ on relationship lines. Policy Claim ..1.. 0…n Product Line Item 20
    • Optionality of relationships• Optionality of relationship means whether the entity ‘may be present’ or ‘must be present’ in the relationship.• It may be represented as ‘solid line’ or ‘broken line’ part in the relationship ( or some other way) Policy _____---------- Claim 21
    • Self Referencing Relationships 22
    • Verb phrases• Verb Phrases describe relationship between two entities going from one entity to another in both directions. Employs organization Employee Works for Paid to Policy Claim Holder Makes 23
    • Keys• Keys are for navigating through data: information retrieval• Primary Keys: A primary key is a group of attributes that uniquely identifies an entity instance. Every entity has exactly one primary key• Foreign Keys: Navigating to attribute of an entity from another entity. FK attributes implement relationships and are owned by parent entities. 24
    • Relationships- identifying vs. non- identifying• The parent entity is needed to identify the child entity. 25
    • Domains• A named set of data values all of the same data type, upon which the actual value for an attribute instance is drawn.• Every attribute must be defined on exactly one underlying domain. Multiple attributes may be based on the same underlying domain.• Example of domain – – Gender- M, F – Province -Varchar(2) – BC, AB, ON, NF, QC, MN, SC, YU – Short Description- Varchar(40) – Long Description – Varchar(2000) – Unique Identifier – Integer(9) 26
    • Cost of wrong domains• NASA spacecraft Mars Climate Orbiter crashed on mars surface in 1998. The spacecraft was using domain with USMB units(pound force seconds ) whereas the control center was using domain based on SI units(newton seconds). Total cost - $327.6 million• European Ariane 5 expendable launch system blast occurred 37 seconds after launch in 1996- Wrong use of domain(Integer vs Float) caused integer overflow - Total cost - $8 Billion 27
    • Types of notations• Different types of semantic notations are available for ER diagramming – Chen Notations – IDEF1X – Information Engineering – Barker Notations 28
    • Types of notations-IDEF1X. Independent Dependent Entities Entities Discriminator Identifying – Solid lines Non-Identifying- Dashed lines ------------ Category Complete Category In-Complete Many-to-Many ------------ Zero-One or Many Z ------------ Attributes P Optional Mandatory 29
    • Types of notations-IDEF1X• Supported by most of the available tools.• More geared towards developing physical database design• Needs combination of notations to capture rules.• These combinations not easily understood by business people- difficult to use in JAD sessions. 30
    • IDEF1X model 31
    • Types of notations – Information Engineering(IE) Entities Super TypeIdentifying Non-Identifying ----------------- Sub Type Sub TypeOne to Many Zero-or-One Exclusive OR in Finkelstein ---------------Many to Many One and only OneZero-One or Many Attributes Attributes Sub Type Sub Type 32
    • Types of notations-Information Engineering ( IE)• Two variations - Clive Finkelstein and James Martin• Different tools implement different variations of the notations.• In the original version, attributes not shown on the entities but in a separate document like Martins` Bubble Chart• Supported by most of the available modeling tools.• Easy to understand notations• Suitable for JAD sessions. 33
    • IE model 34
    • Types of notations- Barker. Entities Solid-Dashed lines for Optionality ____ -------- One or More Zero or One _____ -------- Zero or More One to One Exclusive OR Super Type Sub Type 35
    • Types of notations: Barker• # before attribute – unique identifier attribute• Solid circle are for required attributes• Blank circles for optional attributes• Sub Types are mutually exclusive• Sub Types are always complete.• A line across relationship means the relationship is identifying. 36
    • Types of notations- Barker• Developed by Richard Barker in UK in 1986.• Adopted by Oracle for its case methodology.• Simple and easily understood by business people.• Not supported by all tools. 37
    • Barker model 38
    • 39
    • Reading business rulesEach <Entity 1>{may be | must be }  Optionality<relationship>  Verb Phrase{zero |only one | one or more} Cardinality<Entity 2> An EMPLOYEE A DEPARTMENT must be may be staff of composed of only one one or more DEPARTMENT EMPLOYEE 40
    • Reading business rules• A CLAIM FILE may contain Zero, One or More TOTAL LOSS REQUEST RECORD• A TOTAL LOSS REQUEST RECORD must be on only one CLAIM FILE 41
    • Reading business rules• A CLAIM FILE may have vehicle detail in zero one or more VEHICLE RECORD• A VEHICLE RECORD must be (..?..) one and only one CLAIM FILE 42
    • Reading a data model• Find out what notations are being used.• Get a chart of the notations giving graphical representations and their descriptions.• Look at the important entities in the model – entities which are center of many relationships.• Look at the definition of the entity. The definition should convey the role entity plays in business.• Following relationship lines and reading verb phrases, move from one entity to another.• Note the relationships implemented in the model.• Note the cardinality and optionality rules.• Read the business rule implemented for the entities. 43
    • Let us read a data model 44
    • Reading a data model-gleaning the business rules• It is an attributed logical model.• It is using Information Engineering (IE) notations.• A PARTY may place Zero, One or Many PURCHASE ORDER• A PURCHASE ORDER must be received from only one PARTY.• A PARTY must be of either PERSON or ORGANIZATION type.• A PURCHASE ORDER may contain Zero, One or Many LINE ITEM.• A LINE ITEM must be placed on only one PURCHASE ORDER.• A PRODUCT may be on Zero, One or More LINE ITEM• A LINE ITEM must shows only one PRODUCT.• A PRODUCT may be of SOURCED PRODUCT or SERVICE Type• Party Identifier is key identifier for PARTY.• Product Identifier is key identifier for PRODUCT. 45
    • Reading a data model-gleaning the business rules• Purchase Order Number combined with PARTY Identifier is Primary identifier for PURCHASE ORDER• Line Item Number, Product Identifier, Party Identifier and Purchase Order Number combined is Primary identifier for LINE ITEM• Surname is attribute of PERSON only• Business Number is attribute of ORGANIZATION only.• Sourced From is attribute of SOURCED PRODUCT only• Cost Amount is attribute of SOURCED PRODUCT only.• Service Location is attribute of SERVICE only.• Rate Per Hour is attribute of SERVICE only. 46
    • Reading a data model- deriving real value• Very important exercise for flushing out hidden and missing business rules- minimize ‘later day change requests’.• Value is in critical examination of business rules. – A PURCHASE ORDER must be received from only one PARTY : • Can a party transfer its purchase order to another party? • What if a party is dissolved, merged or acquired by another party after placing a purchase order? Do we need to know about original party? • Can two parties place a combined order to obtain volume discount? – Business Number is attribute of ORGANIZATION only, • There are individuals who are incorporated and have a business number. Should we capture their business number? – A PRODUCT may be of SOURCED PRODUCT or SERVICE Type • What about sourced products requiring installation service and support? Should we invoice service on a separate purchase order 47
    • Avoiding high cost of change. 48
    • Data models – maximizing ROI.• Make data modeling mandatory part of development life cycle.• Standardize on use of data modeling tool so everybody is familiar with its semantics.• Provide training to users in modeling tool and its semantics.• Capture additional business rules in separate documents for their completeness.• Keep data models up to date. 49
    • Further readings• Help section of the data modeling tools: most of the tools come with good support documentations on modeling methodology and notations. – Data Model Patterns: Convention of Thought by David C. Hay – Data Modeling Made Simple: A Practical Guide for Business and IT Professionals by Steve Hoberman – Data Modeling for the Business: A Handbook for Aligning the Business with IT using High-Level Data Models (Take It with You Guides) - By Steve Hoberman, Donna Burbank, Chris Bradley 50
    • Thank you for joining 51