SlideShare a Scribd company logo
1 of 70
Terry Bunio
Data Modeling – Tales from the trenches
Thank you to our Sponsors
@tbunio
tbunio@protegra.com
agilevoyageur.com
www.protegra.com
Who Am I?
• Terry Bunio
• Data Base Administrator
– Oracle
– SQL Server 6,6.5,7,2000,2005,2008,2012
– Informix
– ADABAS
• Data Modeler/Architect
– Investors Group, LPL Financial, Manitoba
Blue Cross, Assante Financial, CI Funds,
Mackenzie Financial
– Normalized and Dimensional
• Agilist
– Innovation Gamer, Team Member, SQL
Developer, Test writer, Sticky Sticker, Project
Manager, PMO on SAP Implementation
Agenda
• Data Modeling Hubris
– Multi-language reference tables
– “All Claims”
– Recursion
Once upon a time
• Worked on a project for a
client in Luxembourg
• Interesting point
– Luxembourg has four official
languages
• English
• French
• German
• Flemish (I think)
Once upon a time
• Need to create multi-lingual
descriptions for reference table
• Currently only required English
and French
• Convinced team that we would
soft model the language
Once upon a time
• These tables also had
independent surrogate kets for
all reference table values
Once upon a time
• It wasn’t fun
• Queries performed terribly and
were overly complex
• Never used the extra flexibility
and we eventually replaced the
functionality with an English
and French description field
Once upon a time
• Not my design
• Once saw a database that
actually stored all text fields on
one table
– You joined to the table with the
Primary Key from the description
table
• Some queries joined to the
name table over 10 times.
All Claims
All Claims
• Anyone work with SAP?
• Their tables are not tables as
much as large flat files
• Record type and other
extremely codified fields
• Really hard to make sense of
All Claims
• To make it easier on
developers we created an
All_claims table that would join
all the relative data together
and also do some filtering
All Claims
• This became quite the beast of
an object
• Became a focal point for
performance tuning
• No one could access the data
until it was loaded
All Claims
• We eventually had to develop
a net change process as we
couldn’t reload all the records
every day
• Ended up being very
successful
– Lot of heartache
– Extremely talented developer
Recursion
Recursion
• Usually used to model multiple
levels of an object
– Office structure
– Organization Hierarchy
– Etc…
Recursion
• Looking back…
– Seemed to be an intellectual
exercise
– Can I figure out a way to
dynamically model this?
Recursion
• Question is:
– Does the data need a dynamic
model?
– Looking back
• The models were 99% stable
• Dynamic model was being down
for the future
• Definitely over engineering
Recursion
• So what?
– Complexity in retrieving data
• Especially for reports
– The data would need to have
multiple levels and the ability to
move between different multiple
levels frequently for me to model
the data recursively like this
again
Recursion
• What not just model the data in
a fixed way and deal with
changes as need
– Region
– Division
– Department
• Whoops! Just add Sub-
Division when required and
convert
Agenda
• Data Modeling Mistakes
– Anthropomorphism
– Over-Engineering
– Keys
• GUIDs
• Surrogate/Real Keys
• Composite Keys
– Deleted Records
– Nulls
– History
– Recursion
Definition
• “A database model is a
specification describing
how a database is
structured and used” –
Wikipedia
Definition
• “A data model describes how
the data entities are related
to each other in the real
world” – Terry (5 years ago)
• “A data model describes how
the data entities are related
to each other in the
application” – Terry (today)
Data Model
Characteristics
• Organize/Structure
like Data Elements
• Define relationships
between Data Entities
• Highly Cohesive
• Loosely Coupled
Relational
• Relational Analysis
– Database design is usually in
Third Normal Form
– Database is optimized for
transaction processing. (OLTP)
– Normalized tables are optimized
for modification rather than
retrieval
Normal forms
• 1st - Under first normal form, all
occurrences of a record type must contain
the same number of fields.
• 2nd - Second normal form is violated
when a non-key field is a fact about a
subset of a key. It is only relevant when
the key is composite
• 3rd - Third normal form is violated when
a non-key field is a fact about another
non-key field
Source: William Kent - 1982
Normal Forms for the
Layman
• 1st – Table only represents
one type of data
– No row types
• 2nd – Field does not depend
on only a part of the Primary
Key
• 3rd – Field depends only on
the Primary Key
Remember
• Remember to ask ourselves
when we are modeling
• Do either of the options
contradict the normal forms
• Usually we model past 3rd
normal form based on other
biases
Anthropomorphism
#1 Mistake in
Data Modeling
• Modeling something
to take on human
characteristics or
characteristics of
our world
Amazon
Amazon
• Warehouse is organized
totally randomly
• Although humans think the
items should be ordered in
some way, it does not help
storage or retrieval in any way
– In fact in hurts it by creating ‘hot
spots’ for in demand items
Data Model
Anthropomorphism
• We sometimes
create objects in
our Data Models
are they exist in the
real world, not in
the applications
Data Model
Anthropomorphism
• This is usually the case for
physical objects in the real
world
– Companies/Organizations
– People
– Addresses
– Phone Numbers
– Emails
Data Model
Anthropomorphism
• Why?
– Do we ever need to consolidate all
people, addresses, or emails?
• Rarely
– We usually report based on other
filter criteria
– So why do we try to place like real
world items on one table when
applications treat them differently?
Over Engineering
Over Engineering
• Additional flexibility that is
not required does not
simplify the solution, it overly
complicates the solution
Over Engineering
• These are usually tables that
have multiple mutually
exclusive foreign keys
– Only one is filled at any one time
• Why not just create separate
join tables?
– Doesn’t violate any normal forms
Keys
GUIDs
• Oscar winner for worst choice
for a Primary Key ever
• Selected based on over
engineering because they
would never be duplicates
GUIDs
• In the meantime they caused
excessive index length, user
frustration, and complex query
execution plans
• Just say no.
GUIDs
• Especially don’t use them on
tables with a fewer number of
records
• Who says all the Primary Keys
In a database need to be of
the same type?
Surrogate Keys
• Surrogate Keys are a huge
benefit
• Straight Integer keys are
probably the most common
– Users are the most used to
integer keys as well
• Same as bank account, credit
cards, other account information
Surrogate Keys
• The exception
– Don’t, don’t, don’t use Surrogate
keys for Reference or Support
tables
– Causes needless lookups for
clients, SQL queries, and for
reports
Surrogate Keys
• Do we really need to assign a
numeric Primary Key for
Gender and Province codes?
– Especially since these value
very rarely change
– Might make sense for reference
tables that change more
frequently.
Composite Keys
Composite Keys
Composite Keys
• Composite Keys are needed to
violate 2nd normal form
– Remove Composite Keys, you
remove being able to have that
violation
• Just a bad idea as there is
inherent meaning that the
Primary Key can change
Deleted Records
• Are we soft deleting or hard
deleting records?
• Used to like soft deleting as
you never lost data
• But this can make queries a
nightmare with needing to filter
on deleted records for every
table in a query
Deleted Records
• Soft deleted records also
perform quite poorly when
included in an index due them
only having two values
– Or else you need to add the
deleted indicator to many
indexes
– Both are inefficient
Nulls
Nulls
• Nulls are evil
• Do whatever you can to avoid
nulls
– Column Defaults
– Domain Defaults
– Did I mention defaults?
Nulls
• Nulls can complicate queries
just like deleted indicators
• Probably also are the number
one cause of devious, mind-
bending defects
– Think of the time you will save!
Nulls
• For this reason, Nulls are the
first thing that goes when
create a Self Service Data
Warehouse
History
History
• Where and how should we
store history?
• Transaction tables are easy
– They usually have always been
historical tables
• But what about tables like
person and address?
History
• Few options
– Create history record on same
table
– Create history record on history
table for each table
– Create history record on one
audit table
– Don’t store it and let the Data
Warehouse worry about it
History on same table
• Keeps the number of tables in
your database to a minimum
• Keeps queries cleaner
• Complicates queries as you
now need to include/exclude
– And you will need to add
additional data information
History on separate table
• Dirties up the database as you
create a history copy of every
table in the database
• Some Queries are cleaner
• Some Queries now need to join
twice as many table though!
History on Audit table
• Queries are cleaner
• Database is cleaner
• But depending on the solution,
you may end up having One
absolutely huge table to parse
through. 
History in Data
Warehouse
• Perhaps the cleanest option
• Requires a commitment to
infrastructure
• Latency may also become an
issue
Lets play a game
Questions?

More Related Content

Similar to Data modeling tips from the trenches

Data modeling trends for Analytics
Data modeling trends for AnalyticsData modeling trends for Analytics
Data modeling trends for AnalyticsIke Ellis
 
The final frontier v3
The final frontier v3The final frontier v3
The final frontier v3Terry Bunio
 
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015Terry Bunio
 
Sfsvsqlug june-2010
Sfsvsqlug june-2010Sfsvsqlug june-2010
Sfsvsqlug june-2010datamodeling
 
Microsoft SQL Server Seven Deadly Sins of Database Design
Microsoft SQL Server Seven Deadly Sins of Database DesignMicrosoft SQL Server Seven Deadly Sins of Database Design
Microsoft SQL Server Seven Deadly Sins of Database DesignMark Ginnebaugh
 
Data modeling trends for analytics
Data modeling trends for analyticsData modeling trends for analytics
Data modeling trends for analyticsIke Ellis
 
Dimensional modeling primer
Dimensional modeling primerDimensional modeling primer
Dimensional modeling primerTerry Bunio
 
50 Shades of Fail KScope16
50 Shades of Fail KScope1650 Shades of Fail KScope16
50 Shades of Fail KScope16Christian Berg
 
Whats A Data Warehouse
Whats A Data WarehouseWhats A Data Warehouse
Whats A Data WarehouseNone None
 
Top 10 sql server reporting services tips
Top 10 sql server reporting services tipsTop 10 sql server reporting services tips
Top 10 sql server reporting services tipsIke Ellis
 
The final frontier
The final frontierThe final frontier
The final frontierTerry Bunio
 
Data modeling dimensions for dta warehousing
Data modeling dimensions for dta warehousingData modeling dimensions for dta warehousing
Data modeling dimensions for dta warehousingDr. Dipti Patil
 
Data modeling dimensions
Data modeling dimensionsData modeling dimensions
Data modeling dimensionsDr. Dipti Patil
 
Build a modern data platform.pptx
Build a modern data platform.pptxBuild a modern data platform.pptx
Build a modern data platform.pptxIke Ellis
 
Workflow solutions best practices and mistakes to avoid
Workflow solutions best practices and mistakes to avoidWorkflow solutions best practices and mistakes to avoid
Workflow solutions best practices and mistakes to avoidInnoTech
 
The Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceThe Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceAbdelmonaim Remani
 
ETL for the masses with Power Query and M
ETL for the masses with Power Query and METL for the masses with Power Query and M
ETL for the masses with Power Query and MRégis Baccaro
 
ITI015En-The evolution of databases (I)
ITI015En-The evolution of databases (I)ITI015En-The evolution of databases (I)
ITI015En-The evolution of databases (I)Huibert Aalbers
 

Similar to Data modeling tips from the trenches (20)

Data modeling trends for Analytics
Data modeling trends for AnalyticsData modeling trends for Analytics
Data modeling trends for Analytics
 
The final frontier v3
The final frontier v3The final frontier v3
The final frontier v3
 
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
 
Sfsvsqlug june-2010
Sfsvsqlug june-2010Sfsvsqlug june-2010
Sfsvsqlug june-2010
 
Microsoft SQL Server Seven Deadly Sins of Database Design
Microsoft SQL Server Seven Deadly Sins of Database DesignMicrosoft SQL Server Seven Deadly Sins of Database Design
Microsoft SQL Server Seven Deadly Sins of Database Design
 
Data modeling trends for analytics
Data modeling trends for analyticsData modeling trends for analytics
Data modeling trends for analytics
 
Dimensional modeling primer
Dimensional modeling primerDimensional modeling primer
Dimensional modeling primer
 
50 Shades of Fail KScope16
50 Shades of Fail KScope1650 Shades of Fail KScope16
50 Shades of Fail KScope16
 
Whats A Data Warehouse
Whats A Data WarehouseWhats A Data Warehouse
Whats A Data Warehouse
 
Breaking data
Breaking dataBreaking data
Breaking data
 
Top 10 sql server reporting services tips
Top 10 sql server reporting services tipsTop 10 sql server reporting services tips
Top 10 sql server reporting services tips
 
The final frontier
The final frontierThe final frontier
The final frontier
 
Data modeling dimensions for dta warehousing
Data modeling dimensions for dta warehousingData modeling dimensions for dta warehousing
Data modeling dimensions for dta warehousing
 
Data modeling dimensions
Data modeling dimensionsData modeling dimensions
Data modeling dimensions
 
Redshift deep dive
Redshift deep diveRedshift deep dive
Redshift deep dive
 
Build a modern data platform.pptx
Build a modern data platform.pptxBuild a modern data platform.pptx
Build a modern data platform.pptx
 
Workflow solutions best practices and mistakes to avoid
Workflow solutions best practices and mistakes to avoidWorkflow solutions best practices and mistakes to avoid
Workflow solutions best practices and mistakes to avoid
 
The Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceThe Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot Persistence
 
ETL for the masses with Power Query and M
ETL for the masses with Power Query and METL for the masses with Power Query and M
ETL for the masses with Power Query and M
 
ITI015En-The evolution of databases (I)
ITI015En-The evolution of databases (I)ITI015En-The evolution of databases (I)
ITI015En-The evolution of databases (I)
 

More from Terry Bunio

Uof m empathys role
Uof m empathys roleUof m empathys role
Uof m empathys roleTerry Bunio
 
Pr dc 2015 sql server is cheaper than open source
Pr dc 2015 sql server is cheaper than open sourcePr dc 2015 sql server is cheaper than open source
Pr dc 2015 sql server is cheaper than open sourceTerry Bunio
 
Ssrs and sharepoint there and back again - SQL SAT Fargo
Ssrs and sharepoint   there and back again - SQL SAT FargoSsrs and sharepoint   there and back again - SQL SAT Fargo
Ssrs and sharepoint there and back again - SQL SAT FargoTerry Bunio
 
A data driven etl test framework sqlsat madison
A data driven etl test framework sqlsat madisonA data driven etl test framework sqlsat madison
A data driven etl test framework sqlsat madisonTerry Bunio
 
SSRS and Sharepoint there and back again
SSRS and Sharepoint   there and back againSSRS and Sharepoint   there and back again
SSRS and Sharepoint there and back againTerry Bunio
 
Role of an agile pm
Role of an agile pmRole of an agile pm
Role of an agile pmTerry Bunio
 
Introduction to lean and agile
Introduction to lean and agileIntroduction to lean and agile
Introduction to lean and agileTerry Bunio
 
Pmi june 5th 2007
Pmi june 5th 2007Pmi june 5th 2007
Pmi june 5th 2007Terry Bunio
 
Pmi sac november 20
Pmi sac november 20Pmi sac november 20
Pmi sac november 20Terry Bunio
 
Iiba.november.09
Iiba.november.09Iiba.november.09
Iiba.november.09Terry Bunio
 
Sdec11 when user stories are not enough
Sdec11 when user stories are not enoughSdec11 when user stories are not enough
Sdec11 when user stories are not enoughTerry Bunio
 
Sdec09 kick off to deployment in 92days
Sdec09 kick off to deployment in 92daysSdec09 kick off to deployment in 92days
Sdec09 kick off to deployment in 92daysTerry Bunio
 
Sdec10 lean package implementation
Sdec10 lean package implementationSdec10 lean package implementation
Sdec10 lean package implementationTerry Bunio
 
Role of an agile Project Manager
Role of an agile Project ManagerRole of an agile Project Manager
Role of an agile Project ManagerTerry Bunio
 
Agile in different environments
Agile in different environmentsAgile in different environments
Agile in different environmentsTerry Bunio
 

More from Terry Bunio (19)

Uof m empathys role
Uof m empathys roleUof m empathys role
Uof m empathys role
 
#YesEstimates
#YesEstimates#YesEstimates
#YesEstimates
 
Pr dc 2015 sql server is cheaper than open source
Pr dc 2015 sql server is cheaper than open sourcePr dc 2015 sql server is cheaper than open source
Pr dc 2015 sql server is cheaper than open source
 
Ssrs and sharepoint there and back again - SQL SAT Fargo
Ssrs and sharepoint   there and back again - SQL SAT FargoSsrs and sharepoint   there and back again - SQL SAT Fargo
Ssrs and sharepoint there and back again - SQL SAT Fargo
 
A data driven etl test framework sqlsat madison
A data driven etl test framework sqlsat madisonA data driven etl test framework sqlsat madison
A data driven etl test framework sqlsat madison
 
SSRS and Sharepoint there and back again
SSRS and Sharepoint   there and back againSSRS and Sharepoint   there and back again
SSRS and Sharepoint there and back again
 
Role of an agile pm
Role of an agile pmRole of an agile pm
Role of an agile pm
 
Estimating 101
Estimating 101Estimating 101
Estimating 101
 
Introduction to lean and agile
Introduction to lean and agileIntroduction to lean and agile
Introduction to lean and agile
 
Pmi june 5th 2007
Pmi june 5th 2007Pmi june 5th 2007
Pmi june 5th 2007
 
Pmi sac november 20
Pmi sac november 20Pmi sac november 20
Pmi sac november 20
 
Iiba.november.09
Iiba.november.09Iiba.november.09
Iiba.november.09
 
Sdec11 when user stories are not enough
Sdec11 when user stories are not enoughSdec11 when user stories are not enough
Sdec11 when user stories are not enough
 
Sdec10 lean AMS
Sdec10 lean AMSSdec10 lean AMS
Sdec10 lean AMS
 
Sdec09 kick off to deployment in 92days
Sdec09 kick off to deployment in 92daysSdec09 kick off to deployment in 92days
Sdec09 kick off to deployment in 92days
 
Sdec10 lean package implementation
Sdec10 lean package implementationSdec10 lean package implementation
Sdec10 lean package implementation
 
Role of an agile Project Manager
Role of an agile Project ManagerRole of an agile Project Manager
Role of an agile Project Manager
 
Agile in different environments
Agile in different environmentsAgile in different environments
Agile in different environments
 
Agile roles
Agile rolesAgile roles
Agile roles
 

Recently uploaded

WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 

Recently uploaded (20)

WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 

Data modeling tips from the trenches

  • 1. Terry Bunio Data Modeling – Tales from the trenches Thank you to our Sponsors
  • 3. Who Am I? • Terry Bunio • Data Base Administrator – Oracle – SQL Server 6,6.5,7,2000,2005,2008,2012 – Informix – ADABAS • Data Modeler/Architect – Investors Group, LPL Financial, Manitoba Blue Cross, Assante Financial, CI Funds, Mackenzie Financial – Normalized and Dimensional • Agilist – Innovation Gamer, Team Member, SQL Developer, Test writer, Sticky Sticker, Project Manager, PMO on SAP Implementation
  • 4.
  • 5.
  • 6. Agenda • Data Modeling Hubris – Multi-language reference tables – “All Claims” – Recursion
  • 7. Once upon a time • Worked on a project for a client in Luxembourg • Interesting point – Luxembourg has four official languages • English • French • German • Flemish (I think)
  • 8. Once upon a time • Need to create multi-lingual descriptions for reference table • Currently only required English and French • Convinced team that we would soft model the language
  • 9. Once upon a time • These tables also had independent surrogate kets for all reference table values
  • 10.
  • 11. Once upon a time • It wasn’t fun • Queries performed terribly and were overly complex • Never used the extra flexibility and we eventually replaced the functionality with an English and French description field
  • 12.
  • 13. Once upon a time • Not my design • Once saw a database that actually stored all text fields on one table – You joined to the table with the Primary Key from the description table • Some queries joined to the name table over 10 times.
  • 15. All Claims • Anyone work with SAP? • Their tables are not tables as much as large flat files • Record type and other extremely codified fields • Really hard to make sense of
  • 16. All Claims • To make it easier on developers we created an All_claims table that would join all the relative data together and also do some filtering
  • 17.
  • 18. All Claims • This became quite the beast of an object • Became a focal point for performance tuning • No one could access the data until it was loaded
  • 19. All Claims • We eventually had to develop a net change process as we couldn’t reload all the records every day • Ended up being very successful – Lot of heartache – Extremely talented developer
  • 21. Recursion • Usually used to model multiple levels of an object – Office structure – Organization Hierarchy – Etc…
  • 22. Recursion • Looking back… – Seemed to be an intellectual exercise – Can I figure out a way to dynamically model this?
  • 23. Recursion • Question is: – Does the data need a dynamic model? – Looking back • The models were 99% stable • Dynamic model was being down for the future • Definitely over engineering
  • 24. Recursion • So what? – Complexity in retrieving data • Especially for reports – The data would need to have multiple levels and the ability to move between different multiple levels frequently for me to model the data recursively like this again
  • 25. Recursion • What not just model the data in a fixed way and deal with changes as need – Region – Division – Department • Whoops! Just add Sub- Division when required and convert
  • 26. Agenda • Data Modeling Mistakes – Anthropomorphism – Over-Engineering – Keys • GUIDs • Surrogate/Real Keys • Composite Keys – Deleted Records – Nulls – History – Recursion
  • 27. Definition • “A database model is a specification describing how a database is structured and used” – Wikipedia
  • 28. Definition • “A data model describes how the data entities are related to each other in the real world” – Terry (5 years ago) • “A data model describes how the data entities are related to each other in the application” – Terry (today)
  • 29. Data Model Characteristics • Organize/Structure like Data Elements • Define relationships between Data Entities • Highly Cohesive • Loosely Coupled
  • 30. Relational • Relational Analysis – Database design is usually in Third Normal Form – Database is optimized for transaction processing. (OLTP) – Normalized tables are optimized for modification rather than retrieval
  • 31. Normal forms • 1st - Under first normal form, all occurrences of a record type must contain the same number of fields. • 2nd - Second normal form is violated when a non-key field is a fact about a subset of a key. It is only relevant when the key is composite • 3rd - Third normal form is violated when a non-key field is a fact about another non-key field Source: William Kent - 1982
  • 32. Normal Forms for the Layman • 1st – Table only represents one type of data – No row types • 2nd – Field does not depend on only a part of the Primary Key • 3rd – Field depends only on the Primary Key
  • 33. Remember • Remember to ask ourselves when we are modeling • Do either of the options contradict the normal forms • Usually we model past 3rd normal form based on other biases
  • 35. #1 Mistake in Data Modeling • Modeling something to take on human characteristics or characteristics of our world
  • 37. Amazon • Warehouse is organized totally randomly • Although humans think the items should be ordered in some way, it does not help storage or retrieval in any way – In fact in hurts it by creating ‘hot spots’ for in demand items
  • 38. Data Model Anthropomorphism • We sometimes create objects in our Data Models are they exist in the real world, not in the applications
  • 39. Data Model Anthropomorphism • This is usually the case for physical objects in the real world – Companies/Organizations – People – Addresses – Phone Numbers – Emails
  • 40. Data Model Anthropomorphism • Why? – Do we ever need to consolidate all people, addresses, or emails? • Rarely – We usually report based on other filter criteria – So why do we try to place like real world items on one table when applications treat them differently?
  • 42. Over Engineering • Additional flexibility that is not required does not simplify the solution, it overly complicates the solution
  • 43. Over Engineering • These are usually tables that have multiple mutually exclusive foreign keys – Only one is filled at any one time • Why not just create separate join tables? – Doesn’t violate any normal forms
  • 44. Keys
  • 45. GUIDs • Oscar winner for worst choice for a Primary Key ever • Selected based on over engineering because they would never be duplicates
  • 46. GUIDs • In the meantime they caused excessive index length, user frustration, and complex query execution plans • Just say no.
  • 47. GUIDs • Especially don’t use them on tables with a fewer number of records • Who says all the Primary Keys In a database need to be of the same type?
  • 48. Surrogate Keys • Surrogate Keys are a huge benefit • Straight Integer keys are probably the most common – Users are the most used to integer keys as well • Same as bank account, credit cards, other account information
  • 49. Surrogate Keys • The exception – Don’t, don’t, don’t use Surrogate keys for Reference or Support tables – Causes needless lookups for clients, SQL queries, and for reports
  • 50. Surrogate Keys • Do we really need to assign a numeric Primary Key for Gender and Province codes? – Especially since these value very rarely change – Might make sense for reference tables that change more frequently.
  • 53. Composite Keys • Composite Keys are needed to violate 2nd normal form – Remove Composite Keys, you remove being able to have that violation • Just a bad idea as there is inherent meaning that the Primary Key can change
  • 54. Deleted Records • Are we soft deleting or hard deleting records? • Used to like soft deleting as you never lost data • But this can make queries a nightmare with needing to filter on deleted records for every table in a query
  • 55. Deleted Records • Soft deleted records also perform quite poorly when included in an index due them only having two values – Or else you need to add the deleted indicator to many indexes – Both are inefficient
  • 56. Nulls
  • 57. Nulls • Nulls are evil • Do whatever you can to avoid nulls – Column Defaults – Domain Defaults – Did I mention defaults?
  • 58. Nulls • Nulls can complicate queries just like deleted indicators • Probably also are the number one cause of devious, mind- bending defects – Think of the time you will save!
  • 59. Nulls • For this reason, Nulls are the first thing that goes when create a Self Service Data Warehouse
  • 61. History • Where and how should we store history? • Transaction tables are easy – They usually have always been historical tables • But what about tables like person and address?
  • 62. History • Few options – Create history record on same table – Create history record on history table for each table – Create history record on one audit table – Don’t store it and let the Data Warehouse worry about it
  • 63. History on same table • Keeps the number of tables in your database to a minimum • Keeps queries cleaner • Complicates queries as you now need to include/exclude – And you will need to add additional data information
  • 64. History on separate table • Dirties up the database as you create a history copy of every table in the database • Some Queries are cleaner • Some Queries now need to join twice as many table though!
  • 65. History on Audit table • Queries are cleaner • Database is cleaner • But depending on the solution, you may end up having One absolutely huge table to parse through. 
  • 66. History in Data Warehouse • Perhaps the cleanest option • Requires a commitment to infrastructure • Latency may also become an issue
  • 67. Lets play a game
  • 68.
  • 69.