Anchor ModelingA Technique for Information under Evolution Lars Rönnbäck @ GSE Nordics June 7–9, 2011
INFORMATION MONEY LOVE Google ngram viewerGraph showing relative occurrence of wordsin literature over the last centuryInformation is rapidly becoming the mostimportant asset
c litusHera C 5 00.B “Panta rhei” Everything flows
Evolving InformationChanging contentChanging structureChanging constraintsChanging interpretation Theres a big difference between saying: "ThisChanging origins information has a 95% reliability" and "ThisChanging reliability information is 100% reliable".
What is a database?The purpose of a database is to store a bodyof information and allow searches over it.The purpose of a temporal database is tostore a body of information under evolutionand allow historical searches over it. t a re no , we But e yet! ther
What is a Data Warehouse?Integrates information from many sourcesKeeps a history of changesProvides “one version of the truth”Enables reporting, ad-hoc analysis, miningCalculates and stores new information
The dilemmaMany sources andmany users naturally lack ofresult in many tim ion adherence er atchanges e ov rad g de 25% 55%Dimensional Modeling 20%NormalizedHaphazard lack of adherence
Patch or Redo?Patching works initially to cope with newrequirementsMaintenance costs usually rise proportionallyto the lifetime of the data warehouseRedoing is unavoidable at some point(and for dimensional modeling sometimes accounted for)The average lifetime is ﬁve yearsThe return of investment should and couldbe much better with a longer lifetime!
What is Anchor Modeling?Anchor Modeling combines normalization andemulation to provide an agile databasemodeling technique for evolving informationthat is implementable in current relationaldatabases.Most, if not all, of what Anchor Modeling isdoing in its physical (relational)representation could be "hidden" from theend-user in a true temporal database.
Technologies Entity-Relationshipone- Modeling to-one Sixth Normal Form Tables Temporal Database Emulation
History Best Paper Award @ ER’09 Maria Petia Lars Olle Paul Bergholtz Wohed research Rönnbäck RegardtJohannesson DW consulting SU DW DKE GSE DW MDM EDW DW TDWI WWW ER09 TOOL AMW 03 04 05 06 07 08 09 10 11
PhilosophyMake modeling free from assumptionsMake modeling agile and iterativeMake evolution non-destructiveDo not duplicate informationDo not alter existing informationDecouple metadata from the modelProvide a simple interface for queries
EvolutioChanging content Anchor M n infull support [6NF + time of change] odelingChanging structurefull support (through extensions) [non-destructive schema evolution]Changing constraintsminimal support [only primary and foreign keys]Changing interpretationachievable [explicitly modeled]Changing originsrestricted support [using metadata]Changing reliabilityrestricted support [using metadata]
ionin g Posit odeling Domain chor M driven An modelingData Vault/ODS/3NF (Inmon) mimics Anchor Modeling reality mimics mimics Use- Data structure searches case driven driven modeling modeling Dimensional Modeling (Kimball)
Basic Notions Attributes – properties Example: The surname of a Person <#42, ‘Rönnbäck’, 2004-06-19> Anchors – entities Example: A Person <#42> (holds only identities of entities) Knots – shared properties Example: The gender of a Person Ties – relationships <#1, ‘Male’> + <#42, #1> Example: The children of a Person <#42, #4711>
Historization<#42, ‘Samuelsson’, 1972-08-20> closed interval historical information<#42, ‘Rönnbäck’, 2004-06-19> open interval current information Historization is done using the time of change as the start of an interval implicitly closed by another instance of Note tha t UPDAT the same identity with a later never al lowed in E is time of change. anchor databas an e
The Modeling Tool www.anchormodeling.com/modelerOpen SourceOnline (HTML5)Free to useIn the Cloud EM O!XML Interchange Format DAutomatic generation of SQL scriptsInteractive (force-directed) Layout Engine
Important BeneﬁtsHandles evolving information (keeping the integrity intact)Increases longevity (databases with long life expectancy)Simpliﬁes modeling concepts (less prone to error)Enables modular and iterative developmentNeeds no translation logic to the physical layerAutomates generation of scriptsNo downtime when upgrading databasesScans only relevant data during searchesSparse data cause no gaps (no null values)
More Information : Twitter: anchormod Homepage m eling ode ling.co w. anchormht tp://ww od eling Tool Tutorials . M B log . Video deling.com E -mail: lars.r onnback@anchormo LinkedIn Groups: Anchor Modeling Temporal Data Modeling Temporal Data