SlideShare a Scribd company logo
1 of 52
SQL Bits X

Temporal Snapshot Fact
Table
A new approach to data snapshots
Davide Mauri
dmauri@solidq.com
Mentor - SolidQ
EXEC sp_help ā€žDavide Mauriā€Ÿ
ā€¢ Microsoft SQL Server MVP
ā€¢ Works with SQL Server from 6.5, on BI from
  2003
ā€¢ Specialized in Data Solution Architecture,
  Database Design, Performance Tuning, BI
ā€¢ President of UGISS (Italian SQL Server UG)
ā€¢ Mentor @ SolidQ
ā€¢ Twitter: mauridb
ā€¢ Blog: http://sqlblog.com/blogs/davide_mauri
Agenda
ā€¢ The problem
ā€¢ The possible Ā«classicĀ» solutions
ā€¢ Limits of Ā«classicĀ» solutions
ā€¢ A temporal approach
ā€¢ Implementing the solution
ā€¢ Technical & Functional Challenges
     (and their resolution ļŠ)
ā€¢ Conclusions
The Problem
The request
ā€¢ Our customer (an insurance company)
  needed a BI solution that would allow
  them to do some ā€œsimpleā€ things:
  ā€“ Analyze the value/situation of all insurances
    at a specific point in time
  ā€“ Analyze all the data from 1970 onwards
  ā€“ Analyze data on daily basis
    ā€¢ Absolutely no weekly or monthly aggregation
The environment
ā€¢ On average they have 3.000.000
  documents related to insurances that need
  to be stored.
  ā€“ Each day.
ā€¢ Data is mainly stored in a DB2 mainframe
  ā€“ Data is (sometimes) stored in a ā€œtemporalā€
    fashion
    ā€¢ For each document a new row is created each
      time something changes
    ā€¢ Each row has ā€œvalid_fromā€ and ā€œvalid_toā€ columns
Letā€Ÿs do the math
ā€¢ To keep the daily snapshot
  ā€“ of 3.000.000 documents
  ā€“ for 365 days
  ā€“ for 40 years


ā€¢ A fact table of near 44 BILLION rows
  would be needed
The possible solutions
ā€¢ How can the problem be solved?
  ā€“ PDW and/or Brute Force was not an option


ā€¢ Unfortunately none of the three known
  approaches can work here
  ā€“ Transactional Fact Table
  ā€“ Periodic Snapshot Fact Table
  ā€“ Accumulating Snapshot Fact Table


ā€¢ A new approach is needed!
The classic solutions
 And why they cannot be used here
Transaction Fact Table
ā€¢ One row per fact occurring at a certain point
  in time
ā€¢ We donā€Ÿt have facts for every day. For
  example, for the document 123, no changes
  were made on 20 August 2011
  ā€“ Analysis on that date would not show the
    document 123
  ā€“ ā€œLastNonEmptyā€ aggregation cannot help since
    dates donā€Ÿt have children
  ā€“ MDX can come to the rescue here butā€¦
     ā€¢ Very high complexity
     ā€¢ Potential performance problems
Accumulating Snapshot F.T.
ā€¢ One fact row per entity in the fact table
  ā€“ Each row has a lot of date columns that store
    all the changes that happen during the entityā€Ÿs
    lifetime
  ā€“ Each column date represents a point in time
    where something changed
     ā€¢ The number of changes must be known in
       advance
ā€¢ With so many date columns it can be
  difficult to manage the ā€œAnalysis Dateā€
                                                  11
Periodic Snapshot Fact Table
ā€¢ The fact table holds a Ā«snapshotĀ» of all
  data for a specific point in time
     ā€“ Thatā€Ÿs the perfect solution, since doing a
       snapshot each day would completely solve
       the problem
     ā€“ Unfortunately there is just too much data
     ā€“ We need to keep the idea but reduce the
       amount of data



BIA-406-S | Temporal Snapshot Fact Table            12
A temporal approach
 And how it can be applied to BI
Temporal Data
ā€¢ The snapshot fact table is a good starting
  point
  ā€“ Keeping a snapshot of a document for each
    day is just a waste of space
  ā€“ And also a big performance killer
ā€¢ Changes to document donā€Ÿt happen too
  often
  ā€“ Fact tables should be temporal in order to
    avoid data duplication
  ā€“ Ideally, each fact row needs a ā€œvalid_fromā€
    and ā€œvalid_toā€ column
Temporal Data
ā€¢ With Temporal Snapshot Fact Table
  (TSFT) each row represents a fact that
  occurred during a time interval not at a
  point in time.
ā€¢ Time Intervals will be right-opened in order
  to simplify calculations:
            [valid_from, valid_to)
      Which means:
            valid_from <= x < valid_to
Temporal Data
ā€¢ Now that the idea is set, we have to solve
  several problems:
  ā€“ Functional Challenges
     ā€¢ SSAS doesnā€Ÿt support the concept of intervals, nor
       the between filter
     ā€¢ The user just wants to define the Analysis Date,
       and doesnā€Ÿt want to deal with ranges
  ā€“ Technical Challenges
     ā€¢ Source data may come from several tables and
       has to be consolidated in a small number of TSFT
     ā€¢ The concept of interval changes the rules of the
       ā€œjoinā€ game a little bitā€¦
Technical Challenges
ā€¢ Source data may come from table with
  different designs:
      temporal tables
      non-temporal tables
ā€¢ Temporal Tables
  ā€“ Ranges from two different temporal tables
    may overlap
ā€¢ Non-temporal tables
  ā€“ There may be some business rules that
    require us to Ā«breakĀ» the existing ranges
Technical Challenges
ā€¢ Before starting to solve the problem, letā€Ÿs
  generalize it a bit (in order to be able to study
  it theoretically)
ā€¢ Imagine a company with this environment
  ā€“ There are Items & Sub Items
  ā€“ Items have 1 or more Sub Items
     ā€¢ Each Sub Item has a value in money
     ā€¢ Value may change during its lifetime
  ā€“ Customers have to pay an amount equal to the
    sum of all Sub Items of an Item
     ā€¢ Customers may pay when they want during the year,
       as long as the total amount is paid by the end of the
       year
Data Diving
Technical Challenges
ā€¢ When joining data, keys are no longer
  sufficient
ā€¢ Time Intervals must also be managed
  correctly, otherwise incorrect results may
  be created
ā€¢ Luckily time operators are well-known (but
  not yet implemented)
  ā€“ overlaps, contains, meets, ecc. ecc.
Technical Challenges
ā€¢ CONTAINS (or DURING) Operator:

             b1                  e1

                       i1
                  b2        e2

                       i2

ā€¢ Means:
           b1<=b2 AND e1>=e2
Technical Challenges
ā€¢ OVERLAPS Operator:
           b1             e1



                i1
                     b2             e2



                               i2


ā€¢ Means:
            b1<=e2 AND b2 <=e1
ā€¢ Since weā€Ÿre using right-opened intervals, a
  little change is needed here:
            b1<e2 AND b2 <e1
Technical Challenges
ā€¢ Before going any further itā€Ÿs best we start
  to Ā«seeĀ» the data weā€Ÿre going to use


                      2008                                                  2009                                                           2010

                     Nov       Dec   Jan   Feb       Mar       Apr   May   Jun   Jul       Aug   Sep       Oct       Nov   Dec       Jan       Feb       Mar

 Item 1                    A                                                           B                                                                 C

          Item 1.1         A               B                         C                                 D                                   B             C

          Item 1.2         A                               B                               C                     D               E                   C

 Item 2                        A                 B                               C                               D                         C
          Item 2.1             A                 B                               C                               D                         C
JOINs & Co.
Technical Challenges
ā€¢ As you have seen, even with just three
  tables, the joins are utterly complex!

ā€¢ But bearing this in mind, we can now find
  an easier, logically equivalent, solution.

ā€¢ Drawing a timeline and visualizing the
  result helps a lot here.
Technical Challenges


  Range for one Item


  Range for the Sub Item


  Payment Dates                           Intervals will represent a
So, on the other way round,
                                             time where nothing
 dates represent a time in
                                                  changed
which something changed!



  Ā«SummarizedĀ» Status
                              1   2   3     4          5       6
Technical Challenges
ā€¢ In simpler words we need to find the
  minimum set of time intervals for each
  entity
  ā€“ The concept of "granularity" must be changed
    in order to take into account the idea of time
    intervals
  ā€“ All fact tables, for each stored entity, must
    have the same granularity for the time interval
     ā€¢ interval granularity must be the same ā€œhorizontallyā€
       (across tables)
     ā€¢ interval granularity must be specific for each entity
       (across rows: ā€œverticallyā€)
Technical Challenges
ā€¢ The problem can be solved by
  Ā«refactoringĀ» data.
ā€¢ Just like refactoring code, but working on
  data
  ā€“ Rephrased idea: ā€œdisciplined technique for
    restructuring existing temporal data, altering
    its internal structure without changing its
    valueā€
     ā€¢ http://en.wikipedia.org/wiki/Code_refactoring
Technical Challenges
ā€¢ Hereā€Ÿs how:
ā€¢ Step 1
  ā€“ For each item, for
    each temporal table,
    union all the distinct
    ranges
Technical Challenges
ā€¢ Step 2
  ā€“ For each item, take all the
    distinct:
  ā€“ Dates from the previous step
  ā€“ Dates that are known to
    Ā«breakĀ» a range
  ā€“ The first day of each year
    used in ranges
Technical Challenges
ā€¢ Step 3
  ā€“ For each item, create
    the new ranges, doing
    the following steps
     ā€¢ Order the rows by date
     ā€¢ Take the Ā«kĀ» row and
       the Ā«k+1Ā» row
     ā€¢ Create the range
            [Datek, Datek+1)

  The New LEAD operator is exactly
  what we need! ļŠ
Technical Challenges
ā€¢ When the time granularity is the day, the
  right-opened range helps a lot here.
  ā€“ Eg: letā€Ÿs say the at some point, for an entity,
    you have the following dates that has to be
    turned into ranges:
Technical Challenges
ā€¢ With a closed range intervals can be
  ambiguous, especially single day intervals:




  ā€“ The meaning is not explicitly clear and we
    would have to take care of it, making the ETL
    phase more complex.
  ā€“ Itā€Ÿs better to make everything explicit
Technical Challenges
ā€¢ So, to avoid ambiguity, with a closed range
  the resulting set of intervals also needs to
  take time into account:




  ā€“ What granularity to use for time?
     ā€¢ Hour? Minute? Microseconds?
  ā€“ Is just a waste of space
  ā€“ Date or Int datatype cannot be used
Technical Challenges
ā€¢ With a right-opened interval everything is
  easier:




  ā€“ Date or Int datatype are enough
  ā€“ No time used, so no need to deal with it
Technical Challenges
ā€¢ Step 4
  ā€“ Ā«ExplodeĀ» all the temporal source tables so
    that all data will use the same (new) ranges,
    using the CONTAINS operator:
Technical Challenges
ā€¢ The original Range:


Becomes the equivalent
set of ranges:



If we would have chosen a daily
snapshot, 425 rows would have been
generatedā€¦not only 12!
(Thatā€Ÿs a 35:1 ratio!)
Technical Challenges
ā€¢ Now we have all the source data
  conforming to a common set of intervals
  ā€“ This means that we can just join tables
    without having to do complex temporal joins


ā€¢ All fact tables will also have the same
  interval granularity

ā€¢ We solved all the technical problems! ļŠ
SSIS In Action
Functional Challenges
ā€¢ SSAS doesnā€Ÿt support time intervals butā€¦

ā€¢ Using some imagination we can say that
  ā€“ An interval is made of 1 to ā€œnā€ dates
  ā€“ A date belongs to 1 to ā€œnā€ intervals


ā€¢ We can model this situation with a Many-
  To-Many relationship!
Functional Challenges
ā€¢ The Ā«interval dimensionĀ» will then be
  hidden and the user will just see the
  Ā«Analysis DateĀ» dimension

ā€¢ When such date is selected, all the fact
  rows that have an interval containing that
  date will be selected too, giving us the
  solution weā€Ÿre looking for ļŠ
Functional Challenges

   Date      The user wants
 Dimension   to analyze data
             on 10 Aug 2009



                                                   Fact Table
    Factless                   DateRange
Date-DateRange                 Dimension

                                SSAS uses all
        10 Aug is               the found ranges
        contained in two                            All the (three)
        Ranges                                      rows related to
                                                    those ranges are
                                                    read
All Togheter!
The final recap
 And a quick recipe to make everything easier
Steps for the DWH
ā€¢ For each entity:
  ā€“ Get all ā€œvalid_fromā€ and ā€œvalid_toā€ columns
    from all tables
  ā€“ Get all the dates that will break intervals from
    all tables
  ā€“ Take all the gathered dates one time (remove
    duplicates) and generate new intervals
  ā€“ Ā«UnpackĀ» original tables in order to use the
    newly generated intervals
Steps for the CUBE
ā€¢ Generate the usual Date dimension
  ā€“ Generate a Ā«Date RangeĀ» dimension
  ā€“ Generate a Factless Date-DataRange table
  ā€“ Generate the fact table with reference to the
    Date Range dimension
  ā€“ Create the M:N relationship
BIA-406-S | Temporal Snapshot Fact Table   47
Conclusion and Improvements
 Some ideas for the future
Conclusion and improvements
ā€¢ The defined pattern can be applied each time
  a daily analysis is needed and data is not
  additive
  ā€“ So each time you need to ā€œsnapshotā€ something

ā€¢ The approach can also be used if source
  data is not temporal, as long as you can turn
  it into a temporal format

ā€¢ Performance can be improved by
  partitioning the cube by year or even by
  month
Conclusion and improvements
ā€¢ At the end is a very simple solution ļŠ

ā€¢ Everything youā€Ÿve seen is the summary of
  several months of work of the Italian
  Team. Thanks guys!
Questions?
Thanks!




Ā© 2012 SolidQ   52

More Related Content

What's hot

How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...ScyllaDB
Ā 
High Performance and Scalability Database Design
High Performance and Scalability Database DesignHigh Performance and Scalability Database Design
High Performance and Scalability Database DesignTung Ns
Ā 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache icebergAlluxio, Inc.
Ā 
Time-Series Apache HBase
Time-Series Apache HBaseTime-Series Apache HBase
Time-Series Apache HBaseHBaseCon
Ā 
Build Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingBuild Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingDatabricks
Ā 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation Brett VanderPlaats
Ā 
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 Apache AGE and the synergy effect in the combination of Postgres and NoSQL Apache AGE and the synergy effect in the combination of Postgres and NoSQL
Apache AGE and the synergy effect in the combination of Postgres and NoSQLEDB
Ā 
Tagging with AI to Enable Data-Driven Smart Building Applications at Scale!
Tagging with AI to Enable Data-Driven Smart Building Applications at Scale!Tagging with AI to Enable Data-Driven Smart Building Applications at Scale!
Tagging with AI to Enable Data-Driven Smart Building Applications at Scale!Memoori
Ā 
Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014Philip Fisher-Ogden
Ā 
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...Flink Forward
Ā 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Amazon Web Services
Ā 
RedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel HochmanRedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel HochmanRedis Labs
Ā 
Key-Value NoSQL Database
Key-Value NoSQL DatabaseKey-Value NoSQL Database
Key-Value NoSQL DatabaseHeman Hosainpana
Ā 
OSA Con 2022 - Apache Iceberg_ An Architectural Look Under the Covers - Alex ...
OSA Con 2022 - Apache Iceberg_ An Architectural Look Under the Covers - Alex ...OSA Con 2022 - Apache Iceberg_ An Architectural Look Under the Covers - Alex ...
OSA Con 2022 - Apache Iceberg_ An Architectural Look Under the Covers - Alex ...Altinity Ltd
Ā 
The Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceThe Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceAbdelmonaim Remani
Ā 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...Mihai Criveti
Ā 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
Ā 
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...confluent
Ā 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
Ā 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB ClusterMongoDB
Ā 

What's hot (20)

How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
Ā 
High Performance and Scalability Database Design
High Performance and Scalability Database DesignHigh Performance and Scalability Database Design
High Performance and Scalability Database Design
Ā 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
Ā 
Time-Series Apache HBase
Time-Series Apache HBaseTime-Series Apache HBase
Time-Series Apache HBase
Ā 
Build Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingBuild Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks Streaming
Ā 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
Ā 
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 Apache AGE and the synergy effect in the combination of Postgres and NoSQL Apache AGE and the synergy effect in the combination of Postgres and NoSQL
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
Ā 
Tagging with AI to Enable Data-Driven Smart Building Applications at Scale!
Tagging with AI to Enable Data-Driven Smart Building Applications at Scale!Tagging with AI to Enable Data-Driven Smart Building Applications at Scale!
Tagging with AI to Enable Data-Driven Smart Building Applications at Scale!
Ā 
Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014
Ā 
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Ā 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Ā 
RedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel HochmanRedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
Ā 
Key-Value NoSQL Database
Key-Value NoSQL DatabaseKey-Value NoSQL Database
Key-Value NoSQL Database
Ā 
OSA Con 2022 - Apache Iceberg_ An Architectural Look Under the Covers - Alex ...
OSA Con 2022 - Apache Iceberg_ An Architectural Look Under the Covers - Alex ...OSA Con 2022 - Apache Iceberg_ An Architectural Look Under the Covers - Alex ...
OSA Con 2022 - Apache Iceberg_ An Architectural Look Under the Covers - Alex ...
Ā 
The Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceThe Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot Persistence
Ā 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
Ā 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
Ā 
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...
Ā 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
Ā 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB Cluster
Ā 

Viewers also liked

Data modeling facts
Data modeling factsData modeling facts
Data modeling factsDr. Dipti Patil
Ā 
SQL Server 2016 Temporal Tables
SQL Server 2016 Temporal TablesSQL Server 2016 Temporal Tables
SQL Server 2016 Temporal TablesDavide Mauri
Ā 
SQL Server 2016 What's New For Developers
SQL Server 2016  What's New For DevelopersSQL Server 2016  What's New For Developers
SQL Server 2016 What's New For DevelopersDavide Mauri
Ā 
A 100 dal genocidio georges ruyssen civiltĆ  cattolica
A 100 dal genocidio georges ruyssen   civiltĆ  cattolicaA 100 dal genocidio georges ruyssen   civiltĆ  cattolica
A 100 dal genocidio georges ruyssen civiltĆ  cattolicaFrancesco Occhetta
Ā 
Datarace: IoT e Big Data (Italian)
Datarace: IoT e Big Data (Italian)Datarace: IoT e Big Data (Italian)
Datarace: IoT e Big Data (Italian)Davide Mauri
Ā 
Iris Multi-Class Classifier with Azure ML
Iris Multi-Class Classifier with Azure MLIris Multi-Class Classifier with Azure ML
Iris Multi-Class Classifier with Azure MLDavide Mauri
Ā 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine LearningDavide Mauri
Ā 
Data warehousing
Data warehousingData warehousing
Data warehousingAllen Woods
Ā 
Data flow ii extract
Data flow   ii extractData flow   ii extract
Data flow ii extractDr. Dipti Patil
Ā 
Real Time Power BI
Real Time Power BIReal Time Power BI
Real Time Power BIDavide Mauri
Ā 
AzureML - Creating and Using Machine Learning Solutions (Italian)
AzureML - Creating and Using Machine Learning Solutions (Italian)AzureML - Creating and Using Machine Learning Solutions (Italian)
AzureML - Creating and Using Machine Learning Solutions (Italian)Davide Mauri
Ā 
SSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveSSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveDavide Mauri
Ā 
Data modeling dimensions
Data modeling dimensionsData modeling dimensions
Data modeling dimensionsDr. Dipti Patil
Ā 
Agile Data Warehousing
Agile Data WarehousingAgile Data Warehousing
Agile Data WarehousingDavide Mauri
Ā 
Research methodology
Research methodologyResearch methodology
Research methodologyDr. Dipti Patil
Ā 
Harnessing the Hadoop Ecosystem Optimizations in Apache Hive
Harnessing the Hadoop Ecosystem Optimizations in Apache HiveHarnessing the Hadoop Ecosystem Optimizations in Apache Hive
Harnessing the Hadoop Ecosystem Optimizations in Apache HiveQubole
Ā 
Lean Data Warehouse via Data Vault
Lean Data Warehouse via Data VaultLean Data Warehouse via Data Vault
Lean Data Warehouse via Data VaultDaniel Upton
Ā 
Dashboarding with Microsoft: Datazen & Power BI
Dashboarding with Microsoft: Datazen & Power BIDashboarding with Microsoft: Datazen & Power BI
Dashboarding with Microsoft: Datazen & Power BIDavide Mauri
Ā 
Azure Machine Learning (Italian)
Azure Machine Learning (Italian)Azure Machine Learning (Italian)
Azure Machine Learning (Italian)Davide Mauri
Ā 
Query Store and live Query Statistics
Query Store and live Query StatisticsQuery Store and live Query Statistics
Query Store and live Query StatisticsSolidQ
Ā 

Viewers also liked (20)

Data modeling facts
Data modeling factsData modeling facts
Data modeling facts
Ā 
SQL Server 2016 Temporal Tables
SQL Server 2016 Temporal TablesSQL Server 2016 Temporal Tables
SQL Server 2016 Temporal Tables
Ā 
SQL Server 2016 What's New For Developers
SQL Server 2016  What's New For DevelopersSQL Server 2016  What's New For Developers
SQL Server 2016 What's New For Developers
Ā 
A 100 dal genocidio georges ruyssen civiltĆ  cattolica
A 100 dal genocidio georges ruyssen   civiltĆ  cattolicaA 100 dal genocidio georges ruyssen   civiltĆ  cattolica
A 100 dal genocidio georges ruyssen civiltĆ  cattolica
Ā 
Datarace: IoT e Big Data (Italian)
Datarace: IoT e Big Data (Italian)Datarace: IoT e Big Data (Italian)
Datarace: IoT e Big Data (Italian)
Ā 
Iris Multi-Class Classifier with Azure ML
Iris Multi-Class Classifier with Azure MLIris Multi-Class Classifier with Azure ML
Iris Multi-Class Classifier with Azure ML
Ā 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
Ā 
Data warehousing
Data warehousingData warehousing
Data warehousing
Ā 
Data flow ii extract
Data flow   ii extractData flow   ii extract
Data flow ii extract
Ā 
Real Time Power BI
Real Time Power BIReal Time Power BI
Real Time Power BI
Ā 
AzureML - Creating and Using Machine Learning Solutions (Italian)
AzureML - Creating and Using Machine Learning Solutions (Italian)AzureML - Creating and Using Machine Learning Solutions (Italian)
AzureML - Creating and Using Machine Learning Solutions (Italian)
Ā 
SSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveSSIS Monitoring Deep Dive
SSIS Monitoring Deep Dive
Ā 
Data modeling dimensions
Data modeling dimensionsData modeling dimensions
Data modeling dimensions
Ā 
Agile Data Warehousing
Agile Data WarehousingAgile Data Warehousing
Agile Data Warehousing
Ā 
Research methodology
Research methodologyResearch methodology
Research methodology
Ā 
Harnessing the Hadoop Ecosystem Optimizations in Apache Hive
Harnessing the Hadoop Ecosystem Optimizations in Apache HiveHarnessing the Hadoop Ecosystem Optimizations in Apache Hive
Harnessing the Hadoop Ecosystem Optimizations in Apache Hive
Ā 
Lean Data Warehouse via Data Vault
Lean Data Warehouse via Data VaultLean Data Warehouse via Data Vault
Lean Data Warehouse via Data Vault
Ā 
Dashboarding with Microsoft: Datazen & Power BI
Dashboarding with Microsoft: Datazen & Power BIDashboarding with Microsoft: Datazen & Power BI
Dashboarding with Microsoft: Datazen & Power BI
Ā 
Azure Machine Learning (Italian)
Azure Machine Learning (Italian)Azure Machine Learning (Italian)
Azure Machine Learning (Italian)
Ā 
Query Store and live Query Statistics
Query Store and live Query StatisticsQuery Store and live Query Statistics
Query Store and live Query Statistics
Ā 

Similar to Temporal Snapshot Fact Tables

SR&ED for Agile Startups
SR&ED for Agile StartupsSR&ED for Agile Startups
SR&ED for Agile StartupsEd Levinson
Ā 
data structure
data structuredata structure
data structureSakshi Jain
Ā 
BigQuery at AppsFlyer - past, present and future
BigQuery at AppsFlyer - past, present and futureBigQuery at AppsFlyer - past, present and future
BigQuery at AppsFlyer - past, present and futureNir Rubinstein
Ā 
Biug 20112026 dimensional modeling and mdx best practices
Biug 20112026   dimensional modeling and mdx best practicesBiug 20112026   dimensional modeling and mdx best practices
Biug 20112026 dimensional modeling and mdx best practicesItay Braun
Ā 
Case Study Real Time Olap Cubes
Case Study Real Time Olap CubesCase Study Real Time Olap Cubes
Case Study Real Time Olap Cubesmister_zed
Ā 
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...vtunotesbysree
Ā 
Modeling, It's Not Just For Calendars and Energy
Modeling, It's Not Just For Calendars and EnergyModeling, It's Not Just For Calendars and Energy
Modeling, It's Not Just For Calendars and EnergyIllinois ASHRAE
Ā 
Software Project Management (lecture 4)
Software Project Management (lecture 4)Software Project Management (lecture 4)
Software Project Management (lecture 4)Syed Muhammad Hammad
Ā 
Software Project Management lecture 9
Software Project Management lecture 9Software Project Management lecture 9
Software Project Management lecture 9Syed Muhammad Hammad
Ā 
Oracle R12 Upgrade Lessons Learned
Oracle R12 Upgrade Lessons LearnedOracle R12 Upgrade Lessons Learned
Oracle R12 Upgrade Lessons Learnedbpellot
Ā 
Soft Landings Conference Case Study 2
Soft Landings Conference Case Study 2Soft Landings Conference Case Study 2
Soft Landings Conference Case Study 2BSRIA
Ā 
Advanced Lean Training Manual Toolkit.ppt
Advanced Lean Training Manual Toolkit.pptAdvanced Lean Training Manual Toolkit.ppt
Advanced Lean Training Manual Toolkit.pptThinL389917
Ā 
Metrics-Based Process Mapping
Metrics-Based Process MappingMetrics-Based Process Mapping
Metrics-Based Process MappingTKMG, Inc.
Ā 
Look into communication
Look into communicationLook into communication
Look into communicationAn Le
Ā 
Asufe juniors-training session2
Asufe juniors-training session2Asufe juniors-training session2
Asufe juniors-training session2Omar Ahmed
Ā 
From Waterfall to Weekly Releases: A Case Study in using Evo and Kanban (2004...
From Waterfall to Weekly Releases: A Case Study in using Evo and Kanban (2004...From Waterfall to Weekly Releases: A Case Study in using Evo and Kanban (2004...
From Waterfall to Weekly Releases: A Case Study in using Evo and Kanban (2004...Tathagat Varma
Ā 
Performant Django - Ara Anjargolian
Performant Django - Ara AnjargolianPerformant Django - Ara Anjargolian
Performant Django - Ara AnjargolianHakka Labs
Ā 

Similar to Temporal Snapshot Fact Tables (20)

SR&ED for Agile Startups
SR&ED for Agile StartupsSR&ED for Agile Startups
SR&ED for Agile Startups
Ā 
data structure
data structuredata structure
data structure
Ā 
BigQuery at AppsFlyer - past, present and future
BigQuery at AppsFlyer - past, present and futureBigQuery at AppsFlyer - past, present and future
BigQuery at AppsFlyer - past, present and future
Ā 
Biug 20112026 dimensional modeling and mdx best practices
Biug 20112026   dimensional modeling and mdx best practicesBiug 20112026   dimensional modeling and mdx best practices
Biug 20112026 dimensional modeling and mdx best practices
Ā 
Case Study Real Time Olap Cubes
Case Study Real Time Olap CubesCase Study Real Time Olap Cubes
Case Study Real Time Olap Cubes
Ā 
Project Scheduling
Project SchedulingProject Scheduling
Project Scheduling
Ā 
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...
Ā 
Modeling, It's Not Just For Calendars and Energy
Modeling, It's Not Just For Calendars and EnergyModeling, It's Not Just For Calendars and Energy
Modeling, It's Not Just For Calendars and Energy
Ā 
Software Project Management (lecture 4)
Software Project Management (lecture 4)Software Project Management (lecture 4)
Software Project Management (lecture 4)
Ā 
Software Project Management lecture 9
Software Project Management lecture 9Software Project Management lecture 9
Software Project Management lecture 9
Ā 
Oracle R12 Upgrade Lessons Learned
Oracle R12 Upgrade Lessons LearnedOracle R12 Upgrade Lessons Learned
Oracle R12 Upgrade Lessons Learned
Ā 
Soft Landings Conference Case Study 2
Soft Landings Conference Case Study 2Soft Landings Conference Case Study 2
Soft Landings Conference Case Study 2
Ā 
Advanced Lean Training Manual Toolkit.ppt
Advanced Lean Training Manual Toolkit.pptAdvanced Lean Training Manual Toolkit.ppt
Advanced Lean Training Manual Toolkit.ppt
Ā 
Metrics-Based Process Mapping
Metrics-Based Process MappingMetrics-Based Process Mapping
Metrics-Based Process Mapping
Ā 
Oracle Primavera P6 PRO R8 Tips & tricks
Oracle Primavera P6 PRO R8 Tips & tricksOracle Primavera P6 PRO R8 Tips & tricks
Oracle Primavera P6 PRO R8 Tips & tricks
Ā 
Look into communication
Look into communicationLook into communication
Look into communication
Ā 
Asufe juniors-training session2
Asufe juniors-training session2Asufe juniors-training session2
Asufe juniors-training session2
Ā 
From Waterfall to Weekly Releases: A Case Study in using Evo and Kanban (2004...
From Waterfall to Weekly Releases: A Case Study in using Evo and Kanban (2004...From Waterfall to Weekly Releases: A Case Study in using Evo and Kanban (2004...
From Waterfall to Weekly Releases: A Case Study in using Evo and Kanban (2004...
Ā 
Week3 to-design
Week3 to-designWeek3 to-design
Week3 to-design
Ā 
Performant Django - Ara Anjargolian
Performant Django - Ara AnjargolianPerformant Django - Ara Anjargolian
Performant Django - Ara Anjargolian
Ā 

More from Davide Mauri

Azure serverless Full-Stack kickstart
Azure serverless Full-Stack kickstartAzure serverless Full-Stack kickstart
Azure serverless Full-Stack kickstartDavide Mauri
Ā 
Agile Data Warehousing
Agile Data WarehousingAgile Data Warehousing
Agile Data WarehousingDavide Mauri
Ā 
Dapper: the microORM that will change your life
Dapper: the microORM that will change your lifeDapper: the microORM that will change your life
Dapper: the microORM that will change your lifeDavide Mauri
Ā 
When indexes are not enough
When indexes are not enoughWhen indexes are not enough
When indexes are not enoughDavide Mauri
Ā 
Building a Real-Time IoT monitoring application with Azure
Building a Real-Time IoT monitoring application with AzureBuilding a Real-Time IoT monitoring application with Azure
Building a Real-Time IoT monitoring application with AzureDavide Mauri
Ā 
SSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveSSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveDavide Mauri
Ā 
Azure SQL & SQL Server 2016 JSON
Azure SQL & SQL Server 2016 JSONAzure SQL & SQL Server 2016 JSON
Azure SQL & SQL Server 2016 JSONDavide Mauri
Ā 
SQL Server & SQL Azure Temporal Tables - V2
SQL Server & SQL Azure Temporal Tables - V2SQL Server & SQL Azure Temporal Tables - V2
SQL Server & SQL Azure Temporal Tables - V2Davide Mauri
Ā 
Azure Stream Analytics
Azure Stream AnalyticsAzure Stream Analytics
Azure Stream AnalyticsDavide Mauri
Ā 
Azure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applicationsAzure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applicationsDavide Mauri
Ā 
Event Hub & Azure Stream Analytics
Event Hub & Azure Stream AnalyticsEvent Hub & Azure Stream Analytics
Event Hub & Azure Stream AnalyticsDavide Mauri
Ā 
SQL Server 2016 JSON
SQL Server 2016 JSONSQL Server 2016 JSON
SQL Server 2016 JSONDavide Mauri
Ā 
Back to the roots - SQL Server Indexing
Back to the roots - SQL Server IndexingBack to the roots - SQL Server Indexing
Back to the roots - SQL Server IndexingDavide Mauri
Ā 
Schema less table & dynamic schema
Schema less table & dynamic schemaSchema less table & dynamic schema
Schema less table & dynamic schemaDavide Mauri
Ā 
BIML: BI to the next level
BIML: BI to the next levelBIML: BI to the next level
BIML: BI to the next levelDavide Mauri
Ā 
Data Science Overview
Data Science OverviewData Science Overview
Data Science OverviewDavide Mauri
Ā 
Delayed durability
Delayed durabilityDelayed durability
Delayed durabilityDavide Mauri
Ā 
Hekaton: In-memory tables
Hekaton: In-memory tablesHekaton: In-memory tables
Hekaton: In-memory tablesDavide Mauri
Ā 

More from Davide Mauri (19)

Azure serverless Full-Stack kickstart
Azure serverless Full-Stack kickstartAzure serverless Full-Stack kickstart
Azure serverless Full-Stack kickstart
Ā 
Agile Data Warehousing
Agile Data WarehousingAgile Data Warehousing
Agile Data Warehousing
Ā 
Dapper: the microORM that will change your life
Dapper: the microORM that will change your lifeDapper: the microORM that will change your life
Dapper: the microORM that will change your life
Ā 
When indexes are not enough
When indexes are not enoughWhen indexes are not enough
When indexes are not enough
Ā 
Building a Real-Time IoT monitoring application with Azure
Building a Real-Time IoT monitoring application with AzureBuilding a Real-Time IoT monitoring application with Azure
Building a Real-Time IoT monitoring application with Azure
Ā 
SSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveSSIS Monitoring Deep Dive
SSIS Monitoring Deep Dive
Ā 
Azure SQL & SQL Server 2016 JSON
Azure SQL & SQL Server 2016 JSONAzure SQL & SQL Server 2016 JSON
Azure SQL & SQL Server 2016 JSON
Ā 
SQL Server & SQL Azure Temporal Tables - V2
SQL Server & SQL Azure Temporal Tables - V2SQL Server & SQL Azure Temporal Tables - V2
SQL Server & SQL Azure Temporal Tables - V2
Ā 
Azure Stream Analytics
Azure Stream AnalyticsAzure Stream Analytics
Azure Stream Analytics
Ā 
Azure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applicationsAzure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applications
Ā 
Event Hub & Azure Stream Analytics
Event Hub & Azure Stream AnalyticsEvent Hub & Azure Stream Analytics
Event Hub & Azure Stream Analytics
Ā 
SQL Server 2016 JSON
SQL Server 2016 JSONSQL Server 2016 JSON
SQL Server 2016 JSON
Ā 
Back to the roots - SQL Server Indexing
Back to the roots - SQL Server IndexingBack to the roots - SQL Server Indexing
Back to the roots - SQL Server Indexing
Ā 
Schema less table & dynamic schema
Schema less table & dynamic schemaSchema less table & dynamic schema
Schema less table & dynamic schema
Ā 
BIML: BI to the next level
BIML: BI to the next levelBIML: BI to the next level
BIML: BI to the next level
Ā 
Data juice
Data juiceData juice
Data juice
Ā 
Data Science Overview
Data Science OverviewData Science Overview
Data Science Overview
Ā 
Delayed durability
Delayed durabilityDelayed durability
Delayed durability
Ā 
Hekaton: In-memory tables
Hekaton: In-memory tablesHekaton: In-memory tables
Hekaton: In-memory tables
Ā 

Recently uploaded

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
Ā 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
Ā 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
Ā 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
Ā 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
Ā 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
Ā 
šŸ¬ The future of MySQL is Postgres šŸ˜
šŸ¬  The future of MySQL is Postgres   šŸ˜šŸ¬  The future of MySQL is Postgres   šŸ˜
šŸ¬ The future of MySQL is Postgres šŸ˜RTylerCroy
Ā 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
Ā 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
Ā 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
Ā 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
Ā 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
Ā 
Scaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationScaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationRadu Cotescu
Ā 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
Ā 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
Ā 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
Ā 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
Ā 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
Ā 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĆŗjo
Ā 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
Ā 

Recently uploaded (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Ā 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Ā 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Ā 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Ā 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
Ā 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Ā 
šŸ¬ The future of MySQL is Postgres šŸ˜
šŸ¬  The future of MySQL is Postgres   šŸ˜šŸ¬  The future of MySQL is Postgres   šŸ˜
šŸ¬ The future of MySQL is Postgres šŸ˜
Ā 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Ā 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Ā 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
Ā 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
Ā 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Ā 
Scaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationScaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organization
Ā 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Ā 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Ā 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
Ā 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
Ā 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Ā 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Ā 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Ā 

Temporal Snapshot Fact Tables

  • 1. SQL Bits X Temporal Snapshot Fact Table A new approach to data snapshots Davide Mauri dmauri@solidq.com Mentor - SolidQ
  • 2. EXEC sp_help ā€žDavide Mauriā€Ÿ ā€¢ Microsoft SQL Server MVP ā€¢ Works with SQL Server from 6.5, on BI from 2003 ā€¢ Specialized in Data Solution Architecture, Database Design, Performance Tuning, BI ā€¢ President of UGISS (Italian SQL Server UG) ā€¢ Mentor @ SolidQ ā€¢ Twitter: mauridb ā€¢ Blog: http://sqlblog.com/blogs/davide_mauri
  • 3. Agenda ā€¢ The problem ā€¢ The possible Ā«classicĀ» solutions ā€¢ Limits of Ā«classicĀ» solutions ā€¢ A temporal approach ā€¢ Implementing the solution ā€¢ Technical & Functional Challenges (and their resolution ļŠ) ā€¢ Conclusions
  • 5. The request ā€¢ Our customer (an insurance company) needed a BI solution that would allow them to do some ā€œsimpleā€ things: ā€“ Analyze the value/situation of all insurances at a specific point in time ā€“ Analyze all the data from 1970 onwards ā€“ Analyze data on daily basis ā€¢ Absolutely no weekly or monthly aggregation
  • 6. The environment ā€¢ On average they have 3.000.000 documents related to insurances that need to be stored. ā€“ Each day. ā€¢ Data is mainly stored in a DB2 mainframe ā€“ Data is (sometimes) stored in a ā€œtemporalā€ fashion ā€¢ For each document a new row is created each time something changes ā€¢ Each row has ā€œvalid_fromā€ and ā€œvalid_toā€ columns
  • 7. Letā€Ÿs do the math ā€¢ To keep the daily snapshot ā€“ of 3.000.000 documents ā€“ for 365 days ā€“ for 40 years ā€¢ A fact table of near 44 BILLION rows would be needed
  • 8. The possible solutions ā€¢ How can the problem be solved? ā€“ PDW and/or Brute Force was not an option ā€¢ Unfortunately none of the three known approaches can work here ā€“ Transactional Fact Table ā€“ Periodic Snapshot Fact Table ā€“ Accumulating Snapshot Fact Table ā€¢ A new approach is needed!
  • 9. The classic solutions And why they cannot be used here
  • 10. Transaction Fact Table ā€¢ One row per fact occurring at a certain point in time ā€¢ We donā€Ÿt have facts for every day. For example, for the document 123, no changes were made on 20 August 2011 ā€“ Analysis on that date would not show the document 123 ā€“ ā€œLastNonEmptyā€ aggregation cannot help since dates donā€Ÿt have children ā€“ MDX can come to the rescue here butā€¦ ā€¢ Very high complexity ā€¢ Potential performance problems
  • 11. Accumulating Snapshot F.T. ā€¢ One fact row per entity in the fact table ā€“ Each row has a lot of date columns that store all the changes that happen during the entityā€Ÿs lifetime ā€“ Each column date represents a point in time where something changed ā€¢ The number of changes must be known in advance ā€¢ With so many date columns it can be difficult to manage the ā€œAnalysis Dateā€ 11
  • 12. Periodic Snapshot Fact Table ā€¢ The fact table holds a Ā«snapshotĀ» of all data for a specific point in time ā€“ Thatā€Ÿs the perfect solution, since doing a snapshot each day would completely solve the problem ā€“ Unfortunately there is just too much data ā€“ We need to keep the idea but reduce the amount of data BIA-406-S | Temporal Snapshot Fact Table 12
  • 13. A temporal approach And how it can be applied to BI
  • 14. Temporal Data ā€¢ The snapshot fact table is a good starting point ā€“ Keeping a snapshot of a document for each day is just a waste of space ā€“ And also a big performance killer ā€¢ Changes to document donā€Ÿt happen too often ā€“ Fact tables should be temporal in order to avoid data duplication ā€“ Ideally, each fact row needs a ā€œvalid_fromā€ and ā€œvalid_toā€ column
  • 15. Temporal Data ā€¢ With Temporal Snapshot Fact Table (TSFT) each row represents a fact that occurred during a time interval not at a point in time. ā€¢ Time Intervals will be right-opened in order to simplify calculations: [valid_from, valid_to) Which means: valid_from <= x < valid_to
  • 16. Temporal Data ā€¢ Now that the idea is set, we have to solve several problems: ā€“ Functional Challenges ā€¢ SSAS doesnā€Ÿt support the concept of intervals, nor the between filter ā€¢ The user just wants to define the Analysis Date, and doesnā€Ÿt want to deal with ranges ā€“ Technical Challenges ā€¢ Source data may come from several tables and has to be consolidated in a small number of TSFT ā€¢ The concept of interval changes the rules of the ā€œjoinā€ game a little bitā€¦
  • 17. Technical Challenges ā€¢ Source data may come from table with different designs: temporal tables non-temporal tables ā€¢ Temporal Tables ā€“ Ranges from two different temporal tables may overlap ā€¢ Non-temporal tables ā€“ There may be some business rules that require us to Ā«breakĀ» the existing ranges
  • 18. Technical Challenges ā€¢ Before starting to solve the problem, letā€Ÿs generalize it a bit (in order to be able to study it theoretically) ā€¢ Imagine a company with this environment ā€“ There are Items & Sub Items ā€“ Items have 1 or more Sub Items ā€¢ Each Sub Item has a value in money ā€¢ Value may change during its lifetime ā€“ Customers have to pay an amount equal to the sum of all Sub Items of an Item ā€¢ Customers may pay when they want during the year, as long as the total amount is paid by the end of the year
  • 20. Technical Challenges ā€¢ When joining data, keys are no longer sufficient ā€¢ Time Intervals must also be managed correctly, otherwise incorrect results may be created ā€¢ Luckily time operators are well-known (but not yet implemented) ā€“ overlaps, contains, meets, ecc. ecc.
  • 21. Technical Challenges ā€¢ CONTAINS (or DURING) Operator: b1 e1 i1 b2 e2 i2 ā€¢ Means: b1<=b2 AND e1>=e2
  • 22. Technical Challenges ā€¢ OVERLAPS Operator: b1 e1 i1 b2 e2 i2 ā€¢ Means: b1<=e2 AND b2 <=e1 ā€¢ Since weā€Ÿre using right-opened intervals, a little change is needed here: b1<e2 AND b2 <e1
  • 23. Technical Challenges ā€¢ Before going any further itā€Ÿs best we start to Ā«seeĀ» the data weā€Ÿre going to use 2008 2009 2010 Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Item 1 A B C Item 1.1 A B C D B C Item 1.2 A B C D E C Item 2 A B C D C Item 2.1 A B C D C
  • 25. Technical Challenges ā€¢ As you have seen, even with just three tables, the joins are utterly complex! ā€¢ But bearing this in mind, we can now find an easier, logically equivalent, solution. ā€¢ Drawing a timeline and visualizing the result helps a lot here.
  • 26. Technical Challenges Range for one Item Range for the Sub Item Payment Dates Intervals will represent a So, on the other way round, time where nothing dates represent a time in changed which something changed! Ā«SummarizedĀ» Status 1 2 3 4 5 6
  • 27. Technical Challenges ā€¢ In simpler words we need to find the minimum set of time intervals for each entity ā€“ The concept of "granularity" must be changed in order to take into account the idea of time intervals ā€“ All fact tables, for each stored entity, must have the same granularity for the time interval ā€¢ interval granularity must be the same ā€œhorizontallyā€ (across tables) ā€¢ interval granularity must be specific for each entity (across rows: ā€œverticallyā€)
  • 28. Technical Challenges ā€¢ The problem can be solved by Ā«refactoringĀ» data. ā€¢ Just like refactoring code, but working on data ā€“ Rephrased idea: ā€œdisciplined technique for restructuring existing temporal data, altering its internal structure without changing its valueā€ ā€¢ http://en.wikipedia.org/wiki/Code_refactoring
  • 29. Technical Challenges ā€¢ Hereā€Ÿs how: ā€¢ Step 1 ā€“ For each item, for each temporal table, union all the distinct ranges
  • 30. Technical Challenges ā€¢ Step 2 ā€“ For each item, take all the distinct: ā€“ Dates from the previous step ā€“ Dates that are known to Ā«breakĀ» a range ā€“ The first day of each year used in ranges
  • 31. Technical Challenges ā€¢ Step 3 ā€“ For each item, create the new ranges, doing the following steps ā€¢ Order the rows by date ā€¢ Take the Ā«kĀ» row and the Ā«k+1Ā» row ā€¢ Create the range [Datek, Datek+1) The New LEAD operator is exactly what we need! ļŠ
  • 32. Technical Challenges ā€¢ When the time granularity is the day, the right-opened range helps a lot here. ā€“ Eg: letā€Ÿs say the at some point, for an entity, you have the following dates that has to be turned into ranges:
  • 33. Technical Challenges ā€¢ With a closed range intervals can be ambiguous, especially single day intervals: ā€“ The meaning is not explicitly clear and we would have to take care of it, making the ETL phase more complex. ā€“ Itā€Ÿs better to make everything explicit
  • 34. Technical Challenges ā€¢ So, to avoid ambiguity, with a closed range the resulting set of intervals also needs to take time into account: ā€“ What granularity to use for time? ā€¢ Hour? Minute? Microseconds? ā€“ Is just a waste of space ā€“ Date or Int datatype cannot be used
  • 35. Technical Challenges ā€¢ With a right-opened interval everything is easier: ā€“ Date or Int datatype are enough ā€“ No time used, so no need to deal with it
  • 36. Technical Challenges ā€¢ Step 4 ā€“ Ā«ExplodeĀ» all the temporal source tables so that all data will use the same (new) ranges, using the CONTAINS operator:
  • 37. Technical Challenges ā€¢ The original Range: Becomes the equivalent set of ranges: If we would have chosen a daily snapshot, 425 rows would have been generatedā€¦not only 12! (Thatā€Ÿs a 35:1 ratio!)
  • 38. Technical Challenges ā€¢ Now we have all the source data conforming to a common set of intervals ā€“ This means that we can just join tables without having to do complex temporal joins ā€¢ All fact tables will also have the same interval granularity ā€¢ We solved all the technical problems! ļŠ
  • 40. Functional Challenges ā€¢ SSAS doesnā€Ÿt support time intervals butā€¦ ā€¢ Using some imagination we can say that ā€“ An interval is made of 1 to ā€œnā€ dates ā€“ A date belongs to 1 to ā€œnā€ intervals ā€¢ We can model this situation with a Many- To-Many relationship!
  • 41. Functional Challenges ā€¢ The Ā«interval dimensionĀ» will then be hidden and the user will just see the Ā«Analysis DateĀ» dimension ā€¢ When such date is selected, all the fact rows that have an interval containing that date will be selected too, giving us the solution weā€Ÿre looking for ļŠ
  • 42. Functional Challenges Date The user wants Dimension to analyze data on 10 Aug 2009 Fact Table Factless DateRange Date-DateRange Dimension SSAS uses all 10 Aug is the found ranges contained in two All the (three) Ranges rows related to those ranges are read
  • 44. The final recap And a quick recipe to make everything easier
  • 45. Steps for the DWH ā€¢ For each entity: ā€“ Get all ā€œvalid_fromā€ and ā€œvalid_toā€ columns from all tables ā€“ Get all the dates that will break intervals from all tables ā€“ Take all the gathered dates one time (remove duplicates) and generate new intervals ā€“ Ā«UnpackĀ» original tables in order to use the newly generated intervals
  • 46. Steps for the CUBE ā€¢ Generate the usual Date dimension ā€“ Generate a Ā«Date RangeĀ» dimension ā€“ Generate a Factless Date-DataRange table ā€“ Generate the fact table with reference to the Date Range dimension ā€“ Create the M:N relationship
  • 47. BIA-406-S | Temporal Snapshot Fact Table 47
  • 48. Conclusion and Improvements Some ideas for the future
  • 49. Conclusion and improvements ā€¢ The defined pattern can be applied each time a daily analysis is needed and data is not additive ā€“ So each time you need to ā€œsnapshotā€ something ā€¢ The approach can also be used if source data is not temporal, as long as you can turn it into a temporal format ā€¢ Performance can be improved by partitioning the cube by year or even by month
  • 50. Conclusion and improvements ā€¢ At the end is a very simple solution ļŠ ā€¢ Everything youā€Ÿve seen is the summary of several months of work of the Italian Team. Thanks guys!

Editor's Notes

  1. FromBol: ā€œThe member value is evaluated as the value of its last child along the time dimension that contains dataā€
  2. http://www.kimballgroup.com/html/designtipsPDF/DesignTips2002/KimballDT37ModelingPipeline.pdf
  3. Show SQL Queries
  4. Show the SSIS solution to explode the tables
  5. Show why the M2Msolutionworks (using T-SQL queries)Show the olapviewsShow the SSAS solution
  6. Review the complete solution from start to end (maybeusing *much* more data?)