This document discusses using Oracle Business Intelligence Enterprise Edition (OBIEE) and the Data Vault data modeling technique to virtualize a business intelligence environment in an agile way. Data Vault provides a flexible and adaptable modeling approach that allows for rapid changes. OBIEE allows for the virtualization of dimensional models built on a Data Vault foundation, enabling quick iteration and delivery of reports and dashboards to users. Together, Data Vault and OBIEE provide an agile approach to business intelligence.
Worst Practices in Data Warehouse DesignKent Graziano
This presentation was given at OakTable World 2014 (#OTW14) in San Francisco. After many years of designing data warehouses and consulting on data warehouse architectures, I have seen a lot of bad design choices by supposedly experienced professional. A sense of professionalism, confidentiality agreements, and some sense of common decency have prevented me from calling people out on some of this. No more! In this session I will walk you through a typical bad design like many I have seen. I will show you what I see when I reverse engineer a supposedly complete design and walk through what is wrong with it and discuss options to correct it. This will be a test of your knowledge of data warehouse best practices by seeing if you can recognize these worst practices.
(OTW13) Agile Data Warehousing: Introduction to Data Vault ModelingKent Graziano
This is the presentation I gave at OakTable World 2013 in San Francisco. #OTW13 was held at the Children's Creativity Museum next to the Moscone Convention Center and was in parallel with Oracle OpenWorld 2013.
The session discussed our attempts to be more agile in designing enterprise data warehouses and how the Data Vault Data Modeling technique helps in that approach.
Extreme BI: Creating Virtualized Hybrid Type 1+2 DimensionsKent Graziano
From a talk I gave at WWDVC and ECO in 2015 about how we built virtual dimensions (views) on a data vault-style data warehouse (see Data Warehousing in the Real World for full details on that architecture)
My presentation on the Visual Data Vault modeling language, presented during WWDVC 2014 in St. Albans, VT, USA.
To download the Visio stencils, check out
http://www.doerffler.com/know-how/data-vault/visual-data-vault/
and http://www.visualdatavault.com
Wonder what this data mesh stuff is all about? What are the principles of data mesh? Can you or should you consider data mesh as the approach for your analytics platform? And most important - how can Snowflake help?
Given in Montreal on 14-Dec-2021
Agile Methods and Data Warehousing (2016 update)Kent Graziano
This presentation takes a look at the Agile Manifesto and the 12 Principles of Agile Development and discusses how these apply to Data Warehousing and Business Intelligence projects. Several examples and details from my past experience are included. Includes more details on using Data Vault as well. (I gave this presentation at OUGF14 in Helsinki, Finland and again in 2016 for TDWI Nashville.)
Agile Data Warehousing: Using SDDM to Build a Virtualized ODSKent Graziano
(This is the talk I gave at Houston DAMA and Agile Denver BI meetups)
At a past client, in order to meet timelines to fulfill urgent, unmet reporting needs, I found it necessary to build a virtualized Operational Data Store as the first phase of a new Data Vault 2.0 project. This allowed me to deliver new objects, quickly and incrementally to the report developer so we could quickly show the business users their data. In order to limit the need for refactoring in later stages of the data warehouse development, I chose to build this virtualization layer on top of a Type 2 persistent staging layer. All of this was done using Oracle SQL Developer Data Modeler (SDDM) against (gasp!) a MS SQL Server Database. In this talk I will show you the architecture for this approach, the rationale, and then the tricks I used in SDDM to build all the stage tables and views very quickly. In the end you will see actual SQL code for a virtual ODS that can easily be translated to an Oracle database.
HOW TO SAVE PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...Kent Graziano
A good data model, done right the first time, can save you time and money. We have all seen the charts on the increasing cost of finding a mistake/bug/error late in a software development cycle. Would you like to reduce, or even eliminate, your risk of finding one of those errors late in the game? Of course you would! Who wouldn't? Nobody plans to miss a requirement or make a bad design decision (well nobody sane anyway). No data modeler or database designer worth their salt wants to leave a model incomplete or incorrect. So what can you do to minimize the risk?
In this talk I will show you a best practice approach to developing your data models and database designs that I have been using for over 15 years. It is a simple, repeatable process for reviewing your data models. It is one that even a non-modeler could follow. I will share my checklist of what to look for and what to ask the data modeler (or yourself) to make sure you get the best possible data model. As a bonus I will share how I use SQL Developer Data Modeler (a no-cost data modeling tool) to collect the information and report it.
Worst Practices in Data Warehouse DesignKent Graziano
This presentation was given at OakTable World 2014 (#OTW14) in San Francisco. After many years of designing data warehouses and consulting on data warehouse architectures, I have seen a lot of bad design choices by supposedly experienced professional. A sense of professionalism, confidentiality agreements, and some sense of common decency have prevented me from calling people out on some of this. No more! In this session I will walk you through a typical bad design like many I have seen. I will show you what I see when I reverse engineer a supposedly complete design and walk through what is wrong with it and discuss options to correct it. This will be a test of your knowledge of data warehouse best practices by seeing if you can recognize these worst practices.
(OTW13) Agile Data Warehousing: Introduction to Data Vault ModelingKent Graziano
This is the presentation I gave at OakTable World 2013 in San Francisco. #OTW13 was held at the Children's Creativity Museum next to the Moscone Convention Center and was in parallel with Oracle OpenWorld 2013.
The session discussed our attempts to be more agile in designing enterprise data warehouses and how the Data Vault Data Modeling technique helps in that approach.
Extreme BI: Creating Virtualized Hybrid Type 1+2 DimensionsKent Graziano
From a talk I gave at WWDVC and ECO in 2015 about how we built virtual dimensions (views) on a data vault-style data warehouse (see Data Warehousing in the Real World for full details on that architecture)
My presentation on the Visual Data Vault modeling language, presented during WWDVC 2014 in St. Albans, VT, USA.
To download the Visio stencils, check out
http://www.doerffler.com/know-how/data-vault/visual-data-vault/
and http://www.visualdatavault.com
Wonder what this data mesh stuff is all about? What are the principles of data mesh? Can you or should you consider data mesh as the approach for your analytics platform? And most important - how can Snowflake help?
Given in Montreal on 14-Dec-2021
Agile Methods and Data Warehousing (2016 update)Kent Graziano
This presentation takes a look at the Agile Manifesto and the 12 Principles of Agile Development and discusses how these apply to Data Warehousing and Business Intelligence projects. Several examples and details from my past experience are included. Includes more details on using Data Vault as well. (I gave this presentation at OUGF14 in Helsinki, Finland and again in 2016 for TDWI Nashville.)
Agile Data Warehousing: Using SDDM to Build a Virtualized ODSKent Graziano
(This is the talk I gave at Houston DAMA and Agile Denver BI meetups)
At a past client, in order to meet timelines to fulfill urgent, unmet reporting needs, I found it necessary to build a virtualized Operational Data Store as the first phase of a new Data Vault 2.0 project. This allowed me to deliver new objects, quickly and incrementally to the report developer so we could quickly show the business users their data. In order to limit the need for refactoring in later stages of the data warehouse development, I chose to build this virtualization layer on top of a Type 2 persistent staging layer. All of this was done using Oracle SQL Developer Data Modeler (SDDM) against (gasp!) a MS SQL Server Database. In this talk I will show you the architecture for this approach, the rationale, and then the tricks I used in SDDM to build all the stage tables and views very quickly. In the end you will see actual SQL code for a virtual ODS that can easily be translated to an Oracle database.
HOW TO SAVE PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...Kent Graziano
A good data model, done right the first time, can save you time and money. We have all seen the charts on the increasing cost of finding a mistake/bug/error late in a software development cycle. Would you like to reduce, or even eliminate, your risk of finding one of those errors late in the game? Of course you would! Who wouldn't? Nobody plans to miss a requirement or make a bad design decision (well nobody sane anyway). No data modeler or database designer worth their salt wants to leave a model incomplete or incorrect. So what can you do to minimize the risk?
In this talk I will show you a best practice approach to developing your data models and database designs that I have been using for over 15 years. It is a simple, repeatable process for reviewing your data models. It is one that even a non-modeler could follow. I will share my checklist of what to look for and what to ask the data modeler (or yourself) to make sure you get the best possible data model. As a bonus I will share how I use SQL Developer Data Modeler (a no-cost data modeling tool) to collect the information and report it.
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Kent Graziano
(updated slides used for North Texas DAMA meetup Oct 2018) As we move more and more towards the need for everyone to do Agile Data Warehousing, we need a data modeling method that can be agile with us. Data Vault Data Modeling is an agile data modeling technique for designing highly flexible, scalable, and adaptable data structures for enterprise data warehouse repositories. It is a hybrid approach using the best of 3NF and dimensional modeling. It is not a replacement for star schema data marts (and should not be used as such). This approach has been used in projects around the world (Europe, Australia, USA) for over 15 years and is now growing in popularity. The purpose of this presentation is to provide attendees with an introduction to the components of the Data Vault Data Model, what they are for and how to build them. The examples will give attendees the basics:
• What the basic components of a DV model are
• How to build, and design structures incrementally, without constant refactoring
Top Five Cool Features in Oracle SQL Developer Data ModelerKent Graziano
This is the presentation I gave at OUGF14 in Helsinki, Finland in June 2014.
Oracle SQL Developer Data Modeler (SDDM) has been around for a few years now and is up to version 4.x. It really is an industrial strength data modeling tool that can be used for any data modeling task you need to tackle. Over the years I have found quite a few features and utilities in the tool that I rely on to make me more efficient (and agile) in developing my models. This presentation will demonstrate at least five of these features, tips, and tricks for you. I will walk through things like modifying the delivered reporting templates, how to create and applying object naming templates, how to use a table template and transformation script to add audit columns to every table, and using the new meta data export tool and several other cool things you might not know are there. Since there will likely be patches and new releases before the conference, there is a good chance there will be some new things for me to show you as well. This might be a bit of a whirlwind demo, so get SDDM installed on your device and bring it to the session so you can follow along.
Not to be confused with Oracle Database Vault (a commercial db security product), Data Vault Modeling is a specific data modeling technique for designing highly flexible, scalable, and adaptable data structures for enterprise data warehouse repositories. It is not a replacement for star schema data marts (and should not be used as such). This approach has been used in projects around the world (Europe, Australia, USA) for the last 10 years but is still not widely known or understood. The purpose of this presentation is to provide attendees with a detailed introduction to the technical components of the Data Vault Data Model, what they are for and how to build them. The examples will give attendees the basics for how to build, and design structures when using the Data Vault modeling technique. The target audience is anyone wishing to explore implementing a Data Vault style data model for an Enterprise Data Warehouse, Operational Data Warehouse, or Dynamic Data Integration Store. See more content like this by following my blog http://kentgraziano.com or follow me on twitter @kentgraziano.
Speeding Time to Insight with a Modern ELT ApproachDatabricks
The availability of new tools in the modern data stack is changing the way data teams operate. Specifically, the modern data stack supports an “ELT” approach for managing data, rather than the traditional “ETL” approach. In an ELT approach, data sources are automatically loaded in a normalized state into Delta Lake and opinionated transformations happen in the data destination using dbt. This workflow allows data analysts to move more quickly from raw data to insight, while creating repeatable data pipelines robust to changes in the source datasets. In this presentation, we’ll illustrate how easy it is for even a data analytics team of one to to develop an end-to-end data pipeline. We’ll load data from GitHub into Delta Lake, then use pre-built dbt models to feed a daily Redash dashboard on sales performance by manager, and use the same transformed models to power the data science team’s predictions of future sales by segment.
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingKent Graziano
This is a presentation I gave at OUGF14 in Helsinki, Finland.
Data Vault Data Modeling is an agile data modeling technique for designing highly flexible, scalable, and adaptable data structures for enterprise data warehouse repositories. It is a hybrid approach using the best of 3NF and dimensional modeling. It is not a replacement for star schema data marts (and should not be used as such). This approach has been used in projects around the world (Europe, Australia, USA) for the last 10 years but is still not widely known or understood. The purpose of this presentation is to provide attendees with a detailed introduction to the components of the Data Vault Data Model, what they are for and how to build them. The examples will give attendees the basics for how to build, and design structures incrementally, without constant refactoring, when using the Data Vault modeling technique. This technique works well for:
• Building the Enterprise Data Warehouse repository in a CIF architecture
• Building a Persistent Staging Area (PSA) in a Kimball Bus Architecture
• Building your data model incrementally, one sprint at a time using a repeatable technique
• Providing a model that is easily extensible without need to re-engineer existing structure or load processes
Every business today wants to leverage data to drive strategic initiatives with machine learning, data science and analytics — but runs into challenges from siloed teams, proprietary technologies and unreliable data.
That’s why enterprises are turning to the lakehouse because it offers a single platform to unify all your data, analytics and AI workloads.
Join our How to Build a Lakehouse technical training, where we’ll explore how to use Apache SparkTM, Delta Lake, and other open source technologies to build a better lakehouse. This virtual session will include concepts, architectures and demos.
Here’s what you’ll learn in this 2-hour session:
How Delta Lake combines the best of data warehouses and data lakes for improved data reliability, performance and security
How to use Apache Spark and Delta Lake to perform ETL processing, manage late-arriving data, and repair corrupted data directly on your lakehouse
I gave this presentation at the Advanced Architecture Conference, Bill Inmon, 2011 in Evergreen, Colorado. This presentation covers a new breed of data warehousing called Operational Data Warehousing. These are the next steps in business intelligence towards self-service BI and enabling users to do more with their enterprise data warehouse solution. Specifically, it talks about how the Data Vault model fits in to this picture.
If you would like to use the slides, please e-mail me first, I'd be happy to discuss it with you.
Part 2 of a 2 part presentation that I did in 2009, this presentation covers more about unstructured data, and operational data vault components. YES, even then I was commenting on how this market will evolve. IF you want to use these slides, please let me know, and add: "(C) Dan Linstedt, all rights reserved, http://LearnDataVault.com" in a VISIBLE fashion on your slides.
I gave this presentation at OUGF14 in Helsinki, Finland and again for TDWI Nashville. This presentation takes a look at the Agile Manifesto and the 12 Principles of Agile Development and discusses how these apply to Data Warehousing and Business Intelligence projects. Several examples and details from my past experience are included.
Data Vault Modeling and Methodology introduction that I provided to a Montreal event in September 2011. It covers an introduction and overview of the Data Vault components for Business Intelligence and Data Warehousing. I am Dan Linstedt, the author and inventor of Data Vault Modeling and methodology.
If you use the images anywhere in your presentations, please credit http://LearnDataVault.com as the source (me).
Thank-you kindly,
Daniel Linstedt
Data Vault 2.0: Using MD5 Hashes for Change Data CaptureKent Graziano
This presentation was given at OakTable World 2014 (#OTW14) in San Francisco as a short Ted-style 10 minute talk. In it I introduce Data Vault 2.0 and its innovative approach to doing change data capture in a data warehouse by using MD5 Hash columns.
Demystifying Data Warehousing as a Service (GLOC 2019)Kent Graziano
Extended deck from the 2019 GLOC event in Cleveland. Discusses what a DWaaS is, the top 10 features of Snowflake that represent that, and a check list for what questions to ask when choosing a cloud based data warehouse.
The bulk of the NoSQL Technologies focus on achieving scale-out ability by building their architecture around a simple, distributed hash, key-value store. This works well for partitioning simple data, but in reality, your information models are not simple. As a result, you may have to build enormous layers of code to manage an explicit structure baked into the persistence tier. In this session, take a look at a NoSQL solution which allows you to store naturally clustered, richly linked object networks beneath your key partitioned roots. The result is that you do not have to write extensive code to deal with the physical structure in the persistence tier even when dealing with complex information models like predictive models, timeseries, recursive relations, compositions, etc. We will explore how such an implementation works in practice by looking at a case study of an advanced model analytics and visualization solution built on the clustered NoSQL database solution Versant Database Engine.
Agile BI via Data Vault and ModelstormingDaniel Upton
Audience: Business Intelligence Architects, Project Managers and Sponsors. This slideshow accompanies a video presentation of the same name, available at http://youtu.be/e0cHFdeGEeE.
[Given at DAMA WI, Nov 2018] With the increasing prevalence of semi-structured data from IoT devices, web logs, and other sources, data architects and modelers have to learn how to interpret and project data from things like JSON. While the concept of loading data without upfront modeling is appealing to many, ultimately, in order to make sense of the data and use it to drive business value, we have to turn that schema-on-read data into a real schema! That means data modeling! In this session I will walk through both simple and complex JSON documents, decompose them, then turn them into a representative data model using Oracle SQL Developer Data Modeler. I will show you how they might look using both traditional 3NF and data vault styles of modeling. In this session you will:
1. See what a JSON document looks like
2. Understand how to read it
3. Learn how to convert it to a standard data model
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Kent Graziano
(updated slides used for North Texas DAMA meetup Oct 2018) As we move more and more towards the need for everyone to do Agile Data Warehousing, we need a data modeling method that can be agile with us. Data Vault Data Modeling is an agile data modeling technique for designing highly flexible, scalable, and adaptable data structures for enterprise data warehouse repositories. It is a hybrid approach using the best of 3NF and dimensional modeling. It is not a replacement for star schema data marts (and should not be used as such). This approach has been used in projects around the world (Europe, Australia, USA) for over 15 years and is now growing in popularity. The purpose of this presentation is to provide attendees with an introduction to the components of the Data Vault Data Model, what they are for and how to build them. The examples will give attendees the basics:
• What the basic components of a DV model are
• How to build, and design structures incrementally, without constant refactoring
Top Five Cool Features in Oracle SQL Developer Data ModelerKent Graziano
This is the presentation I gave at OUGF14 in Helsinki, Finland in June 2014.
Oracle SQL Developer Data Modeler (SDDM) has been around for a few years now and is up to version 4.x. It really is an industrial strength data modeling tool that can be used for any data modeling task you need to tackle. Over the years I have found quite a few features and utilities in the tool that I rely on to make me more efficient (and agile) in developing my models. This presentation will demonstrate at least five of these features, tips, and tricks for you. I will walk through things like modifying the delivered reporting templates, how to create and applying object naming templates, how to use a table template and transformation script to add audit columns to every table, and using the new meta data export tool and several other cool things you might not know are there. Since there will likely be patches and new releases before the conference, there is a good chance there will be some new things for me to show you as well. This might be a bit of a whirlwind demo, so get SDDM installed on your device and bring it to the session so you can follow along.
Not to be confused with Oracle Database Vault (a commercial db security product), Data Vault Modeling is a specific data modeling technique for designing highly flexible, scalable, and adaptable data structures for enterprise data warehouse repositories. It is not a replacement for star schema data marts (and should not be used as such). This approach has been used in projects around the world (Europe, Australia, USA) for the last 10 years but is still not widely known or understood. The purpose of this presentation is to provide attendees with a detailed introduction to the technical components of the Data Vault Data Model, what they are for and how to build them. The examples will give attendees the basics for how to build, and design structures when using the Data Vault modeling technique. The target audience is anyone wishing to explore implementing a Data Vault style data model for an Enterprise Data Warehouse, Operational Data Warehouse, or Dynamic Data Integration Store. See more content like this by following my blog http://kentgraziano.com or follow me on twitter @kentgraziano.
Speeding Time to Insight with a Modern ELT ApproachDatabricks
The availability of new tools in the modern data stack is changing the way data teams operate. Specifically, the modern data stack supports an “ELT” approach for managing data, rather than the traditional “ETL” approach. In an ELT approach, data sources are automatically loaded in a normalized state into Delta Lake and opinionated transformations happen in the data destination using dbt. This workflow allows data analysts to move more quickly from raw data to insight, while creating repeatable data pipelines robust to changes in the source datasets. In this presentation, we’ll illustrate how easy it is for even a data analytics team of one to to develop an end-to-end data pipeline. We’ll load data from GitHub into Delta Lake, then use pre-built dbt models to feed a daily Redash dashboard on sales performance by manager, and use the same transformed models to power the data science team’s predictions of future sales by segment.
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingKent Graziano
This is a presentation I gave at OUGF14 in Helsinki, Finland.
Data Vault Data Modeling is an agile data modeling technique for designing highly flexible, scalable, and adaptable data structures for enterprise data warehouse repositories. It is a hybrid approach using the best of 3NF and dimensional modeling. It is not a replacement for star schema data marts (and should not be used as such). This approach has been used in projects around the world (Europe, Australia, USA) for the last 10 years but is still not widely known or understood. The purpose of this presentation is to provide attendees with a detailed introduction to the components of the Data Vault Data Model, what they are for and how to build them. The examples will give attendees the basics for how to build, and design structures incrementally, without constant refactoring, when using the Data Vault modeling technique. This technique works well for:
• Building the Enterprise Data Warehouse repository in a CIF architecture
• Building a Persistent Staging Area (PSA) in a Kimball Bus Architecture
• Building your data model incrementally, one sprint at a time using a repeatable technique
• Providing a model that is easily extensible without need to re-engineer existing structure or load processes
Every business today wants to leverage data to drive strategic initiatives with machine learning, data science and analytics — but runs into challenges from siloed teams, proprietary technologies and unreliable data.
That’s why enterprises are turning to the lakehouse because it offers a single platform to unify all your data, analytics and AI workloads.
Join our How to Build a Lakehouse technical training, where we’ll explore how to use Apache SparkTM, Delta Lake, and other open source technologies to build a better lakehouse. This virtual session will include concepts, architectures and demos.
Here’s what you’ll learn in this 2-hour session:
How Delta Lake combines the best of data warehouses and data lakes for improved data reliability, performance and security
How to use Apache Spark and Delta Lake to perform ETL processing, manage late-arriving data, and repair corrupted data directly on your lakehouse
I gave this presentation at the Advanced Architecture Conference, Bill Inmon, 2011 in Evergreen, Colorado. This presentation covers a new breed of data warehousing called Operational Data Warehousing. These are the next steps in business intelligence towards self-service BI and enabling users to do more with their enterprise data warehouse solution. Specifically, it talks about how the Data Vault model fits in to this picture.
If you would like to use the slides, please e-mail me first, I'd be happy to discuss it with you.
Part 2 of a 2 part presentation that I did in 2009, this presentation covers more about unstructured data, and operational data vault components. YES, even then I was commenting on how this market will evolve. IF you want to use these slides, please let me know, and add: "(C) Dan Linstedt, all rights reserved, http://LearnDataVault.com" in a VISIBLE fashion on your slides.
I gave this presentation at OUGF14 in Helsinki, Finland and again for TDWI Nashville. This presentation takes a look at the Agile Manifesto and the 12 Principles of Agile Development and discusses how these apply to Data Warehousing and Business Intelligence projects. Several examples and details from my past experience are included.
Data Vault Modeling and Methodology introduction that I provided to a Montreal event in September 2011. It covers an introduction and overview of the Data Vault components for Business Intelligence and Data Warehousing. I am Dan Linstedt, the author and inventor of Data Vault Modeling and methodology.
If you use the images anywhere in your presentations, please credit http://LearnDataVault.com as the source (me).
Thank-you kindly,
Daniel Linstedt
Data Vault 2.0: Using MD5 Hashes for Change Data CaptureKent Graziano
This presentation was given at OakTable World 2014 (#OTW14) in San Francisco as a short Ted-style 10 minute talk. In it I introduce Data Vault 2.0 and its innovative approach to doing change data capture in a data warehouse by using MD5 Hash columns.
Demystifying Data Warehousing as a Service (GLOC 2019)Kent Graziano
Extended deck from the 2019 GLOC event in Cleveland. Discusses what a DWaaS is, the top 10 features of Snowflake that represent that, and a check list for what questions to ask when choosing a cloud based data warehouse.
The bulk of the NoSQL Technologies focus on achieving scale-out ability by building their architecture around a simple, distributed hash, key-value store. This works well for partitioning simple data, but in reality, your information models are not simple. As a result, you may have to build enormous layers of code to manage an explicit structure baked into the persistence tier. In this session, take a look at a NoSQL solution which allows you to store naturally clustered, richly linked object networks beneath your key partitioned roots. The result is that you do not have to write extensive code to deal with the physical structure in the persistence tier even when dealing with complex information models like predictive models, timeseries, recursive relations, compositions, etc. We will explore how such an implementation works in practice by looking at a case study of an advanced model analytics and visualization solution built on the clustered NoSQL database solution Versant Database Engine.
Agile BI via Data Vault and ModelstormingDaniel Upton
Audience: Business Intelligence Architects, Project Managers and Sponsors. This slideshow accompanies a video presentation of the same name, available at http://youtu.be/e0cHFdeGEeE.
[Given at DAMA WI, Nov 2018] With the increasing prevalence of semi-structured data from IoT devices, web logs, and other sources, data architects and modelers have to learn how to interpret and project data from things like JSON. While the concept of loading data without upfront modeling is appealing to many, ultimately, in order to make sense of the data and use it to drive business value, we have to turn that schema-on-read data into a real schema! That means data modeling! In this session I will walk through both simple and complex JSON documents, decompose them, then turn them into a representative data model using Oracle SQL Developer Data Modeler. I will show you how they might look using both traditional 3NF and data vault styles of modeling. In this session you will:
1. See what a JSON document looks like
2. Understand how to read it
3. Learn how to convert it to a standard data model
Given at Oracle Open World 2011: Not to be confused with Oracle Database Vault (a commercial db security product), Data Vault Modeling is a specific data modeling technique for designing highly flexible, scalable, and adaptable data structures for enterprise data warehouse repositories. It has been in use globally for over 10 years now but is not widely known. The purpose of this presentation is to provide an overview of the features of a Data Vault modeled EDW that distinguish it from the more traditional third normal form (3NF) or dimensional (i.e., star schema) modeling approaches used in most shops today. Topics will include dealing with evolving data requirements in an EDW (i.e., model agility), partitioning of data elements based on rate of change (and how that affects load speed and storage requirements), and where it fits in a typical Oracle EDW architecture. See more content like this by following my blog http://kentgraziano.com or follow me on twitter @kentgraziano.
Agile Data Engineering - Intro to Data Vault Modeling (2016)Kent Graziano
(Updated deck) As we move more and more towards the need for everyone to do Agile Data Warehousing, we need a data modeling method that can be agile with us. Data Vault Data Modeling is an agile data modeling technique for designing highly flexible, scalable, and adaptable data structures for enterprise data warehouse repositories. It is a hybrid approach using the best of 3NF and dimensional modeling. It is not a replacement for star schema data marts (and should not be used as such). This approach has been used in projects around the world (Europe, Australia, USA) for over 10 years but is still not widely known or understood. The purpose of this presentation is to provide attendees with an introduction to the components of the Data Vault Data Model, what they are for and how to build them. The examples will give attendees the basics:
• What the basic components of a DV model are
• How to build, and design structures incrementally, without constant refactoring
These are the slides from my talk at Data Day Texas 2016 (#ddtx16).
The world of data warehousing has changed! With the advent of Big Data, Streaming Data, IoT, and The Cloud, what is a modern data management professional to do? It may seem to be a very different world with different concepts, terms, and techniques. Or is it? Lots of people still talk about having a data warehouse or several data marts across their organization. But what does that really mean today in 2016? How about the Corporate Information Factory (CIF), the Data Vault, an Operational Data Store (ODS), or just star schemas? Where do they fit now (or do they)? And now we have the Extended Data Warehouse (XDW) as well. How do all these things help us bring value and data-based decisions to our organizations? Where do Big Data and the Cloud fit? Is there a coherent architecture we can define? This talk will endeavor to cut through the hype and the buzzword bingo to help you figure out what part of this is helpful. I will discuss what I have seen in the real world (working and not working!) and a bit of where I think we are going and need to go in 2016 and beyond.
Demystifying Data Warehouse as a Service (DWaaS)Kent Graziano
This is from the talk I gave at the 30th Anniversary NoCOUG meeting in San Jose, CA.
We all know that data warehouses and best practices for them are changing dramatically today. As organizations build new data warehouses and modernize established ones, they are turning to Data Warehousing as a Service (DWaaS) in hopes of taking advantage of the performance, concurrency, simplicity, and lower cost of a SaaS solution or simply to reduce their data center footprint (and the maintenance that goes with that).
But what is a DWaaS really? How is it different from traditional on-premises data warehousing?
In this talk I will:
• Demystify DWaaS by defining it and its goals
• Discuss the real-world benefits of DWaaS
• Discuss some of the coolest features in a DWaaS solution as exemplified by the Snowflake Elastic Data Warehouse.
Building Modern Data Platform with Microsoft AzureDmitry Anoshin
This presentation will cover Cloud history and Microsoft Azure Data Analytics capabilities. Moreover, it has a real-world example of DW modernization. Finally, we will check the alternative solution on Azure using Snowflake and Matillion ETL.
Best practices to deliver data analytics to the business with power biSatya Shyam K Jayanty
Get your data to life with Power BI visualization and insights!
With the changing landscape of Power BI features it is essential to get hold of configuration and deployment practices within your data platform that will ensure you are on-par with compliance & security practices. In this session we will overview from the basics leading into advanced tricks on this landscape:
How to deploy Power BI?
How to implement configuration parameters and package BI features as a part of Office 365 roll out in your organisation?
What are newest features and enhancements on this Power BI landscape?
How to manage on-premise vs on-cloud connectivity?
How can you help and support the Power BI community as well?
Having said that within the objectives of this session, cloud computing is another aspect of this technology made is possible to get data within few clicks and ticks to the end-user. Let us review how to manage & connect on-premise data to cloud capabilities that can offer full advantage of data catalogue capabilities by keeping data secure as per Information Governance standards. Not just with nuts and bolts, performance is another aspect that every Admin is keeping up, let us look into few settings on how to maximize performance to optimize access to data as required. Gain understanding and insight into number of tools that are available for your Business Intelligence needs. There will be a showcase of events to demonstrate where to begin and how to proceed in BI world.
- D BI A Consulting
consulting@dbia.uk
SharePoint migrations rarely turn out as you plan them. They are sometimes risky and too often take longer than planned. Over the last 10 years of migrating from SharePoint 2003, 2007, 2010 to the latest versions of SharePoint/Office 365 we’ve seen a consistent theme -- organizations underestimate the complexity and level of effort required for a successful, smooth migration.
Whether you are planning to complete your own migration, or engaging a vendor to assist, this webinar will discuss precautions you can take to avoid the slippery slope experienced in SharePoint migrations.
Join Jill Hannemann, Adam Levithan and our special guest Ryan Tully from Metalogix as they:
- Go through the assessment steps to understand the full landscape of your existing SharePoint environment
- Review methodologies for moving content from one environment to the next
- Outline precautions you should take in migrating to either SharePoint 2013 on-premise or online
My Slidedeck about Common Data Service and Model. This technology is under development so content is subject to change and based on current service on 4/13/2018
Big Data Expo 2015 - Barnsten Why Data Modelling is EssentialBigDataExpo
Learn the tips and tricks how to handle Data Modeling in your Big Data environment. Mark will show how modeling will add value to the business and how to make your Big Data landscape transparent across the organization.
You will see the latest modeling techniques for Big Data and different types of modeling notations. Also you will learn how to integrate Data Modeling into your BI environment.
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIDenodo
Watch full webinar here: https://bit.ly/3zVJRRf
According to Dresner Advisory’s 2020 Self-Service Business Intelligence Market Study, 62% of the responding organizations say self-service BI is critical for their business. If we look deeper into the need for today’s self-service BI, it’s beyond some Executives and Business Users being enabled by IT for self-service dashboarding or report generation. Predictive analytics, self-service data preparation, collaborative data exploration are all different facets of new generation self-service BI. While democratization of data for self-service BI holds many benefits, strict data governance becomes increasingly important alongside.
In this session we will discuss:
- The latest trends and scopes of self-service BI
- The role of logical data fabric in self-service BI
- How Denodo enables self-service BI for a wide range of users - Customer case study on self-service BI
Want to know more about Common Data Model and Service? You need to understant what's the difference between CDS for Apps and Analytics? Feel free to use these slides and send me your feed backs.
SharePoint migrations rarely turn out as you plan them. They are sometimes risky and too often take longer than planned. Over the last 10 years of migrating from SharePoint 2003, 2007, 2010 to the latest versions of SharePoint/Office 365 we’ve seen a consistent theme -- organizations underestimate the complexity and level of effort required for a successful, smooth migration.
Whether you are planning to complete your own migration, or engaging a vendor to assist, this webinar will discuss precautions you can take to avoid the slippery slope experienced in SharePoint migrations.
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionDmitry Anoshin
This session will cover building the modern Data Warehouse by migration from the traditional DW platform into the cloud, using Amazon Redshift and Cloud ETL Matillion in order to provide Self-Service BI for the business audience. This topic will cover the technical migration path of DW with PL/SQL ETL to the Amazon Redshift via Matillion ETL, with a detailed comparison of modern ETL tools. Moreover, this talk will be focusing on working backward through the process, i.e. starting from the business audience and their needs that drive changes in the old DW. Finally, this talk will cover the idea of self-service BI, and the author will share a step-by-step plan for building an efficient self-service environment using modern BI platform Tableau.
No doubt Visualization of Data is a key component of our industry. The path data travels since it is created till it takes shape in a chart is sometimes obscure and overlooked as it tends to live in the engineering side (when volume is relevant), an area where Data Scientist tend to visit but not the usual Web/Marketing Data Analyst. Nowadays the options to tame all that journey and make the best of it are many and they don't require extensive engineering knowledge. Small or Big Data, let's see what "Store, Extract, Transform, Load, Visualize" is all about.
Businesses cannot compete without data. Every organization produces and consumes it. Data trends are hitting the mainstream and businesses are adopting buzzwords such as Big Data, data vault, data scientist, etc., to seek solutions for their fundamental data issues. Few realize that the importance of any solution, regardless of platform or technology, relies on the data model supporting it. Data modeling is not an optional task for an organization’s data remediation effort. Instead, it is a vital activity that supports the solution driving your business.
This webinar will address emerging trends around data model application methodology, as well as trends around the practice of data modeling itself. We will discuss abstract models and entity frameworks, as well as the general shift from data modeling being segmented to becoming more integrated with business practices.
Takeaways:
How are anchor modeling, data vault, etc. different and when should I apply them?
Integrating data models to business models and the value this creates
Application development (Data first, code first, object first)
It is a fascinating, explosive time for enterprise analytics.
It is from the position of analytics leadership that the mission will be executed and company leadership will emerge. The data professional is absolutely sitting on the performance of the company in this information economy and has an obligation to demonstrate the possibilities and originate the architecture, data, and projects that will deliver analytics. After all, no matter what business you’re in, you’re in the business of analytics.
The coming years will be full of big changes in enterprise analytics and Data Architecture. William will kick off the fourth year of the Advanced Analytics series with a discussion of the trends winning organizations should build into their plans, expectations, vision, and awareness now.
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo
Watch the full session: Denodo DataFest 2016 sessions: https://goo.gl/Bvmvc9
Data prep and data blending are terms that have come to prominence over the last year or two. On the surface, they appear to offer functionality similar to data virtualization…but there are important differences!
In this session, you will learn:
• How data virtualization complements or contrasts technologies such as data prep and data blending
• Pros and cons of functionality provided by data prep, data catalog and data blending tools
• When and how to use these different technologies to be most effective
This session is part of the Denodo DataFest 2016 event. You can also watch more Denodo DataFest sessions on demand here: https://goo.gl/VXb6M6
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteCaserta
The “Big Data era” has ushered in an avalanche of new technologies and approaches for delivering information and insights to business users. What is the role of the cloud in your analytical environment? How can you make your migration as seamless as possible? This closing keynote, delivered by Joe Caserta, a prominent consultant who has helped many global enterprises adopt Big Data, provided the audience with the inside scoop needed to supplement data warehousing environments with data intelligence—the amalgamation of Big Data and business intelligence.
This presentation was given as the closing keynote at DBTA's annual Data Summit in NYC.
Similar to Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach (20)
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
"Impact of front-end architecture on development cost", Viktor Turskyi
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
1. Using OBIEE and Data Vault to
Virtualize Your BI Environment:
An Agile Approach
Kent Graziano, Data Warrior LLC
Stewart Bryson, Rittman Mead
2. Kent Graziano
Twitter: @KentGraziano
Certified Data Vault Master
Oracle ACE Director, Oracle BI&DW
Data Architecture and Data Warehouse Specialist
● 30+ years in IT
● 20+ years of Oracle-related work
● 15+ years of data warehousing experience
Co-Author of 2 Books
● The Business of Data Vault Modeling
● The Data Model Resource Book (1st Edition)
Editor of “The” Data Vault Book
● Super Charge Your Data Warehouse
Co-Chair BI/DW SIG for ODTUG
Past-President of Oracle Development Tools User
Group and Rocky Mountain Oracle User Group
3. Stewart Bryson
• Twitter : @StewartBryson
• Oracle ACE in BI/DW
• Oracle BI/DW Architect and Delivery
Specialist
• Community Speaker and Enthusiast
• Writer for Rittman Mead Blog:
http://www.rittmanmead.com/blog
• US Conference Chair of the Rittman Mead
BI Forum
• Developer of Transcend Framework
• Email : stewart.bryson@rittmanmead.com
• Real Time BI with Kevin & Stewart
‣ iTunes: http://bit.ly/realtimebi
‣ YouTube:
http://www.youtube.com/user/realtimebi
4. About Rittman Mead
• Oracle BI and DW Partner
• World leader in solutions delivery
and innovation in Oracle BI
• Approximately 70 consultants
worldwide
• Offices in US (Atlanta), Europe,
Australia and India
• Skills in broad range of
supporting Oracle BI Tools
‣ OBIEE
‣ OBIA
‣ ODIEE
‣ Essbase, Oracle OLAP
‣ GoldenGate
‣ Exadata
‣ Endeca
5. Questions We Hope to Answer
• Data Vault
‣ What is Data Vault?
‣ Why would I choose Data Vault over
competing technologies?
• Oracle Information Management
Reference Architecture
‣ What are the core components of the
Reference Architecture?
‣ Is there possibly an acronym for that?
• Oracle Business Intelligence
‣ What is OBIEE?
‣ Why does OBIEE work so well with Data
Vault?
• What is Agile BI and how does this
help?
6. Oracle Information Management Reference Architecture
Staging Layer
● Change tables for Oracle GoldenGate
● Reject tables for Data Quality
● External tables for file feeds
Foundation Layer
● Transactional granularity maintained
● Process neutral: no user or business
requirements
● Just recording what happened
Access and Performance Layer
● Dimensional model
● “Star Schemas”
● Process specific: targeting user and
business requirements
7. What is Data Vault Trying to Solve?
What are our other Enterprise
Data Warehouse options?
● Third-Normal Form (3NF):
Complex primary keys (PK’s) with
cascading snapshot dates
● Star Schema (Dimensional):
Difficult to reengineer fact tables
for granularity changes
Difficult to get it right the first
time
Not adaptable to rapid
business change
NOT AGILE!
8. Data Vault: Definition
The Data Vault is a detail oriented,
historical tracking and uniquely linked set
of normalized tables that support one or
more functional areas
of business.
It is a hybrid approach encompassing the
best of breed between 3rd normal form
(3NF) and star schema. The design is
flexible, scalable, consistent, and
adaptable to the needs of the enterprise. It
is a data model that is architected
specifically to meet the needs of today’s
enterprise data warehouses.
Dan Linstedt: Defining the Data Vault
TDAN.com Article
10. What is the Foundation Layer?
• Basis for long term enterprise
scale data warehouse
• Must be atomic level data
‣ A historical source of facts
‣ No user requirements applied
• Not based on any one data
source or system
• Single point of integration
• Flexible
• Extensible
• Provides data to the
access/reporting layer
‣ Based on targeted business
requirements
‣ Can be virtual
11. Standard Approach to Agile Business
Intelligence
• Design iterations around smaller chunks
‣ Iteration 1: Interviews and user requirements
‣ Iteration 2: Logical modeling
‣ Iteration 3: ETL Development
‣ Iteration 4: Front-end development
• Requires 4 iterations before we get any
usable content
12. Manifesto for Agile Software
Development
“We are uncovering better ways of developing
software by doing it and helping others do it.
Through this work we have come to value:
Individuals and interactions over processes and
tools
Working software over comprehensive
documentation
Customer collaboration over contract negotiation
Responding to change over following a plan
That is, while there is value in the items on the right,
we value the items on the left more.”
http://agilemanifesto.org/
13. Applying the Agile Manifesto to BI
Development
User Stories instead of requirements
documents
● User asks for content or functionality through a
narrative
● Typically includes current version of the report
Time-boxed iterations
● Iteration has a standard length
● Choose one or more user stories to fit in that
iteration
Rework is part of the game
● There are no “missed requirements”... only those
that haven’t been delivered yet.
14. What is Our Approach?
Model iteratively
● Use Data Vault data modeling technique
● Create basic components, then add over
time
Virtualize the Access Layer
● Don’t waste time building facts and
dimensions up front
● ETL and testing takes too long
● “Project” objects using pattern-based DV
model with OBIEE BMM
Users see real reports with real data
16. 1. Hub = Business Keys
Hubs = Unique Lists of Business Keys
Business Keys are used to
TRACK and IDENTIFY key information
17. 2: Links = Associations
Links = Transactions and
Associations
They are used to hook
together multiple sets of
information
18. 3. Satellites = Descriptors
Satellites provide context
for the Hubs and the Links
19. Flexibility (Agility) and Productivity
• Adding new components to the
EDW has NEAR ZERO impact
to:
‣ Existing Loading Processes
‣ Existing Data Model
‣ Existing Reporting & BI Functions
‣ Existing Source Systems
‣ Existing Star Schemas and Data Marts
• Standardized modeling rules
‣ Highly repeatable and learnable
modeling technique
‣ Allows automation of models, loads,
and extracts
‣ Can use a BI-meta layer to virtualize
the reporting structures
‣ OBIEE Business Model and Mapping
tool
20. What is OBIEE?
•Dashboards, Ad-hoc Reporting,
Alerts, Microsoft Office Integration
• High quality graphical, role/user based
views
• Multiple views of same data
•Point and click ease of use
•Common Enterprise Information
Model
• Unified semantic/logical view of data
from multiple sources
• Heterogeneous database access
• True enterprise deployment
•Alerts, scheduling and distribution
21. Where Does OBIEE Fit?
•OBIEE is the
Information Access
Layer
•BI Abstraction layer
allows us “bypass” the
creation of the Access
& Performance Layer
•We “virtualize” the
dependent data marts
22. Flow of Data Through the Three-Layer
Semantic Model
Simplification of the Data Model
Integration of Disparate Data Sources
Addition of Business Logic and Calculations
Addition of Aggregate Sources
24. OBIEE Tips and Tricks (Discovered by
Stewart Bryson)
• Create folders in the Physical
Layer
‣ Separate Hubs, Links and Satellites
‣ Each has distinct uses
• Hubs
‣ Business Keys
‣ Used in defining Primary Keys and
Level Keys
• Links
‣ Used in Extending the Logical Table
Source (LTS)
‣ Never references in display columns or
measures
• Satellites
‣ Use these for Attributes and Measures
‣ Anything displayed to the user
43. Organizations Using Data Vault
• WebMD Health Services
• Anthem Blue-Cross Blue Shield
• Denver Public Schools
• Independent Purchasing
Cooperative (IPC, Miami)
• Owner of Subway
• Kaplan
• US Defense Department
• Colorado Springs Utilities
• State Court of Wyoming
• Federal Express
• US Dept. Of Agriculture
44. Summary
• Data Vault provides a data modeling
technique that allows:
‣ Model Agility
‣ Enabling rapid changes and additions
‣ Productivity
‣ Enabling low complexity systems with high value
output at a rapid pace
‣ Easy projections of dimensional models
• OBIEE provides
‣ Framework for Agile BI
‣ Rapid development of virtualized layer on a data
vault model
45. Super Charge Your Data Warehouse
Available on Amazon.com
Soft Cover or Kindle Format
Now also available in PDF at
LearnDataVault.com
Hint: Kent is the Technical
Editor
46. Kscope Special for LearnDataVault
Go to
http://learndatavault.com/kscope13
Discount coupons for:
Super Charge book
DV Implementation course
DV using Informatica course
48. Contact Information
Kent Graziano
The Oracle Data Warrior
Data Warrior LLC
Kent.graziano@att.net
Visit my blog at
http://kentgraziano.com
@KentGraziano
Stewart Bryson
US Managing Director
Rittman Mead
stewart.bryson@rittmanmead.com
www.rittmanmead.com
@stewartbryson
T : +44 (0) 8446 697 995